Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frame extraction in the cluster using entry points #88

Draft
wants to merge 34 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
4a38bef
edit bash script for Aug23 day3
sfmig Nov 1, 2023
3a9afca
draft bash script that copies logs to destination and reencodes videos
sfmig Nov 1, 2023
e0aac64
add reencoding step
sfmig Nov 1, 2023
f2ada6f
add reencoding step and log ffmpeg params (WIP)
sfmig Nov 2, 2023
6ec8ebd
frame extraction with logs but without reencoding
sfmig Nov 2, 2023
156e769
reencoding and frame extraction for day2
sfmig Nov 13, 2023
27aa6e7
add loops to logs. change filename of reencoded logs.
sfmig Nov 15, 2023
18f9093
remove parent directory name from extracted image filename
sfmig Nov 15, 2023
2c38437
script for day3 job
sfmig Nov 15, 2023
f32fa77
fix for loop
sfmig Nov 15, 2023
a645a3b
repeat reencoded failed jobs
sfmig Nov 16, 2023
e085c39
Sep2023 day4 job
sfmig Nov 17, 2023
82341d7
repeat day 1-04 and 05 on reencoded with less frames
sfmig Nov 17, 2023
24f1cc2
Merge branch 'main' into smg/frame-extraction-w-reencoding
sfmig Nov 20, 2023
c60335f
add clarification for array job syntax
sfmig Nov 20, 2023
cbaff37
remove logs comment
sfmig Nov 20, 2023
5a9d5cb
remove parent directory from name of extracted frame
sfmig Nov 20, 2023
7b6b318
clarify TODO about all files in directory
sfmig Nov 20, 2023
78dd34b
clarify sbatch syntax for array job
sfmig Nov 20, 2023
c83d4d0
add option to reencode or not the videos
sfmig Nov 20, 2023
5d15106
print to log if frame extraction fails
sfmig Nov 20, 2023
cba1639
fix path for new structure
sfmig Nov 20, 2023
5977bd3
clarify reencoding is optional
sfmig Nov 20, 2023
e16ac59
move check earlier
sfmig Nov 20, 2023
0bce7f2
check just below input list
sfmig Nov 20, 2023
02fc3a6
fix number of array jobs
sfmig Nov 20, 2023
622d2e4
actually fix number of array jobs...
sfmig Nov 20, 2023
b7934d1
fix path and if statements
sfmig Nov 20, 2023
60d69b7
derive extension from input video
sfmig Nov 20, 2023
bbad7de
bash script for day4 01-Right rep
sfmig Nov 20, 2023
797ffe5
Merge branch 'smg/frame-extraction-w-reencoding' of github.com:Sainsb…
sfmig Nov 20, 2023
2513601
Merge branch 'main' into smg/frame-extraction-w-reencoding
sfmig Nov 20, 2023
a66f14e
delete frame extraction only script and frame extraction local
sfmig Nov 20, 2023
06fbd0e
a draft bash script for running frame extraction with entry points
sfmig Nov 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 0 additions & 91 deletions bash_scripts/run_frame_extraction_array.sh

This file was deleted.

45 changes: 0 additions & 45 deletions bash_scripts/run_frame_extraction_local.sh

This file was deleted.

149 changes: 149 additions & 0 deletions bash_scripts/run_frame_extraction_w_entry_points.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
#!/bin/bash

#SBATCH -p gpu # partition
#SBATCH -N 1 # number of nodes
#SBATCH --mem 64G # memory pool for all cores
#SBATCH -n 2 # number of cores
#SBATCH -t 3-00:00 # time (D-HH:MM)
#SBATCH --gres gpu:1 # request 1 GPU (of any kind)
#SBATCH -o slurm_array.%N.%A-%a.out
#SBATCH -e slurm_array.%N.%A-%a.err
#SBATCH --mail-type=ALL
#SBATCH [email protected]

# Run this script as
# sbatch --array=0-n%m run_frame_extraction_w_entry_points.sh --config=input.json
#
# The idea is that this script changes as little as possible!
# Instead the input.json is the only file modified, and its content is printed to the logs
#
# NOTE for the optional argument "-array=0-n%m":
# runs n separate jobs, but not more than m at a time.
# the number of array jobs should match the number of input files


# ---------------------
# Create conda env
# ----------------------
# conda env create
# git clone repo
# pip install package



# ----------------------
# Input config
# ----------------------
# Print full json file to logs
# https://www.baeldung.com/linux/jq-command-json#1-prettify-json

# Check json
# Some config fields are mandatory


# Define defaults for optional fields
# To use if not defined in config
LOG_DIR=$OUTPUT_DIR/$OUTPUT_SUBDIR/logs
REENCODED_VIDEOS_SUBDIR=$REENCODED_VIDEOS_DIR/$OUTPUT_SUBDIR
# flag_reencode_input_videos

# ----------------------
# Input data
# ----------------------
# Read input videos from json file
# https://jqlang.github.io/jq/
# INPUT_DATA_LIST=()

# Check len(list of input data) matches max SLURM_ARRAY_TASK_COUNT
# if not, exit
if [[ $SLURM_ARRAY_TASK_COUNT -ne ${#INPUT_DATA_LIST[@]} ]]; then
echo "The number of array tasks does not match the number of inputs"
exit 1
fi


# ----------------------
# Output locations
# ----------------------
# Read output dir and subdir from json
# OUTPUT_DIR=/ceph/zoo/users/sminano/crabs_bboxes_labels
# OUTPUT_SUBDIR="Sep2023_day4_reencoded"

# Create location of SLURM logs
mkdir -p $LOG_DIR # create if it doesnt exist

# read reencoding flag from json
# flag_reencode_input_videos
# https://stackoverflow.com/a/28185962

# Define location of reencoded videos if required
if [ "$flag_reencode_input_videos" = true ] ; then
# Read reencoded dir from json
# REENCODED_VIDEOS_DIR=/ceph/zoo/users/sminano/crabs_reencoded_videos
# REENCODED_VIDEOS_SUBDIR=$REENCODED_VIDEOS_DIR/$OUTPUT_SUBDIR
mkdir -p $REENCODED_VIDEOS_SUBDIR # create if it doesnt exist
fi


# ------------------------
# Command line tool
# ------------------------
for i in {1..${SLURM_ARRAY_TASK_COUNT}}
do
# Input video
SAMPLE=${INPUT_DATA_LIST[${SLURM_ARRAY_TASK_ID}]}
echo "Input video: $SAMPLE"
echo "--------"

# --------------------------
# Reencode video - if required (CLI tool)
# --------------------------
echo "Reencoding ..."
reencode-video ...

# # Check status
# if [ "$?" -ne 0 ]; then
# echo "Reencoding failed! Please check .err log"
# else
# echo "Reencoded video: $REENCODED_VIDEO_PATH"
# fi
# echo "--------"


# -------------------
# Extract frames
# -------------------
echo Extracting frames
extract-frames ...

# # Check status
# if [ "$?" -ne 0 ]; then
# echo "Frame extraction failed! Please check .err log"
# else
# echo "Frames extracted from video: $FRAME_EXTRACTION_INPUT_VIDEO"
# fi
# echo "--------"


# -------------------
# Logs
# -------------------
# Reencoded videos log
# copy .err file to go with reencoded video too if required
# filename: {reencoded video name}.{slurm_array}.{slurm_job_id}
# TODO: make a nicer log
if [ "$flag_reencode_input_videos" = true ] ; then
for ext in err out
do
cp slurm_array.$SLURMD_NODENAME.$SLURM_ARRAY_JOB_ID-$SLURM_ARRAY_TASK_ID.$ext \
/$REENCODED_VIDEOS_SUBDIR/"$filename_no_ext"_RE.slurm_array.$SLURM_ARRAY_JOB_ID-$SLURM_ARRAY_TASK_ID.$ext
done
fi

# Frame extraction logs
# Move logs for this job to subdir with extracted frames
for ext in err out
do
mv slurm_array.$SLURMD_NODENAME.$SLURM_ARRAY_JOB_ID-$SLURM_ARRAY_TASK_ID.$ext /$LOG_DIR
done
done
Loading