Skip to content

Commit

Permalink
Clarify array job syntax
Browse files Browse the repository at this point in the history
  • Loading branch information
sfmig committed Nov 20, 2024
1 parent a742e5d commit acb9ca6
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion guides/DetectAndTrackHPC.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
- The `CKPT_PATH` variable, which is the path to the trained detector model.
- The `VIDEOS_DIR` variable, which defines the path to the videos directory.
- The `VIDEO_FILENAME` variable, which allows us to define a wildcard expression to select a subset of videos in the directory. See the examples in the bash script comments for the syntax.
- Remember that the number of videos to run inference on needs to match the number of jobs in the array. To change the number of jobs in the array job, edit the line that start with `#SBATCH --array=0-n%m`. That command specifies to run `n` separate jobs, but not more than `m` at a time.
- Remember that the number of videos to run inference on needs to match the number of jobs in the array. To change the number of jobs in the array job, edit the line that starts with `#SBATCH --array=0-n%m` and set `n` to the total number of jobs minus 1. The variable `m` refers to the number of jobs that can be run at a time.
Less frequently, one may need to edit:
Expand Down
2 changes: 1 addition & 1 deletion guides/EvaluatingModelsHPC.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
When launching an array job, we may want to edit the following variables in the bash script:
- The `MLFLOW_CKPTS_FOLDER` and the `CKPT_FILENAME` variables, define which trained models we would like to evaluate. See the examples in the bash script comments for the syntax.
- The number of trained models to evaluate needs to match the number of jobs in the array. To change the number of jobs in the array job, edit the line that start with `#SBATCH --array=0-n%m`. That command specifies to run `n` separate jobs, but not more than `m` at a time.
- The number of trained models to evaluate needs to match the number of jobs in the array. To change the number of jobs in the array job, edit the line that starts with `#SBATCH --array=0-n%m` and set `n` to the total number of jobs minus 1. The variable `m` refers to the number of jobs that can be run at a time.
- The `MLFLOW_FOLDER`. By default, we point to the "scratch" folder at `/ceph/zoo/users/sminano/ml-runs-all/ml-runs-scratch` . This folder holds runs that we don't need to keep. For runs we would like to keep, we will instead point to the folder at `/ceph/zoo/users/sminano/ml-runs-all/ml-runs`.
Less frequently, one may need to edit:
Expand Down
2 changes: 1 addition & 1 deletion guides/TrainingModelsHPC.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
Additionally for an array job, one may want to edit the number of jobs in the array (by default set to 3):
- this would mean editing the line that start with `#SBATCH --array=0-n%m` in the `run_training_array.sh` script. That command specifies to run `n` separate jobs, but not more than `m` at a time.
- this would mean editing the line that start with `#SBATCH --array=0-n%m` in the `run_training_array.sh` script. You will need to set `n` to the total number of jobs minus 1. The variable `m` refers to the number of jobs that can be run at a time.
- if the number of jobs in the array is edited, the variable `LIST_SEEDS` needs to be modified accordingly, otherwise we will get an error when launching the job.
1. **Edit the config YAML file if required**
Expand Down

0 comments on commit acb9ca6

Please sign in to comment.