Data augmentation #141

sfmig · 2024-03-27T12:55:35Z

Rebase after #203 is merged

This PR adds a few data augmentation transforms that we think could be helpful.

Specifically,

adds some preselected transforms with reasonable values to the config yaml file,
adds a CLI option --no_data_augmentation to skip all data augmentation during training,
adds a CLI option --log_data_augmentation to log the data augmentations linked to the datamodule as MLflow artefacts,
adds a notebook for visualisation,
adds data augmentation tests.

codecov-commenter · 2024-03-27T13:01:09Z

Codecov Report

Attention: Patch coverage is 62.22222% with 17 lines in your changes missing coverage. Please review.

Project coverage is 37.91%. Comparing base (87babb5) to head (54e45b6).

Files	Patch %	Lines
crabs/detection_tracking/detection_utils.py	35.71%	9 Missing ⚠️
crabs/detection_tracking/train_model.py	20.00%	8 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #141      +/-   ##
==========================================
+ Coverage   37.05%   37.91%   +0.85%     
==========================================
  Files          20       20              
  Lines        1414     1440      +26     
==========================================
+ Hits          524      546      +22     
- Misses        890      894       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nikk-nikaznan

Cool stuff! I am happy with this, run well and I tried to commented out some variable as well. The only thing, I guess we need to add some guide or in the readme for this. As we never test this properly, so not necessarily, the default one in the config will give the best results. Some might harm the model more. But this will be cool for anyone to start doing ablation study on data augmentation.

sfmig · 2024-06-28T09:45:55Z

aah good call!
Yeah my idea was to run a study first, then get a rough estimate of what is the best performing set of parameters (rough estimate because I am not optimising the parameters per transform for example. Then have those as a default as you say.

But I think having a guide on how to run a study like this could be helpful - I opened an issue.

thanks Nik!

* Move checkpoint type computation to utils * Refactor checkpointing in training script * Get ckpt type if ckpt is passed * optionally apply a data augmentation method (WIP) * fix config syntax in code * add data augmentation notebook * notebook to explore params of individual transformations * add transforms from config * Add keywords to datamodule params * Optionally skip data augmentation * If data augmentation key in config, apply * Update tests * Change tests to read default config * Refactor transform functions and clean up * update notebook * Fix data augmentation default config * Optionally log data augmentation transforms as artifacts * Rename skip to 'no_data_augmentation'

sfmig force-pushed the smg/data-augm branch from c392410 to 723be96 Compare April 12, 2024 12:35

sfmig mentioned this pull request Apr 12, 2024

Improvements and other ideas for training detection #96

Closed

3 tasks

sfmig force-pushed the smg/data-augm branch 3 times, most recently from a1c0a7c to 90da5cb Compare June 27, 2024 13:15

sfmig added 2 commits June 27, 2024 16:04

Move checkpoint type computation to utils

12315cb

Refactor checkpointing in training script

9d41ed4

sfmig force-pushed the smg/data-augm branch from 1546145 to 7111fcb Compare June 27, 2024 15:09

sfmig added 14 commits June 27, 2024 18:28

Get ckpt type if ckpt is passed

94361ab

optionally apply a data augmentation method (WIP)

8e38530

fix config syntax in code

27c2376

add data augmentation notebook

21f4b4e

notebook to explore params of individual transformations

4865cc6

add transforms from config

42be084

Add keywords to datamodule params

b2fdde8

Optionally skip data augmentation

633c6c8

If data augmentation key in config, apply

a4be766

Update tests

ad406d7

Change tests to read default config

3ad8420

Refactor transform functions and clean up

6b1a723

update notebook

3df8635

Fix data augmentation default config

43f852b

sfmig force-pushed the smg/data-augm branch from 7111fcb to 43f852b Compare June 27, 2024 17:30

Optionally log data augmentation transforms as artifacts

5119bd1

sfmig force-pushed the smg/data-augm branch from d499ce7 to 5119bd1 Compare June 27, 2024 17:43

Rename skip to 'no_data_augmentation'

54e45b6

sfmig marked this pull request as ready for review June 27, 2024 18:02

sfmig requested a review from nikk-nikaznan June 27, 2024 18:02

nikk-nikaznan approved these changes Jun 28, 2024

View reviewed changes

nikk-nikaznan mentioned this pull request Jun 28, 2024

Refactor tracking #193

Merged

sfmig merged commit 81db31e into main Jun 28, 2024
6 checks passed

sfmig deleted the smg/data-augm branch June 28, 2024 09:46

sfmig mentioned this pull request Jun 28, 2024

Small refactoring to load from ckpt #203

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data augmentation #141

Data augmentation #141

sfmig commented Mar 27, 2024 •

edited

Loading

codecov-commenter commented Mar 27, 2024 •

edited

Loading

nikk-nikaznan left a comment

sfmig commented Jun 28, 2024

Data augmentation #141

Data augmentation #141

Conversation

sfmig commented Mar 27, 2024 • edited Loading

codecov-commenter commented Mar 27, 2024 • edited Loading

Codecov Report

nikk-nikaznan left a comment

Choose a reason for hiding this comment

sfmig commented Jun 28, 2024

sfmig commented Mar 27, 2024 •

edited

Loading

codecov-commenter commented Mar 27, 2024 •

edited

Loading