Implement validation loss during training #159

sfmig · 2024-04-12T09:44:58Z

Logging validation loss is trickier than expected due to the model we are using.

We are using fasterrcnn_resnet50_fpn_v2 that returns predictions only if set to eval
What can we do?
- redefine eager_outputs() method of the parent class GeneralizedRCNN?
  - not sufficient, even if we hack this method the loss is still an empty dict when run in validation - probably we need to do something like this
- use a custom forward method?
  - ptrblk to the rescue
  - probably this is our best bet, but I think it's more difficult than what Patrick suggests because all of the stages (rpn, roi_heads) behave the same as the "parent" network: they only return losses during training, detector_losses and proposal_losses are empty dicts if in eval mode.
  - this person reimplemented all the bits in an old version of the codebase, probably we can replicate (preferably without hacking torchvision directly).
- use another model?
  - we could have a go at YOLO? v8 is the latest, v5 seems straightforward to add to our pipeline.
  - other options for detection available in pytorch here
  - A comparison of some convolutional approaches for detection here

The text was updated successfully, but these errors were encountered:

sfmig mentioned this issue Apr 11, 2024

Running experiments systematically in the cluster #145

Closed

11 tasks

nikk-nikaznan mentioned this issue Jul 10, 2024

Lightning integration #152

Closed

Provide feedback