You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the YOLOv5 issues and found no similar bug report.
YOLOv5 Component
Training
Bug
These days when I'm trying to fine tune my model after pruning by training for several epochs, I found that loss value becomes nan from time to time. By setting breakpoints and checking, I found that there's a bug in metrics.py
Sometimes, if the prediction of some bounding box has a width or height of 0, it turns out to be nan values! Since in CIoU computation, h2 and h1 are used as dividers here.
Environment
No response
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?
Yes I'd like to help by submitting a PR!
The text was updated successfully, but these errors were encountered:
👋 Hello @tobymuller233, thank you for your interest in YOLOv5 🚀! It seems like you're encountering a nan values issue during training, and there might be a potential bug in the metrics.py file. To assist, we'll need a bit more information.
If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us understand and debug the issue. This would include steps to replicate the bug, relevant sections of your code, and any specific error messages.
Additionally, it would be helpful to know more about your environment setup, such as the version of Python, PyTorch, and any other dependencies you are using.
If you have any further insights, like dataset characteristics or specific conditions that might trigger this issue, do share those as well.
Please note that this is an automated response, and an Ultralytics engineer will review your issue and provide further assistance soon. Thank you for your patience and help in improving YOLOv5! 🚀✨
@tobymuller233 thank you for reporting this potential issue with loss computation. You've identified an important edge case where predictions with zero width or height could cause NaN values during CIoU loss calculation.
Before proceeding with a PR, please verify this behavior using the latest version of YOLOv5 as there have been several loss computation improvements. If you can provide a minimal reproducible example (MRE) following our MRE guide, it would help us investigate the issue more effectively.
For now, you could add a small epsilon value to prevent division by zero in the height calculations. However, we should also investigate why the model is predicting zero-sized bounding boxes during training, as this may indicate other underlying issues with the training process or data.
If you'd like to submit a PR, please ensure it includes:
Search before asking
YOLOv5 Component
Training
Bug
These days when I'm trying to fine tune my model after pruning by training for several epochs, I found that loss value becomes nan from time to time. By setting breakpoints and checking, I found that there's a bug in metrics.py
Sometimes, if the prediction of some bounding box has a width or height of 0, it turns out to be nan values! Since in CIoU computation, h2 and h1 are used as dividers here.
Environment
No response
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: