-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial refactor / restructure of the codebase #73
Comments
Spoke a bit about this on Thursday (@nikk-nikaznan). My CoR-like idea is something closer to SLEAP's. Although their objects don't have an inheritance structure. It'd be something in code on our side. I was thinking something like: class PipelineStep(abc.ABC):
@abc.abstractmethod
def run():
pass
class VideoConversionStep(PipelineStep):
def run():
# run the actual encoder code
self.encoder.encode(list_of_videos)
class FrameExtractionStep(PipelineStep):
def run():
...
class Pipeline:
def __init__(steps: list[PipelineStep]):
self.steps = steps
def run():
for step in steps:
output = step.run()
if not output.is_ok():
raise OhDearSomethingWentWrongError() Other options we discussed: My kneejerk would be to try snakemake first. |
This should deal with the following issues: |
After first chunk of light refactoring (#86), we merged the new structure to main, and then went on to close currently open PRs:
|
The codebase is getting a bit wild 🐆 , and I think some steps are consolidated enough now to make them a bit more established.
Roughly the pipeline would involve the following steps:
The last three steps are less well defined at this point.
@nikk-nikaznan and I chatted a bit today and some ideas came up:
one option could be to follow a more functional programming approach: we have config files for each of the pipeline steps holding the main parameters, and scripts that take these config files as CLI arguments. The the steps are run using bash scripts. I am not very keen on more config files but I think this would be the easiest to transform to atm. This is sort of what we are doing now with the frame extraction step.
Would this functional programming option be similar to the chain of responsibility @samcunliffe suggested?
Another option would be a more OOP approach, I was thinking maybe similar to SLEAP's pipelines?
Nik and Matt also suggested having a look at DVC - seems very well suited for ML but still flexible, might be a good investment to learn about it.
Any thoughts more than welcome, happy to discuss further at our next gemba.
The text was updated successfully, but these errors were encountered: