-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements on "docker biocontainers" to bio.tools metadata sync #12
Comments
Current files in the form *.biocontainers.yaml come from a parrallel repository, https://github.com/BioContainers/tools-metadata/, which hasn't been updated for the last three years. Removing this. biocontainers sync will be revisited (see BioContainers/ci#12 and BioContainers/ci#13).
Regarding the new 'filepath' ( The 'bio.tools ID' is an optional part of the submitted dockerfile (as some tools do not have a biotool ID). How should we manage this situation? (Currently, we already use the biotools.id in the path if available, else we default to the software name in the path of the provided dockerfile I believe). |
Also, regarding the 'biocontainer ID': what should we use? [Tool_name]_[version] ? (IE: diann:1.8.1_cv2 ? Or should we remove the _cv2, to make sure we update the tool yaml, and not create a new one?) The cv1 / cv2 is linked to the 'biocontainer dockerfile version', and not the tool version itself (ex here). Should we have separate files? As an example, with the 'cadd-with-script' PR, using the cadd biotool id (cadd_phred), we would have:
Each update to the Dockerfile (for the same version of cadd), would add another file.
And if we had a PR with cadd itself (instead of cadd-scripts-xxx), it would be
|
So, the way it works in the import now (didn't use to) is that: |
I would say that if cv2 replaces cv1 but keeps the same metadata and the same tool in the same version, we should use the same ID (e.g. |
@hmenager In all case, we add a file in Regarding the
The question is:
Github already takes care of versioning, and all the differents version will be in https://github.com/BioContainers anyway. It might be good to have a look at what exactly we want in term of metadata content, and the formatting. |
for the biocontainers ID, we need Just the software name, and we only need the last version! |
Just as a reminder for myself, but if we only need the last version, we need a way to skip the biotool part of the CI for some PR, juste in case someone make a PR with a older version 🤔 (Since there are many way of versioning that are difficult to parse). |
(discussed with @mboudet today)
There are a few flaws that need to be adressed in the CI process (as implemented in
https://github.com/BioContainers/ci/blob/master/github-ci/src/biocontainersci/biotools.py
) that updates the metadata in the RSEc each time a new pull request is merged on the biocontainers containers repository:Unique biocontainers filenames
We need to generate unique filenames for the biocontainers metadata files generated, e.g. instead of
data/fastqc/biocontainers.yaml
,https://github.com/research-software-ecosystem/content/blob/master/data/fastqc/fastqc.biocontainers.yaml
. Here, the new filename pattern isdata/[bio.tools ID]/[biocontainers ID].biocontainers.yaml
. This will avoid collisions in case multiple containers refer to the same software in bio.tools, in which case any new container wrapping a bio.tools already packaged in another container would end replacing the contents of the previous file.Generate files locally
biocontainers metadata files should be generated, at least as an option, in a local copy of the git repository, instead of creating a pull request, for easier testing.
Batch files generation
It would be practical to enable generating/updating metadata files for all the containers available in the repository, instead of only one, crawling all
Dockerfile
files in a local checkout of BioContainers/containers, and generating/updating the*.biocontainers.yaml
files of a local checkout of research-software-ecosystem/content.review metadata mapping
Have a exhaustive metadata review, to check that all metadata (at least LABEL, FROM, MAINTAINER) are mapped to the yaml file.
The text was updated successfully, but these errors were encountered: