-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow deferred datasets to behave as URIs #19077
base: dev
Are you sure you want to change the base?
Conversation
I recommended is_prefix because maybe like say the SRA tools could do something more optimal if the URI is an SRA URL, but this does not conflict with that, makes more sense conceptually for our current use cases, etc.. so I think I like your implementation better than my suggestion. Really nice work - this seems perfect to me. |
d819107
to
a94a80d
Compare
Should I use the new tool execution testing framework for a94a80d? |
I think the test is fine as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff @davelopez!
lib/galaxy/tools/evaluation.py
Outdated
deferred_input = self.tool.inputs.get(input_name) | ||
if not deferred_input: | ||
# TODO: Can this ever happen? It seems like it happens in the test suite. | ||
# For example, in test_metadata_validator_on_deferred_input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BAM file in this test is a special case. BAM files do have a second file associated to it as metadata, a bai
file.
The bai
file is probably created during materialization. Can this trigger it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inputs in the pileup
tool are:
<inputs>
<param name="input1" type="data" format="unsorted.bam" multiple="true" min="1" label="BAM Inputs">
<validator check="bam_index" message="Metadata missing, click the pencil icon in the history item and use the auto-detect feature to correct this issue." type="metadata" />
</param>
<param name="reference" type="data" format="fasta" label="Fasta Input"/>
</inputs>
But during evaluation, we got 2 inputs (apart from $reference
) as deferred $input1
and $input11
. Maybe that $input11
was the index file as you mentioned? I would have expected that to be under $input1.metadata.bam_index
and not in its own input entry. Also at this point, nothing has been materialized yet, but maybe it was "staged" to be materialized... 🤔
Anyway, if this is expected I will just remove the TODO comment, I wanted to make sure this was a known case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea what $input11
is. @jmchilton can you please have a look? Is it ok to merge this PR or is this something we should look at?
Co-authored-by: Nicola Soranzo <[email protected]>
Co-authored-by: Nicola Soranzo <[email protected]>
Co-authored-by: Bjoern Gruening <[email protected]>
There might be cases with additional deferred inputs outside of the input definitions see test_metadata_validator_on_deferred_input.
a94a80d
to
72735eb
Compare
The integration test failure is unrelated |
Closes #19065
Some tools might accept URIs as input. In those cases there is no need to materialize deferred datasets in Galaxy, we can just pass the source URI directly to the tool.
To do that, we can now use
allow_uri_if_protocol
when declaring the input param in the tool.With the above example, if you provide a deferred dataset as
input
which source URI starts with one of the listed protocols (http, https or s3), when you reference your$input
it will contain the source URI, otherwise, it will be the path to the materialized file.You can also use
allow_uri_if_protocol="*"
to treat all deferred datasets as URIs regardless of the protocol.How to test the changes?
allow_uri_if_protocol
in the input param definition with an appropriate value to filter the protocol or*
.License