-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Merge filtering functionality #11
base: master
Are you sure you want to change the base?
Conversation
Currently dimplot tests are not passing due to some issue with the ident used:
That failure is anterior to the additions of this PR (the part that uses the new functionality is still commented in the PR). |
Was going to say that maybe we could filter by more row/gene-wise metrics, but just saw this: satijalab/seurat#147. Another reason not to expend too much future effort on Seurat- Scater much nicer in R. |
^^^ Was just going to suggest this ^^^ |
action = "store", | ||
default = 0, | ||
type = 'integer', | ||
help = "Include only genes with detected expression in at least this many cells." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise we only have the one gene-wise metric to use (min cells), but maybe the interface should echo that of the the cell-wise metrics anyway? Then we could add other metrics in future. See https://github.com/ebi-gene-expression-group/bioconductor-scater-scripts/blob/develop/scater-filter.R, where we have cell- and feature-wise upper and lower thresholds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would take us to the philosophical discussion of how much we want to "improve" packages through the wrappers, as opposed to simply expose their functionality and then pick the best one for each task by the different tools once they are interoperable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're being faithful to the underlying methods we should just shoe-horn this into the object creation as Seurat does :-P. But if we're being keen on consistency with the other tools (and existing arguments for cell-wise filtering) as you were saying the other day - which I think is a good plan - then I think we can allow ourselves some lattitude to 'tidy up' :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that being keen on consistency shouldn't mean stretching the packages to do things that they don't have implementations for. In that case one would rather do a PR to the upstream package with the additional functionality. We should aim to be consistent in the sense of trying to put forward equivalent interfaces for each major step on each major tool, but only within the existing abilities of each tool.
Ok, this
is not true. I had to reduce the number of genes on the random sampling originally as the number of genes wasn't enough after filtering with --min-cells. |
This is ready for review. |
Ping @pinin4fjords @nh3 could I have a review please? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I maintain my argument on the interface to the new functionality, but otherwise LGTM
Set back to WIP again as working on adding more filtering options. |
This PR merges genes and cells filtering functionality into a single module.