Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Merge filtering functionality #11

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

pcm32
Copy link
Member

@pcm32 pcm32 commented Jan 22, 2019

This PR merges genes and cells filtering functionality into a single module.

@pcm32
Copy link
Member Author

pcm32 commented Jan 22, 2019

Currently dimplot tests are not passing due to some issue with the ident used:

✗ Plot dimension reduction
   (in test file r-seurat-scripts-post-install-tests.bats, line 191)
     `run rm -f $pca_image_file && seurat-dim-plot.R -i $cluster_seurat_object -r pca -a $pca_dim_one -b $pca_dim_two -p $pt_size -l $label_size -d $do_label -f $group_by -t '$pca_plot_title' -m $pca_do_bare -u $pca_cols_use -x $pca_coord_fixed -n $pca_no_axes -k $pca_dark_theme -q $pca_plot_order -w $pca_png_width -j $pca_png_height -o $pca_image_file' failed
   Error in DimPlot(seurat_object, reduction.use = opt$reduction_use, dim.1 = opt$dim_1,  : 
     invalid ident in plot.order
   Execution halted

That failure is anterior to the additions of this PR (the part that uses the new functionality is still commented in the PR).

@pinin4fjords
Copy link
Member

pinin4fjords commented Jan 22, 2019

Was going to say that maybe we could filter by more row/gene-wise metrics, but just saw this: satijalab/seurat#147. Another reason not to expend too much future effort on Seurat- Scater much nicer in R.

@pinin4fjords
Copy link
Member

^^^ Was just going to suggest this ^^^

action = "store",
default = 0,
type = 'integer',
help = "Include only genes with detected expression in at least this many cells."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise we only have the one gene-wise metric to use (min cells), but maybe the interface should echo that of the the cell-wise metrics anyway? Then we could add other metrics in future. See https://github.com/ebi-gene-expression-group/bioconductor-scater-scripts/blob/develop/scater-filter.R, where we have cell- and feature-wise upper and lower thresholds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would take us to the philosophical discussion of how much we want to "improve" packages through the wrappers, as opposed to simply expose their functionality and then pick the best one for each task by the different tools once they are interoperable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're being faithful to the underlying methods we should just shoe-horn this into the object creation as Seurat does :-P. But if we're being keen on consistency with the other tools (and existing arguments for cell-wise filtering) as you were saying the other day - which I think is a good plan - then I think we can allow ourselves some lattitude to 'tidy up' :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that being keen on consistency shouldn't mean stretching the packages to do things that they don't have implementations for. In that case one would rather do a PR to the upstream package with the additional functionality. We should aim to be consistent in the sense of trying to put forward equivalent interfaces for each major step on each major tool, but only within the existing abilities of each tool.

@pcm32
Copy link
Member Author

pcm32 commented Jan 22, 2019

Ok, this

That failure is anterior to the additions of this PR (the part that uses the new functionality is still commented in the PR).

is not true. I had to reduce the number of genes on the random sampling originally as the number of genes wasn't enough after filtering with --min-cells.

@pcm32
Copy link
Member Author

pcm32 commented Jan 22, 2019

This is ready for review.

@pcm32 pcm32 requested a review from nh3 January 22, 2019 21:53
@pcm32
Copy link
Member Author

pcm32 commented Jan 24, 2019

Ping @pinin4fjords @nh3 could I have a review please? Thanks.

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I maintain my argument on the interface to the new functionality, but otherwise LGTM

@pcm32 pcm32 changed the title Merge filtering functionality [WIP] Merge filtering functionality Feb 8, 2019
@pcm32
Copy link
Member Author

pcm32 commented Feb 8, 2019

Set back to WIP again as working on adding more filtering options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants