Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering per batch and by percentiles #280

Open
2 tasks
pcm32 opened this issue Jan 17, 2023 · 0 comments
Open
2 tasks

Filtering per batch and by percentiles #280

pcm32 opened this issue Jan 17, 2023 · 0 comments
Labels
persist-seq Requests from Persist-Seq

Comments

@pcm32
Copy link
Member

pcm32 commented Jan 17, 2023

Currently our filtering setup is very rough, it simply allows fixed thresholds (ie. >200 genes per cell). In reality, people want to filter out the topmost 5% or the lowest n% of a specific metric. It is also desirable to do this by batch / sample, instead of over the whole dataset, or even possibly limit the filtering to specific samples / obs fields.

  • Implement % based filtering which will sort out elements based on a specific field and allow removal of > n% or < m % of that desired metric. This shouldn't replace of course the previous hard threshold setting and should be backward compatible (previous call should work as it is. This is of course work here and at Scanpy scripts / Seurat scripts.
  • Implement the ability to enact the above (and any filtering command) on a per sample / batch / observation field. Again both here and on scanpy scripts / seurat scripts.
@pcm32 pcm32 added the persist-seq Requests from Persist-Seq label Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
persist-seq Requests from Persist-Seq
Projects
None yet
Development

No branches or pull requests

1 participant