-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Seurat divided in several modules #2195
Conversation
Hi @bellenger-l I wasn't at the Galaxy Europeans days and hadn't heard there was any update on Seurat. There was already some work on creating Seurat modules here: #2057 (comment) |
Hi @mblue9 So I contributed to the existing seurat folder in order to follow these guidelines. You are of course welcome to contribute too. In addition, your seurat_scripts tool suite lives in a separate folder so we can also develop this even more modular approache. Best, |
Ok, communication on that from someone in IUC would have been nice but thanks to you for the update. |
I agree with @mblue9 , deciding (as opposed to discussing) things in a in-person meeting is something we should avoid, to prevent frustration from everyone that couldn't be there. For meetings at the GCC we usually keep notes, I think we should have done that in this case as well (which is not your fault at all @bellenger-l ). I'm not at all familiar with scRNA-seq, but I understand @pcm32 and @pinin4fjords (and maybe others) have worked on streamlining some seurat scripts in https://github.com/ebi-gene-expression-group/r-seurat-scripts, how much overlap is there with this PR ? |
Hi @bellenger-l we went to some lengths to try discuss this as much as possible with as many members of the community in #2057, we even had a large call with people in the US, Europe and in principle Australia (but I think @mblue9 couldn’t make it), where @bgruening and @bebatut were also present. We really wanted to avoid duplications and join forces. That is why instead of writing our own Galaxy wrappers, we contributed on top of @mblue9's for Seurat. We have made some progress, not only on Seurat, but also along the same lines with Scater (#2117 plus some things I haven't committed - https://github.com/ebi-gene-expression-group/bioconductor-scater-scripts), SC3 (https://github.com/ebi-gene-expression-group/bioconductor-sc3-scripts - we were about to PR things on IUC for this) and Scanpy (#2159 https://github.com/ebi-gene-expression-group/scanpy-scripts). We have not only made Galaxy wrappers, but also intermediate conda packages to expose similar granularity for all these tools outside of Galaxy. It would be great if you could add these additions on top of @mblue9 IUC pull request and we could all join forces and benefit from more man power. Here at the EBI and the Sanger we have 5 part time positions and 1 full time position working on this, so we’re keen to make sure we use the resource effectively. We would love to benefit as well from your contributions on this, and I think that the best way would be if you could contribute to @mblue9 original PR as we are doing. We would be very happy as well to receive PRs on https://github.com/ebi-gene-expression-group/r-seurat-scripts where you could add any missing functionality and build upon the one we have. The idea of those intermediate packages is to give a coherent usage across different workflows environments and shell usage. We believe that the best way forward towards making this work available not only to Galaxy but for other workflows environments is to leave the scripting in intermediate conda packages (r-seurat-scripts, scanpy-scripts, etc). If you follow an approach where you put the R scripting logic with the Galaxy wrappers, then to reuse this from other WF environments you need to duplicate that scripting layer. Contributing those additiional functionalities to the intermediate conda packages means your work is reusable by so many more people. |
Dear fellows, I am the boss of @bellenger-l and I feel I should say a few words to release a bit the pressure she is currently experiencing. It is clear that there is still divergent opinions on the way the things should be done, and I also see that some discussions are still lacking to better develop and avoid duplication of efforts. This said, I would advocate that a PR is just a Pull Request. In my view, PR is to expose work and to discuss it. Thus nothing that cannot be adjusted or reorganised any more. On a side note, where is the @mblue9 original PR you mentioned ? I missed it. I would also appreciate if @bgruening who probably have a more general view on the debate as us could say a word on behalf on the group that discussed the scRNAseq tools in Freiburg. Kind regards |
Esteemed @bellenger-l and @drosofff I apologise if somehow I appeared to be putting any pressure on you guys. I only pursued that we would all work together instead of competing, specially considering that our view at the level of granularity is so congruent. I tried to sound as inviting as I could, but sometimes maybe with the rush of the day I didn't have the time to find the best words. @mblue9 contributions are here #2047. We have some additional commits on top of her work (https://github.com/mblue9/tools-iuc/commits/seurat_scripts/tools/seurat_scripts) but I think that this got there after the PR got merged (which I wasn't aware it was). |
Hi, I don't feel that well these days, I will try to solve this issue. I thought someone took notes, but if not I will provide a summary from my memories. Sorrry, not my week. |
To start tests you will need a |
There is the old |
Oh sorry, didn't notice this, it's fine. |
As @mvdbeek and @pcm32 have already covered, it's not good that a decision like this was made in a meeting that people from the community who had already invested time, work and energy weren't at, weren't informed was happening or what had been discussed. @bellenger-l and @drosofff I'm sorry for any pressure you feel and that you got caught up in this. It must be very frustrating for your group to have put time into contributing this work and to get a response like this. Your work and effort is appreciated. |
HI all, first of all, let's clarify some points from my point of view.
I will assemble a summary of what we discussed and what is available in various PRs during the day and attach it to #2057 @drosofff I think @mblue9 is pointing to a branch, not a PR, see here: https://github.com/mblue9/tools-iuc/tree/seurat_scripts/tools/seurat_scripts
I'm not sure I have, but we will summarize everything today and put it online. Sorry for being later here. |
Thanks @bgruening. I agree with your points. |
Thanks for the clarification @bgruening. I hope you get better soon. |
I also wanted to stress that this was not an IUC meeting, only 3 persons from IUC (out of 10 attendees) were there. It was organised on the spot and hold during a lunch break. Therefore nothing of what was discussed should be considered binding for the IUC. The goals of the meeting were to discuss: 1) how to make the various scRNA toolkits interoperable in Galaxy and 2) to coordinate the tool development to not duplicate efforts. It seems we failed spectacularly on the second one, for which I apologise to all the persons involved. In the future, I'd recommend to open WIP pull requests early in the tool development to make it easier to track who is working on what, and to get feedback on the approach before a lot of work is invested. |
Thanks for the comments @nsoranzo, appreciate it.
Just to clarify, that was a reason why the issue here was opened: #2057, as a way to discuss and keep track of the various efforts on Galaxy scRNA work (there are many as we can see), and why it's a shame that it wasn't updated, that could have avoided this kerfuffle. And the reason the Seurat module work was a branch and not a PR was that it was unclear whether part of the wrapper generation could be automated or not. But I agree a PR would have been better, I should have done that. Just to add, there's already a merged Seurat tool in IUC here: #2047, which made this PR extra surprising and confusing. But hopefully the communication all round on these things (PRs, updates etc) gets better. |
Please see my comment here: #2057 (comment) with a link to a GDoc, please leave comments in the Doc and keep discussing there.
Afaik this PR is building on-top of your work. Its in your seurat folder. |
I think that we need to clarify something, our current (@mblue9 + EBI/Sanger) work of Seurat is still on this branch of @mblue9: https://github.com/mblue9/tools-iuc/tree/seurat_scripts/tools/seurat_scripts As you can see from those files, these are way anterior to ARTbio's, and it was mostly a lack of coordination that that wasn't PR'ed here after we started using it. As you will see many of the proposed modules by ARTbio where already present there and making use of our r-seurat-script intermediate layer. I would suggest then that we put the complentary work from ARTbio on top of those scripts; we will be happy to add the extra needed method on r-seurat-scripts (or if @bellenger-l wants to PR it to keep authorship of those modules, that is fine as well). I can take care of merging these two efforts in a single branch and PR it again, keeping the commits of each one of the involved parties. Would that work? |
@pcm32 I am fine with your idea or accepting this PR (it's nice work) or trying both. But let's see what the decision makers say. |
…seuset' thanks to gdata::mv
Hi all- I'm a colleague of @pcm32 . Despite a few crossed wires along the way I think this is quite positive, we're clearly converging on a strategy. We all have separate R scripts (rather than embedded code), we're talking about interoperability via Loom, etc etc. I don't think we at the EBI would mind adopting the general granularity decisions made here if it facilitated progress. The only point at which our approach diverges is that we have the base scripts in external Bioconda packages (so that we can also use them in non-Galaxy ways, which we feel is a great bonus for consistency across platforms). Is that something everyone could get on board with? If so, I think with a bit of work we could converge the approaches fairly quickly in the new year and get our 'many hands' working together. |
We are open to any option that can be tested. I suggest that @mblue9 makes a PR of her work. Then we can peacefully continue the discussion that has been initiated here and in the google doc. There is nothing that can be done with the previous effort duplication. However it will probably easy to merge two PRs in a single one at the end without too much coding effort. We do not really care of the commit authorships in this particular case. Our main motivation is to get galaxy tools available for better analyses and we will work in line with what was decided by the community. |
Excellent, in that case I'll open a new PR which will include these changes and @mblue9 ones (all original commits) and then we can work on that branch to merge scripts. |
Thanks a lot @pcm32. @bellenger-l I sincerely apologise to you for this very non-peaceful experience. I was quite angry here but not with you, you weren't aware of what had already been discussed. I just really didn't like what had happened - the unilateral decision making and lack of communication in light of the prior discussion, as I don't think it's good in a community. But this is usually an excellent community, and also the omission of making a PR was indeed a major oversight on my part. So I'm sorry if I made this an unpleasant experience for you, I think your work is great and I hope you make more PRs here. |
Hi all, now #2231 includes all known efforts for modular Seurat, I hope that we can keep on working there to merge the functionalities. I will start to add @bellenger-l functionality to our scripts layer, so that we can use what they have in addition from there. I would suggest that work ceases here and is continued at the other branch. I'm happy to give anyone who needs it write access to our iuc checkout or work on a pull request basis. |
We have a much newer set of tools for this if there is an interest. |
:) .. sure, if they fit for IUC .. closing this for now .. reopen if needed |
As discussed in the Galaxy Europeans days, ARTbio divided the Seurat tool in different modules :
* seurat-filtering : Filter out low quality cells and genes
* seurat-normalization : Cell level normalization
* seurat-feature-selection : Find variable genes
* seurat-confounder-removal : Scale data and confounder removal
* seurat-dim-reduction : Dimensional reduction (PCA and tSNE)
* seurat-clustering : Cluster cells
* seurat-diff-expression : Differential expression analysis
The first tool
seurat-filtering
takes an expression matrix as input, the other tools take a rdata file containing a single Seurat R object. All return at least a rdata file containing the updated Seurat object and if requested the Rscript used by the tool, some return a PDF file containing plots (Filtering, Normalization, Feature selection, Dimensional reduction, Clustering and Differential expression module) and one a tabular file (seurat-diff-expression
).FOR CONTRIBUTOR:
FOR REVIEWER:
iuc
has access to associated toolshed repo(s)<command/>
'single quoted'
<![CDATA[ ... ]]>
tagstext
or havingoptional="true"
attribute are checked withif str($param)
before being usedformat
attribute containing datatypes recognised by Galaxy<![CDATA[ ... ]]>
tags