-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Conversion #95
Comments
Hi @ozgurdemir , The conversion uses a single thread. An obvious call is to use multi-threading, there is an old issue (#23) about it. This functionality was not implemented and I cannot provide you with any estimation if/when it will. Although reading and converting individual spectra can be parallelized relatively easy, the final assembly of mzML file (especially the indexed one) is much more difficult to implement in parallel and this step will determine the performance (that is one of the reasons why multi-threaded processing is not there). There can be, of course, some other performance issues, that being solved will improve the overall performance even in a single-threaded design. One can run multiple copies of ThermoRawFileParser in parallel to utilize the resources of the computer better. Of course, it only will work when converting several raw files. |
Hi @caetera , thx for the rapid response. I'm note seeing this for a particular file. So you're right this is probably more of a question rather than in issue. I was just wondering why the conversion from one format to another takes so much time. Without knowing anything about the process itself. Maybe there are some calculations, compression etc. involved. I agree. Parallelizing code is always tricky. Plus if there are bottlenecks in the single threaded implementation they will still be present in the multi core implementation. I'm not familiar with c# tooling but did you ever profile the conversion process to detect bottlenecks? |
Hi I agree with @caetera, going multithreaded could make it faster but will be a challenge. I'm sure improvements could be made even without it, suggestions are always welcome (it was my first C# project). Could you share the RAW file? I'll see what I can do with profiling. Also, using the --noPeakPicking flag increases the file size size as you are probably aware. Thanks for using the parser. |
Is there a possibility to speed up the conversion process? The conversion of a file with ~200 scans takes around 20 seconds:
command used:
ThermoRawFileParser.exe -i test.RAW -b test.mzML --noPeakPicking -f 1
real 1m2.862s user 0m21.265s sys 0m3.149s
resulting mzML file size 52mb
Thanks for this application btw.
The text was updated successfully, but these errors were encountered: