Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use multiple processors #7

Open
thesamovar opened this issue Apr 27, 2015 · 14 comments
Open

Use multiple processors #7

thesamovar opened this issue Apr 27, 2015 · 14 comments
Milestone

Comments

@thesamovar
Copy link
Contributor

In the old KlustaKwik there was some but not a huge benefit to using multiple cores because the problem was memory bandwidth limited. However, in KK2 the memory usage is reduced by orders of magnitude (especially for larger problems), so we might well see much better speed improvements to multiple processors.

There is a technical issue. As far as I know, Numba does not support multiple processors except in the vectorize decorator which is not something we can use in KK2 (and then only in the 'pro' version). I don't see any way around this. This might mean we have to stick to Cython.

@rossant any thoughts?

@rossant
Copy link
Member

rossant commented Apr 27, 2015

How do you want to use multiple cores exactly?

Le lundi 27 avril 2015, Dan Goodman [email protected] a écrit :

In the old KlustaKwik there was some but not a huge benefit to using
multiple cores because the problem was memory bandwidth limited. However,
in KK2 the memory usage is reduced by orders of magnitude (especially for
larger problems), so we might well see much better speed improvements to
multiple processors.

There is a technical issue. As far as I know, Numba does not support
multiple processors except in the vectorize decorator which is not
something we can use in KK2 (and then only in the 'pro' version). I don't
see any way around this. This might mean we have to stick to Cython.

@rossant https://github.com/rossant any thoughts?


Reply to this email directly or view it on GitHub
#7.

@thesamovar
Copy link
Contributor Author

The main one is in the E-step. We have a key loop which, for each cluster, involves iterating over all spikes. I use an OpenMP parallel for over this inner loop over spikes in the C++ version. I'd like to do the equivalent in the Python version.

@rossant
Copy link
Member

rossant commented Apr 27, 2015

Maybe we can use this feature to implement a parallel for loop with Numba?

@rossant
Copy link
Member

rossant commented Apr 27, 2015

@thesamovar thesamovar added this to the 0.2 milestone Apr 30, 2015
@thesamovar
Copy link
Contributor Author

Note to myself: to do this in Cython using OpenMP, we don't have access to the keyword that makes a copy of the variable for each thread, but we can allocate them in a list/array of variables and then access them using the thread index.

@rossant
Copy link
Member

rossant commented Apr 30, 2015

Do you think Numba will let us use multiple CPUs here?

@thesamovar
Copy link
Contributor Author

I think it can be done but might be simpler using Cython. Am happy to switch to Numba but since everything is in Cython at the moment I'll stick with that for now. The big advantage of Numba to me would be that I wouldn't have to type all the variables explicitly, and we could mix and match arrays with different dtypes (e.g. float32, float64, int16, int32, int64). This is possible in Cython but gets complicated when you have multiple arrays each of which could have different dtypes.

@thesamovar
Copy link
Contributor Author

OK this is done for the E-step now and it works pretty well. I'll leave it open in case we want to do the M-step too, but the E-step is most of the work.

@thesamovar thesamovar modified the milestones: 1.x, 0.2 May 2, 2015
@c-wilson
Copy link

Is it possible to set the number of threads that klustakwik will use? Right now it's using all of my physical and virtual CPUs, I'd like to be able to specify how many if possible. I'm using it through phy and have my OMP_NUM_THREADS=1. Thanks!

@thesamovar
Copy link
Contributor Author

I'll look into this, I created a new issue #67 that you can follow if you want.

@thesamovar
Copy link
Contributor Author

OK I fixed this. It was indeed ignoring OMP_NUM_THREADS but it was by design (long story). I've added a new parameter num_cpus which you can set to the number of CPUs you want to use. This is now in the current git master branch.

@c-wilson
Copy link

Great. Just to make sure I understand: to use this, I add “num_cpus=12" to the klustakwik2 dictionary of my prm?

@thesamovar
Copy link
Contributor Author

Yes, if you have the latest version of KK2.

On 15/07/2015 21:15, Chris Wilson wrote:

Great. Just to make sure I understand: I can now add “num_cpus=12" as a
kk parameter to my prm file?

On Jul 15, 2015, at 4:06 PM, Dan Goodman [email protected]
wrote:

OK I fixed this. It was indeed ignoring OMP_NUM_THREADS but it was by
design (long story). I've added a new parameter num_cpus which you can
set to the number of CPUs you want to use. This is now in the current
git master branch.


Reply to this email directly or view it on GitHub
#7 (comment).


Reply to this email directly or view it on GitHub
#7 (comment).

@rossant
Copy link
Member

rossant commented Jul 15, 2015

note that others have reported a bug in phy where KK2 params were not properly taken into account -- should be fixed this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@rossant @thesamovar @c-wilson and others