Configuring Parallelism Between Libraries #19

thomasjpfan · 2024-09-05T15:32:16Z

Parallelism in Python have two semantics:

"Spawners": Starts up multiple workers that let other code do work. For example:
1. concurrent.futures.ProcessPoolExecutor(max_workers=4)
2. cuoncurrent.futures.ThreadPoolExecutor(max_workers=4)
3. joblib.Parallel(n_jobs=4)
"Computers": Actually uses the compute to do work
1. A @ B does matrix multiplication with BLAS
2. scipy.fft.fft(..., workers=8)
3. list(range(10)) (Pure Python that is single core)

Spawner Configuring the Computer

Scikit-learn uses both semantics and automatically configures parallelism to prevent over subscription. With 24 CPU cores:

halving_search = HalvingRandomSearchCV(
	HistGradientBoostingRegressor(), ..., n_jobs=4,
)
halving_search.fit(...)

HalvingRandomSearchCV spawns 4 workers with multiprocessing and then uses threadpoolctl to configure each worker to use 6 CPU cores for OpenMP.

Failure mode

For a 24 CPU cores, here is an example that over-subscribes and stalls:

def f(x):
    ...
    x @ A  # Uses matmul which uses all cores by default (24)
    ...

# Spawns 24 multi-processing workers to run f
quad_vec(f, ..., workers=4)

The user is responsible to prevent oversubscription:

def f(x):
    with threadpool_limits(limits=6, user_api='blas'):
        x @ A

Underlying Questions

As free-threading Python because real, more users will run library code with multi-threading and ultimately run into this problem. There are two questions:

Should libraries with "spawners" be responsible for setting the number of cores for their workers?
If so, how should this configuration be communicated between libraries?
1. Currently, Python libraries generally have three ways to configure parallelism:
  1. Environment variable, export OMP_NUM_THREADS=8
  2. Set globally, torch.set_num_threads(8).
  3. Context manager, with threadpool_limits(limits=8)
  4. Functions signature, fft(..., workers=8)
2. Solution is likely a thread-local config using contextvars that is (somehow) shared between libraries.

Session Notes

The text was updated successfully, but these errors were encountered:

MSanKeys963 added the scheduled label Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuring Parallelism Between Libraries #19

Configuring Parallelism Between Libraries #19

thomasjpfan commented Sep 5, 2024 •

edited

Loading

Configuring Parallelism Between Libraries #19

Configuring Parallelism Between Libraries #19

Comments

thomasjpfan commented Sep 5, 2024 • edited Loading

Spawner Configuring the Computer

Failure mode

Underlying Questions

thomasjpfan commented Sep 5, 2024 •

edited

Loading