Memory Issue using Dask_histogram/Coffea/Dask_awkward #3237
Unanswered
JuanDuarte2003
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been experiencing the next problem when trying to build some histograms using Dask_histogram along a Dask cluster.
I have a YAML file with paths to root trees in CMS EOS storage
In total, they're 660 root files divided into 24 datasets (not necessarily with the same amount of files). Each root tree has fields with weights used to construct the histograms, they have one "nominal" weight and a bunch of systematic weights. I load this YAML file in Python and then preprocess the full dataset:
Then, I have a list of variables and bin edges to create the histogram that I also load from a YAML file like this
The variables can be stored in the root files or be computed from other variables. Then, I pass this to a Coffea Processor
Processor = HistogramProcessor(process, var_list, bin_edges_list)
. The processors looks like this:Here
constants
is just an imported script to load some numbers. As you can see, aside from the 5 axes associated with the histograms, I also added a 6th axis for the various systematic weights mentioned above. Then, this processor should return hist_dask when applied over the datasets like this:Now, to convert the hist_dask into futures and save them in disk after computation, I used the following function
Since the dataset has various keys, I do a for loop to apply the function.
Initially, it works fine, looking at the dashboard it shows that it's computing. At some point, the cluster gets killed. When sending the histograms to the cluster, dask shows some warnings about the graph size, which could indicate a memory problem. Therefore, I talked with someone from the IT team who works directly with the cluster, and he didn't find any problems on that side.
Maybe is there some way to build the hist.dask directly in the cluster so they don't have to go through the scheduler or something like that? That would be a solution to my main problem, still, it looks weird how dask dies without further messages.
Beta Was this translation helpful? Give feedback.
All reactions