Skip to content

Commit

Permalink
Specify minimal blosc2 version
Browse files Browse the repository at this point in the history
  • Loading branch information
FrancescAlted committed Aug 31, 2024
1 parent f64484a commit e2917bc
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions posts/ndim-reductions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@
NumPy is widely recognized for its ability to perform efficient computations and manipulations on multidimensional arrays. This library is fundamental for many aspects of data analysis and science due to its speed and flexibility in handling numerical data. However, when datasets reach considerable sizes, working with uncompressed data can result in prolonged access times and intensive memory usage, which can negatively impact overall performance.

Python-Blosc2 leverages the power of NumPy to perform reductions on compressed multidimensional arrays. But, by compressing data with Blosc2, it is possible to reduce the memory and storage space required to store large datasets, while maintaining fast reduction times. This is especially beneficial for systems with memory constraints, as it allows for faster data access and operation.
`Python-Blosc2 <https://www.blosc.org/python-blosc2>`_ leverages the power of NumPy to perform reductions on compressed multidimensional arrays. But, by compressing data with Blosc2, it is possible to reduce the memory and storage space required to store large datasets, while maintaining fast reduction times. This is especially beneficial for systems with memory constraints, as it allows for faster data access and operation.

In this blog, we will explore how Python-Blosc2 can perform data reductions in in-memory `NDArray <https://www.blosc.org/python-blosc2/reference/ndarray.html>`_ objects (or any other object fulfilling the `LazyArray interface <https://www.blosc.org/python-blosc2/reference/lazyarray.html>`_) and how the speed of these operations can be optimized by using different chunk shapes, compression levels and codecs. We will then compare the performance of Python-Blosc2 with NumPy.

**Note**: The code snippets shown in this blog are part of a `Jupyter notebook <https://github.com/Blosc/python-blosc2/blob/main/doc/getting_started/tutorials/04.reductions.ipynb>`_ that you can run on your own machine. For that, you will need to install a recent version of Python-Blosc2: `pip install 'blosc2>=3.0.0b3'`; feel free to experiment with different parameters and share your results with us!

The 3D array
------------

Expand Down Expand Up @@ -157,6 +159,4 @@ Understanding the balance between space savings and the additional time required

Besides the sum() reduction exercised here, Blosc2 supports a fair range of reduction operators (mean, std, min, max, all, any, etc.), and you are invited to `explore them <https://www.blosc.org/python-blosc2/reference/reduction_functions.html>`_. Moreover, it is also possible to use reductions even for very large arrays that are stored on disk. This opens the door to a wide range of possibilities for data analysis and science, allowing for efficient reductions on large datasets that are compressed on-disk and with minimal memory usage. We will explore this in a forthcoming blog.

Finally, you can find the code for this blog on a `notebook in the Blosc2 repository <https://github.com/Blosc/python-blosc2/blob/main/doc/getting_started/tutorials/04.reductions.ipynb>`_. Feel free to experiment with different parameters and share your results with us!

We would like to thank `ironArray <https://ironarray.io>`_ for supporting the development of the computing capabilities of Blosc2. Then, to NumFOCUS for recently providing a small grant that is helping us to improve the documentation for the project. Last but not least, we would like to thank the Blosc community for providing so many valuable insights and feedback that have helped us to improve the performance and usability of Blosc2.

0 comments on commit e2917bc

Please sign in to comment.