Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phy extract-waveforms saves waveforms with wrong dtype if raw data file is encoded as float32 #44

Open
johnmbarrett opened this issue May 17, 2024 · 0 comments
Assignees

Comments

@johnmbarrett
Copy link

phy extract-waveforms saves waveforms with the wrong dtype if the raw data file is encoded as float32.

Steps to reproduce:

  1. Download and unzip the example dataset: https://drive.google.com/file/d/1mshkvPaxKpHjWK4z67HtfXlUvyuprX9B/view?usp=sharing
  2. Navigate to the folder you saved it to in your command line
  3. Run phy extract-waveforms params.py
  4. Start python and run the following commands:
import numpy as np
np.load('_phy_spikes_subset.waveforms.npy')

Expected behaviour:

The waveforms are loaded

Actual behaviour:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\jmb9770\Anaconda3\envs\phy\Lib\site-packages\numpy\lib\npyio.py", line 456, in load
    return format.read_array(fid, allow_pickle=allow_pickle,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\jmb9770\Anaconda3\envs\phy\Lib\site-packages\numpy\lib\format.py", line 839, in read_array
    array.shape = shape
    ^^^^^^^^^^^
ValueError: cannot reshape array of size 31004592 into shape (63534,61,16)

Environment info:

OS: Windows 10 x64
Python verison: 3.11.9
Conda verison: 23.3.1
phy version: 2.0b5
phylib version: 2.4.3

Additional info:

The culprit appears to be on line 657 of phylib/io/traces.py, where the dtype of the waveforms is inferred if sample2unit is None, else set to float. The phy command extract-waveforms never sets sample2unit, so it always defaults to 1.0, and hence the written waveforms have dtype float, which on most modern python installations means float64. If raw data file from which the waveforms are loaded is of integer type, the multiplication by 1.0 will coerce them to float64, hence they will be written correctly. If the raw data file is of type float32, however, no such coercion will take place and the NpyWriter will byte-copy the float32-encoded waveforms to the waveforms .npy file that claims to have dtype float64 in its header.

@oliche oliche self-assigned this May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants