Release 0.999 · otsaloma/dataiter

DataFrame.fom_arrow: Remove strings_as_object argument
DataFrame.from_pandas: Remove strings_as_object argument
DataFrame.read_csv: Remove strings_as_object argument
DataFrame.read_parquet: Remove strings_as_object argument
GeoJSON.read: Remove strings_as_object argument
ListOfDicts.to_data_frame: Remove strings_as_object argument
read_csv: Remove strings_as_object argument
read_geojson: Remove strings_as_object argument
read_parquet: Remove strings_as_object argument
Vector.as_string: Remove length argument
Vector.is_na: Fix to work in multidimensional cases where the elements of an object vector are arrays/vectors
Vector.rank: Change default method to "min"
Vector.rank: Remove method "average"

This is a breaking change to switch the string data type from the
fixed-width str_ a.k.a. <U# to the variable-width StringDType
introduced in NumPy 2.0. The main benefit is greatly reduced memory use,
making strings usable without needing to be careful or falling back to
object. The note about stability below release 0.99 still applies.

Note that as StringDType is only in NumPy >= 2.0, any NPZ or Pickle
files saved cannot be opened using Dataiter < 0.99 and NumPy < 2.0. If
you need that kind of interoperability, consider using the Parquet file
format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.999