DataFrame.fom_arrow
: Removestrings_as_object
argumentDataFrame.from_pandas
: Removestrings_as_object
argumentDataFrame.read_csv
: Removestrings_as_object
argumentDataFrame.read_parquet
: Removestrings_as_object
argumentGeoJSON.read
: Removestrings_as_object
argumentListOfDicts.to_data_frame
: Removestrings_as_object
argumentread_csv
: Removestrings_as_object
argumentread_geojson
: Removestrings_as_object
argumentread_parquet
: Removestrings_as_object
argumentVector.as_string
: Removelength
argumentVector.is_na
: Fix to work in multidimensional cases where the elements of an object vector are arrays/vectorsVector.rank
: Change defaultmethod
to "min"Vector.rank
: Removemethod
"average"
This is a breaking change to switch the string data type from the
fixed-width str_
a.k.a. <U#
to the variable-width StringDType
introduced in NumPy 2.0. The main benefit is greatly reduced memory use,
making strings usable without needing to be careful or falling back to
object
. The note about stability below release 0.99 still applies.
Note that as StringDType
is only in NumPy >= 2.0, any NPZ or Pickle
files saved cannot be opened using Dataiter < 0.99 and NumPy < 2.0. If
you need that kind of interoperability, consider using the Parquet file
format.