Skip to content

0.999

Latest
Compare
Choose a tag to compare
@otsaloma otsaloma released this 15 Dec 14:54
0.999
  • DataFrame.fom_arrow: Remove strings_as_object argument
  • DataFrame.from_pandas: Remove strings_as_object argument
  • DataFrame.read_csv: Remove strings_as_object argument
  • DataFrame.read_parquet: Remove strings_as_object argument
  • GeoJSON.read: Remove strings_as_object argument
  • ListOfDicts.to_data_frame: Remove strings_as_object argument
  • read_csv: Remove strings_as_object argument
  • read_geojson: Remove strings_as_object argument
  • read_parquet: Remove strings_as_object argument
  • Vector.as_string: Remove length argument
  • Vector.is_na: Fix to work in multidimensional cases where the elements of an object vector are arrays/vectors
  • Vector.rank: Change default method to "min"
  • Vector.rank: Remove method "average"

This is a breaking change to switch the string data type from the
fixed-width str_ a.k.a. <U# to the variable-width StringDType
introduced in NumPy 2.0. The main benefit is greatly reduced memory use,
making strings usable without needing to be careful or falling back to
object. The note about stability below release 0.99 still applies.

Note that as StringDType is only in NumPy >= 2.0, any NPZ or Pickle
files saved cannot be opened using Dataiter < 0.99 and NumPy < 2.0. If
you need that kind of interoperability, consider using the Parquet file
format.