lenskit.data.matrix.SparseRowType#

class lenskit.data.matrix.SparseRowType(dimension, value_type=DataType(float), large=False)#

Bases: ExtensionType

Data type for sparse rows stored in Arrow. Sparse rows are stored as lists of structs with index and column fields.

Stability: Internal

This API is at the internal or experimental stability level: it may change at any time, and breaking changes will not necessarily be described in the release notes. See Stability Levels for details.

Parameters:
__init__(dimension, value_type=DataType(float), large=False)#

Initialize an extension type instance.

This should be called at the end of the subclass’ __init__ method.

Parameters:

Methods

__init__(dimension[, value_type, large])

Initialize an extension type instance.

equals(self, other, *[, check_metadata])

Return true if type is equivalent to passed value.

field(self, i)

from_type(data_type[, dimension])

Create a sparse row type from an Arrow data type, handling legacy struct layouts without the extension types.

to_pandas_dtype(self)

Return the equivalent NumPy / Pandas dtype.

wrap_array(self, storage)

Wrap the given storage array as an extension array.

Attributes

bit_width

The bit width of the extension type.

byte_width

The byte width of the extension type.

dimension

extension_name

The extension type name.

has_variadic_buffers

If True, the number of expected buffers is only lower-bounded by num_buffers.

id

num_buffers

Number of data buffers required to construct Array type excluding children.

num_fields

The number of child fields.

storage_type

The underlying storage type.

value_type

index_type

classmethod from_type(data_type, dimension=None)#

Create a sparse row type from an Arrow data type, handling legacy struct layouts without the extension types.

Parameters:
  • data_type (DataType) – The Arrow data type to interpret as a row type.

  • dimension (int | None) – The row dimension, if known from an external source. If provided and the data type also includes the dimensionality, both dimensions must match.

Raises:
  • TypeError – If the data type is not a valid sparse row type.

  • ValueError – If there is another error, such as mismatched dimensions.

Return type:

SparseRowType