Data Abstractions#

The lenskit.data module provides the core data abstractions LensKit uses to represent recommender system inputs and outputs.

Data Sets#

Dataset

Representation of a data set for LensKit training, evaluation, etc.

EntitySet

Representation of a set of entities from the dataset.

AttributeSet

Base class for attributes associated with entities.

RelationshipSet

Representation for a set of relationship records.

MatrixRelationshipSet

Two-entity relationships without duplicates, accessible in matrix form.

CSRStructure

Representation of the compressed sparse row structure of a sparse matrix, without any data values.

Building Data Sets#

DatasetBuilder

Construct data sets from data and tables.

from_interactions_df

Create a dataset from a data frame of ratings or other user-item interactions.

load_movielens

Load a MovieLens dataset.

load_movielens_df

Load the ratings from a MovieLens dataset as a raw data frame.

Item Data#

ItemList

Representation of a (usually ordered) list of items, possibly with scores and other associated data; many components take and return item lists.

ItemListCollection

A collection of item lists.

ItemListCollector

Collect item lists with associated keys, as in ItemListCollection.

ListILC

Mutable item list collection backed by a Python list.

UserIDKey

Key type for user IDs.

GenericKey

Built-in immutable sequence.

MutableItemListCollection

Intersection type of ItemListCollection and ItemListCollector.

Recommendation Queries#

RecQuery

Representation of a the data available for a recommendation query.

QueryInput

Represent a PEP 604 union type

Schemas and Identifiers#

lenskit.data.schema

Pydantic models for LensKit data schemas.

Vocabulary

Vocabularies of entity identifiers for the LensKit data model.

See also:

  • lenskit.data.types.EntityId

Arrow Support#

These classes provide support for compressed sparse row matrices in Arrow.

SparseRowType

Data type for sparse rows stored in Arrow.

SparseIndexType

Data type for the index field of a sparse row. Indexes are just stored as ``int32``s; the extension type attaches the row's dimensionality to the index field (making it easier to pass it to/from Rust, since we often pass arrays and not entire fields).

SparseRowArray

An array of sparse rows (a compressed sparse row matrix).

They are also supported on the Rust side of LensKit.