Data Abstractions#

The lenskit.data module provides the core data abstractions LensKit uses to represent recommender system inputs and outputs.

Data Sets#

Dataset

Representation of a data set for LensKit training, evaluation, etc.

EntitySet

Representation of a set of entities from the dataset.

AttributeSet

Base class for attributes associated with entities.

RelationshipSet

Representation for a set of relationship records.

MatrixRelationshipSet

Two-entity relationships without duplicates, accessible in matrix form.

CSRStructure

Representation of the compressed sparse row structure of a sparse matrix, without any data values.

Building Data Sets#

DatasetBuilder

Construct data sets from data and tables.

from_interactions_df

Create a dataset from a data frame of ratings or other user-item interactions.

load_movielens

Load a MovieLens dataset.

load_movielens_df

Load the ratings from a MovieLens dataset as a raw data frame.

Item Data#

ItemList

Representation of a (usually ordered) list of items, possibly with scores and other associated data; many components take and return item lists.

ItemListCollection

A collection of item lists.

UserIDKey

Key type for user IDs.

GenericKey

Built-in immutable sequence.

Recommendation Queries#

RecQuery

Representation of a the data available for a recommendation query.

QueryInput

Represent a PEP 604 union type

Schemas and Identifiers#

lenskit.data.schema

Pydantic models for LensKit data schemas.

Vocabulary

Vocabularies of entity identifiers for the LensKit data model.

See also:

  • lenskit.data.types.EntityId