Data Abstractions#
The lenskit.data
module provides the core data abstractions LensKit uses
to represent recommender system inputs and outputs.
Data Sets#
Representation of a data set for LensKit training, evaluation, etc. |
|
Representation of a set of entities from the dataset. |
|
Base class for attributes associated with entities. |
|
Representation for a set of relationship records. |
|
Two-entity relationships without duplicates, accessible in matrix form. |
|
Representation of the compressed sparse row structure of a sparse matrix, without any data values. |
Building Data Sets#
Construct data sets from data and tables. |
|
Create a dataset from a data frame of ratings or other user-item interactions. |
|
Load a MovieLens dataset. |
|
Load the ratings from a MovieLens dataset as a raw data frame. |
Item Data#
Representation of a (usually ordered) list of items, possibly with scores and other associated data; many components take and return item lists. |
|
A collection of item lists. |
|
Collect item lists with associated keys, as in |
|
Mutable item list collection backed by a Python list. |
|
Key type for user IDs. |
|
Built-in immutable sequence. |
|
Intersection type of |
Recommendation Queries#
Representation of a the data available for a recommendation query. |
|
Represent a PEP 604 union type |
Schemas and Identifiers#
Pydantic models for LensKit data schemas. |
|
Vocabularies of entity identifiers for the LensKit data model. |
See also:
lenskit.data.types.EntityId
Arrow Support#
These classes provide support for compressed sparse row matrices in Arrow.
Data type for sparse rows stored in Arrow. |
|
Data type for the index field of a sparse row. Indexes are just stored as ``int32``s; the extension type attaches the row's dimensionality to the index field (making it easier to pass it to/from Rust, since we often pass arrays and not entire fields). |
|
An array of sparse rows (a compressed sparse row matrix). |
They are also supported on the Rust side of LensKit.