Data Utilities#

These are general-purpose data processing utilities.

Building Ratings Matrices#, *, type='scipy', layout='csr', users=None, items=None)#

Convert a rating table to a sparse matrix of ratings.

  • ratings (pd.DataFrame) – A data table of (user, item, rating) triples.

  • type (Literal['scipy', 'spmatrix', 'torch', 'structure']) –

    The type of matrix to create. Can be any of the following:

  • layout (Literal['csr', 'coo']) – The matrix layout to use.

  • users (Optional[pd.Index[Any]]) – An index of user IDs.

  • items (Optional[pd.Index[Any]]) – An index of items IDs.


a named tuple containing the sparse matrix, user index, and item index.

Return type:


class, users, items)#

Bases: NamedTuple, Generic[M]

A rating matrix with associated indices.

  • matrix (M)

  • users (pd.Index[Any])

  • items (pd.Index[Any])

matrix: M#

The rating matrix, with users on rows and items on columns.

users: pd.Index[Any]#

Mapping from user IDs to row numbers.

items: pd.Index[Any]#

Mapping from item IDs to column numbers.

class, colinds, shape)#

Bases: NamedTuple

Representation of the compressed sparse row structure of a sparse matrix, without any data values.

rowptrs: ndarray#

Alias for field number 0

colinds: ndarray#

Alias for field number 1

shape: tuple[int, int]#

Alias for field number 2