k-NN Collaborative Filtering¶

LKPY provides user- and item-based classical k-NN collaborative Filtering implementations. These lightly-configurable implementations are intended to capture the behavior of the Java-based LensKit implementations to provide a good upgrade path and enable basic experiments out of the box.

Item-based k-NN¶

class lenskit.algorithms.item_knn.ItemItem(nnbrs, min_nbrs=1, min_sim=1e-06, save_nbrs=None, center=True, aggregate='weighted-average')¶

Bases: lenskit.algorithms.Predictor

Item-item nearest-neighbor collaborative filtering with ratings. This item-item implementation is not terribly configurable; it hard-codes design decisions found to work well in the previous Java-based LensKit code.

item_index_¶

the index of item IDs.

Type: pandas.Index

item_means_¶

the mean rating for each known item.

Type: numpy.ndarray

item_counts_¶

the number of saved neighbors for each item.

Type: numpy.ndarray

sim_matrix_¶

the similarity matrix.

Type: matrix.CSR

user_index_¶

the index of known user IDs for the rating matrix.

Type: pandas.Index

rating_matrix_¶

the user-item rating matrix for looking up users’ ratings.

Type: matrix.CSR

fit(ratings)¶

Train a model.

The model-training process depends on save_nbrs and min_sim, but not on other algorithm parameters.

Parameters: ratings (pandas.DataFrame) – (user,item,rating) data for computing item similarities.

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

User-based k-NN¶

class lenskit.algorithms.user_knn.UserUser(nnbrs, min_nbrs=1, min_sim=0, center=True, aggregate='weighted-average')¶

Bases: lenskit.algorithms.Predictor

User-user nearest-neighbor collaborative filtering with ratings. This user-user implementation is not terribly configurable; it hard-codes design decisions found to work well in the previous Java-based LensKit code.

user_index_¶

User index.

Type: pandas.Index

item_index_¶

Item index.

Type: pandas.Index

user_means_¶

User mean ratings.

Type: numpy.ndarray

rating_matrix_¶

Normalized user-item rating matrix.

Type: matrix.CSR

transpose_matrix_¶

Transposed un-normalized rating matrix.

Type: matrix.CSR