kNN Collaborative Filtering¶
LKPY provides user and itembased classical kNN collaborative Filtering implementations. These lightlyconfigurable implementations are intended to capture the behavior of the Javabased LensKit implementations to provide a good upgrade path and enable basic experiments out of the box.
Itembased kNN¶

class
lenskit.algorithms.item_knn.
ItemItem
(nnbrs, min_nbrs=1, min_sim=1e06, save_nbrs=None, center=True, aggregate='weightedaverage')¶ Bases:
lenskit.Predictor
Itemitem nearestneighbor collaborative filtering with ratings. This itemitem implementation is not terribly configurable; it hardcodes design decisions found to work well in the previous Javabased LensKit code.
The kNN predictor supports several aggregate functions:
weightedaverage
The weighted average of the user’s rating values, using itemitem similarities as weights.
sum
The sum of the similarities between the target item and the user’s rated items, regardless of the rating the user gave the items.
 Parameters
nnbrs (int) – the maximum number of neighbors for scoring each item (
None
for unlimited)min_nbrs (int) – the minimum number of neighbors for scoring each item
min_sim (double) – minimum similarity threshold for considering a neighbor
save_nbrs (double) – the number of neighbors to save per item in the trained model (
None
for unlimited)center (bool) – whether to normalize (meancenter) rating vectors prior to computing similarities and aggregating user rating values. Turn this off when working with unary data and other data types that don’t respond well to centering.
aggregate – the type of aggregation to do. Can be
weightedaverage
orsum
.

item_index_
¶ the index of item IDs.
 Type

item_means_
¶ the mean rating for each known item.
 Type

item_counts_
¶ the number of saved neighbors for each item.
 Type

sim_matrix_
¶ the similarity matrix.
 Type
matrix.CSR

user_index_
¶ the index of known user IDs for the rating matrix.
 Type

rating_matrix_
¶ the useritem rating matrix for looking up users’ ratings.
 Type
matrix.CSR

fit
(ratings, **kwargs)¶ Train a model.
The modeltraining process depends on
save_nbrs
andmin_sim
, but not on other algorithm parameters. Parameters
ratings (pandas.DataFrame) – (user,item,rating) data for computing item similarities.

predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
 Parameters
user – the user ID
items (arraylike) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
 Returns
scores for the items, indexed by item id.
 Return type
Userbased kNN¶

class
lenskit.algorithms.user_knn.
UserUser
(nnbrs, min_nbrs=1, min_sim=0, center=True, aggregate='weightedaverage')¶ Bases:
lenskit.Predictor
Useruser nearestneighbor collaborative filtering with ratings. This useruser implementation is not terribly configurable; it hardcodes design decisions found to work well in the previous Javabased LensKit code.
 Parameters
nnbrs (int) – the maximum number of neighbors for scoring each item (
None
for unlimited)min_nbrs (int) – the minimum number of neighbors for scoring each item
min_sim (double) – minimum similarity threshold for considering a neighbor
center (bool) – whether to normalize (meancenter) rating vectors. Turn this off when working with unary data and other data types that don’t respond well to centering.
aggregate – the type of aggregation to do. Can be
weightedaverage
orsum
.

user_index_
¶ User index.
 Type

item_index_
¶ Item index.
 Type

user_means_
¶ User mean ratings.
 Type

rating_matrix_
¶ Normalized useritem rating matrix.
 Type
matrix.CSR

transpose_matrix_
¶ Transposed unnormalized rating matrix.
 Type
matrix.CSR

fit
(ratings, **kwargs)¶ “Train” a useruser CF model. This memorizes the rating data in a format that is usable for future computations.
 Parameters
ratings (pandas.DataFrame) – (user, item, rating) data for collaborative filtering.
 Returns
a memorized model for efficient userbased CF computation.
 Return type
UUModel

predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
 Parameters
user – the user ID
items (arraylike) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.
 Returns
scores for the items, indexed by item id.
 Return type