lenskit.knn.item#

Item-based k-NN collaborative filtering.

Classes

ItemKNNScorer(nnbrs[, min_nbrs, min_sim, ...])

Item-item nearest-neighbor collaborative filtering feedback.

class lenskit.knn.item.ItemKNNScorer(nnbrs, min_nbrs=1, min_sim=1e-06, save_nbrs=None, feedback='explicit', block_size=250)#

Bases: Component, Trainable

Item-item nearest-neighbor collaborative filtering feedback. This item-item implementation is based on the description of item-based CF by Deshpande and Karypis [DK04] and hard-codes several design decisions found to work well in the previous Java-based LensKit code [ELKR11]. In explicit-feedback mode, its output is equivalent to that of the Java version.

Parameters:
  • nnbrs (int) – The maximum number of neighbors for scoring each item (None for unlimited)

  • min_nbrs (int) – The minimum number of neighbors for scoring each item

  • min_sim (float) – Minimum similarity threshold for considering a neighbor. Must be positive; if less than the smallest 32-bit normal (\(1.175 \times 10^{-38}\)), is clamped to that value.

  • save_nbrs (int | None) – The number of neighbors to save per item in the trained model (None for unlimited)

  • feedback (FeedbackType) – The type of input data to use (explicit or implicit). This affects data pre-processing and aggregation.

  • block_size (int)

items_: Vocabulary#

Vocabulary of item IDs.

item_means_: torch.Tensor | None#

Mean rating for each known item.

item_counts_: torch.Tensor#

Number of saved neighbors for each item.

sim_matrix_: torch.Tensor#

Similarity matrix (sparse CSR tensor).

users_: Vocabulary#

Vocabulary of user IDs.

rating_matrix_: torch.Tensor#

Normalized rating matrix to look up user ratings at prediction time.

property is_trained: bool#

Check if this model has already been trained.

train(data)#

Train a model.

The model-training process depends on save_nbrs and min_sim, but not on other algorithm parameters.

Parameters:
  • ratings – (user,item,rating) data for computing item similarities.

  • data (Dataset)