lenskit.knn#
k-NN recommender models.
- class lenskit.knn.ItemKNNScorer(nnbrs, min_nbrs=1, min_sim=1e-06, save_nbrs=None, feedback='explicit', block_size=250)#
-
Item-item nearest-neighbor collaborative filtering feedback. This item-item implementation is based on the description of item-based CF by Deshpande and Karypis [DK04] and hard-codes several design decisions found to work well in the previous Java-based LensKit code [ELKR11]. In explicit-feedback mode, its output is equivalent to that of the Java version.
- Parameters:
nnbrs (int) – The maximum number of neighbors for scoring each item (
None
for unlimited)min_nbrs (int) – The minimum number of neighbors for scoring each item
min_sim (float) – Minimum similarity threshold for considering a neighbor. Must be positive; if less than the smallest 32-bit normal (\(1.175 \times 10^{-38}\)), is clamped to that value.
save_nbrs (int | None) – The number of neighbors to save per item in the trained model (
None
for unlimited)feedback (Literal['explicit', 'implicit']) – The type of input data to use (explicit or implicit). This affects data pre-processing and aggregation.
block_size (int)
- items_: Vocabulary#
Vocabulary of item IDs.
- item_means_: torch.Tensor | None#
Mean rating for each known item.
- item_counts_: torch.Tensor#
Number of saved neighbors for each item.
- sim_matrix_: torch.Tensor#
Similarity matrix (sparse CSR tensor).
- users_: Vocabulary#
Vocabulary of user IDs.
- rating_matrix_: torch.Tensor#
Normalized rating matrix to look up user ratings at prediction time.
- class lenskit.knn.UserKNNScorer(nnbrs, min_nbrs=1, min_sim=1e-06, feedback='explicit')#
-
User-user nearest-neighbor collaborative filtering with ratings. This user-user implementation is not terribly configurable; it hard-codes design decisions found to work well in the previous Java-based LensKit code.
- Parameters:
nnbrs (int) – the maximum number of neighbors for scoring each item (
None
for unlimited).min_nbrs (int) – The minimum number of neighbors for scoring each item.
min_sim (float) – Minimum similarity threshold for considering a neighbor. Must be positive; if less than the smallest 32-bit normal (\(1.175 \times 10^{-38}\)), is clamped to that value.
feedback (Literal['explicit', 'implicit']) –
Control how feedback should be interpreted. Specifies defaults for the other settings, which can be overridden individually; can be one of the following values:
explicit
Configure for explicit-feedback mode: use rating values, and predict using weighted averages. This is the default setting.
implicit
Configure for implicit-feedback mode: ignore rating values, and predict using the sums of similarities.
- users_: Vocabulary#
The index of user IDs.
- items_: Vocabulary#
The index of item IDs.
- user_means_: torch.Tensor | None#
Mean rating for each known user.
- user_vectors_: torch.Tensor#
Normalized rating matrix (CSR) to find neighbors at prediction time.
- user_ratings_: csr_array#
Centered but un-normalized rating matrix (COO) to find neighbor ratings.
- train(data)#
“Train” a user-user CF model. This memorizes the rating data in a format that is usable for future computations.
- Parameters:
ratings (pandas.DataFrame) – (user, item, rating) data for collaborative filtering.
data (Dataset)
- Return type:
Modules