Basic and Utility Algorithms¶

The lenskit.algorithms.basic module contains baseline and utility algorithms for nonpersonalized recommendation and testing.

Personalized Mean Rating Prediction¶

class lenskit.algorithms.basic.Bias(items=True, users=True, damping=0.0)¶

Bases: lenskit.algorithms.Predictor

A user-item bias rating prediction algorithm. This implements the following predictor algorithm:

\[s(u,i) = \mu + b_i + b_u\]

where \(\mu\) is the global mean rating, \(b_i\) is item bias, and \(b_u\) is the user bias. With the provided damping values \(\beta_{\mathrm{u}}\) and \(\beta_{\mathrm{i}}\), they are computed as follows:

\[\begin{align*} \mu & = \frac{\sum_{r_{ui} \in R} r_{ui}}{|R|} & b_i & = \frac{\sum_{r_{ui} \in R_i} (r_{ui} - \mu)}{|R_i| + \beta_{\mathrm{i}}} & b_u & = \frac{\sum_{r_{ui} \in R_u} (r_{ui} - \mu - b_i)}{|R_u| + \beta_{\mathrm{u}}} \end{align*}\]

The damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.

Parameters

items – whether to compute item biases
users – whether to compute user biases
damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.

mean\_

The global mean rating.

Type: double

item_offsets\_

The item offsets (\(b_i\) values)

Type: pandas.Series

user_offsets\_

The item offsets (\(b_u\) values)

Type: pandas.Series

fit(ratings, **kwargs)¶

Train the bias model on some rating data.

Parameters: ratings (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.
Returns: the fit bias object.
Return type: Bias

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

Most Popular Item Recommendation¶

The Popular algorithm implements most-popular-item recommendation.

class lenskit.algorithms.basic.Popular(selector=None)¶

Bases: lenskit.algorithms.Recommender

Recommend the most popular items.

Parameters: selector (CandidateSelector) – The candidate selector to use. If None, uses a new UnratedItemCandidateSelector.

item_pop\_

Item rating counts (popularity)

Type: pandas.Series

fit(ratings, **kwargs)¶

Train a model using the specified ratings (or similar) data.

Parameters

ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

recommend(user, n=None, candidates=None, ratings=None)¶

Compute recommendations for a user.

Parameters

user – the user ID
n (int) – the number of recommendations to produce (None for unlimited)
candidates (array-like) – The set of valid candidate items; if None, a default set will be used. For many algorithms, this is their CandidateSelector.
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

a frame with an item column; if the recommender also produces scores, they will be in a score column.

Return type

pandas.DataFrame

Random Item Recommendation¶

The Random algorithm implements random-item recommendation.

class lenskit.algorithms.basic.Random(selector=None, rng_spec=None)¶

Bases: lenskit.algorithms.Recommender

A random-item recommender.

selector¶

Selects candidate items for recommendation. Default is UnratedItemCandidateSelector.

Type: CandidateSelector

rng_spec¶: Seed or random state for generating recommendations. Pass 'user' to deterministically derive per-user RNGS from the user IDs for reproducibility.

fit(ratings, **kwargs)¶

Train a model using the specified ratings (or similar) data.

Parameters

ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

recommend(user, n=None, candidates=None, ratings=None)¶

Compute recommendations for a user.

Parameters

user – the user ID
n (int) – the number of recommendations to produce (None for unlimited)
candidates (array-like) – The set of valid candidate items; if None, a default set will be used. For many algorithms, this is their CandidateSelector.
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

a frame with an item column; if the recommender also produces scores, they will be in a score column.

Return type

pandas.DataFrame

Top-N Recommender¶

The TopN class implements a standard top-N recommender that wraps a Predictor and CandidateSelector and returns the top N candidate items by predicted rating. It is the type of recommender returned by Recommender.adapt() if the provided algorithm is not a recommender.

class lenskit.algorithms.basic.TopN(predictor, selector=None)¶

Bases: lenskit.algorithms.Recommender, lenskit.algorithms.Predictor

Basic recommender that implements top-N recommendation using a predictor.

Note

This class does not do anything of its own in fit(). If its predictor and candidate selector are both fit, the top-N recommender does not need to be fit.

Parameters

predictor (Predictor) – The underlying predictor.
selector (CandidateSelector) – The candidate selector. If None, uses UnratedItemCandidateSelector.

fit(ratings, **kwargs)¶

Fit the recommender.

Parameters

ratings (pandas.DataFrame) – The rating or interaction data. Passed changed to the predictor and candidate selector.
kwargs (args,) – Additional arguments for the predictor to use in its training process.

predict(pairs, ratings=None)¶

Compute predictions for user-item pairs. This method is designed to be compatible with the general SciKit paradigm; applications typically want to use predict_for_user().

Parameters

pairs (pandas.DataFrame) – The user-item pairs, as user and item columns.
ratings (pandas.DataFrame) – user-item rating data to replace memorized data.

Returns

The predicted scores for each user-item pair.

Return type

pandas.Series

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

recommend(user, n=None, candidates=None, ratings=None)¶

Compute recommendations for a user.

Parameters

user – the user ID
n (int) – the number of recommendations to produce (None for unlimited)
candidates (array-like) – The set of valid candidate items; if None, a default set will be used. For many algorithms, this is their CandidateSelector.
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

a frame with an item column; if the recommender also produces scores, they will be in a score column.

Return type

pandas.DataFrame

Unrated Item Candidate Selector¶

UnratedItemCandidateSelector is a candidate selector that remembers items users have rated, and returns a candidate set consisting of all unrated items. It is the default candidate selector for TopN.

class lenskit.algorithms.basic.UnratedItemCandidateSelector¶

Bases: lenskit.algorithms.CandidateSelector

CandidateSelector that selects items a user has not rated as candidates. When this selector is fit, it memorizes the rated items.

items\_

All known items.

Type: pandas.Index

users\_

All known users.

Type: pandas.Index

user_items\_

Items rated by each known user, as positions in the items index.

Type: CSR

candidates(user, ratings=None)¶

Select candidates for the user.

Parameters

user – The user key or ID.
ratings (pandas.Series or array-like) – Ratings or items to use instead of whatever ratings were memorized for this user. If a pandas.Series, the series index is used; if it is another array-like it is assumed to be an array of items.

fit(ratings, **kwargs)¶

Train a model using the specified ratings (or similar) data.

Parameters

ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

Fallback Predictor¶

The Fallback rating predictor is a simple hybrid that takes a list of composite algorithms, and uses the first one to return a result to predict the rating for each item.

A common case is to fill in with Bias when a primary predictor cannot score an item.

class lenskit.algorithms.basic.Fallback(algorithms, *others)¶

Bases: lenskit.algorithms.Predictor

The Fallback algorithm predicts with its first component, uses the second to fill in missing values, and so forth.

fit(ratings, **kwargs)¶

Train a model using the specified ratings (or similar) data.

Parameters

ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

Memorized Predictor¶

The Memorized recommender is primarily useful for test cases. It memorizes a set of rating predictions and returns them.

class lenskit.algorithms.basic.Memorized(scores)¶

Bases: lenskit.algorithms.Predictor

The memorized algorithm memorizes socres provided at construction time.

fit(*args, **kwargs)¶

Train a model using the specified ratings (or similar) data.

Parameters

ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series