Basic and Utility Algorithms

The lenskit.algorithms.basic module contains baseline and utility algorithms for nonpersonalized recommendation and testing.

Personalized Mean Rating Prediction

class lenskit.algorithms.basic.Bias(items=True, users=True, damping=0.0)

Bases: lenskit.algorithms.Predictor

A user-item bias rating prediction algorithm. This implements the following predictor algorithm:

\[s(u,i) = \mu + b_i + b_u\]

where \(\mu\) is the global mean rating, \(b_i\) is item bias, and \(b_u\) is the user bias. With the provided damping values \(\beta_{\mathrm{u}}\) and \(\beta_{\mathrm{i}}\), they are computed as follows:

\[\begin{align*} \mu & = \frac{\sum_{r_{ui} \in R} r_{ui}}{|R|} & b_i & = \frac{\sum_{r_{ui} \in R_i} (r_{ui} - \mu)}{|R_i| + \beta_{\mathrm{i}}} & b_u & = \frac{\sum_{r_{ui} \in R_u} (r_{ui} - \mu - b_i)}{|R_u| + \beta_{\mathrm{u}}} \end{align*}\]

The damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.

Parameters
  • items – whether to compute item biases

  • users – whether to compute user biases

  • damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.

mean_

The global mean rating.

Type

double

item_offsets_

The item offsets (\(b_i\) values)

Type

pandas.Series

user_offsets_

The item offsets (\(b_u\) values)

Type

pandas.Series

fit(data)

Train the bias model on some rating data.

Parameters

data (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.

Returns

the fit bias object.

Return type

Bias

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.

Parameters
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

Top-N Recommender

The TopN class implements a standard top-N recommender that wraps a Predictor and CandidateSelector and returns the top N candidate items by predicted rating. It is the type of recommender returned by Recommender.adapt() if the provided algorithm is not a recommender.

class lenskit.algorithms.basic.TopN(predictor, selector=None)

Bases: lenskit.algorithms.Recommender, lenskit.algorithms.Predictor

Basic recommender that implements top-N recommendation using a predictor.

Note

This class does not do anything of its own in fit(). If its predictor and candidate selector are both fit, the top-N recommender does not need to be fit.

Parameters
fit(ratings, *args, **kwargs)

Fit the recommender.

Parameters
  • ratings (pandas.DataFrame) – The rating or interaction data. Passed changed to the predictor and candidate selector.

  • kwargs (args,) – Additional arguments for the predictor to use in its training process.

predict(pairs, ratings=None)

Compute predictions for user-item pairs. This method is designed to be compatible with the general SciKit paradigm; applications typically want to use predict_for_user().

Parameters
Returns

The predicted scores for each user-item pair.

Return type

pandas.Series

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items.

Parameters
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

recommend(user, n=None, candidates=None, ratings=None)

Compute recommendations for a user.

Parameters
  • user – the user ID

  • n (int) – the number of recommendations to produce (None for unlimited)

  • candidates (array-like) – The set of valid candidate items; if None, a default set will be used. For many algorithms, this is their CandidateSelector.

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

a frame with an item column; if the recommender also produces scores, they will be in a score column.

Return type

pandas.DataFrame

Unrated Item Candidate Selector

UnratedItemCandidateSelector is a candidate selector that remembers items users have rated, and returns a candidate set consisting of all unrated items. It is the default candidate selector for TopN.

class lenskit.algorithms.basic.UnratedItemCandidateSelector

Bases: lenskit.algorithms.CandidateSelector

CandidateSelector that selects items a user has not rated as candidates. When this selector is fit, it memorizes the rated items.

items_

All known items.

Type

pandas.Index

users_

All known users.

Type

pandas.Index

user_items_

Items rated by each known user, as positions in the items index.

Type

CSR

candidates(user, ratings=None)

Select candidates for the user.

Parameters
  • user – The user key or ID.

  • ratings (pandas.Series or array-like) – Ratings or items to use instead of whatever ratings were memorized for this user. If a pandas.Series, the series index is used; if it is another array-like it is assumed to be an array of items.

fit(ratings)

Train a model using the specified ratings (or similar) data.

Parameters
  • ratings (pandas.DataFrame) – The ratings data.

  • args – Additional training data the algorithm may require.

  • kwargs – Additional training data the algorithm may require.

Returns

The algorithm object.

Fallback Predictor

The Fallback rating predictor is a simple hybrid that takes a list of composite algorithms, and uses the first one to return a result to predict the rating for each item.

A common case is to fill in with Bias when a primary predictor cannot score an item.

class lenskit.algorithms.basic.Fallback(algorithms, *others)

Bases: lenskit.algorithms.Predictor

The Fallback algorithm predicts with its first component, uses the second to fill in missing values, and so forth.

fit(ratings, *args, **kwargs)

Train a model using the specified ratings (or similar) data.

Parameters
  • ratings (pandas.DataFrame) – The ratings data.

  • args – Additional training data the algorithm may require.

  • kwargs – Additional training data the algorithm may require.

Returns

The algorithm object.

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items.

Parameters
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

Memorized Predictor

The Memorized recommender is primarily useful for test cases. It memorizes a set of rating predictions and returns them.

class lenskit.algorithms.basic.Memorized(scores)

Bases: lenskit.algorithms.Predictor

The memorized algorithm memorizes socres provided at construction time.

fit(*args, **kwargs)

Train a model using the specified ratings (or similar) data.

Parameters
  • ratings (pandas.DataFrame) – The ratings data.

  • args – Additional training data the algorithm may require.

  • kwargs – Additional training data the algorithm may require.

Returns

The algorithm object.

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items.

Parameters
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series