Basic and Utility Algorithms¶

The lenskit.algorithms.basic module contains baseline and utility algorithms for nonpersonalized recommendation and testing.

Personalized Mean Rating Prediction¶

class lenskit.algorithms.basic.Bias(items=True, users=True, damping=0.0)¶

Bases: lenskit.algorithms.Predictor, lenskit.algorithms.Trainable

A user-item bias rating prediction algorithm. This implements the following predictor algorithm:

\[s(u,i) = \mu + b_i + b_u\]

where \(\mu\) is the global mean rating, \(b_i\) is item bias, and \(b_u\) is the user bias. With the provided damping values \(\beta_{\mathrm{u}}\) and \(\beta_{\mathrm{i}}\), they are computed as follows:

\[\begin{align*} \mu & = \frac{\sum_{r_{ui} \in R} r_{ui}}{|R|} & b_i & = \frac{\sum_{r_{ui} \in R_i} (r_{ui} - \mu)}{|R_i| + \beta_{\mathrm{i}}} & b_u & = \frac{\sum_{r_{ui} \in R_u} (r_{ui} - \mu - b_i)}{|R_u| + \beta_{\mathrm{u}}} \end{align*}\]

The damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.

Parameters:	items – whether to compute item biases users – whether to compute user biases damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.

predict(model, user, items, ratings=None)¶

Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.

Parameters:	model (BiasModel) – the trained model to use. user – the user ID items (array-like) – the items to predict ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.
Returns:	scores for the items, indexed by item id.
Return type:	pandas.Series

train(data)¶

Train the bias model on some rating data.

Parameters:	data (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.
Returns:	a trained model with the desired biases computed.
Return type:	BiasModel

class lenskit.algorithms.basic.BiasModel¶

Trained model for the Bias algorithm.

mean¶

the global mean.

Type:	double

items¶

the item means.

Type:	pandas.Series

users¶

the user means.

Type:	pandas.Series

Fallback Predictor¶

The Fallback rating predictor is a simple hybrid that takes a list of composite algorithms, and uses the first one to return a result to predict the rating for each item.

A common case is to fill in with Bias when a primary predictor cannot score an item.

class lenskit.algorithms.basic.Fallback(*algorithms)¶

Bases: lenskit.algorithms.Predictor, lenskit.algorithms.Trainable

The Fallback algorithm predicts with its first component, uses the second to fill in missing values, and so forth.

load_model(file)¶

Save a trained model to a file.

Parameters:	path (str) – the path to file from which to load the model.
Returns:	the re-loaded model (of an implementation-defined type).

predict(model, user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters:	model – the trained model to use. Either `None` or the ratings matrix if the algorithm has no concept of training. user – the user ID items (array-like) – the items to predict ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
Returns:	scores for the items, indexed by item id.
Return type:	pandas.Series

save_model(model, path)¶

Save a trained model to a file or directory. The default implementation pickles the model.

Algorithms are allowed to use any format for saving their models, including directories.

Parameters:	model – the trained model. path (str) – the path at which to save the model.

train(ratings)¶

Train the model on rating/consumption data. Training methods that require additional data may accept it as additional parameters or via class members.

Parameters:	ratings (pandas.DataFrame) – rating data, as a matrix with columns ‘user’, ‘item’, and ‘rating’. The user and item identifiers may be of any type.
Returns:	the trained model (of an implementation-defined type).

Memorized Predictor¶

The Memorized recommender is primarily useful for test cases. It memorizes a set of rating predictions and returns them.

class lenskit.algorithms.basic.Memorized(scores)¶

Bases: object

The memorized algorithm memorizes scores & repeats them.