Bias

The lenskit.algorithms.bias module contains the personalized mean rating prediction.

class lenskit.algorithms.bias.Bias(items=True, users=True, damping=0.0)

Bases: Predictor

A user-item bias rating prediction algorithm. This implements the following predictor algorithm:

\[s(u,i) = \mu + b_i + b_u\]

where \(\mu\) is the global mean rating, \(b_i\) is item bias, and \(b_u\) is the user bias. With the provided damping values \(\beta_{\mathrm{u}}\) and \(\beta_{\mathrm{i}}\), they are computed as follows:

\[\begin{align*} \mu & = \frac{\sum_{r_{ui} \in R} r_{ui}}{|R|} & b_i & = \frac{\sum_{r_{ui} \in R_i} (r_{ui} - \mu)}{|R_i| + \beta_{\mathrm{i}}} & b_u & = \frac{\sum_{r_{ui} \in R_u} (r_{ui} - \mu - b_i)}{|R_u| + \beta_{\mathrm{u}}} \end{align*}\]

The damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.

Parameters:
  • items – whether to compute item biases

  • users – whether to compute user biases

  • damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.

mean_

The global mean rating.

Type:

double

item_offsets_

The item offsets (\(b_i\) values)

Type:

pandas.Series

user_offsets_

The item offsets (\(b_u\) values)

Type:

pandas.Series

fit(ratings, **kwargs)

Train the bias model on some rating data.

Parameters:

ratings (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.

Returns:

the fit bias object.

Return type:

Bias

transform(ratings, *, indexes=False)

Transform ratings by removing the bias term. This method does not recompute user (or item) biases based on these ratings, but rather uses the biases that were estimated with fit().

Parameters:
  • ratings (pandas.DataFrame) – The ratings to transform. Must contain at least user, item, and rating columns.

  • indexes (bool) – if True, the resulting frame will include uidx and iidx columns containing the 0-based user and item indexes for each rating.

Returns:

A data frame with rating transformed by subtracting user-item bias prediction.

Return type:

pandas.DataFrame

inverse_transform(ratings)

Transform ratings by removing the bias term.

transform_user(ratings)

Transform a user’s ratings by subtracting the bias model.

Parameters:

ratings (pandas.Series) – The user’s ratings, indexed by item. Must have at least item as index and rating column.

Returns:

The transformed ratings and the user bias.

Return type:

pandas.Series

inverse_transform_user(user, ratings, user_bias=None)

Un-transform a user’s ratings by adding in the bias model.

Parameters:
  • user – The user ID.

  • ratings (pandas.Series) – The user’s ratings, indexed by item.

  • user_bias (float or None) – If None, it looks up the user bias learned by fit.

Returns:

The user’s de-normalized ratings.

Return type:

pandas.Series

fit_transform(ratings, **kwargs)

Fit with ratings and return the training data transformed.

predict_for_user(user, items, ratings=None)

Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.

Parameters:
  • user – the user ID

  • items (array-like) – the items to predict

  • ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.

Returns:

scores for the items, indexed by item id.

Return type:

pandas.Series

property user_index

Get the user index from this (fit) bias.

property item_index

Get the item index from this (fit) bias.