Basic and Utility Algorithms¶
The lenskit.algorithms.basic
module contains baseline and utility algorithms
for nonpersonalized recommendation and testing.
Personalized Mean Rating Prediction¶
-
class
lenskit.algorithms.basic.
Bias
(items=True, users=True, damping=0.0)¶ Bases:
lenskit.algorithms.Predictor
A user-item bias rating prediction algorithm. This implements the following predictor algorithm:
s(u,i)=μ+bi+buwhere μ is the global mean rating, bi is item bias, and bu is the user bias. With the provided damping values βu and βi, they are computed as follows:
μ=∑rui∈Rrui|R|bi=∑rui∈Ri(rui−μ)|Ri|+βibu=∑rui∈Ru(rui−μ−bi)|Ru|+βuThe damping values can be interpreted as the number of default (mean) ratings to assume a priori for each user or item, damping low-information users and items towards a mean instead of permitting them to take on extreme values based on few ratings.
- Parameters
items – whether to compute item biases
users – whether to compute user biases
damping (number or tuple) – Bayesian damping to apply to computed biases. Either a number, to damp both user and item biases the same amount, or a (user,item) tuple providing separate damping values.
-
mean\_
The global mean rating.
- Type
double
-
item_offsets\_
The item offsets (bi values)
- Type
-
user_offsets\_
The item offsets (bu values)
- Type
-
fit
(ratings, **kwargs)¶ Train the bias model on some rating data.
- Parameters
ratings (DataFrame) – a data frame of ratings. Must have at least user, item, and rating columns.
- Returns
the fit bias object.
- Return type
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items. Unknown users and items are assumed to have zero bias.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.
- Returns
scores for the items, indexed by item id.
- Return type
Most Popular Item Recommendation¶
The Popular
algorithm implements most-popular-item recommendation.
-
class
lenskit.algorithms.basic.
Popular
(selector=None)¶ Bases:
lenskit.algorithms.Recommender
Recommend the most popular items.
- Parameters
selector (CandidateSelector) – The candidate selector to use. If
None
, uses a newUnratedItemCandidateSelector
.
-
item_pop\_
Item rating counts (popularity)
- Type
-
fit
(ratings, **kwargs)¶ Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
-
recommend
(user, n=None, candidates=None, ratings=None)¶ Compute recommendations for a user.
- Parameters
user – the user ID
n (int) – the number of recommendations to produce (
None
for unlimited)candidates (array-like) – The set of valid candidate items; if
None
, a default set will be used. For many algorithms, this is theirCandidateSelector
.ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
a frame with an
item
column; if the recommender also produces scores, they will be in ascore
column.- Return type
Random Item Recommendation¶
The Random
algorithm implements random-item recommendation.
-
class
lenskit.algorithms.basic.
Random
(selector=None, rng_spec=None)¶ Bases:
lenskit.algorithms.Recommender
A random-item recommender.
-
selector
¶ Selects candidate items for recommendation. Default is
UnratedItemCandidateSelector
.- Type
-
rng_spec
¶ Seed or random state for generating recommendations. Pass
'user'
to deterministically derive per-user RNGS from the user IDs for reproducibility.
-
fit
(ratings, **kwargs)¶ Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
-
recommend
(user, n=None, candidates=None, ratings=None)¶ Compute recommendations for a user.
- Parameters
user – the user ID
n (int) – the number of recommendations to produce (
None
for unlimited)candidates (array-like) – The set of valid candidate items; if
None
, a default set will be used. For many algorithms, this is theirCandidateSelector
.ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
a frame with an
item
column; if the recommender also produces scores, they will be in ascore
column.- Return type
-
Top-N Recommender¶
The TopN
class implements a standard top-N recommender that wraps a
Predictor
and CandidateSelector
and returns the top N
candidate items by predicted rating. It is the type of recommender returned by
Recommender.adapt()
if the provided algorithm is not a recommender.
-
class
lenskit.algorithms.basic.
TopN
(predictor, selector=None)¶ Bases:
lenskit.algorithms.Recommender
,lenskit.algorithms.Predictor
Basic recommender that implements top-N recommendation using a predictor.
Note
This class does not do anything of its own in
fit()
. If its predictor and candidate selector are both fit, the top-N recommender does not need to be fit.- Parameters
predictor (Predictor) – The underlying predictor.
selector (CandidateSelector) – The candidate selector. If
None
, usesUnratedItemCandidateSelector
.
-
fit
(ratings, **kwargs)¶ Fit the recommender.
- Parameters
ratings (pandas.DataFrame) – The rating or interaction data. Passed changed to the predictor and candidate selector.
kwargs (args,) – Additional arguments for the predictor to use in its training process.
-
predict
(pairs, ratings=None)¶ Compute predictions for user-item pairs. This method is designed to be compatible with the general SciKit paradigm; applications typically want to use
predict_for_user()
.- Parameters
pairs (pandas.DataFrame) – The user-item pairs, as
user
anditem
columns.ratings (pandas.DataFrame) – user-item rating data to replace memorized data.
- Returns
The predicted scores for each user-item pair.
- Return type
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type
-
recommend
(user, n=None, candidates=None, ratings=None)¶ Compute recommendations for a user.
- Parameters
user – the user ID
n (int) – the number of recommendations to produce (
None
for unlimited)candidates (array-like) – The set of valid candidate items; if
None
, a default set will be used. For many algorithms, this is theirCandidateSelector
.ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
a frame with an
item
column; if the recommender also produces scores, they will be in ascore
column.- Return type
Unrated Item Candidate Selector¶
UnratedItemCandidateSelector
is a candidate selector that remembers items
users have rated, and returns a candidate set consisting of all unrated items. It is the
default candidate selector for TopN
.
-
class
lenskit.algorithms.basic.
UnratedItemCandidateSelector
¶ Bases:
lenskit.algorithms.CandidateSelector
CandidateSelector
that selects items a user has not rated as candidates. When this selector is fit, it memorizes the rated items.-
items\_
All known items.
- Type
-
users\_
All known users.
- Type
-
user_items\_
Items rated by each known user, as positions in the
items
index.- Type
-
candidates
(user, ratings=None)¶ Select candidates for the user.
- Parameters
user – The user key or ID.
ratings (pandas.Series or array-like) – Ratings or items to use instead of whatever ratings were memorized for this user. If a
pandas.Series
, the series index is used; if it is another array-like it is assumed to be an array of items.
-
fit
(ratings, **kwargs)¶ Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
-
Fallback Predictor¶
The Fallback
rating predictor is a simple hybrid that takes a list of composite algorithms,
and uses the first one to return a result to predict the rating for each item.
A common case is to fill in with Bias
when a primary predictor cannot score an item.
-
class
lenskit.algorithms.basic.
Fallback
(algorithms, *others)¶ Bases:
lenskit.algorithms.Predictor
The Fallback algorithm predicts with its first component, uses the second to fill in missing values, and so forth.
-
fit
(ratings, **kwargs)¶ Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type
-
Memorized Predictor¶
The Memorized
recommender is primarily useful for test cases. It memorizes a set of
rating predictions and returns them.
-
class
lenskit.algorithms.basic.
Memorized
(scores)¶ Bases:
lenskit.algorithms.Predictor
The memorized algorithm memorizes socres provided at construction time.
-
fit
(*args, **kwargs)¶ Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns
The algorithm object.
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type
-