Algorithm Interfaces
LKPY’s batch routines and utility support for managing algorithms expect algorithms to implement consistent interfaces. This page describes those interfaces.
The interfaces are realized as abstract base classes with the Python abc
module.
Implementations must be registered with their interfaces, either by subclassing the interface
or by calling abc.ABCMeta.register()
.
Serialization
Like SciKit models, all LensKit algorithms are pickleable, and this is how we
recommend saving models to disk for later use. This can be done with
pickle
, but we recommend using binpickle
for more
automatically-optimized storage. For example, to save a fully-configured ALS
module with fairly aggressive ZSTD compression:
algo = Recommender.adapt(ImplicitMF(50))
algo.fit(ratings)
binpickle.dump(algo, binpickle.codecs.Blosc('zstd', 9))
Base Algorithm
Algorithms follow the SciKit fit-predict paradigm for estimators, except they know natively how to work with Pandas objects.
The Algorithm
interface defines common methods.
- class lenskit.Algorithm
Bases:
object
Base class for LensKit algorithms. These algorithms follow the SciKit design pattern for estimators.
- Canonical:
lenskit.Algorithm
- IGNORED_PARAMS = []
Names of parameters to ignore in
get_params()
.
- EXTRA_PARAMS = []
Names of extra parameters to include in
get_params()
. Useful when the constructor takes**kwargs
.
- abstract fit(ratings, **kwargs)
Train a model using the specified ratings (or similar) data.
- Parameters:
ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.
- Returns:
The algorithm object.
- get_params(deep=True)
Get the parameters for this algorithm (as in scikit-learn). Algorithm parameters should match constructor argument names.
The default implementation returns all attributes that match a constructor parameter name. It should be compatible with
sklearn.base.BaseEstimator.get_params()
method so that LensKit alogrithms can be cloned withsklearn.base.clone()
as well aslenskit.util.clone()
.- Returns:
the algorithm parameters.
- Return type:
Recommendation
The Recommender
interface provides an interface to generating
recommendations. Not all algorithms implement it; call
Recommender.adapt()
on an algorithm to get a recommender for any
algorithm that at least implements Predictor
. For example:
pred = Bias(damping=5)
rec = Recommender.adapt(pred)
If the algorithm already implements Recommender
, it is returned, so
it is safe to always call Recommender.adapt()
before fitting an
algorithm you will need for top-N recommendations to mak sure it is suitable.
- class lenskit.Recommender
Bases:
Algorithm
Recommends lists of items for users.
- abstract recommend(user, n=None, candidates=None, ratings=None)
Compute recommendations for a user.
- Parameters:
user – the user ID
n (int) – the number of recommendations to produce (
None
for unlimited)candidates (array-like) – The set of valid candidate items; if
None
, a default set will be used. For many algorithms, this is theirCandidateSelector
.ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns:
a frame with an
item
column; if the recommender also produces scores, they will be in ascore
column.- Return type:
- classmethod adapt(algo)
Ensure that an algorithm is a
Recommender
. If it is not a recommender, it is wrapped in alenskit.basic.TopN
with a default candidate selector.Note
Since 0.6.0, since algorithms are fit directly, you should call this method before calling
Algorithm.fit()
, unless you will always be passing explicit candidate sets torecommend()
.- Parameters:
algo (Predictor) – the underlying rating predictor.
Candidate Selection
Some recommenders use a candidate selector to identify possible items to recommend. These are also treated as algorithms, mainly so that they can memorize users’ prior ratings to exclude them from recommendation.
- class lenskit.CandidateSelector
Bases:
Algorithm
Select candidates for recommendation for a user, possibly with some additional ratings.
UnratedItemCandidateSelector
is the default and most common implementation of this interface.- abstract candidates(user, ratings=None)
Select candidates for the user.
- Parameters:
user – The user key or ID.
ratings (pandas.Series or array-like) – Ratings or items to use instead of whatever ratings were memorized for this user. If a
pandas.Series
, the series index is used; if it is another array-like it is assumed to be an array of items.
- static rated_items(ratings)
Utility function for converting a series or array into an array of item IDs. Useful in implementations of
candidates()
.
Rating Prediction
The Predictor
class impelemnts ‘rating prediction’, as well as any
other personalized item scoring that may not be predictions of actual ratings.
Most algorithms actually implement this interface.
- class lenskit.Predictor
Bases:
Algorithm
Predicts user ratings of items. Predictions are really estimates of the user’s like or dislike, and the
Predictor
interface makes no guarantees about their scale or granularity.- Canonical:
lenskit.Predictor
- predict(pairs, ratings=None)
Compute predictions for user-item pairs. This method is designed to be compatible with the general SciKit paradigm; applications typically want to use
predict_for_user()
.- Parameters:
pairs (pandas.DataFrame) – The user-item pairs, as
user
anditem
columns.ratings (pandas.DataFrame) – user-item rating data to replace memorized data.
- Returns:
The predicted scores for each user-item pair.
- Return type:
- abstract predict_for_user(user, items, ratings=None)
Compute predictions for a user and items.
- Parameters:
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns:
scores for the items, indexed by item id.
- Return type: