Classic Matrix Factorization¶
LKPY provides classical matrix factorization implementations.
Common Support¶
The mf_common
module contains common support code for matrix factorization
algorithms. These classes, MFPredictor
and BiasMFPredictor
,
define the parameters that are estimated during the Algorithm.fit()
process on common matrix factorization algorithms.
-
class
lenskit.algorithms.mf_common.
MFPredictor
¶ Common predictor for matrix factorization.
-
user_index_
¶ Users in the model (length=:math:m).
- Type
-
item_index_
¶ Items in the model (length=:math:n).
- Type
-
user_features_
¶ The \(m \times k\) user-feature matrix.
- Type
-
item_features_
¶ The \(n \times k\) item-feature matrix.
- Type
-
lookup_items
(items)¶ Look up the indices for a set of items.
- Parameters
items (array-like) – the item IDs to look up.
- Returns
the item indices. Unknown items will have negative indices.
- Return type
-
lookup_user
(user)¶ Look up the index for a user.
- Parameters
user – the user ID to look up
- Returns
the user index.
- Return type
-
n_features
¶ The number of features.
-
n_items
¶ The number of items.
-
n_users
¶ The number of users.
-
score
(user, items)¶ Score a set of items for a user. User and item parameters must be indices into the matrices.
- Parameters
- Returns
the scores for the items.
- Return type
-
-
class
lenskit.algorithms.mf_common.
BiasMFPredictor
¶ Common model for biased matrix factorization.
-
user_index_
¶ Users in the model (length=:math:m).
- Type
-
item_index_
¶ Items in the model (length=:math:n).
- Type
-
global_bias_
¶ The global bias term.
- Type
double
-
user_bias_
¶ The user bias terms.
- Type
-
item_bias_
¶ The item bias terms.
- Type
-
user_features_
¶ The \(m \times k\) user-feature matrix.
- Type
-
item_features_
¶ The \(n \times k\) item-feature matrix.
- Type
-
score
(user, items, raw=False)¶ Score a set of items for a user. User and item parameters must be indices into the matrices.
- Parameters
- Returns
the scores for the items.
- Return type
-
Alternating Least Squares¶
LensKit provides alternating least squares implementations of matrix factorization suitable for explicit feedback data. These implementations are parallelized with Numba, and perform best with the MKL from Conda.
-
class
lenskit.algorithms.als.
BiasedMF
(features, *, iterations=20, reg=0.1, damping=5, bias=True, progress=None)¶ Bases:
lenskit.algorithms.mf_common.BiasMFPredictor
Biased matrix factorization trained with alternating least squares [ZWSP2008]. This is a prediction-oriented algorithm suitable for explicit feedback data.
See the base class
BiasMFPredictor
for documentation on the estimated parameters you can extract from a trained model.- ZWSP2008
Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, and Rong Pan. 2008. Large-Scale Parallel Collaborative Filtering for the Netflix Prize. In +Algorithmic Aspects in Information and Management_, LNCS 5034, 337–348. DOI 10.1007/978-3-540-68880-8_32.
- Parameters
-
fit
(ratings)¶ Run ALS to train a model.
- Parameters
ratings – the ratings data frame.
- Returns
The algorithm (for chaining).
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type
-
class
lenskit.algorithms.als.
ImplicitMF
(features, *, iterations=20, reg=0.1, weight=40, progress=None)¶ Bases:
lenskit.algorithms.mf_common.MFPredictor
Implicit matrix factorization trained with alternating least squares [HKV2008]. This algorithm outputs ‘predictions’, but they are not on a meaningful scale. If its input data contains
rating
values, these will be used as the ‘confidence’ values; otherwise, confidence will be 1 for every rated item.See the base class
MFPredictor
for documentation on the estimated parameters you can extract from a trained model.- HKV2008(1,2)
Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In _Proceedings of the 2008 Eighth IEEE International Conference on Data Mining_, 263–272. DOI 10.1109/ICDM.2008.22
- Parameters
-
fit
(ratings)¶ Train a model using the specified ratings (or similar) data.
- Parameters
ratings (pandas.DataFrame) – The ratings data.
args – Additional training data the algorithm may require.
kwargs – Additional training data the algorithm may require.
- Returns
The algorithm object.
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type
FunkSVD¶
FunkSVD is an SVD-like matrix factorization that uses stochastic gradient descent, configured much like coordinate descent, to train the user-feature and item-feature matrices.
-
class
lenskit.algorithms.funksvd.
FunkSVD
(features, iterations=100, *, lrate=0.001, reg=0.015, damping=5, range=None, bias=True)¶ Bases:
lenskit.algorithms.mf_common.BiasMFPredictor
Algorithm class implementing FunkSVD matrix factorization. FunkSVD is a regularized biased matrix factorization technique trained with featurewise stochastic gradient descent.
See the base class
BiasMFPredictor
for documentation on the estimated parameters you can extract from a trained model.- Parameters
features (int) – the number of features to train
iterations (int) – the number of iterations to train each feature
lrate (double) – the learning rate
reg (double) – the regularization factor
damping (double) – damping factor for the underlying mean
bias (Predictor) – the underlying bias model to fit. If
True
, then abasic.Bias
model is fit withdamping
.range (tuple) – the
(min, max)
rating values to clamp ratings, orNone
to leave predictions unclamped.
-
fit
(ratings)¶ Train a FunkSVD model.
- Parameters
ratings – the ratings data frame.
-
predict_for_user
(user, items, ratings=None)¶ Compute predictions for a user and items.
- Parameters
user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
- Returns
scores for the items, indexed by item id.
- Return type