Classic Matrix Factorization¶

LKPY provides classical matrix factorization implementations.

Common Support
Alternating Least Squares
FunkSVD

Common Support ¶

The mf_common module contains common support code for matrix factorization algorithms. These classes, MFPredictor and BiasMFPredictor, define the parameters that are estimated during the Algorithm.fit() process on common matrix factorization algorithms.

class lenskit.algorithms.mf_common.MFPredictor¶

Common predictor for matrix factorization.

user_index_¶

Users in the model (length=:math:m).

Type: pandas.Index

item_index_¶

Items in the model (length=:math:n).

Type: pandas.Index

user_features_¶

The \(m \times k\) user-feature matrix.

Type: numpy.ndarray

item_features_¶

The \(n \times k\) item-feature matrix.

Type: numpy.ndarray

lookup_items(items)¶

Look up the indices for a set of items.

Parameters: items (array-like) – the item IDs to look up.
Returns: the item indices. Unknown items will have negative indices.
Return type: numpy.ndarray

lookup_user(user)¶

Look up the index for a user.

Parameters: user – the user ID to look up
Returns: the user index.
Return type: int

property n_features¶: The number of features.

property n_items¶: The number of items.

property n_users¶: The number of users.

score(user, items)¶

Score a set of items for a user. User and item parameters must be indices into the matrices.

Parameters

user (int) – the user index
items (array-like of int) – the item indices
raw (bool) – if True, do return raw scores without biases added back.

Returns

the scores for the items.

Return type

numpy.ndarray

class lenskit.algorithms.mf_common.BiasMFPredictor¶

Common model for biased matrix factorization.

user_index_¶

Users in the model (length=:math:m).

Type: pandas.Index

item_index_¶

Items in the model (length=:math:n).

Type: pandas.Index

global_bias_¶

The global bias term.

Type: double

user_bias_¶

The user bias terms.

Type: numpy.ndarray

item_bias_¶

The item bias terms.

Type: numpy.ndarray

user_features_¶

The \(m \times k\) user-feature matrix.

Type: numpy.ndarray

item_features_¶

The \(n \times k\) item-feature matrix.

Type: numpy.ndarray

score(user, items, raw=False)¶

Score a set of items for a user. User and item parameters must be indices into the matrices.

Parameters

user (int) – the user index
items (array-like of int) – the item indices
raw (bool) – if True, do return raw scores without biases added back.

Returns

the scores for the items.

Return type

numpy.ndarray

Alternating Least Squares ¶

LensKit provides alternating least squares implementations of matrix factorization suitable for explicit feedback data. These implementations are parallelized with Numba, and perform best with the MKL from Conda.

class lenskit.algorithms.als.BiasedMF(features, *, iterations=20, reg=0.1, damping=5, bias=True, method='cd', rand=<built-in method randn of numpy.random.mtrand.RandomState object>, progress=None)¶

Bases: lenskit.algorithms.mf_common.BiasMFPredictor

Biased matrix factorization trained with alternating least squares [ZWSP2008]. This is a prediction-oriented algorithm suitable for explicit feedback data.

It provides two solvers for the optimization step (the method parameter):

'cd' (the default): Coordinate descent [TPT2011], adapted for a separately-trained bias model and to use weighted regularization as in the original ALS paper [ZWSP2008].
'lu': A direct implementation of the original ALS concept [ZWSP2008] using LU-decomposition to solve for the optimized matrices.

See the base class BiasMFPredictor for documentation on the estimated parameters you can extract from a trained model.

ZWSP2008(1,2,3): Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, and Rong Pan. 2008. Large-Scale Parallel Collaborative Filtering for the Netflix Prize. In +Algorithmic Aspects in Information and Management_, LNCS 5034, 337–348. DOI 10.1007/978-3-540-68880-8_32.
TPT2011: Gábor Takács, István Pilászy, and Domonkos Tikk. 2011. Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.

Parameters

features (int) – the number of features to train
iterations (int) – the number of iterations to train
reg (float) – the regularization factor; can also be a tuple (ureg, ireg) to specify separate user and item regularization terms.
damping (float) – damping factor for the underlying mean
bias (bool or Bias) – the bias model. If True, fits a Bias with damping damping.
method (str) – the solver to use (see above).
rng (function) – RNG function compatible with :fun:`numpy.random.randn` for initializing matrices.
progress – a tqdm.tqdm()-compatible progress bar function

fit(ratings, **kwargs)¶

Run ALS to train a model.

Parameters: ratings – the ratings data frame.
Returns: The algorithm (for chaining).

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

class lenskit.algorithms.als.ImplicitMF(features, *, iterations=20, reg=0.1, weight=40, method='cg', rand=<built-in method randn of numpy.random.mtrand.RandomState object>, progress=None)¶

Bases: lenskit.algorithms.mf_common.MFPredictor

Implicit matrix factorization trained with alternating least squares [HKV2008]. This algorithm outputs ‘predictions’, but they are not on a meaningful scale. If its input data contains rating values, these will be used as the ‘confidence’ values; otherwise, confidence will be 1 for every rated item.

'cd' (the default): Conjugate gradient method [TPT2011].
'lu': A direct implementation of the original implicit-feedback ALS concept [HKV2008] using LU-decomposition to solve for the optimized matrices.

See the base class MFPredictor for documentation on the estimated parameters you can extract from a trained model.

HKV2008(1,2,3): Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In _Proceedings of the 2008 Eighth IEEE International Conference on Data Mining_, 263–272. DOI 10.1109/ICDM.2008.22
TPT2011: Gábor Takács, István Pilászy, and Domonkos Tikk. 2011. Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.

Parameters

features (int) – the number of features to train
iterations (int) – the number of iterations to train
reg (double) – the regularization factor
weight (double) – the scaling weight for positive samples (\(\alpha\) in [HKV2008]).
progress – a tqdm.tqdm()-compatible progress bar function

fit(ratings, **kwargs)¶

Train a model using the specified ratings (or similar) data.

Parameters

ratings (pandas.DataFrame) – The ratings data.
kwargs – Additional training data the algorithm may require. Algorithms should avoid using the same keyword arguments for different purposes, so that they can be more easily hybridized.

Returns

The algorithm object.

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

FunkSVD ¶

FunkSVD is an SVD-like matrix factorization that uses stochastic gradient descent, configured much like coordinate descent, to train the user-feature and item-feature matrices.

class lenskit.algorithms.funksvd.FunkSVD(features, iterations=100, *, lrate=0.001, reg=0.015, damping=5, range=None, bias=True)¶

Bases: lenskit.algorithms.mf_common.BiasMFPredictor

Algorithm class implementing FunkSVD matrix factorization. FunkSVD is a regularized biased matrix factorization technique trained with featurewise stochastic gradient descent.

See the base class BiasMFPredictor for documentation on the estimated parameters you can extract from a trained model.

Parameters

features (int) – the number of features to train
iterations (int) – the number of iterations to train each feature
lrate (double) – the learning rate
reg (double) – the regularization factor
damping (double) – damping factor for the underlying mean
bias (Predictor) – the underlying bias model to fit. If True, then a basic.Bias model is fit with damping.
range (tuple) – the (min, max) rating values to clamp ratings, or None to leave predictions unclamped.

fit(ratings, **kwargs)¶

Train a FunkSVD model.

Parameters: ratings – the ratings data frame.

predict_for_user(user, items, ratings=None)¶

Compute predictions for a user and items.

Parameters

user – the user ID
items (array-like) – the items to predict
ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.

Returns

scores for the items, indexed by item id.

Return type

pandas.Series

Classic Matrix Factorization¶

Common Support¶

Alternating Least Squares¶

FunkSVD¶

Common Support ¶

Alternating Least Squares ¶

FunkSVD ¶