lenskit.topn#

Classes

RecListAnalysis([group_cols, n_jobs])

Compute one or more top-N metrics over recommendation lists.

class lenskit.topn.RecListAnalysis(group_cols=None, n_jobs=None)#

Bases: object

Compute one or more top-N metrics over recommendation lists.

This method groups the recommendations by the specified columns, and computes the metric over each group. The default set of grouping columns is all columns except the following:

item
rank
score
rating

The truth frame, truth, is expected to match over (a subset of) the grouping columns, and contain at least an item column. If it also contains a rating column, that is used as the users’ rating for metrics that require it; otherwise, a rating value of 1 is assumed.

Parameters:: group_cols (list) – The columns to group by, or None to use the default.

add_metric(metric, *, name=None, **kwargs)#

Add a metric to the analysis.

A metric is a function of two arguments: the a single group of the recommendation frame, and the corresponding truth frame. The truth frame will be indexed by item ID. The recommendation frame will be in the order in the data. Many metrics are defined in lenskit.metrics.topn; they are re-exported from lenskit.topn for convenience.

Parameters:

metric – The metric to compute.
name – The name to assign the metric. If not provided, the function name is used.
**kwargs – Additional arguments to pass to the metric.

compute(recs, truth, *, include_missing=False)#

Run the analysis. Neither data frame should be meaningfully indexed.

Parameters:

recs (pandas.DataFrame) – A data frame of recommendations.
truth (pandas.DataFrame) – A data frame of ground truth (test) data.
include_missing (bool) – True to include users from truth missing from recs. Matches are done via group columns that appear in both recs and truth.

Returns:

The results of the analysis.

Return type:

pandas.DataFrame