lenskit.metrics.topn#
Top-N evaluation metrics.
Functions
|
Decorator to register a bulk implementation for a metric. |
|
Compute the unnormalized discounted cumulative gain [JarvelinKekalainen02]. |
|
Compute whether or not a list is a hit; any list with at least one relevant item in the first \(k\) positions (\(L_{\le k} \cap I_u^{\mathrm{test}} \ne \emptyset\)) is scored as 1, and lists with no relevant items as 0. |
|
Compute the normalized discounted cumulative gain [JarvelinKekalainen02]. |
|
Compute recommendation precision. |
|
Evaluate recommendations with rank-biased precision [MZ08] with a patience parameter \(\gamma\). |
|
Compute recommendation recall. |
|
Compute the reciprocal rank [KV97] of the first relevant item in a list of recommendations. |
- lenskit.metrics.topn.bulk_impl(metric)#
Decorator to register a bulk implementation for a metric.
- lenskit.metrics.topn.precision(recs, truth, k=None)#
Compute recommendation precision. This is computed as:
\[\frac{|L \cap I_u^{\mathrm{test}}|}{|L|}\]In the uncommon case that
k
is specified andlen(recs) < k
, this metric useslen(recs)
as the denominator.This metric has a bulk implementation.
- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.k (int | None) – The maximum list length to consider.
- Return type:
float | None
- lenskit.metrics.topn.recall(recs, truth, k=None)#
Compute recommendation recall. This is computed as:
\[\frac{|L \cap I_u^{\mathrm{test}}|}{\operatorname{min}\{|I_u^{\mathrm{test}}|, k\}}\]This metric has a bulk implementation.
- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.k (int | None) – The maximum list length to consider.
- Return type:
float | None
- lenskit.metrics.topn.hit(recs, truth, k=None)#
Compute whether or not a list is a hit; any list with at least one relevant item in the first \(k\) positions (\(L_{\le k} \cap I_u^{\mathrm{test}} \ne \emptyset\)) is scored as 1, and lists with no relevant items as 0. When averaged over the recommendation lists, this computes the hit rate [DK04].
This metric has a bulk implementation.
- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.k (int | None) – The maximum list length to consider.
- Return type:
float | None
- lenskit.metrics.topn.recip_rank(recs, truth, k=None)#
Compute the reciprocal rank [KV97] of the first relevant item in a list of recommendations. Let \(\kappa\) denote the 1-based rank of the first relevant item in \(L\), with \(\kappa=\infty\) if none of the first \(k\) items in \(L\) are relevant; then the reciprocal rank is \(1 / \kappa\). If no elements are relevant, the reciprocal rank is therefore 0. Deshpande and Karypis [DK04] call this the “reciprocal hit rate”.
This metric has a bulk equivalent.
- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.k (int | None) – The maximum list length to consider.
- Return type:
float | None
- lenskit.metrics.topn.dcg(recs, truth, discount=<ufunc 'log2'>)#
Compute the unnormalized discounted cumulative gain [JarvelinKekalainen02].
Discounted cumultative gain is computed as:
\[\begin{align*} \mathrm{DCG}(L,u) & = \sum_{i=1}^{|L|} \frac{r_{ui}}{d(i)} \end{align*}\]Unrated items are assumed to have a utility of 0; if no rating values are provided in the truth frame, item ratings are assumed to be 1.
- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.discount – The rank discount function. Each item’s score will be divided the discount of its rank, if the discount is greater than 1.
- Return type:
float | None
- lenskit.metrics.topn.ndcg(recs, truth, discount=<ufunc 'log2'>, k=None)#
Compute the normalized discounted cumulative gain [JarvelinKekalainen02].
Discounted cumultative gain is computed as:
\[\begin{align*} \mathrm{DCG}(L,u) & = \sum_{i=1}^{|L|} \frac{r_{ui}}{d(i)} \end{align*}\]Unrated items are assumed to have a utility of 0; if no rating values are provided in the truth frame, item ratings are assumed to be 1.
This is then normalized as follows:
\[\begin{align*} \mathrm{nDCG}(L, u) & = \frac{\mathrm{DCG}(L,u)}{\mathrm{DCG}(L_{\mathrm{ideal}}, u)} \end{align*}\]This metric has a bulk implementation.
- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.discount (Callable) – The rank discount function. Each item’s score will be divided the discount of its rank, if the discount is greater than 1.
k (int | None) – The maximum list length.
- Return type:
float | None
- lenskit.metrics.topn.rbp(recs, truth, k=None, patience=0.5, normalize=False)#
Evaluate recommendations with rank-biased precision [MZ08] with a patience parameter \(\gamma\).
If \(r_{ui} \in \{0, 1\}\) is binary implicit ratings, this is computed by:
\[\begin{align*} \operatorname{RBP}_\gamma(L, u) & =(1 - \gamma) \sum_i r_{ui} p^i \end{align*}\]The original RBP metric depends on the idea that the rank-biased sum of binary relevance scores in an infinitely-long, perfectly-precise list has is \(1/(1 - \gamma)\). However, in recommender evaluation, we usually have a small test set, so the maximum achievable RBP is significantly less, and is a function of the number of test items. With
normalize=True
, the RBP metric will be normalized by the maximum achievable with the provided test data.- Parameters:
recs (DataFrame) – The recommendation list. This is expected to have a column
item
with the recommended item IDs; all other columns are ignored.truth (DataFrame) – The user’s test data. It is expected to be indexed by item ID. If it has a
rating
column, that is used as the item gains; otherwise, each item has gain 1. All other columns are ignored.k (int | None) – The maximum recommendation list length.
patience (float) – The patience parameter \(\gamma\), the probability that the user continues browsing at each point.
normalize (bool) – Whether to normalize the RBP scores; if
True
, divides the RBP score by the maximum achievable with the test data (as in nDCG).
- Return type:
float | None