lenskit.metrics.RBP#

class lenskit.metrics.RBP(k=None, *, patience=0.85, normalize=False)#

Bases: ListMetric, RankingMetricBase

Evaluate recommendations with rank-biased precision [MZ08] with a patience parameter \(\gamma\).

If \(r_{ui} \in \{0, 1\}\) is binary implicit ratings, this is computed by:

\[\begin{align*} \operatorname{RBP}_\gamma(L, u) & =(1 - \gamma) \sum_i r_{ui} p^i \end{align*}\]

The original RBP metric depends on the idea that the rank-biased sum of binary relevance scores in an infinitely-long, perfectly-precise list has is \(1/(1 - \gamma)\). However, in recommender evaluation, we usually have a small test set, so the maximum achievable RBP is significantly less, and is a function of the number of test items. With normalize=True, the RBP metric will be normalized by the maximum achievable with the provided test data.

Moffat and Zobel [MZ08] provide an extended discussion on choosing the patience parameter \(\gamma\). This metric defaults to \(\gamma=0.85\), to provide a relatively shallow curve and reward good items on the first few pages of results (in a 10-per-page setting). Recommendation systems data has no pooling, so the variance of this estimator may be high as they note in the paper; however, RBP with high patience should be no worse than nDCG (and perhaps even better) in this regard.

Warning

The additional normalization is experimental, and should not yet be used for published research results.

Parameters:
  • k (int | None) – The maximum recommendation list length.

  • patience (float) – The patience parameter \(\gamma\), the probability that the user continues browsing at each point. The default is 0.85.

  • normalize (bool) – Whether to normalize the RBP scores; if True, divides the RBP score by the maximum achievable with the test data (as in nDCG).

Stability:
Caller (see Stability Levels).
__init__(k=None, *, patience=0.85, normalize=False)#
Parameters:

Methods

__init__([k, patience, normalize])

measure_list(recs, test)

Compute the metric value for a single result list.

truncate(items)

Truncate an item list if it is longer than k.

Attributes

default

The default value to infer when computing statistics over missing values.

k

The maximum length of rankings to consider.

label

The metric's default label in output.

patience

normalize

property label#

The metric’s default label in output.

The base implementation returns the class name by default.

measure_list(recs, test)#

Compute the metric value for a single result list.

Individual metric classes need to implement this method.

Parameters:
Return type:

float