Evaluating Top-N Rankings#

The lenskit.metrics.ranking module contains the core top-N ranking accuracy metrics (including rank-oblivious list metrics like precision, recall, and hit rate).

Ranking metrics extend the RankingMetricBase base class in addition to ListMetric and/or GlobalMetric, return a score given a recommendation list and a test rating list, both as item lists; most metrics require the recommendation item list to be ordered.

All LensKit ranking metrics take k as a constructor argument to control the list of the length that is considered; this allows multiple measurements (e.g. HR@5 and HR@10) to be computed from a single set of rankings.

Changed in version 2025.1: The top-N accuracy metric interface has changed to use item lists, and to be simpler to implement.

Included Effectiveness Metrics#

List and Set Metrics#

These metrics just look at the recommendation list and do not consider the rank positions of items within it.

Hit

Compute whether or not a list is a hit; any list with at least one relevant item in the first \(k\) positions (\(L_{\le k} \cap I_u^{\mathrm{test}} \ne \emptyset\)) is scored as 1, and lists with no relevant items as 0.

Precision

Compute recommendation precision.

Recall

Compute recommendation recall.

Ranked List Metrics#

These metrics treat the recommendation list as a ranked list of items that may or may not be relevant; some also support different item utilities (e.g. ratings or graded relevance scores).

RecipRank

Compute the reciprocal rank [KV97] of the first relevant item in a list of recommendations.

RBP

Evaluate recommendations with rank-biased precision [MZ08] with a patience parameter \(\gamma\).

NDCG

Compute the normalized discounted cumulative gain [JarvelinKekalainen02].