了解 Precision@K、AP@K、MAP@K

Understanding Precision@K, AP@K, MAP@K

我目前正在评估基于隐式反馈的推荐系统。我对排名任务的评估指标有点困惑。具体来说，我希望通过精度和召回率来进行评估。

Precision@k has the advantage of not requiring any estimate of the size of the set of relevant documents but the disadvantages that it is the least stable of the commonly used evaluation measures and that it does not average well, since the total number of relevant documents for a query has a strong influence on precision at k

我自己注意到它往往非常不稳定，因此，我想对多个评估日志的结果进行平均。

我在想；假设我运行一个评估函数 returns 以下数组：

Numpy array containing precision@k scores for each user.

现在我有一个数组，用于存储我数据集中的所有 precision@3 分数。

如果我取这个数组的平均值，然后取 20 个不同分数的平均值：这是否等同于 Mean Average Precision@K 或 MAP@K 还是我对这个的理解有点太字面意思了？

我正在写一篇带有评估部分的论文，因此定义的准确性对我来说非常重要。

有两个平均值使概念变得模糊，但它们非常简单 - 至少在 recsys 上下文中 - 让我澄清一下：

P@K

How many relevant items are present in the top-k recommendations of your system

例如，要计算 P@3： 获取给定用户的前 3 条推荐，并检查其中有多少是好的。该数字除以 3 得到 P@3

AP@K

The mean of P@i for i=1, ..., K.

例如，要计算 AP@3：将 P@1、P@2 和 P@3 相加，然后将该值除以 3

AP@K 通常是为一个用户计算的。

地图@K

The mean of the AP@K for all the users.

例如，要计算 MAP@3：对所有用户求和 AP@3，然后将该值除以用户数量

如果你是程序员，你可以查看this code，这是ml_metrics的函数apk和mapk的实现，一个由Kaggle 首席技术官.

希望对您有所帮助！

了解 Precision@K、AP@K、MAP@K

Understanding Precision@K, AP@K, MAP@K

machine-learning

recommender-systems

P@K

AP@K

地图@K