只有ground truth的问题数量影响R&R训练数据的大小？

Only the number of questions of ground truth affects the size of R&R training data?

我从ground truth创建了R&R的训练数据，发现ground truth的每个问题都记录了10条训练数据，而不依赖于ground truth的候选答案数量。

只有ground truth的问题数影响R&R训练数据的大小？我想知道，因为训练数据有大小限制。

noticed that each question of ground truth made 10 records of training data without depending on the number of candidate answers of ground truth

如果您使用 python train.py 实用程序为 R&R 准备训练数据，则每个问题的候选答案数量由可选的 -r（--rows) 参数，它指定查询 returns 的答案结果数。默认值为 10，这就是您所看到的。

类似地，如果您直接使用/fcselect API调用生成训练数据，那么您可以类似地使用可选的rows参数来指定候选答案的数量为其生成特征。同样，默认值为 10。

如果您负担得起，通常最好覆盖此默认值并尝试使用更高的值，因为这为排名器提供了更多学习和重新排名答案的空间。 RnR 网络工具使用默认值 30。

Only the number of questions of ground truth affects the size of R&R training data?

不，训练数据的大小与所有方面成正比：（1）查询的数量，（2）每个查询的候选答案数量，以及（3）特征（列）的数量。特征的数量本身与模式中标记为特征生成的字段数量成正比（即在默认模式中，它们被标记为 watson_text_en 类型）。