AWS 个性化:过滤所有已交互的项目似乎不会持续存在
AWS personalize: filtering all items already interacted with does not seem to persist
我们正在使用 AWS Personalize 为特定用户获取我们 Feed 中各种项目的个性化排名。
我们也在使用看起来像
的过滤器
EXCLUDE ItemID WHERE Interactions.event_type IN ("*")
此过滤器取自AWS blog,其中指出
To remove all items that a user has previously interacted with, use the following filter expression:
EXCLUDE itemId WHERE INTERACTIONS.event_type in ("*")
正在玩控制台 https://console.aws.amazon.com/personalize/home?region=us-east-1#arn:aws:personalize:us-east-1::dataset-group$/campaigns/campaignDetail/
我输入了一个userId=5253ffbb-f5e3-4e71-9a33-91ee65365c7d
和一堆item ids:
5829, 5480, 2275, 6706, 5438, 6444, 6444, 7461, 7599, 4384, 6747, 7499, 6491, 5453, 7605, 5985, 6663, 7174, 1094, 6474, 7357, 7220, 8370, 7445, 5721, 991, 5592, 9283, 7547, 8676, 8872, 8092, 9401, 8645, 2090, 7684, 3788, 5849, 6524, 8480, 7299, 5752, 8007, 9100, 7422, 8640, 7917, 9254, 10050, 9851, 1744, 4227, 6388, 9490, 6481, 5744, 6486, 9040, 4048, 8170, 9623, 7966, 8560, 5336, 3885, 4441, 10442, 6842, 4898, 567, 4214, 125, 9556, 10039, 5494, 9447, 10051, 8302, 9482, 6649, 9133, 4828, 8288, 62, 9680, 4792, 10785, 9727, 10777, 11366, 10252, 9728, 2450, 10463, 9578, 4246, 10154, 10793, 10299, 6733, 10597, vy7erddv, 9247, 9816, 8385, 9589, 10845, 10368, 11427, 11405, 10475, 11273, 11392, 11335, 5871, 10465, 10927, 9371, 9894, 10773, 10747, 11274, 11349, 10831, 9882, vaxq362m, m3g32ayv, 5wqa8r4v, km7kl7kv, 3wno92pm, 3m483l5v, pv9rallv, lmr4dn8v
现在我记录此用户与某些项目的交互并重新加载控制台建议...
这 似乎 可以按预期工作,如果用户已经与这些项目进行交互,则会从列表中过滤掉这些项目。
但令我惊讶的是......这些项目不会无限期地保持过滤......如果我继续记录与该用户的其他项目的交互,那么稍后重新加载的推荐可能会包含以前交互过的项目。或者如果有足够的时间(比如一天),所有的项目似乎都会为这个用户回来!!
我完全不知道为什么会这样。
互动被跟踪为
POST https://personalize-events.us-east-1.amazonaws.com/events
{
"eventList": [
{
"eventType": "list_view",
"ITEM_ID": "vaxq362m",
"properties": "{\"itemType\": \"artwork\", \"itemId\": \"vaxq362m\"}",
"sentAt": {{$timestamp}}
}
],
"sessionId": "xxx1234",
"trackingId": "<OUR_TRACKING_ID>",
"userId": "5253ffbb-f5e3-4e71-9a33-91ee65365c7d"
}
这似乎有效,因为
- 响应状态为200
- 如果我导出交互数据集,交互将显示在 CSV 中
- 项目确实会在短时间内从返回的推荐中删除
过滤交互数据集不会考虑用户的完整历史记录。来自 docs:
Amazon Personalize considers up to 200 historical interactions for a user, and up to 100 streamed interactions you record for the user with the PutEvents operation. Additionally, the number of historical interactions Amazon Personalize considers for a user depends on the max_user_history_length_percentile and min_user_history_length_percentile hyperparameters you defined before training.
For example, if you used .99 for the max_user_history_length_percentile, and 99% of your users have at most 4 interactions, Amazon Personalize will only filter based on the user's most recent 4 historical interactions. If a user has less than the number historical interactions at the min_user_history_length_percentile, Amazon Personalize doesn't consider the user's interactions when filtering.
To filter based on up to 200 historical interactions for a user, set the max_user_history_length_percentile to 1.0 and retrain the model.
我们正在使用 AWS Personalize 为特定用户获取我们 Feed 中各种项目的个性化排名。
我们也在使用看起来像
的过滤器EXCLUDE ItemID WHERE Interactions.event_type IN ("*")
此过滤器取自AWS blog,其中指出
To remove all items that a user has previously interacted with, use the following filter expression:
EXCLUDE itemId WHERE INTERACTIONS.event_type in ("*")
正在玩控制台 https://console.aws.amazon.com/personalize/home?region=us-east-1#arn:aws:personalize:us-east-1::dataset-group$
我输入了一个userId=5253ffbb-f5e3-4e71-9a33-91ee65365c7d
和一堆item ids:
5829, 5480, 2275, 6706, 5438, 6444, 6444, 7461, 7599, 4384, 6747, 7499, 6491, 5453, 7605, 5985, 6663, 7174, 1094, 6474, 7357, 7220, 8370, 7445, 5721, 991, 5592, 9283, 7547, 8676, 8872, 8092, 9401, 8645, 2090, 7684, 3788, 5849, 6524, 8480, 7299, 5752, 8007, 9100, 7422, 8640, 7917, 9254, 10050, 9851, 1744, 4227, 6388, 9490, 6481, 5744, 6486, 9040, 4048, 8170, 9623, 7966, 8560, 5336, 3885, 4441, 10442, 6842, 4898, 567, 4214, 125, 9556, 10039, 5494, 9447, 10051, 8302, 9482, 6649, 9133, 4828, 8288, 62, 9680, 4792, 10785, 9727, 10777, 11366, 10252, 9728, 2450, 10463, 9578, 4246, 10154, 10793, 10299, 6733, 10597, vy7erddv, 9247, 9816, 8385, 9589, 10845, 10368, 11427, 11405, 10475, 11273, 11392, 11335, 5871, 10465, 10927, 9371, 9894, 10773, 10747, 11274, 11349, 10831, 9882, vaxq362m, m3g32ayv, 5wqa8r4v, km7kl7kv, 3wno92pm, 3m483l5v, pv9rallv, lmr4dn8v
现在我记录此用户与某些项目的交互并重新加载控制台建议...
这 似乎 可以按预期工作,如果用户已经与这些项目进行交互,则会从列表中过滤掉这些项目。
但令我惊讶的是......这些项目不会无限期地保持过滤......如果我继续记录与该用户的其他项目的交互,那么稍后重新加载的推荐可能会包含以前交互过的项目。或者如果有足够的时间(比如一天),所有的项目似乎都会为这个用户回来!!
我完全不知道为什么会这样。
互动被跟踪为
POST https://personalize-events.us-east-1.amazonaws.com/events
{
"eventList": [
{
"eventType": "list_view",
"ITEM_ID": "vaxq362m",
"properties": "{\"itemType\": \"artwork\", \"itemId\": \"vaxq362m\"}",
"sentAt": {{$timestamp}}
}
],
"sessionId": "xxx1234",
"trackingId": "<OUR_TRACKING_ID>",
"userId": "5253ffbb-f5e3-4e71-9a33-91ee65365c7d"
}
这似乎有效,因为
- 响应状态为200
- 如果我导出交互数据集,交互将显示在 CSV 中
- 项目确实会在短时间内从返回的推荐中删除
过滤交互数据集不会考虑用户的完整历史记录。来自 docs:
Amazon Personalize considers up to 200 historical interactions for a user, and up to 100 streamed interactions you record for the user with the PutEvents operation. Additionally, the number of historical interactions Amazon Personalize considers for a user depends on the max_user_history_length_percentile and min_user_history_length_percentile hyperparameters you defined before training.
For example, if you used .99 for the max_user_history_length_percentile, and 99% of your users have at most 4 interactions, Amazon Personalize will only filter based on the user's most recent 4 historical interactions. If a user has less than the number historical interactions at the min_user_history_length_percentile, Amazon Personalize doesn't consider the user's interactions when filtering.
To filter based on up to 200 historical interactions for a user, set the max_user_history_length_percentile to 1.0 and retrain the model.