Universal Recommender 中的数据收集
Data collection in Universal Recommender
我会使用 prediction.io 构建推荐服务。我认为 Universal Recommender (http://templates.prediction.io/PredictionIO/template-scala-parallel-universal-recommendation) 对我来说是一个很好的模板。我的网站显示 EPG,我想根据用户页面视图为简单的 PoC 推荐。页面浏览量是具有某些属性性别、演员、标签、频道、花束的广播...
一开始我只会发送一个主要事件,用户看到广播:
{
"event" : "view",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "broadcastId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
如果我理解文档,我将不得不每天使用 crontask 发送新广播以添加属性,并学习 pio 新项目:
{
"event" : "$set",
"entityType" : "item",
"entityId" : "broadcastId",
"properties" : {
"bouquet" : ["B1", "B2"],
"people": ["P1", "P2"],
"channel": ["C1"],
"availableDate" : "2015-11-23T21:02:49.228Z",
"expireDate": "2016-10-05T21:02:49.228Z"
},
"eventTime" : "2015-11-23T21:02:49.228Z"
}
现在,我不知道是在广播实体中使用属性更好,还是发送次要事件更好?例如:
{
"event" : "view-bouquet",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "bouquetId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
{
"event" : "view-people",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "peopleId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
...
{
"event" : "view-channel",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "channelId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
这取决于你想做什么。如果您希望能够过滤 and/or 这些提升,请将它们添加为项目的属性;如果您希望它们通过交叉同现计算 influence/improve 模型,请将它们作为次要事件发送。 [或者两者都做 if/when 这对你有意义]
我会使用 prediction.io 构建推荐服务。我认为 Universal Recommender (http://templates.prediction.io/PredictionIO/template-scala-parallel-universal-recommendation) 对我来说是一个很好的模板。我的网站显示 EPG,我想根据用户页面视图为简单的 PoC 推荐。页面浏览量是具有某些属性性别、演员、标签、频道、花束的广播...
一开始我只会发送一个主要事件,用户看到广播:
{
"event" : "view",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "broadcastId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
如果我理解文档,我将不得不每天使用 crontask 发送新广播以添加属性,并学习 pio 新项目:
{
"event" : "$set",
"entityType" : "item",
"entityId" : "broadcastId",
"properties" : {
"bouquet" : ["B1", "B2"],
"people": ["P1", "P2"],
"channel": ["C1"],
"availableDate" : "2015-11-23T21:02:49.228Z",
"expireDate": "2016-10-05T21:02:49.228Z"
},
"eventTime" : "2015-11-23T21:02:49.228Z"
}
现在,我不知道是在广播实体中使用属性更好,还是发送次要事件更好?例如:
{
"event" : "view-bouquet",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "bouquetId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
{
"event" : "view-people",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "peopleId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
...
{
"event" : "view-channel",
"entityType" : "user",
"entityId" : "userId",
"targetEntityType" : "item",
"targetEntityId" : "channelId",
"properties" : {},
"eventTime" : "2015-10-05T21:02:49.228Z"
}
这取决于你想做什么。如果您希望能够过滤 and/or 这些提升,请将它们添加为项目的属性;如果您希望它们通过交叉同现计算 influence/improve 模型,请将它们作为次要事件发送。 [或者两者都做 if/when 这对你有意义]