Universal Recommender 中的数据收集

Data collection in Universal Recommender

我会使用 prediction.io 构建推荐服务。我认为 Universal Recommender (http://templates.prediction.io/PredictionIO/template-scala-parallel-universal-recommendation) 对我来说是一个很好的模板。我的网站显示 EPG,我想根据用户页面视图为简单的 PoC 推荐。页面浏览量是具有某些属性性别、演员、标签、频道、花束的广播...

一开始我只会发送一个主要事件,用户看到广播:

{
  "event" : "view",
  "entityType" : "user",
  "entityId" : "userId",
  "targetEntityType" : "item",
  "targetEntityId" : "broadcastId",
  "properties" : {},
  "eventTime" : "2015-10-05T21:02:49.228Z"
}

如果我理解文档,我将不得不每天使用 crontask 发送新广播以添加属性,并学习 pio 新项目:

{
  "event" : "$set",
  "entityType" : "item",
  "entityId" : "broadcastId",
  "properties" : {
      "bouquet" :       ["B1", "B2"],
      "people": ["P1", "P2"],
      "channel": ["C1"],
      "availableDate" :     "2015-11-23T21:02:49.228Z",
      "expireDate":         "2016-10-05T21:02:49.228Z"
  },
  "eventTime" : "2015-11-23T21:02:49.228Z"
}

现在,我不知道是在广播实体中使用属性更好,还是发送次要事件更好?例如:

{
  "event" : "view-bouquet",
  "entityType" : "user",
  "entityId" : "userId",
  "targetEntityType" : "item",
  "targetEntityId" : "bouquetId",
  "properties" : {},
  "eventTime" : "2015-10-05T21:02:49.228Z"
}

{
  "event" : "view-people",
  "entityType" : "user",
  "entityId" : "userId",
  "targetEntityType" : "item",
  "targetEntityId" : "peopleId",
  "properties" : {},
  "eventTime" : "2015-10-05T21:02:49.228Z"
}

...

{
  "event" : "view-channel",
  "entityType" : "user",
  "entityId" : "userId",
  "targetEntityType" : "item",
  "targetEntityId" : "channelId",
  "properties" : {},
  "eventTime" : "2015-10-05T21:02:49.228Z"
}

这取决于你想做什么。如果您希望能够过滤 and/or 这些提升,请将它们添加为项目的属性;如果您希望它们通过交叉同现计算 influence/improve 模型,请将它们作为次要事件发送。 [或者两者都做 if/when 这对你有意义]