cloudant index：计算每个时间段的唯一用户数

Question

关于这个问题 here 有一个非常相似的 post。在 cloudant 中，当用户访问应用程序时，我有一个文档结构存储，如下所示：

{"username":"one","timestamp":"2015-10-07T15:04:46Z"}---|同一天 {"username":"one","timestamp":"2015-10-07T19:22:00Z"}---^
{"username":"one","timestamp":"2015-10-25T04:22:00Z"}
{"username":"two","timestamp":"2015-10-07T19:22:00Z"}

我想知道的是计算给定时间段内唯一身份用户的数量。例如：

2015-10-07 = {"count": 2} 两个不同的用户在 2015-10-07 访问
2015-10-25 = {"count": 1} 一个不同的用户在 2015-10-25 访问
2015 = {"count" 2} 2015 年访问了两个不同的用户

这一切都变得棘手了，因为例如在 2015-10-07，username: one 有两条访问时间的记录，但它应该只有 return独特用户总数的计数为 1。

我试过：

function(doc) {
    var time = new Date(Date.parse(doc['timestamp'])); 
    emit([time.getUTCFullYear(),time.getUTCMonth(),time.getUTCDay(),doc.username], 1);
}

这有几个问题，Jesus Alva 在我上面链接的 post 中发表评论时强调了这些问题。

谢谢！

Answer 1

可能有更好的方法，但超出了我的想象...

您可以尝试为每个粒度级别生成一个索引：

function(doc) {
    var time = new Date(Date.parse(doc['timestamp'])); 
    var year = time.getUTCFullYear();
    var month = time.getUTCMonth()+1;
    var day = time.getUTCDate();

    // day granularity
    emit([year,month,day,doc.username], null);

    // year granularity
    emit([year,doc.username], null);
}

// reduce function - `_count`

天查询(2015-10-07):

inclusive_end=true&
start_key=[2015, 10, 7, "\u0000"]&
end_key=[2015, 10, 7, "\uefff"]&
reduce=true&
group=true

天查询结果 - 您的应用程序代码将计算行数：

{"rows":[
  {"key":[2015,10,7,"one"],"value":2},
  {"key":[2015,10,7,"two"],"value":1}
]}

年份查询：

inclusive_end=true&
start_key=[2015, "\u0000"]&
end_key=[2015, "\uefff"]&
reduce=true&
group=true

查询结果 - 您的应用程序代码将计算行数：

{"rows":[
  {"key":[2015,"one"],"value":3},
  {"key":[2015,"two"],"value":1}
]}

cloudant index：计算每个时间段的唯一用户数

cloudant index: count number of unique users per time period

lucene

couchdb

mapreduce

nosql

cloudant