由于独特 'key combinations'，运行传播错误；组合数据行

Question

我想在 Google 分析数据上做一个 group_by，我有一个唯一的用户标识符，URL 那个人去了，然后是次数用户访问了该页面。

数据来自Google这样的分析：

ID          Page                  Pageviews
abc123      example.com/pagea     2 
qwer123     example.com/pageb     3 
abc123      example.com/pageb     4
qwer123     example.com/pagec     5 
uiop123     example.com/pagea     6

我想把它变成

ID        example.com/pagea    example.com/pageb    example.com/pagec
abc123    2                    4                    0
qwer123   0                    3                    5
uiop123   6                    0                    0

但是，当我使用 spread 时，出现错误：Error: Each row of output must be identified by a unique combination of keys。

我运行的命令是： df <- data %>% spread(Page, Pageviews, fill = 0)

这里是我认为导致问题的地方：在我展开之前，我从 URLs 中删除了一些数据以规范化 URLs（基本上删除了查询字符串） .因此，在展开之前，我想我需要合并具有相同 ID 和页面的位置，然后添加合并的页面浏览量，因此我现在有 1 行而不是两行。

基本上，我想我需要转到数据的第一部分并打开实例：

ID          Page                  Pageviews
abc123      example.com/pagea     2 
abc123      example.com/pagea     3

进入

ID          Page                  Pageviews
abc123      example.com/pagea     5

最不痛苦的方法是什么？

Answer 1

首先使用 dplyr :

library(dplyr)
library(tidyr)
df <- data %>% group_by(ID,Page) %>%
summarise(Pageviews = sum(Pageviews,na.rm=T) %>%
spread(Page, Pageviews, fill = 0)

由于独特 'key combinations'，运行传播错误；组合数据行

Error in running a spread because of unique 'key combinations'; combining rows of data

pivot

pivot-table

r

spread

tidyr

由于独特 'key combinations'，运行 传播错误；组合数据行

Error in running a spread because of unique 'key combinations'; combining rows of data

pivot

pivot-table

r

spread

tidyr

由于独特 'key combinations'，运行传播错误；组合数据行