计算 data.frame 中的总和(长格式)
calculate the sum in a data.frame (long format)
我想计算此 data.frame 2005 年、2006 年、2007 年和类别 a、b、c 的总和。
year <- c(2005,2005,2005,2006,2006,2006,2007,2007,2007)
category <- c("a","a","a","b","b","b","c","c","c")
value <- c(3,6,8,9,7,4,5,8,9)
df <- data.frame(year, category,value, stringsAsFactors = FALSE)
table 应该是这样的:
year
category
value
2005
a
1
2005
a
1
2005
a
1
2006
b
2
2006
b
2
2006
b
2
2007
c
3
2007
c
3
2007
c
3
2006
a
3
2007
b
6
2008
c
9
知道如何实现吗?
add_row 或者 cbind?
如何使用 dplyr
包:
df %>%
group_by(year, category) %>%
summarise(sum = sum(value))
# # A tibble: 3 × 3
# # Groups: year [3]
# year category sum
# <dbl> <chr> <dbl>
# 1 2005 a 17
# 2 2006 b 20
# 3 2007 c 22
如果您宁愿添加一个作为总和的列而不是折叠它,请将 summarise()
替换为 mutate()
df %>%
group_by(year, category) %>%
mutate(sum = sum(value))
# # A tibble: 9 × 4
# # Groups: year, category [3]
# year category value sum
# <dbl> <chr> <dbl> <dbl>
# 1 2005 a 3 17
# 2 2005 a 6 17
# 3 2005 a 8 17
# 4 2006 b 9 20
# 5 2006 b 7 20
# 6 2006 b 4 20
# 7 2007 c 5 22
# 8 2007 c 8 22
# 9 2007 c 9 22
使用aggregate
的基础R解决方案
rbind( df, aggregate( value ~ year + category, df, sum ) )
year category value
1 2005 a 3
2 2005 a 6
3 2005 a 8
4 2006 b 9
5 2006 b 7
6 2006 b 4
7 2007 c 5
8 2007 c 8
9 2007 c 9
10 2005 a 17
11 2006 b 20
12 2007 c 22
我想计算此 data.frame 2005 年、2006 年、2007 年和类别 a、b、c 的总和。
year <- c(2005,2005,2005,2006,2006,2006,2007,2007,2007)
category <- c("a","a","a","b","b","b","c","c","c")
value <- c(3,6,8,9,7,4,5,8,9)
df <- data.frame(year, category,value, stringsAsFactors = FALSE)
table 应该是这样的:
year | category | value |
---|---|---|
2005 | a | 1 |
2005 | a | 1 |
2005 | a | 1 |
2006 | b | 2 |
2006 | b | 2 |
2006 | b | 2 |
2007 | c | 3 |
2007 | c | 3 |
2007 | c | 3 |
2006 | a | 3 |
2007 | b | 6 |
2008 | c | 9 |
知道如何实现吗? add_row 或者 cbind?
如何使用 dplyr
包:
df %>%
group_by(year, category) %>%
summarise(sum = sum(value))
# # A tibble: 3 × 3
# # Groups: year [3]
# year category sum
# <dbl> <chr> <dbl>
# 1 2005 a 17
# 2 2006 b 20
# 3 2007 c 22
如果您宁愿添加一个作为总和的列而不是折叠它,请将 summarise()
替换为 mutate()
df %>%
group_by(year, category) %>%
mutate(sum = sum(value))
# # A tibble: 9 × 4
# # Groups: year, category [3]
# year category value sum
# <dbl> <chr> <dbl> <dbl>
# 1 2005 a 3 17
# 2 2005 a 6 17
# 3 2005 a 8 17
# 4 2006 b 9 20
# 5 2006 b 7 20
# 6 2006 b 4 20
# 7 2007 c 5 22
# 8 2007 c 8 22
# 9 2007 c 9 22
使用aggregate
rbind( df, aggregate( value ~ year + category, df, sum ) )
year category value
1 2005 a 3
2 2005 a 6
3 2005 a 8
4 2006 b 9
5 2006 b 7
6 2006 b 4
7 2007 c 5
8 2007 c 8
9 2007 c 9
10 2005 a 17
11 2006 b 20
12 2007 c 22