dplyr 用另一列汇总分组数据
dplyr summarize grouped data with another column
我有一个数据框pop.subset <-
:
state location pop
WA Seattle 100
WA Kent 20
OR foo 30
CA foo2 80
我需要每个州人口最少的城市存储在data.frame中。
我有:
result <- pop.subset %>%
group_by(state) %>%
summarise(min = min(pop))
本returns本data.frame:
state min
WA 20
... .... etc
但我也需要这座城市。我试过在 group_by
函数中包含位置,如下所示:group_by(state, location)
,但是这给出了每个城市与州配对的最小值,而不是像这样的州与城市配对:
state location pop
WA Seattle 100
WA Kent 20
foo foo foo
有没有我缺少的简单解决方案?我希望我的结果是这样的:
state location pop
WA Kent 20
... ... ... etc.
你试过这样的事情吗?
result <- pop.subset %>%
group_by(state, location) %>%
summarise(min = min(both_sexes_2012))
我想你想按 state
分组,然后过滤 min(pop)
:
pop.subset %>%
group_by(state) %>%
filter(pop == min(pop)) %>%
ungroup()
# A tibble: 3 x 3
state location pop
<chr> <chr> <int>
1 WA Kent 20
2 OR foo 30
3 CA foo2 80
我明白了,这样就解决了:
library(tibble)
data<-tribble(~state, ~location, ~pop,
"WA", "Seattle", 100,
"WA", "Kent", 20,
"OR", "foo" , 30,
"CA", "foo2" , 80
)
library(dplyr)
data%>%group_by(state)%>%summarise(location=location[which.min(pop)]
,min=min(pop))
# A tibble: 3 x 3
state location min
<chr> <chr> <dbl>
1 CA foo2 80
2 OR foo 30
3 WA Kent 20
我有一个数据框pop.subset <-
:
state location pop
WA Seattle 100
WA Kent 20
OR foo 30
CA foo2 80
我需要每个州人口最少的城市存储在data.frame中。 我有:
result <- pop.subset %>%
group_by(state) %>%
summarise(min = min(pop))
本returns本data.frame:
state min
WA 20
... .... etc
但我也需要这座城市。我试过在 group_by
函数中包含位置,如下所示:group_by(state, location)
,但是这给出了每个城市与州配对的最小值,而不是像这样的州与城市配对:
state location pop
WA Seattle 100
WA Kent 20
foo foo foo
有没有我缺少的简单解决方案?我希望我的结果是这样的:
state location pop
WA Kent 20
... ... ... etc.
你试过这样的事情吗?
result <- pop.subset %>%
group_by(state, location) %>%
summarise(min = min(both_sexes_2012))
我想你想按 state
分组,然后过滤 min(pop)
:
pop.subset %>%
group_by(state) %>%
filter(pop == min(pop)) %>%
ungroup()
# A tibble: 3 x 3
state location pop
<chr> <chr> <int>
1 WA Kent 20
2 OR foo 30
3 CA foo2 80
我明白了,这样就解决了:
library(tibble)
data<-tribble(~state, ~location, ~pop,
"WA", "Seattle", 100,
"WA", "Kent", 20,
"OR", "foo" , 30,
"CA", "foo2" , 80
)
library(dplyr)
data%>%group_by(state)%>%summarise(location=location[which.min(pop)]
,min=min(pop))
# A tibble: 3 x 3
state location min
<chr> <chr> <dbl>
1 CA foo2 80
2 OR foo 30
3 WA Kent 20