如何根据来自不同 table 的数据创建新行 (R)
How to create new rows based on data from a different table (R)
所以,如果我有这样的数据 table:
stores <- read_csv("stores.csv")
stores
# A tibble: 6 x 3
state store num_locations
<chr> <chr> <dbl>
1 california target 20
2 california walmart 29
3 nevada target 10
4 nevada walmart 12
5 arizona target 15
6 arizona walmart 19
然后,我创建一个没有位置信息的新数据框:
stores_2 <- select(stores, store, num_locations)
# A tibble: 6 x 2
store num_locations
<chr> <dbl>
1 target 20
2 walmart 29
3 target 10
4 walmart 12
5 target 15
6 walmart 19
有没有一种方法可以创建一个 third 数据集来提供平均位置数,就像这样(我不确定如何实际生成这个小标题):
# A tibble: 6 x 2
store avg_num_locations
<chr> <dbl>
1 target 15
2 walmart 20
一种解决方案是使用 tidyverse 函数 group_by()
和 summarise()
:
library(tidyverse)
stores <- data.frame(
stringsAsFactors = FALSE,
state = c("california","california","nevada","nevada","arizona",
"arizona"),
store = c("target",
"walmart","target","walmart","target",
"walmart"),
num_locations = c(20L, 29L, 10L, 12L, 15L, 19L)
)
stores_summary <- stores %>%
group_by(store) %>%
summarise(avg_num_locations = mean(num_locations))
stores_summary
# A tibble: 2 x 2
# store avg_num_locations
# <chr> <dbl>
#1 target 15
#2 walmart 20
在 base R 中,你可以使用 `aggregate:
aggregate(num_locations~store, stores, mean)
store num_locations
1 target 15
2 walmart 20
所以,如果我有这样的数据 table:
stores <- read_csv("stores.csv")
stores
# A tibble: 6 x 3
state store num_locations
<chr> <chr> <dbl>
1 california target 20
2 california walmart 29
3 nevada target 10
4 nevada walmart 12
5 arizona target 15
6 arizona walmart 19
然后,我创建一个没有位置信息的新数据框:
stores_2 <- select(stores, store, num_locations)
# A tibble: 6 x 2
store num_locations
<chr> <dbl>
1 target 20
2 walmart 29
3 target 10
4 walmart 12
5 target 15
6 walmart 19
有没有一种方法可以创建一个 third 数据集来提供平均位置数,就像这样(我不确定如何实际生成这个小标题):
# A tibble: 6 x 2
store avg_num_locations
<chr> <dbl>
1 target 15
2 walmart 20
一种解决方案是使用 tidyverse 函数 group_by()
和 summarise()
:
library(tidyverse)
stores <- data.frame(
stringsAsFactors = FALSE,
state = c("california","california","nevada","nevada","arizona",
"arizona"),
store = c("target",
"walmart","target","walmart","target",
"walmart"),
num_locations = c(20L, 29L, 10L, 12L, 15L, 19L)
)
stores_summary <- stores %>%
group_by(store) %>%
summarise(avg_num_locations = mean(num_locations))
stores_summary
# A tibble: 2 x 2
# store avg_num_locations
# <chr> <dbl>
#1 target 15
#2 walmart 20
在 base R 中,你可以使用 `aggregate:
aggregate(num_locations~store, stores, mean)
store num_locations
1 target 15
2 walmart 20