如何根据来自不同 table 的数据创建新行 (R)

How to create new rows based on data from a different table (R)

所以,如果我有这样的数据 table:

stores <- read_csv("stores.csv")
stores

# A tibble: 6 x 3
  state      store   num_locations
  <chr>      <chr>           <dbl>
1 california target             20
2 california walmart            29
3 nevada     target             10
4 nevada     walmart            12
5 arizona    target             15
6 arizona    walmart            19

然后,我创建一个没有位置信息的新数据框:

stores_2 <- select(stores, store, num_locations)

# A tibble: 6 x 2
  store   num_locations
  <chr>           <dbl>
1 target             20
2 walmart            29
3 target             10
4 walmart            12
5 target             15
6 walmart            19

有没有一种方法可以创建一个 third 数据集来提供平均位置数,就像这样(我不确定如何实际生成这个小标题):

# A tibble: 6 x 2
  store   avg_num_locations
  <chr>           <dbl>
1 target             15
2 walmart            20

一种解决方案是使用 tidyverse 函数 group_by()summarise():

library(tidyverse)

stores <- data.frame(
  stringsAsFactors = FALSE,
                       state = c("california","california","nevada","nevada","arizona",
                                 "arizona"),
                       store = c("target",
                                 "walmart","target","walmart","target",
                                 "walmart"),
     num_locations = c(20L, 29L, 10L, 12L, 15L, 19L)
          )

stores_summary <- stores %>%
  group_by(store) %>%
  summarise(avg_num_locations = mean(num_locations))

stores_summary
# A tibble: 2 x 2
#  store   avg_num_locations
#  <chr>               <dbl>
#1 target                 15
#2 walmart                20

在 base R 中,你可以使用 `aggregate:

aggregate(num_locations~store, stores, mean)
    store num_locations
1  target            15
2 walmart            20