R在一组值之后插入带有平均值的行
R insert row with mean after group of values
如何在具有按日期分组的值的平均值的一组行之后插入新行?
id date price
1 2022-01-01 4
2 2022-01-01 2
3 2022-01-01 2
1 2022-01-02 5
2 2022-01-02 3
3 2022-01-02 1
期望的输出
id date price
1 2022-01-01 4
2 2022-01-01 2
3 2022-01-01 2
mean 2022-01-01 2.66
1 2022-01-02 5
2 2022-01-02 3
3 2022-01-02 1
mean 2022-01-02 3
你可以这样做(虽然我真的不明白为什么你想要这种格式的数据):
bind_rows(
dat %>% mutate(id = as.character(id)),
dat %>% group_by(date) %>%
summarize(price=mean(price)) %>%
mutate(id = "mean")
) %>%
arrange(date,id)
输出:
id date price
<char> <IDat> <num>
1: 1 2022-01-01 4.000000
2: 2 2022-01-01 2.000000
3: 3 2022-01-01 2.000000
4: mean 2022-01-01 2.666667
5: 1 2022-01-02 5.000000
6: 2 2022-01-02 3.000000
7: 3 2022-01-02 1.000000
8: mean 2022-01-02 3.000000
也许这样做更好:
dat %>% group_by(date) %>% mutate(mean = mean(price))
输出:
id date price mean
<int> <date> <int> <dbl>
1 1 2022-01-01 4 2.67
2 2 2022-01-01 2 2.67
3 3 2022-01-01 2 2.67
4 1 2022-01-02 5 3
5 2 2022-01-02 3 3
6 3 2022-01-02 1 3
我们可以使用add_row
library(dplyr)
library(tibble)
df1 %>%
mutate(id = as.character(id)) %>%
group_by(date) %>%
group_modify(~ .x %>%
add_row(id = 'mean', price = mean(.x$price, na.rm = TRUE))) %>%
ungroup %>%
select(names(df1))
-输出
# A tibble: 8 × 3
id date price
<chr> <chr> <dbl>
1 1 2022-01-01 4
2 2 2022-01-01 2
3 3 2022-01-01 2
4 mean 2022-01-01 2.67
5 1 2022-01-02 5
6 2 2022-01-02 3
7 3 2022-01-02 1
8 mean 2022-01-02 3
数据
df1 <- structure(list(id = c(1L, 2L, 3L, 1L, 2L, 3L), date = c("2022-01-01",
"2022-01-01", "2022-01-01", "2022-01-02", "2022-01-02", "2022-01-02"
), price = c(4L, 2L, 2L, 5L, 3L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
如何在具有按日期分组的值的平均值的一组行之后插入新行?
id date price
1 2022-01-01 4
2 2022-01-01 2
3 2022-01-01 2
1 2022-01-02 5
2 2022-01-02 3
3 2022-01-02 1
期望的输出
id date price
1 2022-01-01 4
2 2022-01-01 2
3 2022-01-01 2
mean 2022-01-01 2.66
1 2022-01-02 5
2 2022-01-02 3
3 2022-01-02 1
mean 2022-01-02 3
你可以这样做(虽然我真的不明白为什么你想要这种格式的数据):
bind_rows(
dat %>% mutate(id = as.character(id)),
dat %>% group_by(date) %>%
summarize(price=mean(price)) %>%
mutate(id = "mean")
) %>%
arrange(date,id)
输出:
id date price
<char> <IDat> <num>
1: 1 2022-01-01 4.000000
2: 2 2022-01-01 2.000000
3: 3 2022-01-01 2.000000
4: mean 2022-01-01 2.666667
5: 1 2022-01-02 5.000000
6: 2 2022-01-02 3.000000
7: 3 2022-01-02 1.000000
8: mean 2022-01-02 3.000000
也许这样做更好:
dat %>% group_by(date) %>% mutate(mean = mean(price))
输出:
id date price mean
<int> <date> <int> <dbl>
1 1 2022-01-01 4 2.67
2 2 2022-01-01 2 2.67
3 3 2022-01-01 2 2.67
4 1 2022-01-02 5 3
5 2 2022-01-02 3 3
6 3 2022-01-02 1 3
我们可以使用add_row
library(dplyr)
library(tibble)
df1 %>%
mutate(id = as.character(id)) %>%
group_by(date) %>%
group_modify(~ .x %>%
add_row(id = 'mean', price = mean(.x$price, na.rm = TRUE))) %>%
ungroup %>%
select(names(df1))
-输出
# A tibble: 8 × 3
id date price
<chr> <chr> <dbl>
1 1 2022-01-01 4
2 2 2022-01-01 2
3 3 2022-01-01 2
4 mean 2022-01-01 2.67
5 1 2022-01-02 5
6 2 2022-01-02 3
7 3 2022-01-02 1
8 mean 2022-01-02 3
数据
df1 <- structure(list(id = c(1L, 2L, 3L, 1L, 2L, 3L), date = c("2022-01-01",
"2022-01-01", "2022-01-01", "2022-01-02", "2022-01-02", "2022-01-02"
), price = c(4L, 2L, 2L, 5L, 3L, 1L)), class = "data.frame", row.names = c(NA,
-6L))