group_by 和 R 中的近似函数
Approx function with group_by and across in R
我目前正在对时间序列进行插值,需要在具有 4 列和 172660 行但有 4 个组的 dataframe
中使用 approx
函数(因此每个组有 43165 行) .目前,有两个答案:, but with the interpolation in just one column; and one 。第一种方法确实有效,但不适合我的目的。我还注意到,例如,使用 mutate_at 被 mutate(across())
取代。所以我尝试使用更新的方法,但它不起作用。
library(tidyverse)
tabela_1 <- tibble(x1 = rnorm(4800, mean = 88.5, sd = 4),
x2 = rnorm(4800, mean = -38.526, sd = 2.758),
x3 = rnorm(4800, mean = -22.6852, sd = 1.8652),
x4 = rnorm(4800, mean = -38.526, sd = 2.758),
tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.72),
times = 4),
category = rep(x = 1:4, each = 1200))
tabela <- tibble(tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.02),
times = 4),
category = rep(x = 1:4, each = 43165))
tabela_joined <- tabela %>%
left_join(tabela_1, by = c("tmpts", "category")) %>%
arrange(category, tmpts) %>%
janitor::clean_names()
tabela_interpolation <- tabela_joined %>%
group_by(category) %>%
summarize(across(.cols = x1:x4, approx(., n = 43165)))
当运行tabela_interpolation
时,我收到:
Erro: Problem with `summarise()` input `..1`.
i `..1 = across(.cols = x1:x15, approx(., n = 43165))`.
x Can't convert an integer vector to function
i The error occurred in group 1: run = 1.
Run `rlang::last_error()` to see where the error occurred.
Além disso: Warning message:
In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
collapsing to unique 'x' values
我应该如何使用 summarise
加上 across
从 dataframe
中每一列的 approx
函数中获取插值时间序列?
您可以使用 across
语法作为 -
library(tidyverse)
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, approx, n = 43165)) %>%
ungroup
或
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, ~approx(., n = 43165))) %>%
ungroup
后面可以跟unnest
以获得完整的扩展数据框。
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, approx, n = 43165)) %>%
ungroup %>%
unnest(x1:x4)
# category x1 x2 x3 x4
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 1 1 1 1
# 2 1 2 2 2 2
# 3 1 3 3 3 3
# 4 1 4 4 4 4
# 5 1 5 5 5 5
# 6 1 6 6 6 6
# 7 1 7 7 7 7
# 8 1 8 8 8 8
# 9 1 9 9 9 9
#10 1 10 10 10 10
# … with 345,310 more rows
我目前正在对时间序列进行插值,需要在具有 4 列和 172660 行但有 4 个组的 dataframe
中使用 approx
函数(因此每个组有 43165 行) .目前,有两个答案:mutate(across())
取代。所以我尝试使用更新的方法,但它不起作用。
library(tidyverse)
tabela_1 <- tibble(x1 = rnorm(4800, mean = 88.5, sd = 4),
x2 = rnorm(4800, mean = -38.526, sd = 2.758),
x3 = rnorm(4800, mean = -22.6852, sd = 1.8652),
x4 = rnorm(4800, mean = -38.526, sd = 2.758),
tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.72),
times = 4),
category = rep(x = 1:4, each = 1200))
tabela <- tibble(tmpts = rep(x = seq(from = 0, to = 863.28, by = 0.02),
times = 4),
category = rep(x = 1:4, each = 43165))
tabela_joined <- tabela %>%
left_join(tabela_1, by = c("tmpts", "category")) %>%
arrange(category, tmpts) %>%
janitor::clean_names()
tabela_interpolation <- tabela_joined %>%
group_by(category) %>%
summarize(across(.cols = x1:x4, approx(., n = 43165)))
当运行tabela_interpolation
时,我收到:
Erro: Problem with `summarise()` input `..1`.
i `..1 = across(.cols = x1:x15, approx(., n = 43165))`.
x Can't convert an integer vector to function
i The error occurred in group 1: run = 1.
Run `rlang::last_error()` to see where the error occurred.
Além disso: Warning message:
In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
collapsing to unique 'x' values
我应该如何使用 summarise
加上 across
从 dataframe
中每一列的 approx
函数中获取插值时间序列?
您可以使用 across
语法作为 -
library(tidyverse)
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, approx, n = 43165)) %>%
ungroup
或
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, ~approx(., n = 43165))) %>%
ungroup
后面可以跟unnest
以获得完整的扩展数据框。
tabela_joined %>%
group_by(category) %>%
summarize(across(x1:x4, approx, n = 43165)) %>%
ungroup %>%
unnest(x1:x4)
# category x1 x2 x3 x4
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 1 1 1 1
# 2 1 2 2 2 2
# 3 1 3 3 3 3
# 4 1 4 4 4 4
# 5 1 5 5 5 5
# 6 1 6 6 6 6
# 7 1 7 7 7 7
# 8 1 8 8 8 8
# 9 1 9 9 9 9
#10 1 10 10 10 10
# … with 345,310 more rows