如何使用 across 和 .named 参数将此基本 R 代码转换为 dplyr
How to transform this base R code to dplyr using across and .names arguement
我有这个数据框:
df <- structure(list(A = c(2L, 3L, 4L, 5L, 5L), B = c(3L, 1L, 2L, 5L,
5L), C = c(4L, 5L, 2L, 1L, 1L), D = c(3L, 1L, 5L, 1L, 2L)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -5L))
A B C D
<int> <int> <int> <int>
1 2 3 4 3
2 3 1 5 1
3 4 2 2 5
4 5 5 1 1
5 5 5 1 2
我想用下一列减去每一列!
我可以使用这里的基本 R 代码来做到这一点 :
df[-1] - df[-ncol(df)]
B C D
1 1 1 -1
2 -2 4 -4
3 -2 0 3
4 0 -4 0
5 0 -4 1
由于 .names
参数,我想使用 across
,因此将此代码转换为 dplyr
预期输出:
A B C D B-A C-B D-C
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
我的第一次尝试:
library(dplyr)
df %>%
mutate(across(everything(), ~df[-1] - df[-ncol(df)], .names = "{.col}-{.col}")) %>%
select(contains("-"))
`A-A`$B $C $D `B-B`$B $C $D `C-C`$B $C $D `D-D`$B $C $D
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 1 1 -1 1 1 -1 1 1 -1 1 1 -1
2 -2 4 -4 -2 4 -4 -2 4 -4 -2 4 -4
3 -2 0 3 -2 0 3 -2 0 3 -2 0 3
4 0 -4 0 0 -4 0 0 -4 0 0 -4 0
5 0 -4 1 0 -4 1 0 -4 1 0 -4 1
我的第二次尝试:
df %>%
mutate(across(everything(), ~.[-1] - .[-ncol(.)], .names = "{.col}-{.col}"))
Error in `mutate()`:
! Problem while computing
`..1 = across(everything(), ~.[-1]
- .[-ncol(.)], .names =
"{.col}-{.col}")`.
Caused by error in `across()`:
! Problem while computing
column `A-A`.
Caused by error in `-ncol(A)`:
! invalid argument to unary operator
Run `rlang::last_error()` to see where the error occurred.
有更简单的方法,但如果我们想要 across
library(dplyr)
df %>%
mutate(across(-1, ~ {
prevnm <- names(cur_data())[match(cur_column(), names(cur_data()))-1]
.x - df[[prevnm]]},
.names = "{.col}-{names(.)[match(.col, names(.))-1]}"))
-输出
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
或者使用两个across
df %>%
mutate(across(-1, .names = "{.col}-{names(.)[match(.col,
names(.))-1]}") - across(-last_col()))
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
此外,dplyover
中的 across2
有一个更紧凑的选项
library(dplyover) #https://github.com/TimTeaFan/dplyover
df %>%
mutate(across2(-1, -last_col(), ~.x -.y, .names = "{xcol}-{ycol}"))
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
如果.names
可以使用默认的下划线作为分隔符,那就更简单了
df %>%
mutate(across2(-1, -last_col(), `-`))
# A tibble: 5 × 7
A B C D B_A C_B D_C
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
我有这个数据框:
df <- structure(list(A = c(2L, 3L, 4L, 5L, 5L), B = c(3L, 1L, 2L, 5L,
5L), C = c(4L, 5L, 2L, 1L, 1L), D = c(3L, 1L, 5L, 1L, 2L)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -5L))
A B C D
<int> <int> <int> <int>
1 2 3 4 3
2 3 1 5 1
3 4 2 2 5
4 5 5 1 1
5 5 5 1 2
我想用下一列减去每一列!
我可以使用这里的基本 R 代码来做到这一点
df[-1] - df[-ncol(df)]
B C D
1 1 1 -1
2 -2 4 -4
3 -2 0 3
4 0 -4 0
5 0 -4 1
由于 .names
参数,我想使用 across
,因此将此代码转换为 dplyr
预期输出:
A B C D B-A C-B D-C
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
我的第一次尝试:
library(dplyr)
df %>%
mutate(across(everything(), ~df[-1] - df[-ncol(df)], .names = "{.col}-{.col}")) %>%
select(contains("-"))
`A-A`$B $C $D `B-B`$B $C $D `C-C`$B $C $D `D-D`$B $C $D
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 1 1 -1 1 1 -1 1 1 -1 1 1 -1
2 -2 4 -4 -2 4 -4 -2 4 -4 -2 4 -4
3 -2 0 3 -2 0 3 -2 0 3 -2 0 3
4 0 -4 0 0 -4 0 0 -4 0 0 -4 0
5 0 -4 1 0 -4 1 0 -4 1 0 -4 1
我的第二次尝试:
df %>%
mutate(across(everything(), ~.[-1] - .[-ncol(.)], .names = "{.col}-{.col}"))
Error in `mutate()`:
! Problem while computing
`..1 = across(everything(), ~.[-1]
- .[-ncol(.)], .names =
"{.col}-{.col}")`.
Caused by error in `across()`:
! Problem while computing
column `A-A`.
Caused by error in `-ncol(A)`:
! invalid argument to unary operator
Run `rlang::last_error()` to see where the error occurred.
有更简单的方法,但如果我们想要 across
library(dplyr)
df %>%
mutate(across(-1, ~ {
prevnm <- names(cur_data())[match(cur_column(), names(cur_data()))-1]
.x - df[[prevnm]]},
.names = "{.col}-{names(.)[match(.col, names(.))-1]}"))
-输出
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
或者使用两个across
df %>%
mutate(across(-1, .names = "{.col}-{names(.)[match(.col,
names(.))-1]}") - across(-last_col()))
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
此外,dplyover
across2
有一个更紧凑的选项
library(dplyover) #https://github.com/TimTeaFan/dplyover
df %>%
mutate(across2(-1, -last_col(), ~.x -.y, .names = "{xcol}-{ycol}"))
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
如果.names
可以使用默认的下划线作为分隔符,那就更简单了
df %>%
mutate(across2(-1, -last_col(), `-`))
# A tibble: 5 × 7
A B C D B_A C_B D_C
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1