do add_row 和 group_by 的问题。在 add_row 中需要 group_by 变量名
Problems with do add_row and group_by. Need the group_by variable name in add_row
问题:我想 add_row 使用 dplyr/tibble。我想在我的示例中按 A 对数据进行分组,然后 add_row 包含组名 A,然后是 B 的值。
我面临的问题是尝试在 A 下的列中添加 Group_by 变量 A。无论我尝试什么,它总是返回错误或 NA 作为该列中的值。
可重现的例子:
example <- data.frame(A = sample(letters[1:3],10,replace = TRUE),
B = sample(letters[24:26],10,replace = TRUE),
C = sample(1:3,10,replace = TRUE))
示例数据的输出:
A B C
1 c y 2
2 b x 3
3 c y 1
4 b y 1
5 c z 1
6 a x 1
7 b x 1
8 c z 2
9 a y 3
10 c y 1
我要的代码运行.
answer <- example %>%
mutate(A = as.character(A),
B = as.character(B)) %>%
group_by(A) %>%
do(add_row(.,
B = "ADDED",
C = "ADDED"))
数据输出:
A B C
1 a x 1
2 a y 3
3 <NA> ADDED ADDED
4 b x 3
5 b y 1
6 b x 1
7 <NA> ADDED ADDED
8 c y 2
9 c y 1
10 c z 1
11 c z 2
12 c y 1
13 <NA> ADDED ADDED
所以在数据的输出中,有NA的地方我想说组名(a,b, or c)
我试过只把组变量名放在那里,它不起作用会引发错误。
谢谢!
library(zoo)
df=read.table(text='A B C
1 a x 1
2 a y 3
3 NA ADDED ADDED
4 b x 3
5 b y 1
6 b x 1
7 NA ADDED ADDED
8 c y 2
9 c y 1
10 c z 1
11 c z 2
12 c y 1
13 NA ADDED ADDED',header=TRUE,stringsAsFactors=FALSE)
df$A=na.locf(df$A)
> df
A B C
1 a x 1
2 a y 3
3 a ADDED ADDED
4 b x 3
5 b y 1
6 b x 1
7 b ADDED ADDED
8 c y 2
9 c y 1
10 c z 1
11 c z 2
12 c y 1
13 c ADDED ADDED
您可以直接在 do
中添加它。
example %>%
mutate_if(is.factor, as.character) %>%
group_by(A) %>%
do(add_row(.,
A = unique(.$A),
B = "ADDED",
C = "ADDED"))
或在最后使用tidyr::fill
。因为它正在填充分组变量,所以您必须先 ungroup
。
library(tidyr)
example %>%
mutate_if(is.factor, as.character) %>%
group_by(A) %>%
do(add_row(.,
B = "ADDED",
C = "ADDED")) %>%
ungroup() %>%
fill(A)
# A tibble: 13 x 3
A B C
<chr> <chr> <chr>
1 a z 2
2 a x 1
3 a y 2
4 a ADDED ADDED
5 b y 1
6 b z 1
7 b ADDED ADDED
8 c z 2
9 c y 2
10 c z 2
11 c y 2
12 c z 1
13 c ADDED ADDED
library(tidyverse)
example <- tibble(A = sample(letters[1:3], 10, replace = TRUE),
B = sample(letters[24:26], 10, replace = TRUE),
C = sample(1:3, 10, replace = TRUE)) %>%
mutate(C = as.character(C)) %>%
arrange(A)
to_be_added <- example %>% distinct(A) %>% cbind(B = "ADDED", C = "ADDED")
bind_rows(example, to_be_added) %>% arrange(A)
#> # A tibble: 13 x 3
#> A B C
#> <chr> <chr> <chr>
#> 1 a z 2
#> 2 a y 1
#> 3 a ADDED ADDED
#> 4 b x 1
#> 5 b z 1
#> 6 b z 1
#> 7 b y 3
#> 8 b y 1
#> 9 b y 1
#> 10 b ADDED ADDED
#> 11 c y 1
#> 12 c z 1
#> 13 c ADDED ADDED
问题:我想 add_row 使用 dplyr/tibble。我想在我的示例中按 A 对数据进行分组,然后 add_row 包含组名 A,然后是 B 的值。
我面临的问题是尝试在 A 下的列中添加 Group_by 变量 A。无论我尝试什么,它总是返回错误或 NA 作为该列中的值。
可重现的例子:
example <- data.frame(A = sample(letters[1:3],10,replace = TRUE), B = sample(letters[24:26],10,replace = TRUE), C = sample(1:3,10,replace = TRUE))
示例数据的输出:
A B C 1 c y 2 2 b x 3 3 c y 1 4 b y 1 5 c z 1 6 a x 1 7 b x 1 8 c z 2 9 a y 3 10 c y 1
我要的代码运行.
answer <- example %>% mutate(A = as.character(A), B = as.character(B)) %>% group_by(A) %>% do(add_row(., B = "ADDED", C = "ADDED"))
数据输出:
A B C 1 a x 1 2 a y 3 3 <NA> ADDED ADDED 4 b x 3 5 b y 1 6 b x 1 7 <NA> ADDED ADDED 8 c y 2 9 c y 1 10 c z 1 11 c z 2 12 c y 1 13 <NA> ADDED ADDED
所以在数据的输出中,有NA的地方我想说组名(a,b, or c)
我试过只把组变量名放在那里,它不起作用会引发错误。
谢谢!
library(zoo)
df=read.table(text='A B C
1 a x 1
2 a y 3
3 NA ADDED ADDED
4 b x 3
5 b y 1
6 b x 1
7 NA ADDED ADDED
8 c y 2
9 c y 1
10 c z 1
11 c z 2
12 c y 1
13 NA ADDED ADDED',header=TRUE,stringsAsFactors=FALSE)
df$A=na.locf(df$A)
> df
A B C
1 a x 1
2 a y 3
3 a ADDED ADDED
4 b x 3
5 b y 1
6 b x 1
7 b ADDED ADDED
8 c y 2
9 c y 1
10 c z 1
11 c z 2
12 c y 1
13 c ADDED ADDED
您可以直接在 do
中添加它。
example %>%
mutate_if(is.factor, as.character) %>%
group_by(A) %>%
do(add_row(.,
A = unique(.$A),
B = "ADDED",
C = "ADDED"))
或在最后使用tidyr::fill
。因为它正在填充分组变量,所以您必须先 ungroup
。
library(tidyr)
example %>%
mutate_if(is.factor, as.character) %>%
group_by(A) %>%
do(add_row(.,
B = "ADDED",
C = "ADDED")) %>%
ungroup() %>%
fill(A)
# A tibble: 13 x 3
A B C
<chr> <chr> <chr>
1 a z 2
2 a x 1
3 a y 2
4 a ADDED ADDED
5 b y 1
6 b z 1
7 b ADDED ADDED
8 c z 2
9 c y 2
10 c z 2
11 c y 2
12 c z 1
13 c ADDED ADDED
library(tidyverse)
example <- tibble(A = sample(letters[1:3], 10, replace = TRUE),
B = sample(letters[24:26], 10, replace = TRUE),
C = sample(1:3, 10, replace = TRUE)) %>%
mutate(C = as.character(C)) %>%
arrange(A)
to_be_added <- example %>% distinct(A) %>% cbind(B = "ADDED", C = "ADDED")
bind_rows(example, to_be_added) %>% arrange(A)
#> # A tibble: 13 x 3
#> A B C
#> <chr> <chr> <chr>
#> 1 a z 2
#> 2 a y 1
#> 3 a ADDED ADDED
#> 4 b x 1
#> 5 b z 1
#> 6 b z 1
#> 7 b y 3
#> 8 b y 1
#> 9 b y 1
#> 10 b ADDED ADDED
#> 11 c y 1
#> 12 c z 1
#> 13 c ADDED ADDED