Tidyverse unnest_tokens 在函数内部不起作用
Tidyverse unnest_tokens does not work inside function
我有一个可以在代码中运行的 unnest_tokens
函数,但是一旦我将它放入一个函数中,我就无法运行它。我不明白为什么当我把它放在函数中时会发生这种情况。
数据:
id words
1 why is this function not working
2 more text
3 help me
4 thank you
5 in advance
6 xx xx
数据在 stringsAsFactors == FALSE
上检查,如果它是 Vector
。
is.vector(data$words)
[1] TRUE
is.vector(data$id)
[1] TRUE
typeof(data$words)
[1] "character"
下面是给出正确输出的函数之外的代码:
df <- x %>%
unnest_tokens(word, words)%>%
group_by(id)
1 why
1 is
1 this
1 function
1 not
1 working
2 more
2 text
3 help
3 me
4 thank
4 you
5 in
5 advance
6 xx
6 xx
一旦我将代码放入函数中,就会出现错误。
tidy_x <- unnestDF(data, "words", "id")
unnestDF <- function(df, col, groupbyCol) {
x <- df %>%
unnest_tokens(word, df[col])%>%
group_by(df[groupbyCol])
return(x)
}
Error in check_input(x) :
Input must be a character vector of any length or a list of character
vectors, each of which has a length of 1.
提前谢谢你。
因为我们使用引号参数,一种选择是转换为符号,然后在 unnest_tokens
内计算 (!!
) 而不是 group_by
使用 group_by_at
可以带字符串
unnestDF <- function(df, col, groupbyCol) {
df %>%
unnest_tokens(word, !! rlang::sym(col))%>%
group_by_at(groupbyCol)
}
unnestDF(data, "words", "id")
# A tibble: 16 x 2
# Groups: id [6]
# id word
# * <int> <chr>
# 1 1 why
# 2 1 is
# 3 1 this
# 4 1 function
# 5 1 not
# 6 1 working
# 7 2 more
# 8 2 text
# 9 3 help
#10 3 me
#11 4 thank
#12 4 you
#13 5 in
#14 5 advance
#15 6 xx
#16 6 xx
我有一个可以在代码中运行的 unnest_tokens
函数,但是一旦我将它放入一个函数中,我就无法运行它。我不明白为什么当我把它放在函数中时会发生这种情况。
数据:
id words
1 why is this function not working
2 more text
3 help me
4 thank you
5 in advance
6 xx xx
数据在 stringsAsFactors == FALSE
上检查,如果它是 Vector
。
is.vector(data$words)
[1] TRUE
is.vector(data$id)
[1] TRUE
typeof(data$words)
[1] "character"
下面是给出正确输出的函数之外的代码:
df <- x %>%
unnest_tokens(word, words)%>%
group_by(id)
1 why
1 is
1 this
1 function
1 not
1 working
2 more
2 text
3 help
3 me
4 thank
4 you
5 in
5 advance
6 xx
6 xx
一旦我将代码放入函数中,就会出现错误。
tidy_x <- unnestDF(data, "words", "id")
unnestDF <- function(df, col, groupbyCol) {
x <- df %>%
unnest_tokens(word, df[col])%>%
group_by(df[groupbyCol])
return(x)
}
Error in check_input(x) : Input must be a character vector of any length or a list of character vectors, each of which has a length of 1.
提前谢谢你。
因为我们使用引号参数,一种选择是转换为符号,然后在 unnest_tokens
内计算 (!!
) 而不是 group_by
使用 group_by_at
可以带字符串
unnestDF <- function(df, col, groupbyCol) {
df %>%
unnest_tokens(word, !! rlang::sym(col))%>%
group_by_at(groupbyCol)
}
unnestDF(data, "words", "id")
# A tibble: 16 x 2
# Groups: id [6]
# id word
# * <int> <chr>
# 1 1 why
# 2 1 is
# 3 1 this
# 4 1 function
# 5 1 not
# 6 1 working
# 7 2 more
# 8 2 text
# 9 3 help
#10 3 me
#11 4 thank
#12 4 you
#13 5 in
#14 5 advance
#15 6 xx
#16 6 xx