tidyverse:将字符串拆分为 data.frame 作为行
tidyverse: splitting string to data.frame as rows
我想将基于 \n
的字符串拆分为 data.frame
的行。下面给出的代码未按要求工作。任何提示。
library(tidyverse)
Test <- "ASD 7\nDEF \n This"
library(stringr)
str_split(string = Test, pattern = "\n")
[[1]]
[1] "ASD 7" "DEF " " This
tb <-
as_tibble(Test) %>%
set_names("Test")
tb %>%
str_split(string = Test, pattern = "\n")
[[1]]
[1] NA
Warning message:
In stri_split_regex(string, pattern, n = n, simplify = simplify, :
NAs introduced by coercion
需要输出
ASD 7
DEF
This
str_split
旨在处理原子向量而不是数据集。它没有参数 data
因此它只能像这样工作
str_split(tb$Test, '\n')
[[1]]
[1] "ASD 7" "DEF " " This"
或
> tb %>%
+ mutate(chr_list = str_split(Test, '\n'))
# A tibble: 1 x 2
Test chr_list
<chr> <list>
1 "ASD 7\nDEF \n This" <chr [3]>
此外,如果你喜欢在数据库中做,你可以像这样tidyr::separate
或tidyr::separate_rows()
tb %>%
separate_rows(Test, sep = '\n')
# A tibble: 3 x 1
Test
<chr>
1 "ASD 7"
2 "DEF "
3 " This"
或
tb %>%
separate(Test, into = c('A', 'B', 'C'), sep = '\n')
# A tibble: 1 x 3
A B C
<chr> <chr> <chr>
1 ASD 7 "DEF " " This"
PS:如果你也想去除空格,你可以使用'\s*\n+\s*'
作为分隔模式
tb %>%
transmute(text_data = map(str_split(Test, '\n'), ~ str_trim(.x))) %>%
unnest_longer(text_data)
# A tibble: 3 x 1
text_data
<chr>
1 ASD 7
2 DEF
3 This
或
tb %>%
separate_rows(Test, sep = "\s*\n+\s*")
# A tibble: 3 x 1
Test
<chr>
1 ASD 7
2 DEF
3 This
我想将基于 \n
的字符串拆分为 data.frame
的行。下面给出的代码未按要求工作。任何提示。
library(tidyverse)
Test <- "ASD 7\nDEF \n This"
library(stringr)
str_split(string = Test, pattern = "\n")
[[1]]
[1] "ASD 7" "DEF " " This
tb <-
as_tibble(Test) %>%
set_names("Test")
tb %>%
str_split(string = Test, pattern = "\n")
[[1]]
[1] NA
Warning message:
In stri_split_regex(string, pattern, n = n, simplify = simplify, :
NAs introduced by coercion
需要输出
ASD 7
DEF
This
str_split
旨在处理原子向量而不是数据集。它没有参数 data
因此它只能像这样工作
str_split(tb$Test, '\n')
[[1]]
[1] "ASD 7" "DEF " " This"
或
> tb %>%
+ mutate(chr_list = str_split(Test, '\n'))
# A tibble: 1 x 2
Test chr_list
<chr> <list>
1 "ASD 7\nDEF \n This" <chr [3]>
此外,如果你喜欢在数据库中做,你可以像这样tidyr::separate
或tidyr::separate_rows()
tb %>%
separate_rows(Test, sep = '\n')
# A tibble: 3 x 1
Test
<chr>
1 "ASD 7"
2 "DEF "
3 " This"
或
tb %>%
separate(Test, into = c('A', 'B', 'C'), sep = '\n')
# A tibble: 1 x 3
A B C
<chr> <chr> <chr>
1 ASD 7 "DEF " " This"
PS:如果你也想去除空格,你可以使用'\s*\n+\s*'
作为分隔模式
tb %>%
transmute(text_data = map(str_split(Test, '\n'), ~ str_trim(.x))) %>%
unnest_longer(text_data)
# A tibble: 3 x 1
text_data
<chr>
1 ASD 7
2 DEF
3 This
或
tb %>%
separate_rows(Test, sep = "\s*\n+\s*")
# A tibble: 3 x 1
Test
<chr>
1 ASD 7
2 DEF
3 This