将字符向量列表转换为整齐的数据框
Convert list of character vectors to tidy data frame
我有一个字符向量列表,我想将其转换为整洁的数据框。字符向量的长度不相等。
dput(data)
list(`ko03008 Ribosome biogenesis in eukaryotes` = c("G5382",
"G13330", "G4043", "G13255"), `ko03010 Ribosome` = c("G16823",
"G4822", "G11737", "G114", "G18144", "G6031", "G24182", "G9882",
"G14270", "G16903", "G2506", "G3550"), `ko03013 RNA transport` = c("G18058",
"G20817", "G6913", "G18004", "G4129", "G5382", "G5264", "G17529",
"G5114", "G21371", "G19351", "G15511", "G1049", "G14663"), `ko03015 mRNA surveillance pathway` = c("G20817",
"G6913", "G18004", "G4129", "G5382", "G19351", "G15511", "G1463"
), `ko03018 RNA degradation` = c("G11453", "G7437", "G11483",
"G12095"), `ko03020 RNA polymerase` = c("G13069", "G10917", "G6973",
"G7432"))
我想创建一个包含两列的数据框。一个带有列表中每个字符向量的名称(例如 'ko03008 Ribosome biogeneis in eukaryotes'),另一个带有基因 ID(例如 'G5382)。
我使用 enframe
创建了一个小标题,如下所示:
但我想像这样格式化它(列表中第一个向量的示例):
使用unnest_longer
:
library(tidyverse)
data %>%
enframe() %>%
unnest_longer(value)
# A tibble: 46 x 2
name value
<chr> <chr>
1 ko03008 Ribosome biogenesis in eukaryotes G5382
2 ko03008 Ribosome biogenesis in eukaryotes G13330
3 ko03008 Ribosome biogenesis in eukaryotes G4043
4 ko03008 Ribosome biogenesis in eukaryotes G13255
5 ko03010 Ribosome G16823
6 ko03010 Ribosome G4822
7 ko03010 Ribosome G11737
8 ko03010 Ribosome G114
9 ko03010 Ribosome G18144
10 ko03010 Ribosome G6031
# ... with 36 more rows
我有一个字符向量列表,我想将其转换为整洁的数据框。字符向量的长度不相等。
dput(data)
list(`ko03008 Ribosome biogenesis in eukaryotes` = c("G5382",
"G13330", "G4043", "G13255"), `ko03010 Ribosome` = c("G16823",
"G4822", "G11737", "G114", "G18144", "G6031", "G24182", "G9882",
"G14270", "G16903", "G2506", "G3550"), `ko03013 RNA transport` = c("G18058",
"G20817", "G6913", "G18004", "G4129", "G5382", "G5264", "G17529",
"G5114", "G21371", "G19351", "G15511", "G1049", "G14663"), `ko03015 mRNA surveillance pathway` = c("G20817",
"G6913", "G18004", "G4129", "G5382", "G19351", "G15511", "G1463"
), `ko03018 RNA degradation` = c("G11453", "G7437", "G11483",
"G12095"), `ko03020 RNA polymerase` = c("G13069", "G10917", "G6973",
"G7432"))
我想创建一个包含两列的数据框。一个带有列表中每个字符向量的名称(例如 'ko03008 Ribosome biogeneis in eukaryotes'),另一个带有基因 ID(例如 'G5382)。
我使用 enframe
创建了一个小标题,如下所示:
但我想像这样格式化它(列表中第一个向量的示例):
使用unnest_longer
:
library(tidyverse)
data %>%
enframe() %>%
unnest_longer(value)
# A tibble: 46 x 2
name value
<chr> <chr>
1 ko03008 Ribosome biogenesis in eukaryotes G5382
2 ko03008 Ribosome biogenesis in eukaryotes G13330
3 ko03008 Ribosome biogenesis in eukaryotes G4043
4 ko03008 Ribosome biogenesis in eukaryotes G13255
5 ko03010 Ribosome G16823
6 ko03010 Ribosome G4822
7 ko03010 Ribosome G11737
8 ko03010 Ribosome G114
9 ko03010 Ribosome G18144
10 ko03010 Ribosome G6031
# ... with 36 more rows