在r中配对列并从宽到长制作数据框
Pairing columns and making dataframe from wide to long in r
我有这种dataframe
:
id institution name_a info_a bfullname idb
1 A Chet Baker 666 Clifford Brown 123
我需要重塑它,保留 id
、institution
并将列配对,保持值如下:
id institution role name id_name
1 A student Chet Baker 666
1 A teacher Clifford Brown 123
角色列由 column name
定义,我有一个标识向量,如下所示:
value id
name_a student
bfullname teacher
问题是我有很多名称不同的列,我需要一种方法来指定哪些列与另一个列一起使用,或者可能是我可以重命名列并这样做的解决方案。
看了很多reshape
,dcast
,melt
等等话题还是想不通
有什么办法吗?
忘记 reshape
,使用 tidyr
:
require(dplyr)
require(tidyr)
df <- tribble(
~id, ~institution, ~name_a, ~info_a, ~bfullname, ~idb,
1, "A", "Chet Baker", 666, "Clifford Brown", 123,
2, "B", "George Baker", 123, "Charlie Brown", 234,
3, "C", "Banket Baker", 456, "James Brown", 647,
4, "D", "Koeken Baker", 789, "Golden Brown", 967
)
def <- tribble(~value, ~roleid, ~info,
"name_a", "student", "info_a",
"bfullname", "teacher", "idb")
def
dflong <- df %>%
gather(key, value, -id, -institution)
dflong %>%
filter(key %in% def$value) %>%
rename(role = key, name = value) %>%
inner_join(def, by = c('role' = 'value')) %>%
left_join(dflong %>% select(- institution), by = c('id' = 'id','info' = 'key'))
这将导致:
# A tibble: 8 x 7
id institution role name roleid info value
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 A name_a Chet Baker student info_a 666
2 2 B name_a George Baker student info_a 123
3 3 C name_a Banket Baker student info_a 456
4 4 D name_a Koeken Baker student info_a 789
5 1 A bfullname Clifford Brown teacher idb 123
6 2 B bfullname Charlie Brown teacher idb 234
7 3 C bfullname James Brown teacher idb 647
8 4 D bfullname Golden Brown teacher idb 967
library(data.table)
setDT(df)
melt(
df,
id.vars = 1:2,
measure.vars = list(name = c(3, 5), id_name = c(4, 6)),
variable.name = "role"
)
#> id institution role name id_name
#> 1: 1 A 1 Chet Baker 666
#> 2: 1 A 2 Clifford Brown 123
其中 df
是:
df <- read.table(text = '
id institution name_a info_a bfullname idb
1 A "Chet Baker" 666 "Clifford Brown" 123
', header = TRUE)
由 reprex package (v0.2.1)
于 2019-02-14 创建
我有这种dataframe
:
id institution name_a info_a bfullname idb
1 A Chet Baker 666 Clifford Brown 123
我需要重塑它,保留 id
、institution
并将列配对,保持值如下:
id institution role name id_name
1 A student Chet Baker 666
1 A teacher Clifford Brown 123
角色列由 column name
定义,我有一个标识向量,如下所示:
value id
name_a student
bfullname teacher
问题是我有很多名称不同的列,我需要一种方法来指定哪些列与另一个列一起使用,或者可能是我可以重命名列并这样做的解决方案。
看了很多reshape
,dcast
,melt
等等话题还是想不通
有什么办法吗?
忘记 reshape
,使用 tidyr
:
require(dplyr)
require(tidyr)
df <- tribble(
~id, ~institution, ~name_a, ~info_a, ~bfullname, ~idb,
1, "A", "Chet Baker", 666, "Clifford Brown", 123,
2, "B", "George Baker", 123, "Charlie Brown", 234,
3, "C", "Banket Baker", 456, "James Brown", 647,
4, "D", "Koeken Baker", 789, "Golden Brown", 967
)
def <- tribble(~value, ~roleid, ~info,
"name_a", "student", "info_a",
"bfullname", "teacher", "idb")
def
dflong <- df %>%
gather(key, value, -id, -institution)
dflong %>%
filter(key %in% def$value) %>%
rename(role = key, name = value) %>%
inner_join(def, by = c('role' = 'value')) %>%
left_join(dflong %>% select(- institution), by = c('id' = 'id','info' = 'key'))
这将导致:
# A tibble: 8 x 7
id institution role name roleid info value
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 A name_a Chet Baker student info_a 666
2 2 B name_a George Baker student info_a 123
3 3 C name_a Banket Baker student info_a 456
4 4 D name_a Koeken Baker student info_a 789
5 1 A bfullname Clifford Brown teacher idb 123
6 2 B bfullname Charlie Brown teacher idb 234
7 3 C bfullname James Brown teacher idb 647
8 4 D bfullname Golden Brown teacher idb 967
library(data.table)
setDT(df)
melt(
df,
id.vars = 1:2,
measure.vars = list(name = c(3, 5), id_name = c(4, 6)),
variable.name = "role"
)
#> id institution role name id_name
#> 1: 1 A 1 Chet Baker 666
#> 2: 1 A 2 Clifford Brown 123
其中 df
是:
df <- read.table(text = '
id institution name_a info_a bfullname idb
1 A "Chet Baker" 666 "Clifford Brown" 123
', header = TRUE)
由 reprex package (v0.2.1)
于 2019-02-14 创建