根据名称匹配用另一个数据框替换列
Replacing column with another data frame based on name matching
嗨,我有点新,所以我不确定我这样做是否正确,但我环顾四周,找不到适用于我的代码的代码或建议。
我有一个数据框 mainDF,如下所示:
Person
ABG
SEP
CLC
XSP
APP
WED
GSH
SP-1
2.1
3.0
1.3
1.8
1.4
2.5
1.4
SP-2
2.5
2.1
2.0
1.9
1.2
1.2
2.1
SP-3
2.3
3.1
2.5
1.5
1.1
2.6
2.1
我有另一个数据框 TranslateDF,它具有缩写列名称的转换信息。我想在这里用真实姓名替换缩写名称:
请注意,翻译数据框可能有无关的信息,或者它可能缺少列的信息,因此如果 mainDF 没有获得完整的命名,则将其从数据中删除。
Abbreviated
Full Naming
ABG
All barbecue grill
SEP
shake eel peel
CLC
cold loin cake
XSP
xylophone spear pint
APP
apple pot pie
HUM
hall united meat
LPL
lending porkloin
理想情况下,我会得到新的结果数据框:
Person
All barbecue grill
shake eel peel
cold loin cake
xylophone spear pint
apple pot pie
SP-1
2.1
3.0
1.3
1.8
1.4
SP-2
2.5
2.1
2.0
1.9
1.2
SP-3
2.3
3.1
2.5
1.5
1.1
如有任何帮助,我将不胜感激!
这个怎么样:
mainDF <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1,
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8,
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4,
2.1, 2.1)), row.names = c(NA, 3L), class = "data.frame")
translateDF <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP",
"HUM", "LPL"), `Full Naming` = c("All barbecue grill", "shake eel peel",
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat",
"lending porkloin")), row.names = c(NA, 7L), class = "data.frame")
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
mainDF %>%
pivot_longer(-Person,
names_to="Abbreviated",
values_to = "vals") %>%
left_join(translateDF) %>%
select(-Abbreviated) %>%
na.omit() %>%
pivot_wider(names_from=`Full Naming`, values_from="vals")
#> Joining, by = "Abbreviated"
#> # A tibble: 3 × 6
#> Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spe…`
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 SP-1 2.1 3 1.3 1.8
#> 2 SP-2 2.5 2.1 2 1.9
#> 3 SP-3 2.3 3.1 2.5 1.5
#> # … with 1 more variable: `apple pot pie` <dbl>
由 reprex package (v2.0.1)
于 2022-04-24 创建
library(tidyverse)
mainDF %>%
rename_with(~str_replace_all(., set_names(TranslateDF[, 2], TranslateDF[, 1]))) %>%
select(Person, which(!(names(.) %in% names(mainDF))))
Person All barbecue grill shake eel peel cold loin cake xylophone spear pint apple pot pie
1 SP-1 2.1 3.0 1.3 1.8 1.4
2 SP-2 2.5 2.1 2.0 1.9 1.2
3 SP-3 2.3 3.1 2.5 1.5 1.1
您可以将命名向量传递给 select()
,这将一步重命名和 select。如果主数据框中不存在任何列,则使用 any_of()
包装可确保它不会失败:
library(dplyr)
df1 %>%
select(Person, any_of(setNames(df2$Abbreviated, df2$Full_Naming)))
# A tibble: 3 x 6
Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spear pint` `apple pot pie`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 SP-1 2.1 3 1.3 1.8 1.4
2 SP-2 2.5 2.1 2 1.9 1.2
3 SP-3 2.3 3.1 2.5 1.5 1.1
数据:
df1 <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1,
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8,
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4,
2.1, 2.1)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), spec = structure(list(cols = list(
Person = structure(list(), class = c("collector_character",
"collector")), ABG = structure(list(), class = c("collector_double",
"collector")), SEP = structure(list(), class = c("collector_double",
"collector")), CLC = structure(list(), class = c("collector_double",
"collector")), XSP = structure(list(), class = c("collector_double",
"collector")), APP = structure(list(), class = c("collector_double",
"collector")), WED = structure(list(), class = c("collector_double",
"collector")), GSH = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
df2 <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP",
"HUM", "LPL"), Full_Naming = c("All barbecue grill", "shake eel peel",
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat",
"lending porkloin")), class = "data.frame", row.names = c(NA,
-7L))
嗨,我有点新,所以我不确定我这样做是否正确,但我环顾四周,找不到适用于我的代码的代码或建议。
我有一个数据框 mainDF,如下所示:
Person | ABG | SEP | CLC | XSP | APP | WED | GSH |
---|---|---|---|---|---|---|---|
SP-1 | 2.1 | 3.0 | 1.3 | 1.8 | 1.4 | 2.5 | 1.4 |
SP-2 | 2.5 | 2.1 | 2.0 | 1.9 | 1.2 | 1.2 | 2.1 |
SP-3 | 2.3 | 3.1 | 2.5 | 1.5 | 1.1 | 2.6 | 2.1 |
我有另一个数据框 TranslateDF,它具有缩写列名称的转换信息。我想在这里用真实姓名替换缩写名称:
请注意,翻译数据框可能有无关的信息,或者它可能缺少列的信息,因此如果 mainDF 没有获得完整的命名,则将其从数据中删除。
Abbreviated | Full Naming |
---|---|
ABG | All barbecue grill |
SEP | shake eel peel |
CLC | cold loin cake |
XSP | xylophone spear pint |
APP | apple pot pie |
HUM | hall united meat |
LPL | lending porkloin |
理想情况下,我会得到新的结果数据框:
Person | All barbecue grill | shake eel peel | cold loin cake | xylophone spear pint | apple pot pie |
---|---|---|---|---|---|
SP-1 | 2.1 | 3.0 | 1.3 | 1.8 | 1.4 |
SP-2 | 2.5 | 2.1 | 2.0 | 1.9 | 1.2 |
SP-3 | 2.3 | 3.1 | 2.5 | 1.5 | 1.1 |
如有任何帮助,我将不胜感激!
这个怎么样:
mainDF <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1,
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8,
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4,
2.1, 2.1)), row.names = c(NA, 3L), class = "data.frame")
translateDF <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP",
"HUM", "LPL"), `Full Naming` = c("All barbecue grill", "shake eel peel",
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat",
"lending porkloin")), row.names = c(NA, 7L), class = "data.frame")
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
mainDF %>%
pivot_longer(-Person,
names_to="Abbreviated",
values_to = "vals") %>%
left_join(translateDF) %>%
select(-Abbreviated) %>%
na.omit() %>%
pivot_wider(names_from=`Full Naming`, values_from="vals")
#> Joining, by = "Abbreviated"
#> # A tibble: 3 × 6
#> Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spe…`
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 SP-1 2.1 3 1.3 1.8
#> 2 SP-2 2.5 2.1 2 1.9
#> 3 SP-3 2.3 3.1 2.5 1.5
#> # … with 1 more variable: `apple pot pie` <dbl>
由 reprex package (v2.0.1)
于 2022-04-24 创建library(tidyverse)
mainDF %>%
rename_with(~str_replace_all(., set_names(TranslateDF[, 2], TranslateDF[, 1]))) %>%
select(Person, which(!(names(.) %in% names(mainDF))))
Person All barbecue grill shake eel peel cold loin cake xylophone spear pint apple pot pie
1 SP-1 2.1 3.0 1.3 1.8 1.4
2 SP-2 2.5 2.1 2.0 1.9 1.2
3 SP-3 2.3 3.1 2.5 1.5 1.1
您可以将命名向量传递给 select()
,这将一步重命名和 select。如果主数据框中不存在任何列,则使用 any_of()
包装可确保它不会失败:
library(dplyr)
df1 %>%
select(Person, any_of(setNames(df2$Abbreviated, df2$Full_Naming)))
# A tibble: 3 x 6
Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spear pint` `apple pot pie`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 SP-1 2.1 3 1.3 1.8 1.4
2 SP-2 2.5 2.1 2 1.9 1.2
3 SP-3 2.3 3.1 2.5 1.5 1.1
数据:
df1 <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1,
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8,
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4,
2.1, 2.1)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), spec = structure(list(cols = list(
Person = structure(list(), class = c("collector_character",
"collector")), ABG = structure(list(), class = c("collector_double",
"collector")), SEP = structure(list(), class = c("collector_double",
"collector")), CLC = structure(list(), class = c("collector_double",
"collector")), XSP = structure(list(), class = c("collector_double",
"collector")), APP = structure(list(), class = c("collector_double",
"collector")), WED = structure(list(), class = c("collector_double",
"collector")), GSH = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
df2 <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP",
"HUM", "LPL"), Full_Naming = c("All barbecue grill", "shake eel peel",
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat",
"lending porkloin")), class = "data.frame", row.names = c(NA,
-7L))