Grep 并在 R 中追加列
Grep and append column in R
我有 2000 行 HR 数据集,需要在搜索字符串模式后追加列。我想匹配(有时它们不完全匹配)edu 列从 df2 到 df1,并打印相应的 Dep 行。
此外,当 df1 中没有匹配的 edu 模式时,再次粘贴相同的 Dep 字符串而不是 NA,反之亦然(如预期结果的最后两行)。有什么建议么?谢谢
df2 <- data.frame(Dep = c("Research & Development", "Sales", "Research & Development", "Research & Development", "sales", "sales"), Edu = c("Life Sciences", "Marketing", "paramedical", "Other", "Technical studies","Business"))
df1 <- data.frame(Dep = c("Sales", "Sales", "Research & Development", "Research & Development", "Human Resources", "Research & Development", "legal section"), Edu = c("Life Sciences", "Marketing", "Medical", "Other", "Human Resources", "Technical", "Law"))
预期产出
Dep_df1 Edu_df1_df2 Dep_df2
Sales Life Sciences Research & Development
Sales Marketing Sales
Research & Development Medical Research & Development
Research & Development Other Research & Development
Human Resources Human Resources Human Resources
Research & Development Technical Sales
legal section Law legal section
sales Business sales
一种可能的方式 - 使用 dplyr
加入。它将导致列名带有 .x
和 .y
附加到相同命名的列。
library(dplyr)
df1 <- data.frame(Dep = c("S", "S", "R"), Edu = c("LS", "M", "O"))
df2 <- data.frame(Dep = c("G", "L", "Q"), Edu = c("LS", "M", "O"))
df2 %>% left_join(df1, by = c("Education")
经过一些尝试,这成功了。
dd=merge(df1, df2[, c("Edu", "Dep")], by="Edu", all.x = TRUE)
transform(dd, dep.yfill = pmax(Dep.x, Dep.y, na.rm = TRUE))
我有 2000 行 HR 数据集,需要在搜索字符串模式后追加列。我想匹配(有时它们不完全匹配)edu 列从 df2 到 df1,并打印相应的 Dep 行。
此外,当 df1 中没有匹配的 edu 模式时,再次粘贴相同的 Dep 字符串而不是 NA,反之亦然(如预期结果的最后两行)。有什么建议么?谢谢
df2 <- data.frame(Dep = c("Research & Development", "Sales", "Research & Development", "Research & Development", "sales", "sales"), Edu = c("Life Sciences", "Marketing", "paramedical", "Other", "Technical studies","Business"))
df1 <- data.frame(Dep = c("Sales", "Sales", "Research & Development", "Research & Development", "Human Resources", "Research & Development", "legal section"), Edu = c("Life Sciences", "Marketing", "Medical", "Other", "Human Resources", "Technical", "Law"))
预期产出
Dep_df1 Edu_df1_df2 Dep_df2
Sales Life Sciences Research & Development
Sales Marketing Sales
Research & Development Medical Research & Development
Research & Development Other Research & Development
Human Resources Human Resources Human Resources
Research & Development Technical Sales
legal section Law legal section
sales Business sales
一种可能的方式 - 使用 dplyr
加入。它将导致列名带有 .x
和 .y
附加到相同命名的列。
library(dplyr)
df1 <- data.frame(Dep = c("S", "S", "R"), Edu = c("LS", "M", "O"))
df2 <- data.frame(Dep = c("G", "L", "Q"), Edu = c("LS", "M", "O"))
df2 %>% left_join(df1, by = c("Education")
经过一些尝试,这成功了。
dd=merge(df1, df2[, c("Edu", "Dep")], by="Edu", all.x = TRUE)
transform(dd, dep.yfill = pmax(Dep.x, Dep.y, na.rm = TRUE))