在R中,如何根据条件用另一个数据集的另一列的值替换一列中的值?

In R, how to replace values in a column with values of another column of another data set based on a condition?

我需要数据集,我在下面给出了样本。我需要替换 target_df$project_name 中的项目名称,以防它们出现在 registry_df$to_change 中并具有 registry_df$replacement 中的相应值。但是,我尝试的代码显然没有产生任何结果。应该怎么改正或者还有什么方法可以达到预期的目的?

数据集:

target_df <- tibble::tribble(
  ~project_name,     ~sum,   
  "Mark",            "4307",     
  "Boat",            "9567",       
  "Delorean",        "5344",      
  "Parix",           "1043",
)

registry_df <- tibble::tribble(
  ~to_change,     ~replacement,   
  "Mark",            "Duck",     
  "Boat",            "Tank",       
  "Toloune",         "Bordeaux",      
  "Hunge",           "Juron",
)

target_df 的期望输出:

project_name        sum   
  "Duck"            "4307"     
  "Tank"            "9567"       
  "Delorean"        "5344"      
  "Parix"           "1043"

代码:

library(data.table)

target_df <- transform(target_df, 
                       project_name = ifelse(target_df$project_name %in% registry_df$to_change),
                       registry_df$replacement,
                       project_name
)

基本的 R 解决方案:您可以使用 match 函数匹配列。由于并非 target_df$project_name 的所有级别都在 registry_df$to_change 中,因此您的匹配变量将具有 NA。因此,我包含了 ifelse 函数,它在 NAs 的情况下保持原始值。

matching <- registry_df$replacement[match(target_df$project_name, registry_df$to_change)]
target_df$project_name <- ifelse(is.na(matching),
                                 target_df$project_name,
                                 matching)

target_df 给出预期输出:

  project_name sum  
  <chr>        <chr>
1 Duck         4307 
2 Tank         9567 
3 Delorean     5344 
4 Parix        1043 

一个dplyr解决方案。可能有一种更少步骤的优雅方法。

library(dplyr)

target_df <- target_df %>% 
  left_join(registry_df,  
            by = c("project_name" = "to_change")) %>% 
  mutate(replacement = ifelse(is.na(replacement), project_name, replacement)) %>% 
  select(project_name = replacement, sum)

结果:

# A tibble: 4 × 2
  project_name sum  
  <chr>        <chr>
1 Duck         4307 
2 Tank         9567 
3 Delorean     5344 
4 Parix        1043