根据前一列中的值创建新列

Creating new column based on values in preceding column

我想向 data.frame 添加一个新列,将第一列 中的数值 转换为 来自后续匹配列的相应字符串(如果有),即列名 部分匹配 第一列中的此值。

在这个例子中,我希望为 'Highest_Earner' 添加一个值,这取决于 Earner_Number 列中的值:

> df1 <- data.frame("Earner_Number" = c(1, 2, 1, 5),
                    "Earner5" = c("Max", "Alex", "Ben", "Mark"),
                    "Earner1" = c("John", "Dora", "Micelle", "Josh"))
> df1
  Earner_Number Earner5 Earner1
1             1     Max    John
2             2    Alex    Dora
3             1     Ben Micelle
4             5    Mark    Josh

结果应该是:

> df1
  Earner_Number Earner5 Earner1 Highest_Earner
1             1     Max    John           John
2             2    Alex    Dora        Neither
3             1     Ben Micelle       Michelle
4             5    Mark    Josh           Mark

我试过将 data.frame 切成各种小块,但想知道是否有人有更简洁的方法?

    #Have to convert them to character for nested if else to work.

    df$Earner5 <- as.character(df$Earner5)
    df$Earner1 <- as.character(df$Earner1)

    #Using nested if to get your column.
    df$Higher_Earner <-    ifelse(df$Earner_Number == 5, df$Earner5, 
                                      ifelse(df$Earner_Number==1df$Earner1,"Neither"))

dplyr 方法

library(tidyverse)

df <- tibble("Earner_Number" = c(1,2,1,5), "Earner5" = c('Max', 'Alex','Ben','Mark'), "Earner1" = c("John","Dora","Micelle",'Josh'))



df %>% 
  mutate(Highest_Earner = case_when(Earner_Number == 1 ~ Earner1,
                                    Earner_Number == 5 ~ Earner5,
                                    TRUE ~ 'Neither'))