根据前一列中的值创建新列
Creating new column based on values in preceding column
我想向 data.frame
添加一个新列,将第一列 中的数值 从 转换为 来自后续匹配列的相应字符串(如果有),即列名 部分匹配 第一列中的此值。
在这个例子中,我希望为 'Highest_Earner' 添加一个值,这取决于 Earner_Number
列中的值:
> df1 <- data.frame("Earner_Number" = c(1, 2, 1, 5),
"Earner5" = c("Max", "Alex", "Ben", "Mark"),
"Earner1" = c("John", "Dora", "Micelle", "Josh"))
> df1
Earner_Number Earner5 Earner1
1 1 Max John
2 2 Alex Dora
3 1 Ben Micelle
4 5 Mark Josh
结果应该是:
> df1
Earner_Number Earner5 Earner1 Highest_Earner
1 1 Max John John
2 2 Alex Dora Neither
3 1 Ben Micelle Michelle
4 5 Mark Josh Mark
我试过将 data.frame
切成各种小块,但想知道是否有人有更简洁的方法?
#Have to convert them to character for nested if else to work.
df$Earner5 <- as.character(df$Earner5)
df$Earner1 <- as.character(df$Earner1)
#Using nested if to get your column.
df$Higher_Earner <- ifelse(df$Earner_Number == 5, df$Earner5,
ifelse(df$Earner_Number==1df$Earner1,"Neither"))
dplyr 方法
library(tidyverse)
df <- tibble("Earner_Number" = c(1,2,1,5), "Earner5" = c('Max', 'Alex','Ben','Mark'), "Earner1" = c("John","Dora","Micelle",'Josh'))
df %>%
mutate(Highest_Earner = case_when(Earner_Number == 1 ~ Earner1,
Earner_Number == 5 ~ Earner5,
TRUE ~ 'Neither'))
我想向 data.frame
添加一个新列,将第一列 中的数值 从 转换为 来自后续匹配列的相应字符串(如果有),即列名 部分匹配 第一列中的此值。
在这个例子中,我希望为 'Highest_Earner' 添加一个值,这取决于 Earner_Number
列中的值:
> df1 <- data.frame("Earner_Number" = c(1, 2, 1, 5),
"Earner5" = c("Max", "Alex", "Ben", "Mark"),
"Earner1" = c("John", "Dora", "Micelle", "Josh"))
> df1
Earner_Number Earner5 Earner1
1 1 Max John
2 2 Alex Dora
3 1 Ben Micelle
4 5 Mark Josh
结果应该是:
> df1
Earner_Number Earner5 Earner1 Highest_Earner
1 1 Max John John
2 2 Alex Dora Neither
3 1 Ben Micelle Michelle
4 5 Mark Josh Mark
我试过将 data.frame
切成各种小块,但想知道是否有人有更简洁的方法?
#Have to convert them to character for nested if else to work.
df$Earner5 <- as.character(df$Earner5)
df$Earner1 <- as.character(df$Earner1)
#Using nested if to get your column.
df$Higher_Earner <- ifelse(df$Earner_Number == 5, df$Earner5,
ifelse(df$Earner_Number==1df$Earner1,"Neither"))
dplyr 方法
library(tidyverse)
df <- tibble("Earner_Number" = c(1,2,1,5), "Earner5" = c('Max', 'Alex','Ben','Mark'), "Earner1" = c("John","Dora","Micelle",'Josh'))
df %>%
mutate(Highest_Earner = case_when(Earner_Number == 1 ~ Earner1,
Earner_Number == 5 ~ Earner5,
TRUE ~ 'Neither'))