R递归条件嵌套有两个条件
R recursive conditionals nested with two conditions
我看到了一些类似的问题,但还没有完全解决这个问题。希望没有错。
我有一个这样的DF:
Invoices<-c(20171100247, 20171100408, 20171200376,20171201052, 21609000088)
Oustanding.days<-c(15,85,96,251,123)
Quantile.low<-c(25,21,22,23,24)
Quantile.Medium<-c(45,65,85,93,74)
Quantile.top<-c(74,89,101,175,125)
Remittances<-c(25,47,5,7,2)
df<-cbind(Invoices,Oustanding.days,Quantile.low,Quantile.Medium,Quantile.top,Remittances)
df
Invoices Oustanding.days Quantile.low Quantile.Medium Quantile.top Remittances
[1,] 20171100247 15 25 45 74 25
[2,] 20171100408 85 21 65 89 47
[3,] 20171200376 96 22 85 101 5
[4,] 20171201052 251 23 93 175 7
[5,] 21609000088 123 24 74 125 2
我想创建一个包含条件的 "Payment accuracy" 列,从这个意义上说:
如果汇款低于 5,那么我想线性分配精度:
1) df$Outstanding.days <60 -> 打印 "too early"
2) df$Outstanding.days >60 <90 -> 打印 "early"
3) df$Outstanding.days >90 -> 打印 "late"
如果汇款超过 5 我想用分位数分配它:
1) df$Outstanding.days < Quantile.low -> 打印 "too early"
2) df$Outstanding.days > Quantile.low & < Quantile.Medium -> 打印 "early"
3) df$Outstanding.days > Quantile.Medium & < Quantile.top -> 打印 "On date"
4) df$Outstanding.days > Quantile.top -> 打印 "late"
我正在尝试使用转换和嵌套条件
df.final<-transform(df,Payment.accuracy=(
if (df$OutStandingDays <= df$Quantile.low) {print
("too early")}
else (print ("NA"))))
但我做错了什么。
谢谢。
我添加了一个更正 df 的列(如果您想提取该列,以后会很容易),我使用了您最后的代码行:
Invoices<-c(20171100247, 20171100408, 20171200376,20171201052, 21609000088)
Oustanding_days<-c(15,85,96,251,123)
Quantile_low<-c(25,21,22,23,24)
Quantile_Medium<-c(45,65,85,93,74)
Quantile_top<-c(74,89,101,175,125)
Remittances<-c(25,47,5,7,2)
df<- cbind(Invoices,Oustanding_days,Quantile_low,Quantile_Medium,Quantile_top,Remittances)
df <- as.data.frame(df)
for (i in 1:length(df[,1])){
if(df$Oustanding_days[i] <= df$Quantile_low[i]){
df$final[i] <- print("too early")
} else {
df$final[i] <-print("NA")
}
}
通过该示例,您应该能够重现所需的所有条件。
祝你好运!
在此解决方案中,我根据 remittances
上的两个条件拆分数据,然后按行折叠。
library(tidyverse)
# First condition
df_less5 = df %>% filter(Remittances < 5)
df_less5 = df_less5 %>%
mutate(payment_accuracy = ifelse(Oustanding.days < 60, "too early",
ifelse(Oustanding.days >60 & Oustanding.days <90, "early", "late")))
# Second condition
df_more5 = df %>% filter(Remittances > 5)
df_more5 = df_more5 %>%
mutate(payment_accuracy = ifelse(Oustanding.days < Quantile.low, "too early",
ifelse(Oustanding.days > Quantile.low & Oustanding.days < Quantile.Medium, "early",
ifelse(Oustanding.days > Quantile.Medium & Oustanding.days < Quantile.top, "on_date",
ifelse(Oustanding.days > Quantile.top, "late", "other")))))
# new dataset
df_new = bind_rows(df_less5, df_more5)
给出以下输出:
> df_new
Invoices Oustanding.days Quantile.low Quantile.Medium Quantile.top Remittances payment_accuracy
1 21609000088 123 24 74 125 2 late
2 20171100247 15 25 45 74 25 too early
3 20171100408 85 21 65 89 47 on_date
4 20171201052 251 23 93 175 7 late
您可以为此使用 dplyr
和嵌套的 ifelse
语句。
请注意,像 >Quantile.low & < Quantile.Medium
这样的语句排除了它等于其中一个值的情况,您应该为此使用 <=
。即它应该是 >=Quantile.low & < Quantile.Medium
或 >Quantile.low & <= Quantile.Medium
。在下面的示例中,我假设了后一种选择。
df <- as.data.frame(df)
library(dplyr)
df %>% mutate(x=ifelse(Remittances<5,
ifelse(Oustanding.days<=60,'too early',
ifelse(Oustanding.days>60 & Oustanding.days<=90,'early','late')),NA)) %>%
mutate(x=ifelse(Remittances>=5,
ifelse(Oustanding.days<=Quantile.low,'too early',
ifelse(Oustanding.days>Quantile.low & Oustanding.days<=Quantile.Medium,'low',
ifelse(Oustanding.days>Quantile.Medium & Oustanding.days <= Quantile.top,'On date','late'))),x))
哪个returns:
Invoices Oustanding.days Quantile.low Quantile.Medium Quantile.top Remittances x
1 20171100247 15 25 45 74 25 too early
2 20171100408 85 21 65 89 47 On date
3 20171200376 96 22 85 101 5 On date
4 20171201052 251 23 93 175 7 late
5 21609000088 123 24 74 125 2 late
希望对您有所帮助!
我们可以使用 dplyr 包中的 case_when
来根据多个条件赋值。嵌套的 ifelse 语句或 for 循环有时可能过于复杂且难以阅读。
最后一行TRUE ~ NA_character_
是指定NA
到不满足以上任何条件的行
library(dplyr)
df2 <- df %>%
mutate(`Payment accuracy` = case_when(
Remittances < 5 & Outstanding.days < 60 ~ "too early",
Remittances < 5 & Outstanding.days >= 60 & Outstanding.days < 90 ~ "early",
Remittances < 5 & Outstanding.days >= 90 ~ "late",
Remittances >= 5 & Outstanding.days < Quantile.low ~ "too early",
Remittances >= 5 & Outstanding.days >= Quantile.low &
Outstanding.days < Quantile.Medium ~ "early",
Remittances >= 5 & Outstanding.days >= Quantile.Medium &
Outstanding.days < Quantile.top ~ "On date",
Remittances >= 5 & Outstanding.days >= Quantile.top ~ "late",
TRUE ~ NA_character_
))
df2
# Invoices Outstanding.days Quantile.low Quantile.Medium Quantile.top Remittances Payment accuracy
# 1 20171100247 15 25 45 74 25 too early
# 2 20171100408 85 21 65 89 47 On date
# 3 20171200376 96 22 85 101 5 On date
# 4 20171201052 251 23 93 175 7 late
# 5 21609000088 123 24 74 125 2 late
数据
请注意您的原始代码中有拼写错误,例如 Outstanding.days
和 Remittances
。此外,您没有通过 cbind
创建数据框。您需要的功能是data.frame
。 stringsAsFactors = FALSE
是为了确保列类型是字符,而不是因子。
Invoices<-c(20171100247, 20171100408, 20171200376,20171201052, 21609000088)
Outstanding.days<-c(15,85,96,251,123)
Quantile.low<-c(25,21,22,23,24)
Quantile.Medium<-c(45,65,85,93,74)
Quantile.top<-c(74,89,101,175,125)
Remittances<-c(25,47,5,7,2)
df <- data.frame(Invoices, Outstanding.days, Quantile.low,
Quantile.Medium, Quantile.top, Remittances,
stringsAsFactors = FALSE)
我看到了一些类似的问题,但还没有完全解决这个问题。希望没有错。
我有一个这样的DF:
Invoices<-c(20171100247, 20171100408, 20171200376,20171201052, 21609000088)
Oustanding.days<-c(15,85,96,251,123)
Quantile.low<-c(25,21,22,23,24)
Quantile.Medium<-c(45,65,85,93,74)
Quantile.top<-c(74,89,101,175,125)
Remittances<-c(25,47,5,7,2)
df<-cbind(Invoices,Oustanding.days,Quantile.low,Quantile.Medium,Quantile.top,Remittances)
df
Invoices Oustanding.days Quantile.low Quantile.Medium Quantile.top Remittances
[1,] 20171100247 15 25 45 74 25
[2,] 20171100408 85 21 65 89 47
[3,] 20171200376 96 22 85 101 5
[4,] 20171201052 251 23 93 175 7
[5,] 21609000088 123 24 74 125 2
我想创建一个包含条件的 "Payment accuracy" 列,从这个意义上说:
如果汇款低于 5,那么我想线性分配精度:
1) df$Outstanding.days <60 -> 打印 "too early"
2) df$Outstanding.days >60 <90 -> 打印 "early"
3) df$Outstanding.days >90 -> 打印 "late"
如果汇款超过 5 我想用分位数分配它:
1) df$Outstanding.days < Quantile.low -> 打印 "too early"
2) df$Outstanding.days > Quantile.low & < Quantile.Medium -> 打印 "early"
3) df$Outstanding.days > Quantile.Medium & < Quantile.top -> 打印 "On date"
4) df$Outstanding.days > Quantile.top -> 打印 "late"
我正在尝试使用转换和嵌套条件
df.final<-transform(df,Payment.accuracy=(
if (df$OutStandingDays <= df$Quantile.low) {print
("too early")}
else (print ("NA"))))
但我做错了什么。
谢谢。
我添加了一个更正 df 的列(如果您想提取该列,以后会很容易),我使用了您最后的代码行:
Invoices<-c(20171100247, 20171100408, 20171200376,20171201052, 21609000088)
Oustanding_days<-c(15,85,96,251,123)
Quantile_low<-c(25,21,22,23,24)
Quantile_Medium<-c(45,65,85,93,74)
Quantile_top<-c(74,89,101,175,125)
Remittances<-c(25,47,5,7,2)
df<- cbind(Invoices,Oustanding_days,Quantile_low,Quantile_Medium,Quantile_top,Remittances)
df <- as.data.frame(df)
for (i in 1:length(df[,1])){
if(df$Oustanding_days[i] <= df$Quantile_low[i]){
df$final[i] <- print("too early")
} else {
df$final[i] <-print("NA")
}
}
通过该示例,您应该能够重现所需的所有条件。
祝你好运!
在此解决方案中,我根据 remittances
上的两个条件拆分数据,然后按行折叠。
library(tidyverse)
# First condition
df_less5 = df %>% filter(Remittances < 5)
df_less5 = df_less5 %>%
mutate(payment_accuracy = ifelse(Oustanding.days < 60, "too early",
ifelse(Oustanding.days >60 & Oustanding.days <90, "early", "late")))
# Second condition
df_more5 = df %>% filter(Remittances > 5)
df_more5 = df_more5 %>%
mutate(payment_accuracy = ifelse(Oustanding.days < Quantile.low, "too early",
ifelse(Oustanding.days > Quantile.low & Oustanding.days < Quantile.Medium, "early",
ifelse(Oustanding.days > Quantile.Medium & Oustanding.days < Quantile.top, "on_date",
ifelse(Oustanding.days > Quantile.top, "late", "other")))))
# new dataset
df_new = bind_rows(df_less5, df_more5)
给出以下输出:
> df_new
Invoices Oustanding.days Quantile.low Quantile.Medium Quantile.top Remittances payment_accuracy
1 21609000088 123 24 74 125 2 late
2 20171100247 15 25 45 74 25 too early
3 20171100408 85 21 65 89 47 on_date
4 20171201052 251 23 93 175 7 late
您可以为此使用 dplyr
和嵌套的 ifelse
语句。
请注意,像 >Quantile.low & < Quantile.Medium
这样的语句排除了它等于其中一个值的情况,您应该为此使用 <=
。即它应该是 >=Quantile.low & < Quantile.Medium
或 >Quantile.low & <= Quantile.Medium
。在下面的示例中,我假设了后一种选择。
df <- as.data.frame(df)
library(dplyr)
df %>% mutate(x=ifelse(Remittances<5,
ifelse(Oustanding.days<=60,'too early',
ifelse(Oustanding.days>60 & Oustanding.days<=90,'early','late')),NA)) %>%
mutate(x=ifelse(Remittances>=5,
ifelse(Oustanding.days<=Quantile.low,'too early',
ifelse(Oustanding.days>Quantile.low & Oustanding.days<=Quantile.Medium,'low',
ifelse(Oustanding.days>Quantile.Medium & Oustanding.days <= Quantile.top,'On date','late'))),x))
哪个returns:
Invoices Oustanding.days Quantile.low Quantile.Medium Quantile.top Remittances x
1 20171100247 15 25 45 74 25 too early
2 20171100408 85 21 65 89 47 On date
3 20171200376 96 22 85 101 5 On date
4 20171201052 251 23 93 175 7 late
5 21609000088 123 24 74 125 2 late
希望对您有所帮助!
我们可以使用 dplyr 包中的 case_when
来根据多个条件赋值。嵌套的 ifelse 语句或 for 循环有时可能过于复杂且难以阅读。
最后一行TRUE ~ NA_character_
是指定NA
到不满足以上任何条件的行
library(dplyr)
df2 <- df %>%
mutate(`Payment accuracy` = case_when(
Remittances < 5 & Outstanding.days < 60 ~ "too early",
Remittances < 5 & Outstanding.days >= 60 & Outstanding.days < 90 ~ "early",
Remittances < 5 & Outstanding.days >= 90 ~ "late",
Remittances >= 5 & Outstanding.days < Quantile.low ~ "too early",
Remittances >= 5 & Outstanding.days >= Quantile.low &
Outstanding.days < Quantile.Medium ~ "early",
Remittances >= 5 & Outstanding.days >= Quantile.Medium &
Outstanding.days < Quantile.top ~ "On date",
Remittances >= 5 & Outstanding.days >= Quantile.top ~ "late",
TRUE ~ NA_character_
))
df2
# Invoices Outstanding.days Quantile.low Quantile.Medium Quantile.top Remittances Payment accuracy
# 1 20171100247 15 25 45 74 25 too early
# 2 20171100408 85 21 65 89 47 On date
# 3 20171200376 96 22 85 101 5 On date
# 4 20171201052 251 23 93 175 7 late
# 5 21609000088 123 24 74 125 2 late
数据
请注意您的原始代码中有拼写错误,例如 Outstanding.days
和 Remittances
。此外,您没有通过 cbind
创建数据框。您需要的功能是data.frame
。 stringsAsFactors = FALSE
是为了确保列类型是字符,而不是因子。
Invoices<-c(20171100247, 20171100408, 20171200376,20171201052, 21609000088)
Outstanding.days<-c(15,85,96,251,123)
Quantile.low<-c(25,21,22,23,24)
Quantile.Medium<-c(45,65,85,93,74)
Quantile.top<-c(74,89,101,175,125)
Remittances<-c(25,47,5,7,2)
df <- data.frame(Invoices, Outstanding.days, Quantile.low,
Quantile.Medium, Quantile.top, Remittances,
stringsAsFactors = FALSE)