时间跨度每年生日前

Timespan every year before birthday

数据:

 DB <- data.frame(orderID  = c(1,2,3,4,5,6,7,8,9,10),     
   orderDate = c("1.1.14","16.3.14","11.5.14","21.6.14","29.7.14", 
        "2.8.14","21.9.14","4.10.14","30.11.14","2.1.15"),  
   itemID = c(2,3,2,5,12,4,2,3,1,5),  
   price = c(29.90, 39.90, 29.90, 19.90, 49.90, 9.90, 29.90, 39.90, 
              14.90, 19.90),
   customerID = c(1, 2, 3, 1, 1, 3, 2, 2, 1, 1),
   dateofbirth = c("12.1.67","14.10.82","6.8.87","12.1.67","12.1.67",
           "6.8.87","14.10.82","14.10.82","12.1.67","12.1.67")

预期结果:

orderedinatimespan2weeksbeforebirthday = c("Yes", "No", "No", "No",
     "No", "Yes", "No"  , "Yes", "No", "Yes")

大家好, 希望你在新的一年里一切顺利;) 不幸的是,新的一年给我带来了一些我无法单独解决的新问题 - 所以如果你再次帮助我,我将非常高兴:) 在数据集中,每个订单都有自己的 ID,每个注册用户都有他唯一的客户 ID .每个客户都可以订购具有特定价格的商品(带有 ItemID)。用户在数据库中写入了 his/her 出生日期(如上所示:D)我想标记发生在 his/her 生日前 2 周或 his/her 生日用 "Yes" 并在今年剩余时间用 "No" 订购。此外,"formula" 不仅适用于今年 - 它也适用于未来几年的订单。 (2016 年等)我还想将结果添加为现有数据集中的新列 (orderedinatimespan2weeksbeforebirthday)...

已经这样试过了,但是当我只使用日期和月份而不使用年份时,span 功能不起作用...

DB$dateOfBirth <- as.Date(DB$dateOfBirth) 
DB$Birthday1 <- format(as.Date(DB$dateofbirth), "%m-%d")
DB$Birthday2 <- DB$dateOfBirth-ddays(14)
DB$Birthday3 <- format(as.Date(DB$Birthday2), "%m-%d")
DB$Birthday3 <- format(as.Date(DB$Birthday3), "%y-%m-%d")
DB$spanBirthday <- new_interval (ymd(DB$Birthday2), ymd(mydata$Birthday1))

希望你能告诉我哪里出了问题或告诉我解决问题的另一种可能性....

干杯,谢谢!

一种方法是将 "dateofbirth" 的 "year" 部分更改为 "orderDate" 的部分,然后检查 DOB1 是否在“2 周”内"orderDate"。使用sub删除"orderDate"中的"day/month","dateofbirth"列中的strsplit,将“3rd”元素("year")替换为"year" 来自 "orderDate"。这可以用 "mapply" 来完成。转换为 "date" class 并执行逻辑运算 <,即 returns "TRUE/FALSE"。如果您需要将其转换为 "Yes/No",请在结果中添加“1”以获得数字索引“1/2”并替换为 "Yes/No".

toChange <- sub('.*\.', '', DB$orderDate)
DOB <- mapply(function(x,y) {x[3]<-y; paste(x,collapse=".")}, 
           strsplit(as.character(DB$dateofbirth),'[.]'), toChange)
DOB1 <- as.Date(DOB, '%d.%m.%y')
orderDate <- as.Date(DB$orderDate, '%d.%m.%y')
c('No', 'Yes')[(orderDate-12 <DOB1 & DOB1 <= orderDate+12)+1]
#[1] "Yes" "No"  "No"  "No"  "No"  "Yes" "No"  "Yes" "No"  "Yes"

如果您需要针对不同的 "orderDate" 进行更改,将其包装在一个函数中会更容易

ordertimeSpan <- function(data, orderCol, DOBCol){
 toChange <- sub('.*\.', '', data[,orderCol])
 DOB <- mapply(function(x,y){x[3] <- y; paste(x,collapse='.')}, 
     strsplit(as.character(data[,DOBCol]),'[.]'), toChange)
 DOB1 <- as.Date(DOB, '%d.%m.%y')
 orderDate <- as.Date(data[,orderCol], '%d.%m.%y')
 c('No', 'Yes')[(orderDate-12 < DOB1 & DOB1 <= orderDate+12)+1]
 }

 ordertimeSpan(DB, 'orderDate', 'dateofbirth')
 #[1] "Yes" "No"  "No"  "No"  "No"  "Yes" "No"  "Yes" "No"  "Yes"

更新

如果"dates"已经是%Y-%m-%d格式,即

 DB$orderDate <- as.Date(DB$orderDate, '%d.%m.%y')
 DB$dateofbirth <- as.Date(DB$dateofbirth, '%d.%m.%y')
 #in the present datatset, "dateofbirth" would end up in "year" 2067, etc by converting to 'Date' class, which could be corrected .  But, that is not the main focus here..

toChange <- format(DB$orderDate, '%Y')
DOB <- mapply(function(x,y) {x[1]<-y; as.Date(paste(x,collapse="-"))}, 
          strsplit(as.character(DB$dateofbirth),'[-]'), toChange)
 orderDate <- DB$orderDate
 c('No', 'Yes')[(orderDate-12 <DOB & DOB <= orderDate+12)+1]
#[1] "Yes" "No"  "No"  "No"  "No"  "Yes" "No"  "Yes" "No"  "Yes"