更改数据框中的所有日期格式
Change all date format in dataframe
我是 R 编程的新手,有点陷入困境:
我有一个数据框,我想检查日期格式的 rows/columns 中是否有任何值应该被剥离到仅时间部分。
例如,日期字符串 "2015-01-02 10:15:44"
应更改为 "10:15:44"
我知道,这是非常新手的方法,但这是我尝试获取所有值的子字符串的方法。
id<-c(1,2,3,4)
time1<-c("2015-01-02 10:15:44","NA","2015-11-12 00:15:44","2015-01-02 12:15:14")
time2<-c("NA", "2015-01-02 10:15:44","NA","2015-11-12 00:15:44")
..
..
timen ....
print(df)
df<-data.frame(id,time1, time2,..., timen)
df[1:4 ,2: ncol(df)] <- substring(df[1:4 ,2: ncol(df)], 12)
print(df)
有人可以建议出路吗?
试试这个:df1
包含您的数据。你可以在这个操作之后与原始数据重新组合。
target<-unlist(sapply(stringr::str_extract_all(names(df1),"^t.*"),"["))
Changed<-as.data.frame(sapply(target,function(x){ind=which(names(df1)==x)
unlist(sapply(stringr::str_split(df1[,ind]," "),"[",2))}))
cbind(id=df1$id,Changed)
输出:
id time1 time2
1 10:15:44 <NA>
2 <NA> 10:15:44
3 00:15:44 <NA>
4 12:15:14 00:15:44
你试过包了吗lubridate
:
time_cols <- c("time1", "time2")
df[time_cols] <- apply(df[time_cols], 2,
function(col){
format(lubridate::ymd_hms(col), "%H:%M:%S")
})
df
# id time1 time2
# 1 1 10:15:44 <NA>
# 2 2 <NA> 10:15:44
# 3 3 00:15:44 <NA>
# 4 4 12:15:14 00:15:44
遍历列和子字符串:
df[, 2:3] <- lapply(df[, 2:3], substring, first = 12)
df
# id time1 time2
# 1 1 10:15:44
# 2 2 10:15:44
# 3 3 00:15:44
# 4 4 12:15:14 00:15:44
# input data
df <- data.frame(id = c(1,2,3,4),
time1 = c("2015-01-02 10:15:44","NA","2015-11-12 00:15:44","2015-01-02 12:15:14"),
time2 = c("NA", "2015-01-02 10:15:44","NA","2015-11-12 00:15:44"))
我是 R 编程的新手,有点陷入困境:
我有一个数据框,我想检查日期格式的 rows/columns 中是否有任何值应该被剥离到仅时间部分。
例如,日期字符串 "2015-01-02 10:15:44"
应更改为 "10:15:44"
我知道,这是非常新手的方法,但这是我尝试获取所有值的子字符串的方法。
id<-c(1,2,3,4)
time1<-c("2015-01-02 10:15:44","NA","2015-11-12 00:15:44","2015-01-02 12:15:14")
time2<-c("NA", "2015-01-02 10:15:44","NA","2015-11-12 00:15:44")
..
..
timen ....
print(df)
df<-data.frame(id,time1, time2,..., timen)
df[1:4 ,2: ncol(df)] <- substring(df[1:4 ,2: ncol(df)], 12)
print(df)
有人可以建议出路吗?
试试这个:df1
包含您的数据。你可以在这个操作之后与原始数据重新组合。
target<-unlist(sapply(stringr::str_extract_all(names(df1),"^t.*"),"["))
Changed<-as.data.frame(sapply(target,function(x){ind=which(names(df1)==x)
unlist(sapply(stringr::str_split(df1[,ind]," "),"[",2))}))
cbind(id=df1$id,Changed)
输出:
id time1 time2
1 10:15:44 <NA>
2 <NA> 10:15:44
3 00:15:44 <NA>
4 12:15:14 00:15:44
你试过包了吗lubridate
:
time_cols <- c("time1", "time2")
df[time_cols] <- apply(df[time_cols], 2,
function(col){
format(lubridate::ymd_hms(col), "%H:%M:%S")
})
df
# id time1 time2
# 1 1 10:15:44 <NA>
# 2 2 <NA> 10:15:44
# 3 3 00:15:44 <NA>
# 4 4 12:15:14 00:15:44
遍历列和子字符串:
df[, 2:3] <- lapply(df[, 2:3], substring, first = 12)
df
# id time1 time2
# 1 1 10:15:44
# 2 2 10:15:44
# 3 3 00:15:44
# 4 4 12:15:14 00:15:44
# input data
df <- data.frame(id = c(1,2,3,4),
time1 = c("2015-01-02 10:15:44","NA","2015-11-12 00:15:44","2015-01-02 12:15:14"),
time2 = c("NA", "2015-01-02 10:15:44","NA","2015-11-12 00:15:44"))