如何将特定时间转换为
How can I transform a specific time, to a
我想将字符串列转换为正确的格式。
通常我会这样做:
print(df$Time)
> "00:00:01"
as.POSIXct(df$Time,format="%H:%M:%S")
但是,我的数据很奇怪。它看起来像这样:
print(df$Time)
850a" "823a" NA "906a" "321a" "1154p"
我的解决方案不起作用。因为我首先剥离了字符(在本例中为 "a" 和 "p")。但是这样做之后,我的时间缺少了一个重要的部分(如果是早上或下午)。
因此我的问题是:如何将这些数据转换为正确的格式?
预期输出:
df$Time_Old
850a" "823a" NA "906a" "321a" "1154p"
df$Time_New
08.50 08.23 NA 09.06 03.21 23.54
一些示例数据:
vector_string <- as.vector(tv_Adds[["Time"]])
vector_string = vector_string[1:20]
> vector_string
[1] "850a" "823a" NA "906a" "321a" "1154p" "608p" "1012a" "354a" "1121p" "414p" "1241p" "721p" "223p" "316p"
[16] "345p" "1145a" "3p" "937a" "138p"
> dput(vector_string[1:20])
c("850a", "823a", NA, "906a", "321a", "1154p", "608p", "1012a",
"354a", "1121p", "414p", "1241p", "721p", "223p", "316p", "345p",
"1145a", "3p", "937a", "138p")
您必须将小时与分钟分开,因为您输入的内容含糊不清。然后在非 NA 条目的末尾添加 "m"。我想你需要这个:
tvec = c("850a", "823a", NA, "906a", "321a", "1154p")
notNA <- !is.na(tvec)
#separate hours from minutes with a dot and append m at the end:
tvec[notNA] <- paste0(strtrim(tvec[notNA], nchar(tvec[notNA]) - 3), ".",
substr(tvec[notNA], nchar(tvec[notNA])-2, nchar(tvec[notNA]))
, "m")
as.POSIXct(tvec, format = "%I.%M%p")
[1] "2019-10-25 08:50:00 CEST" "2019-10-25 08:23:00 CEST"
[3] NA "2019-10-25 09:06:00 CEST"
[5] "2019-10-25 03:21:00 CEST" "2019-10-25 23:54:00 CEST"
"%I.%M%p"
代表
hour(0-12), followed by .,followed by minutes(00-59), followed by "am"(or "pm")
根据您分享的示例,我们似乎需要处理 3 种不同的情况。
- 当你有
834a
需要变成 8:34am
- 当你有
1143p
需要变成 11:43pm
- 当你有
3a
需要变成 3:00am
处理完这些后,在本例中使用简单的 ifelse
语句计算字符数并进行相应修改,然后我们可以通过调用 strptime
以正确的格式简单地转换为 datetime 对象,即
v1[!is.na(v1)] <- paste0(v1[!is.na(v1)], 'm')
v2 <- ifelse(nchar(v1) == 5, gsub('(^[0-9]{1})(.*$)', '\1:\2', v1),
ifelse(nchar(v1) == 3, gsub('(^[0-9]{1})(.*$)', '\1:00\2', v1),
gsub('(^[0-9]{2})(.*$)', '\1:\2', v1)))
v2
#[1] "8:50am" "8:23am" NA "9:06am" "3:21am" "11:54pm" "6:08pm" "10:12am" "3:54am" "11:21pm" "4:14pm" "12:41pm" "7:21pm" "2:23pm" "3:16pm" "3:45pm" "11:45am" "3:00pm" "9:37am" "1:38pm"
strptime(v2, format = '%I:%M%p')
#[1] "2019-10-29 08:50:00 +03" "2019-10-29 08:23:00 +03" NA "2019-10-29 09:06:00 +03" "2019-10-29 03:21:00 +03" "2019-10-29 23:54:00 +03" "2019-10-29 18:08:00 +03" "2019-10-29 10:12:00 +03" "2019-10-29 03:54:00 +03" "2019-10-29 23:21:00 +03"
#[11] "2019-10-29 16:14:00 +03" "2019-10-29 12:41:00 +03" "2019-10-29 19:21:00 +03" "2019-10-29 14:23:00 +03" "2019-10-29 15:16:00 +03" "2019-10-29 15:45:00 +03" "2019-10-29 11:45:00 +03" "2019-10-29 15:00:00 +03" "2019-10-29 09:37:00 +03" "2019-10-29 13:38:00 +03"
已使用数据
dput(v1)
c("850am", "823am", NA, "906am", "321am", "1154pm", "608pm",
"1012am", "354am", "1121pm", "414pm", "1241pm", "721pm", "223pm",
"316pm", "345pm", "1145am", "3pm", "937am", "138pm")
我想将字符串列转换为正确的格式。
通常我会这样做:
print(df$Time)
> "00:00:01"
as.POSIXct(df$Time,format="%H:%M:%S")
但是,我的数据很奇怪。它看起来像这样:
print(df$Time)
850a" "823a" NA "906a" "321a" "1154p"
我的解决方案不起作用。因为我首先剥离了字符(在本例中为 "a" 和 "p")。但是这样做之后,我的时间缺少了一个重要的部分(如果是早上或下午)。
因此我的问题是:如何将这些数据转换为正确的格式?
预期输出:
df$Time_Old
850a" "823a" NA "906a" "321a" "1154p"
df$Time_New
08.50 08.23 NA 09.06 03.21 23.54
一些示例数据:
vector_string <- as.vector(tv_Adds[["Time"]])
vector_string = vector_string[1:20]
> vector_string
[1] "850a" "823a" NA "906a" "321a" "1154p" "608p" "1012a" "354a" "1121p" "414p" "1241p" "721p" "223p" "316p"
[16] "345p" "1145a" "3p" "937a" "138p"
> dput(vector_string[1:20])
c("850a", "823a", NA, "906a", "321a", "1154p", "608p", "1012a",
"354a", "1121p", "414p", "1241p", "721p", "223p", "316p", "345p",
"1145a", "3p", "937a", "138p")
您必须将小时与分钟分开,因为您输入的内容含糊不清。然后在非 NA 条目的末尾添加 "m"。我想你需要这个:
tvec = c("850a", "823a", NA, "906a", "321a", "1154p")
notNA <- !is.na(tvec)
#separate hours from minutes with a dot and append m at the end:
tvec[notNA] <- paste0(strtrim(tvec[notNA], nchar(tvec[notNA]) - 3), ".",
substr(tvec[notNA], nchar(tvec[notNA])-2, nchar(tvec[notNA]))
, "m")
as.POSIXct(tvec, format = "%I.%M%p")
[1] "2019-10-25 08:50:00 CEST" "2019-10-25 08:23:00 CEST"
[3] NA "2019-10-25 09:06:00 CEST"
[5] "2019-10-25 03:21:00 CEST" "2019-10-25 23:54:00 CEST"
"%I.%M%p"
代表
hour(0-12), followed by .,followed by minutes(00-59), followed by "am"(or "pm")
根据您分享的示例,我们似乎需要处理 3 种不同的情况。
- 当你有
834a
需要变成8:34am
- 当你有
1143p
需要变成11:43pm
- 当你有
3a
需要变成3:00am
处理完这些后,在本例中使用简单的 ifelse
语句计算字符数并进行相应修改,然后我们可以通过调用 strptime
以正确的格式简单地转换为 datetime 对象,即
v1[!is.na(v1)] <- paste0(v1[!is.na(v1)], 'm')
v2 <- ifelse(nchar(v1) == 5, gsub('(^[0-9]{1})(.*$)', '\1:\2', v1),
ifelse(nchar(v1) == 3, gsub('(^[0-9]{1})(.*$)', '\1:00\2', v1),
gsub('(^[0-9]{2})(.*$)', '\1:\2', v1)))
v2
#[1] "8:50am" "8:23am" NA "9:06am" "3:21am" "11:54pm" "6:08pm" "10:12am" "3:54am" "11:21pm" "4:14pm" "12:41pm" "7:21pm" "2:23pm" "3:16pm" "3:45pm" "11:45am" "3:00pm" "9:37am" "1:38pm"
strptime(v2, format = '%I:%M%p')
#[1] "2019-10-29 08:50:00 +03" "2019-10-29 08:23:00 +03" NA "2019-10-29 09:06:00 +03" "2019-10-29 03:21:00 +03" "2019-10-29 23:54:00 +03" "2019-10-29 18:08:00 +03" "2019-10-29 10:12:00 +03" "2019-10-29 03:54:00 +03" "2019-10-29 23:21:00 +03"
#[11] "2019-10-29 16:14:00 +03" "2019-10-29 12:41:00 +03" "2019-10-29 19:21:00 +03" "2019-10-29 14:23:00 +03" "2019-10-29 15:16:00 +03" "2019-10-29 15:45:00 +03" "2019-10-29 11:45:00 +03" "2019-10-29 15:00:00 +03" "2019-10-29 09:37:00 +03" "2019-10-29 13:38:00 +03"
已使用数据
dput(v1)
c("850am", "823am", NA, "906am", "321am", "1154pm", "608pm",
"1012am", "354am", "1121pm", "414pm", "1241pm", "721pm", "223pm",
"316pm", "345pm", "1145am", "3pm", "937am", "138pm")