如何以正确的格式获取我的时间字符串?
How can I get my time strings in a right format?
我无法将字符串转换为时间。我知道 Stack 上有很多关于将字符串转换为时间的主题,但是我无法通过解决方案解决这个问题。
情况
我有一个这样的文件:
> dput(df$Time[1:50])
c("1744.3", "2327.54", "1718.51", "2312.3200000000002", "1414.16",
"2046.15", "1442.5", "1912.22", "2303.2199999999998", "2146.3200000000002",
"1459.02", "1930.15", "1856.23", "2319.15", "1451.05", "25.460000000000036",
"1453.25", "2309.02", "2342.48", "2322.5300000000002", "2101.5",
"2026.07", "1245.04", "1945.15", "5.4099999999998545", "1039.5",
"1731.37", "2058.41", "2030.36", "1814.31", "1338.18", "1858.33",
"1731.36", "2343.38", "1733.27", "2304.59", "1309.47", "1916.11",
"1958.3", "1929.54", "1756.4", "1744.23", "1731.26", "1844.47",
"1353.25", "1958.3", "1746.44", "1857.53", "2047.15", "2327.2199999999998", "1915"
)
在这个例子中,时间应该是这样的:
"1744.3" = 17:44:30
"2327.54" = 23:27:54
"1718.51" = 17:18:51
"2312.3200000000002" = 23:12:32
...
"25.460000000000036" = 00:25:46 # as you can see, the first two 00 are missing.
“1915”=19:15:00
但是,我尝试了多种方法(现在我什至被 str_replace() 困住了)。希望有人知道我如何改变它。
我尝试了什么?
format(df$Time, "%H%M.%S") # Yes I know...
# So therefore I thought, lets replace the strings to get them in a proper format
# like HH:MM:SS. First step was to replace the "." for a ":"
str_replace("." , ":", df$Time) # this was leading to "." (don't know why)
这就是我非常沮丧的原因,所以我将其发布在 Stack 上。希望大家能帮帮我。
非常感谢!
这是一种方法,将 dput
的输出存储在 x
中。
library(magrittr)
#Remove all the dots
gsub('\.', '', x) %>%
#Select only first 6 characters
substr(1, 6) %>%
#Pad 0's at the end
stringr::str_pad(6,pad = '0', side = 'right') %>%
#Add colon (:) separator
sub('(.{2})(.{2})', '\1:\2:', .)
# [1] "17:44:30" "23:27:54" "17:18:51" "23:12:32" "14:14:16" "20:46:15"
# [7] "14:42:50" "19:12:22" "23:03:21" "21:46:32" "14:59:02" "19:30:15"
#[13] "18:56:23" "23:19:15" "14:51:05" "25:46:00" "14:53:25" "23:09:02"
#...
请注意,这也可以在没有管道的情况下完成,但为了清楚起见,请使用它。如果需要,您可以从这里将时间转换为 POSIXct
格式。
一种迂回的方式
tmp=as.numeric(lapply(strsplit(as.character(df$Time),"\."),function(x){nchar(x[1])}))
ifelse(tmp>2,
substr(as.POSIXct(df$Time,format="%H%M.%S"),12,19),
substr(as.POSIXct(df$Time,format="%M.%S"),12,19))
主要问题是时间"25.460000000000036"
。但我想我找到了一个清晰但有点冗长的解决方案:
library(tidyverse)
df %>%
mutate(hours = formatC(as.numeric(Time), width = 4, format = "d", flag = "0"),
seconds = as.numeric(str_extract(Time, "[.].+")) * 100) %>%
mutate(Time_new = stringi::stri_datetime_parse(paste0(hours, seconds), format = "HHmm.ss"))
#> # A tibble: 51 x 4
#> Time hours seconds Time_new
#> <chr> <chr> <dbl> <dttm>
#> 1 25.460000000000036 0025 46. 2020-02-19 00:25:46 # I changed the order of the times so the weird format is on top
#> 2 1744.3 1744 30 2020-02-19 17:44:30
#> 3 2327.54 2327 54 2020-02-19 23:27:54
#> 4 1718.51 1718 51 2020-02-19 17:18:51
#> 5 2312.3200000000002 2312 32. 2020-02-19 23:12:32
#> 6 1414.16 1414 16 2020-02-19 14:14:16
#> 7 2046.15 2046 15 2020-02-19 20:46:15
#> 8 1442.5 1442 50 2020-02-19 14:42:50
#> 9 1912.22 1912 22 2020-02-19 19:12:22
#> 10 2303.2199999999998 2303 22.0 2020-02-19 23:03:21
#> # ... with 41 more rows
如果你也有没有分数的时间(即没有点),你可以使用这种方法:
normalize_time <- function(t) {
formatC(as.numeric(t) * 100, width = 6, format = "d", flag = "0")
}
df %>%
mutate(Time_new = as.POSIXct(normalize_time(Time), format = "%H%M%S"))
一条data.table路
首先,将向量中的字符串转换为数字,乘以 100(以获取小数点分隔符前 HMS 的相关部分)并设置为整数。然后使用 sprintf()
添加前导零以获得 6 位字符串。最后,转换为时间。
data.table::as.ITime( sprintf( "%06d",
as.integer( as.numeric(time) * 100 ) ),
format = "%H%M%S" )
# [1] "17:44:30" "23:27:54" "17:18:51" "23:12:32" "14:14:16" "20:46:15" "14:42:50" "19:12:22" "23:03:21" "21:46:32" "14:59:02" "19:30:15"
# [13] "18:56:23" "23:19:15" "14:51:05" "00:25:46" "14:53:25" "23:09:02" "23:42:48" "23:22:53" "21:01:50" "20:26:07" "12:45:04" "19:45:15"
# [25] "00:05:40" "10:39:50" "17:31:37" "20:58:41" "20:30:36" "18:14:31" "13:38:18" "18:58:33" "17:31:36" "23:43:38" "17:33:27" "23:04:59"
# [37] "13:09:47" "19:16:11" "19:58:30" "19:29:54" "17:56:40" "17:44:23" "17:31:26" "18:44:47" "13:53:25" "19:58:30" "17:46:44" "18:57:53"
# [49] "20:47:15" "23:27:21"
我无法将字符串转换为时间。我知道 Stack 上有很多关于将字符串转换为时间的主题,但是我无法通过解决方案解决这个问题。
情况 我有一个这样的文件:
> dput(df$Time[1:50])
c("1744.3", "2327.54", "1718.51", "2312.3200000000002", "1414.16",
"2046.15", "1442.5", "1912.22", "2303.2199999999998", "2146.3200000000002",
"1459.02", "1930.15", "1856.23", "2319.15", "1451.05", "25.460000000000036",
"1453.25", "2309.02", "2342.48", "2322.5300000000002", "2101.5",
"2026.07", "1245.04", "1945.15", "5.4099999999998545", "1039.5",
"1731.37", "2058.41", "2030.36", "1814.31", "1338.18", "1858.33",
"1731.36", "2343.38", "1733.27", "2304.59", "1309.47", "1916.11",
"1958.3", "1929.54", "1756.4", "1744.23", "1731.26", "1844.47",
"1353.25", "1958.3", "1746.44", "1857.53", "2047.15", "2327.2199999999998", "1915"
)
在这个例子中,时间应该是这样的:
"1744.3" = 17:44:30
"2327.54" = 23:27:54
"1718.51" = 17:18:51
"2312.3200000000002" = 23:12:32
...
"25.460000000000036" = 00:25:46 # as you can see, the first two 00 are missing.
“1915”=19:15:00
但是,我尝试了多种方法(现在我什至被 str_replace() 困住了)。希望有人知道我如何改变它。
我尝试了什么?
format(df$Time, "%H%M.%S") # Yes I know...
# So therefore I thought, lets replace the strings to get them in a proper format
# like HH:MM:SS. First step was to replace the "." for a ":"
str_replace("." , ":", df$Time) # this was leading to "." (don't know why)
这就是我非常沮丧的原因,所以我将其发布在 Stack 上。希望大家能帮帮我。
非常感谢!
这是一种方法,将 dput
的输出存储在 x
中。
library(magrittr)
#Remove all the dots
gsub('\.', '', x) %>%
#Select only first 6 characters
substr(1, 6) %>%
#Pad 0's at the end
stringr::str_pad(6,pad = '0', side = 'right') %>%
#Add colon (:) separator
sub('(.{2})(.{2})', '\1:\2:', .)
# [1] "17:44:30" "23:27:54" "17:18:51" "23:12:32" "14:14:16" "20:46:15"
# [7] "14:42:50" "19:12:22" "23:03:21" "21:46:32" "14:59:02" "19:30:15"
#[13] "18:56:23" "23:19:15" "14:51:05" "25:46:00" "14:53:25" "23:09:02"
#...
请注意,这也可以在没有管道的情况下完成,但为了清楚起见,请使用它。如果需要,您可以从这里将时间转换为 POSIXct
格式。
一种迂回的方式
tmp=as.numeric(lapply(strsplit(as.character(df$Time),"\."),function(x){nchar(x[1])}))
ifelse(tmp>2,
substr(as.POSIXct(df$Time,format="%H%M.%S"),12,19),
substr(as.POSIXct(df$Time,format="%M.%S"),12,19))
主要问题是时间"25.460000000000036"
。但我想我找到了一个清晰但有点冗长的解决方案:
library(tidyverse)
df %>%
mutate(hours = formatC(as.numeric(Time), width = 4, format = "d", flag = "0"),
seconds = as.numeric(str_extract(Time, "[.].+")) * 100) %>%
mutate(Time_new = stringi::stri_datetime_parse(paste0(hours, seconds), format = "HHmm.ss"))
#> # A tibble: 51 x 4
#> Time hours seconds Time_new
#> <chr> <chr> <dbl> <dttm>
#> 1 25.460000000000036 0025 46. 2020-02-19 00:25:46 # I changed the order of the times so the weird format is on top
#> 2 1744.3 1744 30 2020-02-19 17:44:30
#> 3 2327.54 2327 54 2020-02-19 23:27:54
#> 4 1718.51 1718 51 2020-02-19 17:18:51
#> 5 2312.3200000000002 2312 32. 2020-02-19 23:12:32
#> 6 1414.16 1414 16 2020-02-19 14:14:16
#> 7 2046.15 2046 15 2020-02-19 20:46:15
#> 8 1442.5 1442 50 2020-02-19 14:42:50
#> 9 1912.22 1912 22 2020-02-19 19:12:22
#> 10 2303.2199999999998 2303 22.0 2020-02-19 23:03:21
#> # ... with 41 more rows
如果你也有没有分数的时间(即没有点),你可以使用这种方法:
normalize_time <- function(t) {
formatC(as.numeric(t) * 100, width = 6, format = "d", flag = "0")
}
df %>%
mutate(Time_new = as.POSIXct(normalize_time(Time), format = "%H%M%S"))
一条data.table路
首先,将向量中的字符串转换为数字,乘以 100(以获取小数点分隔符前 HMS 的相关部分)并设置为整数。然后使用 sprintf()
添加前导零以获得 6 位字符串。最后,转换为时间。
data.table::as.ITime( sprintf( "%06d",
as.integer( as.numeric(time) * 100 ) ),
format = "%H%M%S" )
# [1] "17:44:30" "23:27:54" "17:18:51" "23:12:32" "14:14:16" "20:46:15" "14:42:50" "19:12:22" "23:03:21" "21:46:32" "14:59:02" "19:30:15"
# [13] "18:56:23" "23:19:15" "14:51:05" "00:25:46" "14:53:25" "23:09:02" "23:42:48" "23:22:53" "21:01:50" "20:26:07" "12:45:04" "19:45:15"
# [25] "00:05:40" "10:39:50" "17:31:37" "20:58:41" "20:30:36" "18:14:31" "13:38:18" "18:58:33" "17:31:36" "23:43:38" "17:33:27" "23:04:59"
# [37] "13:09:47" "19:16:11" "19:58:30" "19:29:54" "17:56:40" "17:44:23" "17:31:26" "18:44:47" "13:53:25" "19:58:30" "17:46:44" "18:57:53"
# [49] "20:47:15" "23:27:21"