将 H:M:S 字符变量转换为 numeric/continuous 比例 (R)

Converting H:M:S character variable to numeric/continuous scale (R)

我有兴趣可视化每小时对给定主题的 Twitter 情绪,我的变量存储如下:

sapply(valence_hour,class)
      Time          day mean_valence            n 
 "character"    "numeric"    "numeric"    "integer" 

这是一个数据示例:

Time          day   mean_valence            n 
23:59:00     19     0.0909090909            3
23:58:00     19     0.0589743590            3
23:57:00     19     0.49743590             3

然后我运行下面的图表代码:

ggplot(valence_hour, aes(x = Time, y = mean_valence)) +
  geom_point() +
  geom_line()+
  scale_x_continuous(breaks=seq(1,30,1)) +
  geom_smooth()

但是,我不断收到此错误:“错误:提供给连续刻度的离散值”

为了解决这个我认为是由存储为字符的“时间”变量引起的问题,我尝试实施类似于 的解决方案。 我 运行 以下内容运行没有错误,但它没有解决我的“时间”变量的问题,因为我仍然收到错误“提供给连续刻度的离散值”

valence_hour <-
  time_to_seconds <- function(time) {
  
  parts <- time %>% 
    strsplit(":|\.") %>% 
    .[[1]] %>% 
    as.numeric
  
  seconds <- parts[1] * 60 * 60 + parts[2] * 60 + parts[3]
  
  seconds
}

time_to_seconds("00:01:53.910")

这是一个方法。
连接当前系统日期和 Time,强制转换为 "POSIXct" 并将这个新的临时变量用于 x-axis。在日期时间图层中设置轴标签。

警告是由于数据集小,loess抱怨没有足够的数据点。别担心,它会处理更大的数据。

library(dplyr, quietly = TRUE)
library(ggplot2, quietly = TRUE)

x <- '
Time          day   mean_valence            n 
23:59:00     19     0.0909090909            3
23:58:00     19     0.0589743590            3
23:57:00     19     0.49743590             3'
valence_hour <- read.table(textConnection(x), header = TRUE)

valence_hour %>%
  mutate(Time = paste(Sys.Date(), Time),
         Time = as.POSIXct(Time)) %>%
  ggplot(aes(Time, mean_valence)) +
  geom_point() +
  geom_line()+
  scale_x_datetime(
    date_breaks = "1 mins",
    date_labels = "%H:%M:%S"
  ) +
  geom_smooth(formula = y ~ x, method = "loess")
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : span too small. fewer data values than degrees of freedom.
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : pseudoinverse used at 1.654e+09
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : neighborhood radius 60.6
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : reciprocal condition number 0
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : There are other near singularities as well. 3672.4
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : span too small. fewer
#> data values than degrees of freedom.
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at
#> 1.654e+09
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius 60.6
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
#> number 0
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : There are other near
#> singularities as well. 3672.4
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf

reprex package (v2.0.1)

于 2022-05-30 创建

对@Rui Barradas 的解决方案深信不疑,这是另一种方法:

library(tidyverse)
library(lubridate)

valence_hour %>% 
  mutate(Time = hms(Time)) %>% 
  ggplot(aes(x = Time, y = mean_valence)) +
  geom_point() +
  geom_line()+
  scale_x_time()+
  geom_smooth()