将 H:M:S 字符变量转换为 numeric/continuous 比例 (R)
Converting H:M:S character variable to numeric/continuous scale (R)
我有兴趣可视化每小时对给定主题的 Twitter 情绪,我的变量存储如下:
sapply(valence_hour,class)
Time day mean_valence n
"character" "numeric" "numeric" "integer"
这是一个数据示例:
Time day mean_valence n
23:59:00 19 0.0909090909 3
23:58:00 19 0.0589743590 3
23:57:00 19 0.49743590 3
然后我运行下面的图表代码:
ggplot(valence_hour, aes(x = Time, y = mean_valence)) +
geom_point() +
geom_line()+
scale_x_continuous(breaks=seq(1,30,1)) +
geom_smooth()
但是,我不断收到此错误:“错误:提供给连续刻度的离散值”
为了解决这个我认为是由存储为字符的“时间”变量引起的问题,我尝试实施类似于 的解决方案。
我 运行 以下内容运行没有错误,但它没有解决我的“时间”变量的问题,因为我仍然收到错误“提供给连续刻度的离散值”
valence_hour <-
time_to_seconds <- function(time) {
parts <- time %>%
strsplit(":|\.") %>%
.[[1]] %>%
as.numeric
seconds <- parts[1] * 60 * 60 + parts[2] * 60 + parts[3]
seconds
}
time_to_seconds("00:01:53.910")
这是一个方法。
连接当前系统日期和 Time
,强制转换为 "POSIXct"
并将这个新的临时变量用于 x-axis。在日期时间图层中设置轴标签。
警告是由于数据集小,loess
抱怨没有足够的数据点。别担心,它会处理更大的数据。
library(dplyr, quietly = TRUE)
library(ggplot2, quietly = TRUE)
x <- '
Time day mean_valence n
23:59:00 19 0.0909090909 3
23:58:00 19 0.0589743590 3
23:57:00 19 0.49743590 3'
valence_hour <- read.table(textConnection(x), header = TRUE)
valence_hour %>%
mutate(Time = paste(Sys.Date(), Time),
Time = as.POSIXct(Time)) %>%
ggplot(aes(Time, mean_valence)) +
geom_point() +
geom_line()+
scale_x_datetime(
date_breaks = "1 mins",
date_labels = "%H:%M:%S"
) +
geom_smooth(formula = y ~ x, method = "loess")
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : span too small. fewer data values than degrees of freedom.
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : pseudoinverse used at 1.654e+09
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : neighborhood radius 60.6
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : reciprocal condition number 0
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : There are other near singularities as well. 3672.4
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : span too small. fewer
#> data values than degrees of freedom.
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at
#> 1.654e+09
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius 60.6
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
#> number 0
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : There are other near
#> singularities as well. 3672.4
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf
由 reprex package (v2.0.1)
于 2022-05-30 创建
对@Rui Barradas 的解决方案深信不疑,这是另一种方法:
library(tidyverse)
library(lubridate)
valence_hour %>%
mutate(Time = hms(Time)) %>%
ggplot(aes(x = Time, y = mean_valence)) +
geom_point() +
geom_line()+
scale_x_time()+
geom_smooth()
我有兴趣可视化每小时对给定主题的 Twitter 情绪,我的变量存储如下:
sapply(valence_hour,class)
Time day mean_valence n
"character" "numeric" "numeric" "integer"
这是一个数据示例:
Time day mean_valence n
23:59:00 19 0.0909090909 3
23:58:00 19 0.0589743590 3
23:57:00 19 0.49743590 3
然后我运行下面的图表代码:
ggplot(valence_hour, aes(x = Time, y = mean_valence)) +
geom_point() +
geom_line()+
scale_x_continuous(breaks=seq(1,30,1)) +
geom_smooth()
但是,我不断收到此错误:“错误:提供给连续刻度的离散值”
为了解决这个我认为是由存储为字符的“时间”变量引起的问题,我尝试实施类似于
valence_hour <-
time_to_seconds <- function(time) {
parts <- time %>%
strsplit(":|\.") %>%
.[[1]] %>%
as.numeric
seconds <- parts[1] * 60 * 60 + parts[2] * 60 + parts[3]
seconds
}
time_to_seconds("00:01:53.910")
这是一个方法。
连接当前系统日期和 Time
,强制转换为 "POSIXct"
并将这个新的临时变量用于 x-axis。在日期时间图层中设置轴标签。
警告是由于数据集小,loess
抱怨没有足够的数据点。别担心,它会处理更大的数据。
library(dplyr, quietly = TRUE)
library(ggplot2, quietly = TRUE)
x <- '
Time day mean_valence n
23:59:00 19 0.0909090909 3
23:58:00 19 0.0589743590 3
23:57:00 19 0.49743590 3'
valence_hour <- read.table(textConnection(x), header = TRUE)
valence_hour %>%
mutate(Time = paste(Sys.Date(), Time),
Time = as.POSIXct(Time)) %>%
ggplot(aes(Time, mean_valence)) +
geom_point() +
geom_line()+
scale_x_datetime(
date_breaks = "1 mins",
date_labels = "%H:%M:%S"
) +
geom_smooth(formula = y ~ x, method = "loess")
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : span too small. fewer data values than degrees of freedom.
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : pseudoinverse used at 1.654e+09
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : neighborhood radius 60.6
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : reciprocal condition number 0
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : There are other near singularities as well. 3672.4
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : span too small. fewer
#> data values than degrees of freedom.
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at
#> 1.654e+09
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius 60.6
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
#> number 0
#> Warning in predLoess(object$y, object$x, newx = if
#> (is.null(newdata)) object$x else if (is.data.frame(newdata))
#> as.matrix(model.frame(delete.response(terms(object)), : There are other near
#> singularities as well. 3672.4
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf
由 reprex package (v2.0.1)
于 2022-05-30 创建对@Rui Barradas 的解决方案深信不疑,这是另一种方法:
library(tidyverse)
library(lubridate)
valence_hour %>%
mutate(Time = hms(Time)) %>%
ggplot(aes(x = Time, y = mean_valence)) +
geom_point() +
geom_line()+
scale_x_time()+
geom_smooth()