如何将字符日期转换为数字?
How can I can convert character dates into numerics?
给定包含电影院数据的时间序列,标识符“日期”很有趣。我想转换成格式“YYYY/MM/DD”。但是,当我 运行 我的代码时:
CINEMA.TICKET$DATE <- as.Date(CINEMA.TICKET$date , format = "%y/%m/%d")
出现两个问题:
首先,日期显示在 table 的最右侧,例如, "0005-05-20."许多条目完全消失了。谁能解释一下我做错了什么,我该怎么做才正确?
film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity date month quarter day newdate DATE
1 1492 304 3900000 26 0 4 4.26 150000 26 610.3286 5/5/2018 5 2 5 0005-05-20 2005-05-20
2 1492 352 3360000 42 0 5 8.08 80000 42 519.8020 5/5/2018 5 2 5 0005-05-20 2005-05-20
3 1492 489 2560000 32 0 4 20.00 80000 32 160.0000 5/5/2018 5 2 5 0005-05-20 2005-05-20
4 1492 429 1200000 12 0 1 11.01 100000 12 108.9918 5/5/2018 5 2 5 0005-05-20 2005-05-20
5 1492 524 1200000 15 0 3 16.67 80000 15 89.9820 5/5/2018 5 2 5 0005-05-20 2005-05-20
6 1492 71 1050000 7 0 3 0.98 150000 7 714.2857 5/5/2018 5 2 5 0005-05-20 2005-05-20
> str(CINEMA.TICKET)
正如@Dave2e 指出的那样。您正在寻找:
CINEMA.TICKET[, date := as.Date(date , format = "%d/%m/%Y")]
假设我们的输入格式是 "30/5/2018"
因为问题不明确 "5/5/2018"
的例子可能是 "%d/%m/%Y"
或 "%m/%d/%Y"
对于排序列使用:
setcolorder(CINEMA.TICKET, c("c", "b", "a"))
其中 c,b,a
是按所需顺序排列的列名称
lubridate
可能会成功
> lubridate::mdy("5/5/2018")
[1] "2018-05-05"
所以你应该使用
library(lubridate)
library(tidyverse)
CINEMA.TICKET <- CINEMA.TICKET %>%
mutate(DATE=mdy(date))
这是另一个选项:
library(tidyverse)
output <- df %>%
mutate(date = as.Date(date, format="%m/%d/%Y"))
输出
film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity date month quarter day
1 1492 304 3900000 26 0 4 4.26 150000 26 610.3286 2018-05-05 5 2 5
2 1492 352 3360000 42 0 5 8.08 80000 42 519.8020 2018-05-05 5 2 5
3 1492 489 2560000 32 0 4 20.00 80000 32 160.0000 2018-05-05 5 2 5
4 1492 429 1200000 12 0 1 11.01 100000 12 108.9918 2018-05-05 5 2 5
5 1492 524 1200000 15 0 3 16.67 80000 15 89.9820 2018-05-05 5 2 5
6 1492 71 1050000 7 0 3 0.98 150000 7 714.2857 2018-05-05 5 2 5
要将 date
归类为日期,不能使用正斜杠。您可以更改格式,但不再归类为日期,而是重新归类为字符。
class(output$date)
# [1] "Date"
output2 <- df %>%
mutate(date = as.Date(date, format="%m/%d/%Y")) %>%
mutate(date = format(date, "%Y/%m/%d"))
class(output2$date)
# [1] "character"
数据
df <-
structure(
list(
film_code = c(1492L, 1492L, 1492L, 1492L, 1492L,
1492L),
cinema_code = c(304L, 352L, 489L, 429L, 524L, 71L),
total_sales = c(3900000L,
3360000L, 2560000L, 1200000L, 1200000L, 1050000L),
tickets_sold = c(26L,
42L, 32L, 12L, 15L, 7L),
tickets_out = c(0L, 0L, 0L, 0L, 0L,
0L),
show_time = c(4L, 5L, 4L, 1L, 3L, 3L),
occu_perc = c(4.26,
8.08, 20, 11.01, 16.67, 0.98),
ticket_price = c(150000L, 80000L,
80000L, 100000L, 80000L, 150000L),
ticket_use = c(26L, 42L, 32L,
12L, 15L, 7L),
capacity = c(610.3286, 519.802, 160, 108.9918,
89.982, 714.2857),
date = c("5/5/2018", "5/5/2018", "5/5/2018", "5/5/2018",
"5/5/2018", "5/5/2018"),
month = c(5L, 5L, 5L, 5L, 5L, 5L),
quarter = c(2L,
2L, 2L, 2L, 2L, 2L),
day = c(5L, 5L, 5L, 5L, 5L, 5L)
),
class = "data.frame",
row.names = c(NA,-6L)
)
给定包含电影院数据的时间序列,标识符“日期”很有趣。我想转换成格式“YYYY/MM/DD”。但是,当我 运行 我的代码时:
CINEMA.TICKET$DATE <- as.Date(CINEMA.TICKET$date , format = "%y/%m/%d")
出现两个问题: 首先,日期显示在 table 的最右侧,例如, "0005-05-20."许多条目完全消失了。谁能解释一下我做错了什么,我该怎么做才正确?
film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity date month quarter day newdate DATE
1 1492 304 3900000 26 0 4 4.26 150000 26 610.3286 5/5/2018 5 2 5 0005-05-20 2005-05-20
2 1492 352 3360000 42 0 5 8.08 80000 42 519.8020 5/5/2018 5 2 5 0005-05-20 2005-05-20
3 1492 489 2560000 32 0 4 20.00 80000 32 160.0000 5/5/2018 5 2 5 0005-05-20 2005-05-20
4 1492 429 1200000 12 0 1 11.01 100000 12 108.9918 5/5/2018 5 2 5 0005-05-20 2005-05-20
5 1492 524 1200000 15 0 3 16.67 80000 15 89.9820 5/5/2018 5 2 5 0005-05-20 2005-05-20
6 1492 71 1050000 7 0 3 0.98 150000 7 714.2857 5/5/2018 5 2 5 0005-05-20 2005-05-20
> str(CINEMA.TICKET)
正如@Dave2e 指出的那样。您正在寻找:
CINEMA.TICKET[, date := as.Date(date , format = "%d/%m/%Y")]
假设我们的输入格式是 "30/5/2018"
因为问题不明确 "5/5/2018"
的例子可能是 "%d/%m/%Y"
或 "%m/%d/%Y"
对于排序列使用:
setcolorder(CINEMA.TICKET, c("c", "b", "a"))
其中 c,b,a
是按所需顺序排列的列名称
lubridate
可能会成功
> lubridate::mdy("5/5/2018")
[1] "2018-05-05"
所以你应该使用
library(lubridate)
library(tidyverse)
CINEMA.TICKET <- CINEMA.TICKET %>%
mutate(DATE=mdy(date))
这是另一个选项:
library(tidyverse)
output <- df %>%
mutate(date = as.Date(date, format="%m/%d/%Y"))
输出
film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity date month quarter day
1 1492 304 3900000 26 0 4 4.26 150000 26 610.3286 2018-05-05 5 2 5
2 1492 352 3360000 42 0 5 8.08 80000 42 519.8020 2018-05-05 5 2 5
3 1492 489 2560000 32 0 4 20.00 80000 32 160.0000 2018-05-05 5 2 5
4 1492 429 1200000 12 0 1 11.01 100000 12 108.9918 2018-05-05 5 2 5
5 1492 524 1200000 15 0 3 16.67 80000 15 89.9820 2018-05-05 5 2 5
6 1492 71 1050000 7 0 3 0.98 150000 7 714.2857 2018-05-05 5 2 5
要将 date
归类为日期,不能使用正斜杠。您可以更改格式,但不再归类为日期,而是重新归类为字符。
class(output$date)
# [1] "Date"
output2 <- df %>%
mutate(date = as.Date(date, format="%m/%d/%Y")) %>%
mutate(date = format(date, "%Y/%m/%d"))
class(output2$date)
# [1] "character"
数据
df <-
structure(
list(
film_code = c(1492L, 1492L, 1492L, 1492L, 1492L,
1492L),
cinema_code = c(304L, 352L, 489L, 429L, 524L, 71L),
total_sales = c(3900000L,
3360000L, 2560000L, 1200000L, 1200000L, 1050000L),
tickets_sold = c(26L,
42L, 32L, 12L, 15L, 7L),
tickets_out = c(0L, 0L, 0L, 0L, 0L,
0L),
show_time = c(4L, 5L, 4L, 1L, 3L, 3L),
occu_perc = c(4.26,
8.08, 20, 11.01, 16.67, 0.98),
ticket_price = c(150000L, 80000L,
80000L, 100000L, 80000L, 150000L),
ticket_use = c(26L, 42L, 32L,
12L, 15L, 7L),
capacity = c(610.3286, 519.802, 160, 108.9918,
89.982, 714.2857),
date = c("5/5/2018", "5/5/2018", "5/5/2018", "5/5/2018",
"5/5/2018", "5/5/2018"),
month = c(5L, 5L, 5L, 5L, 5L, 5L),
quarter = c(2L,
2L, 2L, 2L, 2L, 2L),
day = c(5L, 5L, 5L, 5L, 5L, 5L)
),
class = "data.frame",
row.names = c(NA,-6L)
)