如何将字符日期转换为数字?

How can I can convert character dates into numerics?

给定包含电影院数据的时间序列,标识符“日期”很有趣。我想转换成格式“YYYY/MM/DD”。但是,当我 运行 我的代码时:

CINEMA.TICKET$DATE <- as.Date(CINEMA.TICKET$date , format = "%y/%m/%d")

出现两个问题: 首先,日期显示在 table 的最右侧,例如, "0005-05-20."许多条目完全消失了。谁能解释一下我做错了什么,我该怎么做才正确?

film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity     date month quarter day    newdate       DATE
1      1492         304     3900000           26           0         4      4.26       150000         26 610.3286 5/5/2018     5       2   5 0005-05-20 2005-05-20
2      1492         352     3360000           42           0         5      8.08        80000         42 519.8020 5/5/2018     5       2   5 0005-05-20 2005-05-20
3      1492         489     2560000           32           0         4     20.00        80000         32 160.0000 5/5/2018     5       2   5 0005-05-20 2005-05-20
4      1492         429     1200000           12           0         1     11.01       100000         12 108.9918 5/5/2018     5       2   5 0005-05-20 2005-05-20
5      1492         524     1200000           15           0         3     16.67        80000         15  89.9820 5/5/2018     5       2   5 0005-05-20 2005-05-20
6      1492          71     1050000            7           0         3      0.98       150000          7 714.2857 5/5/2018     5       2   5 0005-05-20 2005-05-20
> str(CINEMA.TICKET)

正如@Dave2e 指出的那样。您正在寻找:

CINEMA.TICKET[, date := as.Date(date , format = "%d/%m/%Y")]

假设我们的输入格式是 "30/5/2018" 因为问题不明确 "5/5/2018" 的例子可能是 "%d/%m/%Y""%m/%d/%Y"

对于排序列使用:

setcolorder(CINEMA.TICKET, c("c", "b", "a"))

其中 c,b,a 是按所需顺序排列的列名称

lubridate 可能会成功

> lubridate::mdy("5/5/2018")
[1] "2018-05-05"

所以你应该使用

library(lubridate)
library(tidyverse)

CINEMA.TICKET <- CINEMA.TICKET %>% 
  mutate(DATE=mdy(date))

这是另一个选项:

library(tidyverse)

output <- df %>% 
  mutate(date = as.Date(date, format="%m/%d/%Y"))

输出

  film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity       date month quarter day
1      1492         304     3900000           26           0         4      4.26       150000         26 610.3286 2018-05-05     5       2   5
2      1492         352     3360000           42           0         5      8.08        80000         42 519.8020 2018-05-05     5       2   5
3      1492         489     2560000           32           0         4     20.00        80000         32 160.0000 2018-05-05     5       2   5
4      1492         429     1200000           12           0         1     11.01       100000         12 108.9918 2018-05-05     5       2   5
5      1492         524     1200000           15           0         3     16.67        80000         15  89.9820 2018-05-05     5       2   5
6      1492          71     1050000            7           0         3      0.98       150000          7 714.2857 2018-05-05     5       2   5

要将 date 归类为日期,不能使用正斜杠。您可以更改格式,但不再归类为日期,而是重新归类为字符。

class(output$date)
# [1] "Date"

output2 <- df %>% 
  mutate(date = as.Date(date, format="%m/%d/%Y")) %>% 
  mutate(date = format(date, "%Y/%m/%d"))

class(output2$date)
# [1] "character"

数据

df <-
  structure(
    list(
      film_code = c(1492L, 1492L, 1492L, 1492L, 1492L,
                    1492L),
      cinema_code = c(304L, 352L, 489L, 429L, 524L, 71L),
      total_sales = c(3900000L,
                      3360000L, 2560000L, 1200000L, 1200000L, 1050000L),
      tickets_sold = c(26L,
                       42L, 32L, 12L, 15L, 7L),
      tickets_out = c(0L, 0L, 0L, 0L, 0L,
                      0L),
      show_time = c(4L, 5L, 4L, 1L, 3L, 3L),
      occu_perc = c(4.26,
                    8.08, 20, 11.01, 16.67, 0.98),
      ticket_price = c(150000L, 80000L,
                       80000L, 100000L, 80000L, 150000L),
      ticket_use = c(26L, 42L, 32L,
                     12L, 15L, 7L),
      capacity = c(610.3286, 519.802, 160, 108.9918,
                   89.982, 714.2857),
      date = c("5/5/2018", "5/5/2018", "5/5/2018", "5/5/2018",
               "5/5/2018", "5/5/2018"),
      month = c(5L, 5L, 5L, 5L, 5L, 5L),
      quarter = c(2L,
                  2L, 2L, 2L, 2L, 2L),
      day = c(5L, 5L, 5L, 5L, 5L, 5L)
    ),
    class = "data.frame",
    row.names = c(NA,-6L)
  )