将每日数据系列重新排序为每周行,每周列为周一、周二、周三、周四和周五
Reordering a daily data series into weekly rows with Mon, Tue, Wed, Thur and Fri as the weekly columns
我有一个每日时间序列,其中包含工作日,不包括周末。
我想重新排序,使每一列代表一周,随后的五行显示该周周一、周二、周三、周四和周五的数据。
我尝试使用 cast
(包:reshape
),但在获取上述内容时遇到问题。
感谢您的帮助。
示例:
Date Day Value
06/01/2020 mon 15
07/01/2020 tue 16
08/01/2020 wed 17
09/01/2020 thu 18
10/01/2020 fri 19
13/01/2020 mon 20
14/01/2020 tue 21
15/01/2020 wed 22
16/01/2020 thu 23
17/01/2020 fri 24
重新格式化为:
Start of week mon tue wed thu fri
06/01/2020 15 16 17 18 19
13/01/2020 20 21 22 23 24
这里是一个使用 tidyverse
包的例子。我还使用 lubridate
将 Date
从字符转换为日期 class。
关键是使用 tidyr
包中的 pivot_wider
将数据转换为宽格式。
library(tidyverse)
library(lubridate)
dat2 <- dat %>%
# Convert W to factor for ordering
mutate(Day = factor(Day, levels = c("mon", "tue", "wed", "thu", "fri"))) %>%
# Create a goruping variable to show the week number
group_by(Day) %>%
mutate(Group = 1:n()) %>%
ungroup() %>%
# Change the Date based on Group
group_by(Group) %>%
mutate(Date = min(dmy(Date))) %>%
# Convert to wide format
pivot_wider(names_from = Day, values_from = Value) %>%
# Remove Group
ungroup() %>%
select(-Group)
dat2
# # A tibble: 2 x 6
# Date mon tue wed thu fri
# <date> <int> <int> <int> <int> <int>
# 1 2020-01-06 15 16 17 18 19
# 2 2020-01-13 20 21 22 23 24
数据
# Create example data frame
dat <- read.table(text = "Date Day Value
'06/01/2020' mon 15
'07/01/2020' tue 16
'08/01/2020' wed 17
'09/01/2020' thu 18
'10/01/2020' fri 19
'13/01/2020' mon 20
'14/01/2020' tue 21
'15/01/2020' wed 22
'16/01/2020' thu 23
'17/01/2020' fri 24",
header = TRUE, stringsAsFactors = FALSE)
data.table-包的另一个选项:
library(data.table)
# convert to a 'data.table'
# set the 'Date' and 'Day' columns in the right format
setDT(mydf)[, `:=` (Date = as.Date(Date, format = "%d/%m/%Y"),
Day = factor(Day, levels = c("mon","tue","wed","thu","fri")))]
# create a 'start_of_week' column
# transform from long to wide format
res <- mydf[, start_of_week := Date[1], by = cumsum(Day == "mon")
][, dcast(.SD, start_of_week ~ Day, value.var = "Value")]
给出:
> res
start_of_week mon tue wed thu fri
1: 06/01/2020 15 16 17 18 19
2: 13/01/2020 20 21 22 23 24
已用数据:
mydf <- read.table(text="Date Day Value
06/01/2020 mon 15
07/01/2020 tue 16
08/01/2020 wed 17
09/01/2020 thu 18
10/01/2020 fri 19
13/01/2020 mon 20
14/01/2020 tue 21
15/01/2020 wed 22
16/01/2020 thu 23
17/01/2020 fri 24", header=TRUE, stringsAsFactors=FALSE)
我有一个每日时间序列,其中包含工作日,不包括周末。 我想重新排序,使每一列代表一周,随后的五行显示该周周一、周二、周三、周四和周五的数据。
我尝试使用 cast
(包:reshape
),但在获取上述内容时遇到问题。
感谢您的帮助。
示例:
Date Day Value
06/01/2020 mon 15
07/01/2020 tue 16
08/01/2020 wed 17
09/01/2020 thu 18
10/01/2020 fri 19
13/01/2020 mon 20
14/01/2020 tue 21
15/01/2020 wed 22
16/01/2020 thu 23
17/01/2020 fri 24
重新格式化为:
Start of week mon tue wed thu fri
06/01/2020 15 16 17 18 19
13/01/2020 20 21 22 23 24
这里是一个使用 tidyverse
包的例子。我还使用 lubridate
将 Date
从字符转换为日期 class。
关键是使用 tidyr
包中的 pivot_wider
将数据转换为宽格式。
library(tidyverse)
library(lubridate)
dat2 <- dat %>%
# Convert W to factor for ordering
mutate(Day = factor(Day, levels = c("mon", "tue", "wed", "thu", "fri"))) %>%
# Create a goruping variable to show the week number
group_by(Day) %>%
mutate(Group = 1:n()) %>%
ungroup() %>%
# Change the Date based on Group
group_by(Group) %>%
mutate(Date = min(dmy(Date))) %>%
# Convert to wide format
pivot_wider(names_from = Day, values_from = Value) %>%
# Remove Group
ungroup() %>%
select(-Group)
dat2
# # A tibble: 2 x 6
# Date mon tue wed thu fri
# <date> <int> <int> <int> <int> <int>
# 1 2020-01-06 15 16 17 18 19
# 2 2020-01-13 20 21 22 23 24
数据
# Create example data frame
dat <- read.table(text = "Date Day Value
'06/01/2020' mon 15
'07/01/2020' tue 16
'08/01/2020' wed 17
'09/01/2020' thu 18
'10/01/2020' fri 19
'13/01/2020' mon 20
'14/01/2020' tue 21
'15/01/2020' wed 22
'16/01/2020' thu 23
'17/01/2020' fri 24",
header = TRUE, stringsAsFactors = FALSE)
data.table-包的另一个选项:
library(data.table)
# convert to a 'data.table'
# set the 'Date' and 'Day' columns in the right format
setDT(mydf)[, `:=` (Date = as.Date(Date, format = "%d/%m/%Y"),
Day = factor(Day, levels = c("mon","tue","wed","thu","fri")))]
# create a 'start_of_week' column
# transform from long to wide format
res <- mydf[, start_of_week := Date[1], by = cumsum(Day == "mon")
][, dcast(.SD, start_of_week ~ Day, value.var = "Value")]
给出:
> res start_of_week mon tue wed thu fri 1: 06/01/2020 15 16 17 18 19 2: 13/01/2020 20 21 22 23 24
已用数据:
mydf <- read.table(text="Date Day Value
06/01/2020 mon 15
07/01/2020 tue 16
08/01/2020 wed 17
09/01/2020 thu 18
10/01/2020 fri 19
13/01/2020 mon 20
14/01/2020 tue 21
15/01/2020 wed 22
16/01/2020 thu 23
17/01/2020 fri 24", header=TRUE, stringsAsFactors=FALSE)