如何重新排列数据框,从两个变量和第三个具有两个级别的分类变量中创建四个连续变量?
How to re-arrange a dataframe creating four continuous variables from two ones and a third cathegorical variable with two levels?
我有下一个数据框:
df <- data.frame(Date= as.Date(c("2020-08-04","2020-08-04","2020-08-06","2020-08-06","2020-08-07","2020-08-07")),
Period= c("Day","Night","Day","Night","Day","Night"),
State.1= c(1, 0.45,0.48,0.32,0.29,0.87),
State.2= c(0, 0.55,0.28,0.62,0.79,0.17))
df
Date Period State.1 State.2
1 2020-08-04 Day 1.00 0.00
2 2020-08-04 Night 0.45 0.55
3 2020-08-06 Day 0.48 0.28
4 2020-08-06 Night 0.32 0.62
5 2020-08-07 Day 0.29 0.79
6 2020-08-07 Night 0.87 0.17
我在其中根据 date
和 Period
指出了 State.1
和 State.2
的一些值。出于说明目的,我想重新安排我的数据框,将 Period
包含在 State.1
和 State.2
中。我希望得到下一个:
df2
Date State.1_day State.1_night State.2_day State.2_night
1 2020-08-04 1.00 0.45 0.00 0.55
2 2020-08-05 NA NA NA NA
3 2020-08-06 0.48 0.32 0.28 0.62
4 2020-08-07 0.29 0.87 0.79 0.17
我该怎么做?我尝试了 melt()
但我无法得到我想要的。
提前致谢。
你可以这样做:
tidyr::pivot_wider(df, names_from = Period, values_from = c("State.1", State.2))
#> # A tibble: 3 x 5
#> Date State.1_Day State.1_Night State.2_Day State.2_Night
#> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-08-04 1 0.45 0 0.55
#> 2 2020-08-06 0.48 0.32 0.28 0.62
#> 3 2020-08-07 0.290 0.87 0.79 0.17
或者如果您想要每一天的记录,即使原始数据框中不存在这一天,您可以这样做:
dplyr::left_join(data.frame(Date = seq(min(df$Date), max(df$Date), by = "day")),
tidyr::pivot_wider(df, names_from = Period,
values_from = c("State.1", State.2)))
#> Date State.1_Day State.1_Night State.2_Day State.2_Night
#> 1 2020-08-04 1.00 0.45 0.00 0.55
#> 2 2020-08-05 NA NA NA NA
#> 3 2020-08-06 0.48 0.32 0.28 0.62
#> 4 2020-08-07 0.29 0.87 0.79 0.17
我们可以使用data.table
library(data.table)
dcast(setDT(df), Date ~ Period, value.var = c("State.1", "State.2"))
我有下一个数据框:
df <- data.frame(Date= as.Date(c("2020-08-04","2020-08-04","2020-08-06","2020-08-06","2020-08-07","2020-08-07")),
Period= c("Day","Night","Day","Night","Day","Night"),
State.1= c(1, 0.45,0.48,0.32,0.29,0.87),
State.2= c(0, 0.55,0.28,0.62,0.79,0.17))
df
Date Period State.1 State.2
1 2020-08-04 Day 1.00 0.00
2 2020-08-04 Night 0.45 0.55
3 2020-08-06 Day 0.48 0.28
4 2020-08-06 Night 0.32 0.62
5 2020-08-07 Day 0.29 0.79
6 2020-08-07 Night 0.87 0.17
我在其中根据 date
和 Period
指出了 State.1
和 State.2
的一些值。出于说明目的,我想重新安排我的数据框,将 Period
包含在 State.1
和 State.2
中。我希望得到下一个:
df2
Date State.1_day State.1_night State.2_day State.2_night
1 2020-08-04 1.00 0.45 0.00 0.55
2 2020-08-05 NA NA NA NA
3 2020-08-06 0.48 0.32 0.28 0.62
4 2020-08-07 0.29 0.87 0.79 0.17
我该怎么做?我尝试了 melt()
但我无法得到我想要的。
提前致谢。
你可以这样做:
tidyr::pivot_wider(df, names_from = Period, values_from = c("State.1", State.2))
#> # A tibble: 3 x 5
#> Date State.1_Day State.1_Night State.2_Day State.2_Night
#> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-08-04 1 0.45 0 0.55
#> 2 2020-08-06 0.48 0.32 0.28 0.62
#> 3 2020-08-07 0.290 0.87 0.79 0.17
或者如果您想要每一天的记录,即使原始数据框中不存在这一天,您可以这样做:
dplyr::left_join(data.frame(Date = seq(min(df$Date), max(df$Date), by = "day")),
tidyr::pivot_wider(df, names_from = Period,
values_from = c("State.1", State.2)))
#> Date State.1_Day State.1_Night State.2_Day State.2_Night
#> 1 2020-08-04 1.00 0.45 0.00 0.55
#> 2 2020-08-05 NA NA NA NA
#> 3 2020-08-06 0.48 0.32 0.28 0.62
#> 4 2020-08-07 0.29 0.87 0.79 0.17
我们可以使用data.table
library(data.table)
dcast(setDT(df), Date ~ Period, value.var = c("State.1", "State.2"))