如何使用多个变量或维度进行扩展
How to expand with multiple variables or dimensions
我想在 R 中扩展三个维度。我想在一个数据框中合并三年中每天的县级信息,该数据框包含所有年份的所有县,包括所有月份和所有日期(例如 31) .问题是并不是每个 county#day 的观测值都可以在使用数据中找到。这是因为此事件并未在特定县的特定日期发生。因此,这些对我来说是零观察。
为了创建我的主文件,我列出了所有县。然后,我想扩展它,以便我对每个县#year#month#day 组合都有一个独特的观察。
我把代码留给你了。我有一个 data.frame 包括县。我会生成年、月和日。到目前为止,我使用了 tidyverse 的扩展。
编辑:
library(tidyverse)
# This is my list of all counties from an official source
counties <- data.frame("county" = c("A", "B" ,"c"))
# This is what I have, the data includes counties (not all),
# for year (not all),
# months (not all)
# and days (not all)
using <- data.frame("county" = c("A", "A", "A", "B", "B", "B", "B"),
"year" = c(2015,2016,2017,2015,2016,2017,2018),
"month" = c(1,2,7,2,3,2,4),
"day" = c(1,2,22,3,21,14,5))
# This is my attempt to get at least all county year combinations
county.month <- expand(counties, county, 1:12)
# But I wish I could get all county#year#month#dya combinations
最佳
丹尼尔
我不确定你想要什么作为输出...但我认为你想要 tidyr
的功能:complete
而不是 expand
?
例如
using %>%
complete(month, nesting(county, year))
# A tibble: 35 x 4
month county year day
<dbl> <fct> <dbl> <dbl>
1 1 A 2015 1
2 1 A 2016 NA
3 1 A 2017 NA
4 1 B 2015 NA
5 1 B 2016 NA
6 1 B 2017 NA
7 1 B 2018 NA
8 2 A 2015 NA
9 2 A 2016 2
10 2 A 2017 NA
这种方法应该做你想做的事:所有可能的 county/year/month/day 组合的小标题(假设每个月有 31 天......;))关键是使用因素
library(tidyverse)
counties <- data.frame("county" = c("A", "B" ,"C"), stringsAsFactors = F)
using <- tibble("county" = c("A", "A", "A", "B", "B", "B", "B"),
"year" = c(2015,2016,2017,2015,2016,2017,2018),
"month" = c(1,2,7,2,3,2,4),
"day" = c(1,2,22,3,21,14,5))
using %>%
mutate_if(is.character, as_factor) %>%
mutate_if(is.numeric, as.ordered) %>%
mutate(county = fct_expand(county, counties$county),
month = fct_expand(month, as.character(1:12)),
day = fct_expand(day, as.character(1:31))) %>%
expand(county, year, month, day) %>%
arrange(year, month, day)
# A tibble: 4,464 x 4
county year month day
<fct> <ord> <ord> <ord>
1 A 2015 1 1
2 B 2015 1 1
3 c 2015 1 1
4 A 2015 1 2
5 B 2015 1 2
6 c 2015 1 2
7 A 2015 1 3
8 B 2015 1 3
9 c 2015 1 3
10 A 2015 1 5
# … with 4,454 more rows
也许您想要的是数据中年份中的所有日期。如果是这种情况,请使用 seq()
函数 by="1 day"
.
library(tidyverse)
library(lubridate)
counties <- data.frame("county" = c("A", "B" ,"c"), stringsAsFactors = FALSE)
start_date<-as_date("2015-01-01")
end_date<-as_date("2018-12-31")
all_dates<-seq(start_date, end_date, by='1 day')
allcounties_alldates<-crossing(counties, all_dates)
我想在 R 中扩展三个维度。我想在一个数据框中合并三年中每天的县级信息,该数据框包含所有年份的所有县,包括所有月份和所有日期(例如 31) .问题是并不是每个 county#day 的观测值都可以在使用数据中找到。这是因为此事件并未在特定县的特定日期发生。因此,这些对我来说是零观察。
为了创建我的主文件,我列出了所有县。然后,我想扩展它,以便我对每个县#year#month#day 组合都有一个独特的观察。
我把代码留给你了。我有一个 data.frame 包括县。我会生成年、月和日。到目前为止,我使用了 tidyverse 的扩展。
编辑:
library(tidyverse)
# This is my list of all counties from an official source
counties <- data.frame("county" = c("A", "B" ,"c"))
# This is what I have, the data includes counties (not all),
# for year (not all),
# months (not all)
# and days (not all)
using <- data.frame("county" = c("A", "A", "A", "B", "B", "B", "B"),
"year" = c(2015,2016,2017,2015,2016,2017,2018),
"month" = c(1,2,7,2,3,2,4),
"day" = c(1,2,22,3,21,14,5))
# This is my attempt to get at least all county year combinations
county.month <- expand(counties, county, 1:12)
# But I wish I could get all county#year#month#dya combinations
最佳
丹尼尔
我不确定你想要什么作为输出...但我认为你想要 tidyr
的功能:complete
而不是 expand
?
例如
using %>%
complete(month, nesting(county, year))
# A tibble: 35 x 4
month county year day
<dbl> <fct> <dbl> <dbl>
1 1 A 2015 1
2 1 A 2016 NA
3 1 A 2017 NA
4 1 B 2015 NA
5 1 B 2016 NA
6 1 B 2017 NA
7 1 B 2018 NA
8 2 A 2015 NA
9 2 A 2016 2
10 2 A 2017 NA
这种方法应该做你想做的事:所有可能的 county/year/month/day 组合的小标题(假设每个月有 31 天......;))关键是使用因素
library(tidyverse)
counties <- data.frame("county" = c("A", "B" ,"C"), stringsAsFactors = F)
using <- tibble("county" = c("A", "A", "A", "B", "B", "B", "B"),
"year" = c(2015,2016,2017,2015,2016,2017,2018),
"month" = c(1,2,7,2,3,2,4),
"day" = c(1,2,22,3,21,14,5))
using %>%
mutate_if(is.character, as_factor) %>%
mutate_if(is.numeric, as.ordered) %>%
mutate(county = fct_expand(county, counties$county),
month = fct_expand(month, as.character(1:12)),
day = fct_expand(day, as.character(1:31))) %>%
expand(county, year, month, day) %>%
arrange(year, month, day)
# A tibble: 4,464 x 4
county year month day
<fct> <ord> <ord> <ord>
1 A 2015 1 1
2 B 2015 1 1
3 c 2015 1 1
4 A 2015 1 2
5 B 2015 1 2
6 c 2015 1 2
7 A 2015 1 3
8 B 2015 1 3
9 c 2015 1 3
10 A 2015 1 5
# … with 4,454 more rows
也许您想要的是数据中年份中的所有日期。如果是这种情况,请使用 seq()
函数 by="1 day"
.
library(tidyverse)
library(lubridate)
counties <- data.frame("county" = c("A", "B" ,"c"), stringsAsFactors = FALSE)
start_date<-as_date("2015-01-01")
end_date<-as_date("2018-12-31")
all_dates<-seq(start_date, end_date, by='1 day')
allcounties_alldates<-crossing(counties, all_dates)