将特定元素嵌套到 r 中的另一个列表中

Nesting Specific Elements into another list in r

我的数据集有 5 个 ID,跨度从 01-01-201012-31-2013。我首先 split 数据 ID,最后得到一个列表对象。然后我创建另一个列表,创建 10 天的间隔并按 ID.

排列

我想根据间隔元素中标记的 ID 将这些间隔嵌套到第一个 ID 列表中。

例如: 主列表由 ID 元素组成。 [1],[2],[3]是嵌套在ID的区间,例如[A]中的区间都是IDA,[B]是B,[C]是C,等等。

[A]
   [1]
   [2]
   [3]
[B]
   [1]
   [2]
   [3]
[C]
   [1]
   [2]
   [3]
[D]
   [1]
   [2]
   [3]
[E]
   [1]
   [2]
   [3]

下面的代码将区间嵌套到 ID 列表中,但它嵌套了所有 ID,而不是它应该在的特定区间。

set.seed(12345)
library(lubridate)
library(tidyverse)

date <- rep_len(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"), 500)
ID <- rep(c("A","B","C","D", "E"), 100)

df <- data.frame(date = date,
                 x = runif(length(date), min = 60000, max = 80000),
                 y = runif(length(date), min = 800000, max = 900000),
                 ID)

df_ID <- split(df, df$ID)


df_nested <- lapply(df_ID, function(x){
  x %>%
    arrange(ID) %>% 
    # Creates a new column assigning the first day in the 10-day interval in which
    # the date falls under (e.g., 01-01-2010 would be in the first 10-day interval
    # so the `floor_date` assigned to it would be 01-01-2010)
    mutate(new = floor_date(date, "10 days")) %>%
    # For any months that has 31 days, the 31st day would normally be assigned its 
    # own interval. The code below takes the 31st day and joins it with the 
    # previous interval. 
    mutate(new = if_else(day(new) == 31, new - days(10), new)) %>% 
    group_by(new, .add = TRUE) %>%
    group_split()
})

我会这样做:

set.seed(12345)
library(lubridate)
library(tidyverse)

f = function(data){
  data %>% mutate(
    new = floor_date(data$date, "10 days"),
    new = if_else(day(new) == 31, new - days(10), new)
  )
}

tibble(
  ID = rep(c("A","B","C","D", "E"), 100),
  date = rep_len(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"), 500),
  x = runif(length(date), min = 60000, max = 80000),
  y = runif(length(date), min = 800000, max = 900000)
) %>% group_by(ID) %>% 
  nest() %>% 
  mutate(data = map(data, f)) %>% 
  unnest(data)

输出

# A tibble: 500 x 5
# Groups:   ID [5]
   ID    date            x       y new       
   <chr> <date>      <dbl>   <dbl> <date>    
 1 A     2010-01-01 74418. 820935. 2010-01-01
 2 A     2010-01-06 63327. 885896. 2010-01-01
 3 A     2010-01-11 60691. 873949. 2010-01-11
 4 A     2010-01-16 69250. 868411. 2010-01-11
 5 A     2010-01-21 69075. 876142. 2010-01-21
 6 A     2010-01-26 67797. 829892. 2010-01-21
 7 A     2010-01-31 75860. 843542. 2010-01-21
 8 A     2010-02-05 67233. 882318. 2010-02-01
 9 A     2010-02-10 75644. 826283. 2010-02-01
10 A     2010-02-15 66424. 853789. 2010-02-11

简单明了,不是吗?

您想对数据执行的所有操作都包含在 f 函数中。您可以根据需要扩展它。

剩下的在一个简单的方案中完成 tibble %>% group_by %>% nest % mutate %>% unnest