管道处理列表、子集和计算我自己的函数

Piping over a list, subsetting and calculate a function of my own

我有一个包含这三列和其他附加列的数据集

structure(list(from = c(1, 8, 3, 3, 8, 1, 4, 5, 8, 3, 1, 8, 4, 
1), to = c(8, 3, 8, 54, 3, 4, 1, 6, 7, 1, 4, 3, 8, 8), time = c(1521823032, 
1521827196, 1521827196, 1522678358, 1522701516, 1522701993, 1522702123, 
1522769399, 1522780956, 1522794468, 1522794468, 1522794468, 1522794468, 
1522859524)), class = "data.frame", row.names = c(NA, -14L))

我需要代码来获取所有小于一个数字(例如 5)的索引,并且对每个索引执行以下操作:如果索引在“from”列或“to”列中,则对数据集进行子集化并计算一个函数(例如时间上的最小值和最大值之间的差异)。因此,我希望有一个包含索引和计算结果的数据框。

这是我的,但它不起作用。

dur<-function(x)max(x)-min(x)  #The function to calculate the difference. In other cases I need to use other functions of my own

filternumber <- function(number,x){          #A function to filter data x by the number in the two two columns
  x <- x%>% subset(from == number | to == number)
  return(x)
}

lista <- unique(c(data$from, data$to))  # Creates a list with all the indexes in the data. I do this to avoid having non-existing indexes
lista <-lista[lista <= 5]  #Limit the list to 5. In my code this number would be an argument to a function

result<-lista%>%filteremployee(.,data) %>% select(time) %>% dur() #I use select because I have many other columns in the data

这种情况下的结果应该是一个数据帧,其中 1 为 1036492,3 为 967272,4 为 92475

我也试过将 filteremployee(.,data) %>% select(time) %>% dur() 放在 side mutate 中,但这也不起作用

该函数是使用 == 创建的,它是按元素创建的。这里,我们可能需要循环

library(dplyr)
library(purrr)
map_dbl(lista, ~ filternumber(.x, data) %>%
      select(time) %>%
       dur)
[1] 1036492  967272   92475       0

也许您正在寻找这样的东西:

library(purrr)
library(dplyr)

index <- c(1, 3, 4)
names(index) <- index

index %>% 
  map_dfr(~ df %>% 
        filter(from == .x | to == .x) %>% 
        summarize(result = dur(time)),
        .id = "index")

这个returns

  index  result
1     1 1036492
2     3  967272
3     4   92475