管道处理列表、子集和计算我自己的函数
Piping over a list, subsetting and calculate a function of my own
我有一个包含这三列和其他附加列的数据集
structure(list(from = c(1, 8, 3, 3, 8, 1, 4, 5, 8, 3, 1, 8, 4,
1), to = c(8, 3, 8, 54, 3, 4, 1, 6, 7, 1, 4, 3, 8, 8), time = c(1521823032,
1521827196, 1521827196, 1522678358, 1522701516, 1522701993, 1522702123,
1522769399, 1522780956, 1522794468, 1522794468, 1522794468, 1522794468,
1522859524)), class = "data.frame", row.names = c(NA, -14L))
我需要代码来获取所有小于一个数字(例如 5)的索引,并且对每个索引执行以下操作:如果索引在“from”列或“to”列中,则对数据集进行子集化并计算一个函数(例如时间上的最小值和最大值之间的差异)。因此,我希望有一个包含索引和计算结果的数据框。
这是我的,但它不起作用。
dur<-function(x)max(x)-min(x) #The function to calculate the difference. In other cases I need to use other functions of my own
filternumber <- function(number,x){ #A function to filter data x by the number in the two two columns
x <- x%>% subset(from == number | to == number)
return(x)
}
lista <- unique(c(data$from, data$to)) # Creates a list with all the indexes in the data. I do this to avoid having non-existing indexes
lista <-lista[lista <= 5] #Limit the list to 5. In my code this number would be an argument to a function
result<-lista%>%filteremployee(.,data) %>% select(time) %>% dur() #I use select because I have many other columns in the data
这种情况下的结果应该是一个数据帧,其中 1 为 1036492,3 为 967272,4 为 92475
我也试过将 filteremployee(.,data) %>% select(time) %>% dur()
放在 side mutate 中,但这也不起作用
该函数是使用 ==
创建的,它是按元素创建的。这里,我们可能需要循环
library(dplyr)
library(purrr)
map_dbl(lista, ~ filternumber(.x, data) %>%
select(time) %>%
dur)
[1] 1036492 967272 92475 0
也许您正在寻找这样的东西:
library(purrr)
library(dplyr)
index <- c(1, 3, 4)
names(index) <- index
index %>%
map_dfr(~ df %>%
filter(from == .x | to == .x) %>%
summarize(result = dur(time)),
.id = "index")
这个returns
index result
1 1 1036492
2 3 967272
3 4 92475
我有一个包含这三列和其他附加列的数据集
structure(list(from = c(1, 8, 3, 3, 8, 1, 4, 5, 8, 3, 1, 8, 4,
1), to = c(8, 3, 8, 54, 3, 4, 1, 6, 7, 1, 4, 3, 8, 8), time = c(1521823032,
1521827196, 1521827196, 1522678358, 1522701516, 1522701993, 1522702123,
1522769399, 1522780956, 1522794468, 1522794468, 1522794468, 1522794468,
1522859524)), class = "data.frame", row.names = c(NA, -14L))
我需要代码来获取所有小于一个数字(例如 5)的索引,并且对每个索引执行以下操作:如果索引在“from”列或“to”列中,则对数据集进行子集化并计算一个函数(例如时间上的最小值和最大值之间的差异)。因此,我希望有一个包含索引和计算结果的数据框。
这是我的,但它不起作用。
dur<-function(x)max(x)-min(x) #The function to calculate the difference. In other cases I need to use other functions of my own
filternumber <- function(number,x){ #A function to filter data x by the number in the two two columns
x <- x%>% subset(from == number | to == number)
return(x)
}
lista <- unique(c(data$from, data$to)) # Creates a list with all the indexes in the data. I do this to avoid having non-existing indexes
lista <-lista[lista <= 5] #Limit the list to 5. In my code this number would be an argument to a function
result<-lista%>%filteremployee(.,data) %>% select(time) %>% dur() #I use select because I have many other columns in the data
这种情况下的结果应该是一个数据帧,其中 1 为 1036492,3 为 967272,4 为 92475
我也试过将 filteremployee(.,data) %>% select(time) %>% dur()
放在 side mutate 中,但这也不起作用
该函数是使用 ==
创建的,它是按元素创建的。这里,我们可能需要循环
library(dplyr)
library(purrr)
map_dbl(lista, ~ filternumber(.x, data) %>%
select(time) %>%
dur)
[1] 1036492 967272 92475 0
也许您正在寻找这样的东西:
library(purrr)
library(dplyr)
index <- c(1, 3, 4)
names(index) <- index
index %>%
map_dfr(~ df %>%
filter(from == .x | to == .x) %>%
summarize(result = dur(time)),
.id = "index")
这个returns
index result
1 1 1036492
2 3 967272
3 4 92475