应用将函数作为参数的用户定义函数

Apply a user-defined function that takes function as argument

如果用户定义的函数将函数作为参数,我如何 "apply" 将用户定义的函数应用于数据框的每一行?

这是一个示例...假设我在数据框中有三列,每列包含整数。对于每一行,我想取最小整数并使用查找数据集将其转换为相应的字母。同样,使用最大整数执行相同的任务。结果是这样的:

Col1 | Col2 | Col3 | MaxVal | MinVal |
-------------------------------------
 1      2      1       B        A
 4      4      1       F        A
 5      6      2       F        B

下面的代码导致:Error in $<-.data.frame(*tmp*, "MaxVal", value = integer(0)) : replacement has 0 rows, data has 3

myData <- data.frame("Col1" = c(1, 4, 5), "Col2" = c(2, 6, 6), "Col3" = c(1, 1, 2))
numberToLetterData <- data.frame("Number" = 1:6, "Letter" = c("A", "B","C","D","E","F"))

GetMinOrMaxForRow <- function(x, refData, functionToUse){
    refData$Letter[refData$Number ==  functionToUse(x)]
}

myData$MinVal <- apply(myData, 1, FUN = function(x) GetMinOrMaxForRow(x = x, refData = numberToLetterData, functionToUse = min))
myData$MaxVal <- apply(myData, 1, FUN = function(x) GetMinOrMaxForRow(x = x, refData = numberToLetterData, functionToUse = max))

...但以下代码(最后两行已切换)工作正常:

myData <- data.frame("Col1" = c(1, 4, 5), "Col2" = c(2, 6, 6), "Col3" = c(1, 1, 2))
numberToLetterData <- data.frame("Number" = 1:6, "Letter" = c("A", "B","C","D","E","F"))

GetMinOrMaxForRow <- function(x, refData, functionToUse){
    refData$Letter[refData$Number ==  functionToUse(x)]
}

myData$MaxVal <- apply(myData, 1, FUN = function(x) GetMinOrMaxForRow(x = x, refData = numberToLetterData, functionToUse = max))
myData$MinVal <- apply(myData, 1, FUN = function(x) GetMinOrMaxForRow(x = x, refData = numberToLetterData, functionToUse = min))

...有人知道为什么吗?

调用第一行后,您分配myData$MinVal。在下一行中,您在数据框中的完整行上构建最大值,包括新的 MinVal 列。

所以不要将函数应用于所有列,即仅 myData[1:3]。

myData <- data.frame("Col1" = c(1, 4, 5), "Col2" = c(2, 6, 6), "Col3" = c(1, 1, 2))
numberToLetterData <- data.frame("Number" = 1:6, "Letter" = c("A", "B","C","D","E","F"))

GetMinOrMaxForRow <- function(x, refData, functionToUse){
    refData$Letter[refData$Number ==  functionToUse(x)]
}

myData$MinVal <- apply(myData[,1:3], 1, FUN = function(x) GetMinOrMaxForRow(x = x, refData = numberToLetterData, functionToUse = min))
myData$MaxVal <- apply(myData[,1:3], 1, FUN = function(x) GetMinOrMaxForRow(x = x, refData = numberToLetterData, functionToUse = max))

使用 dplyr 你可以:

myData %>% 
  rowwise %>% 
  mutate(minVal = lookup[min(Col1, Col2, Col3)],
         maxVal = lookup[max(Col1, Col2, Col3)])

或者分两步,所以先计算函数再做查找:

myData %>% 
  rowwise %>% 
  mutate(minVal = min(Col1, Col2, Col3),
         maxVal = max(Col1, Col2, Col3)) %>% 
  mutate_at(vars(minVal, maxVal), function(x) lookup[x])

使用purrr你可以做到:

require(purrr)
lookup <- setNames(LETTERS[1:6], 1:6)
myData %>% 
  by_row(~lookup[min(.[1:3])], .collate = "cols", .to = "minVal") %>% 
  by_row(~lookup[max(.[1:3])], .collate = "cols", .to = "maxVal")