在 R 中使用 tapply(dataframe, index, function) 作为函数 2 列的参数

Question

我想在数据框上使用 tapply() 函数，使用索引对行进行分组。我的问题是我要传递给函数的参数不是单个列，而是一对列。这是因为数据框的 2 列代表 x-y 点，它们是成对的。运行 tapply(dataframe , indexes , function) 给我的错误是 indexes 的长度与 tapply 不同。我该如何解决这个问题？谢谢！

Answer 1

如果要汇总的列不止一列，请使用 aggregate 而不是 tapply（因为 tapply 适用于单个列）

aggregate(.~ indexes, transform(df1, indexes = indexes), FUN = yourfun)

或者另一种选择是by

by(df1, list(indexes), FUN = yourfun)

或者 tidyverse

可能更灵活

library(dplyr)
df1 %>%
    group_by(indexes) %>%
    summarise(across(c(x, y), yourfun), .groups = 'drop')

使用一个可重现的小例子

indexes = rep(1:2, c(3, 2))
by(mtcars[1:5, 1:5], indexes, FUN = sum)

在 R 中使用 tapply(dataframe, index, function) 作为函数 2 列的参数

Use tapply(dataframe , index, function) in R giving as argument to the function 2 columns

r

tapply