如何根据 R 中数据框的最大值 return 来自数据框的行？

Question

假设我有一个如下所示的数据框。我在 Whosebug 上找到的大多数建议旨在从 one 列中获取最大值，然后 returning 行索引。我想知道是否有一种方法可以通过扫描 两个或更多 列的最大值来 return 数据框的行索引。

总而言之，从下面的示例中，我想要获取行：

11 building_footprint_sum 0.003 0.470

保存数据框的最大值

+----+-------------------------+--------------------+-------------------+
| id |        plot_name        | rsquare_allotments | rsquare_block_dev |
+----+-------------------------+--------------------+-------------------+
|  6 | building_footprint_max  | 0.002              | 0.421             |
|  7 | building_footprint_mean | 0.002              | 0.354             |
|  8 | building_footprint_med  | 0.002              | 0.350             |
|  9 | building_footprint_min  | 0.002              | 0.278             |
| 10 | building_footprint_sd   | 0.003              | 0.052             |
| 11 | building_footprint_sum  | 0.003              | 0.470             |
+----+-------------------------+--------------------+-------------------+

有没有一个相当简单的方法来实现这个？

Answer 1

尝试使用 pmax

?pmax    
pmax and pmin take one or more vectors (or matrices) as arguments and
return a single vector giving the ‘parallel’ maxima (or minima) of the vectors.

我建议分两步完成

# make a new column that compares column 3 and column 4 and returns the larger value
> df$new <- pmax(df$rsquare_allotments, df$rsquare_block_dev)

# look for the row, where the new variable has the largest value
> df[(df$new == max(df$new)), ][3:4]

请考虑如果最大值出现不止一次，您的结果将不止一行

Answer 2

您正在查找矩阵达到最大值的行索引。您可以通过使用 which() 和 arr.ind=TRUE 选项来做到这一点：

> set.seed(1)
> foo <- matrix(rnorm(6),3,2)
> which(foo==max(foo),arr.ind=TRUE)
     row col
[1,]   1   2

所以在这种情况下，您需要第 1 行。（并且您可以丢弃 col 输出。）

如果你走这条路，要小心浮点运算和==（见FAQ 7.31）。最好这样做：

> which(foo>max(foo)-0.01,arr.ind=TRUE)
     row col
[1,]   1   2

您使用适当的小值代替 0.01。

如何根据 R 中数据框的最大值 return 来自数据框的行？

How to return the row from a data frame based on the maximum value of the data frame in R?

r

max

dataframe