通过 R 中的另一列最大值获得另一列值

Question

我得到了一个看起来像这样的数据框。

+---+------+------+------+------+
|   | Name | col1 | col2 | col3 |
+---+------+------+------+------+
| 1 |   A  |  10  |  0   |  0   |
| 2 |   B  |  5   |  20  |  5   |
| 3 |   C  |  15  |  15  |  20  |
| 4 |   D  |  20  |  5   |  15  |
| 5 |   F  |  0   |  10  |  15  |
+---+------+------+------+------+

我想要每列最大值的名称。预期输出应如下所示：

+---+------+------+
|   |  col |  MAX |
+---+------+------+
| 1 | col1 |   D  |
| 2 | col2 |   B  |
| 3 | col3 |   C  |
+---+------+------+

如何编码？

Answer 1

data.table

library(data.table)
setDT(df)

df2 = melt(df, id.vars="Name", variable.name="col")
df2 = df2[, .SD[which.max(value)], by = col][, c("col", "Name")]
names(df2)[2] = "MAX"

输出：

df2

    col MAX
1: col1   D
2: col2   B
3: col3   C

dplyr

library(dplyr)

df2 = df %>% 
  gather(key="col", value="Value", 2:4) %>% 
  top_n(1, Value) %>%
  rename_at(1, ~"MAX") %>% 
  select(c("col", "MAX"))

输出：

df2

  col MAX
1 col1   D
2 col2   B
3 col3   C

基础 R

它也许还可以工作得更简单或更美观（更新：请参阅 akruns base R 解决方案，这样更好）...

df2 = reshape(df, direction="long", varying=2:4, v.names="value")
df2 = df2[order(-df2$value), ]
df2 = df2[!duplicated(df2$time), c("time", "Name")]
names(df2) = c("col", "MAX")
df2$col = paste0("col", df2$col)
rownames(df2) = NULL

输出：

df2

   col MAX
1 col1   D
2 col2   B
3 col3   C

Answer 2

在base R中，我们可以做到

stack(sapply(df1[-1], \(x) df1$Name[which.max(x)]))[2:1]
   ind values
1 col1      D
2 col2      B
3 col3      C

数据

df1 <- structure(list(Name = c("A", "B", "C", "D", "F"), col1 = c(10L, 
5L, 15L, 20L, 0L), col2 = c(0L, 20L, 15L, 5L, 10L), col3 = c(0L, 
5L, 20L, 15L, 15L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

通过 R 中的另一列最大值获得另一列值

Got another columns value by another column max in R

r

data-manipulation

dataframe

data.table

dplyr

基础 R

数据