如何找到每行的 2 个最高值,然后将它们加在一起?

How can I find the 2 highest values for each row, then add them together?

在下面的 table 中,如何找到每行中的两个最大值,然后将这些值相加?

我在 RStudio 中有一份附件 table。是否有一行代码可以用来在每一行上添加两个最大的数字,以便我可以将其应用于更大的数据集?

您可以进行按行计算,对指定列中的值进行排序并对最高的两个求和:

library(dplyr)

df <- data.frame(Mon = c(12,15,42,43,56,73,23),
                 Tues = c(15,14,12,75,98,79,68),
                 Wed = c(13,42,35,64,35,95,56),
                 Thur = c(23,46,32,94,78,68,35),
                 Friday = c(25,23,64,35,27,54,32))



df %>% 
  rowwise() %>% 
  mutate(two_max = sum(sort(c(Mon, Tues, Wed, Thur, Friday), decreasing = TRUE)[1:2])) %>% 
  ungroup()

如果您不想手动指定列名,您也可以一次 select 所有数字列:


df %>% 
  rowwise() %>% 
  mutate(two_max = sum(sort(c_across(where(is.numeric)), decreasing = TRUE)[1:2])) %>% 
  ungroup()

两种策略都给出了结果:


# A tibble: 7 x 6
    Mon  Tues   Wed  Thur Friday two_max
  <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>
1    12    15    13    23     25      48
2    15    14    42    46     23      88
3    42    12    35    32     64     106
4    43    75    64    94     35     169
5    56    98    35    78     27     176
6    73    79    95    68     54     174
7    23    68    56    35     32     124

apply搭配mtcars为例:

top2 <- x <- apply(mtcars, 1, function(x) sort(x, decreasing = TRUE)[1:2])
top2 <- matrix(top2, ncol = 2, byrow = TRUE)
addem <- rowSums(top2)
top2plus <- cbind(mtcars, addem)
head(top2plus, 5)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb addem
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   270
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   270
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   201
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   368
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   535

实现此目的的一种方法是重塑数据框以拥有整洁的数据(即将天数作为变量,将值作为另一个变量),并使用 dplyr 动词将数据处理成摘要:

df <- data.frame(ID = LETTERS[1:7],
                 Mon = c(12, 15, 42, 43, 56, 73, 23),
                 Tues = c(15, 14, 12, 75, 98, 79, 68),
                 Wed = c(13, 42, 35, 64, 35, 95, 56),
                 Thur = c(23, 46, 32, 94, 78, 68, 35),
                 Friday = c(25, 23, 64, 35, 27, 54, 32))

library(dplyr)
library(tidyr)
df %>% 
  pivot_longer(-ID) %>%         # reshape
  group_by(ID) %>%              # change scope to each ID
  slice_max(value, n = 2) %>%   # keep two maximums
  summarise(max2 = sum(value))  # sum them
#> # A tibble: 7 x 2
#>   ID     max2
#>   <chr> <dbl>
#> 1 A        48
#> 2 B        88
#> 3 C       106
#> 4 D       169
#> 5 E       176
#> 6 F       174
#> 7 G       124

reprex package (v0.3.0)

于 2020-12-01 创建

如果你喜欢单线,就拿那个

df <- data.frame(Mon = c(12,15,42,43,56,73,23),
                 Tues = c(15,14,12,75,98,79,68),
                 Wed = c(13,42,35,64,35,95,56),
                 Thur = c(23,46,32,94,78,68,35),
                 Friday = c(25,23,64,35,27,54,32))

sol <- apply(df, 1, function(x) sum(max(x), max(x[-which(x == max(x))[1]])))

请注意,还有一些更有效的方法。 Lmk 如果你想听他们