如何 select 在 r 中没有 NA 的第一个和最后一个测试

Question

我昨天问过类似的问题：

但是我发现问题变得困难了，因为第一列中存在NA。

我的数据框是这样的：

Person  W.1   W.2   W.3   W.4   W.5   
1       NA    57    52    59    NA
2       49    NA    60    61    NA
3       34    79    NA    58    NA

有没有办法 select 没有 "NA" 的第一个和最后一个测试。我有300个数据条目，W.1表示第一次测试，W.2表示第二次测试，W.n表示第n次测试。我想比较第一次考试的分数和最后一次考试的分数，我也想比较第一次考试的分数和最高分。所需的输出应如下所示：

1    57 59
2    49 61
3    34 58

另一个应该是这样的：

1    52 59
2    49 61
3    34 79

但是不同的人有不同的地方"NA"，谁能帮帮我？

谢谢！

Answer 1

我会将数据转换为长格式，然后使用 dplyr:

library(tidyr)
library(dplyr)
dat.long = gather(dat, key = test, value = score, W.1:W.5)
head(dat.long)
#  Person test score
# 1      1  W.1    NA
# 2      2  W.1    49
# 3      3  W.1    34
# 4      1  W.2    57
# 5      2  W.2    NA
# 6      3  W.2    79

dat.long %>% na.omit %>%
    mutate(test = grep(pattern = "[0-9]", x = test)) %>%
    group_by(Person) %>%
    summarize(first = score[which.min(test)],
              last = score[which.max(test)],
              max = max(score))

#    Person first last max
#  1      1    57   59  59
#  2      2    49   61  61
#  3      3    34   58  79

结合您想要的输出。

Answer 2

这里有一个以 R 为基数的矢量化方法

start <- cbind(seq_len(nrow(df)), max.col(!is.na(df[-1L]), ties.method = "first") + 1L)
end <- cbind(seq_len(nrow(df)), max.col(!is.na(df[-1L]), ties.method = "last") + 1L)
maxval <- do.call(pmax, c(df[-1L], na.rm = TRUE))
cbind(df[1L], start = df[start], end = df[end], maxvalue = maxval)
#   Person start end maxvalue
# 1      1    57  59       59
# 2      2    49  61       61
# 3      3    34  58       79

或者您可以修改您之前问题中的@Marats 解决方案，如

t(apply(df[-1], 1, function(x) c(x[range(which(!is.na(x)))], max(x, na.rm = TRUE))))
#      [,1] [,2] [,3]
# [1,]   57   59   59
# [2,]   49   61   61
# [3,]   34   58   79

Answer 3

这里使用 data.table:

require(data.table)
require(reshape2)
melt(dt, id=1L, na.rm=TRUE)[, .(first=value[1L], 
         last=value[.N], max=max(value)), keyby=Person]
#    Person first last max
# 1:      1    57   59  59
# 2:      2    49   61  61
# 3:      3    34   58  79

您可以安全地忽略该警告。如果您使用 data.table 版本 <= 1.9.4.

，则需要 reshape2

如何 select 在 r 中没有 NA 的第一个和最后一个测试

How to select the first and last one test without NA in r

select

r

na