是否有 R 函数仅从具有许多 NA 值的逗号分隔字符串中提取数字以创建仅包含数字的列?

Is there an R function to extract only numbers from a comma-separated string with many NA values to create a column with only the numbers?

我有一个如下所示的数据集:

 before = data.frame(diag1 = c(1,NA, 1, NA, NA, 1), diag2 = c(NA, NA, NA, 2, NA, NA), diag3 = c(3, NA, NA, NA, 3, 3), diag4 = c(4, 4, NA, NA, 4, NA))

  diag1 diag2 diag3 diag4
1     1    NA     3     4
2    NA    NA    NA     4
3     1    NA    NA    NA
4    NA     2    NA    NA
5    NA    NA     3     4
6     1    NA     3    NA

我一直在尝试找到一个解决方案,最终结果是一个名为 "diagnoses" 的新列,看起来像这样

  diagnoses
1     1,3,4
2         4
3         1
4         2
5       3,4
6       1,3

这只是我实际问题的一个小得多的例子。在我正在处理的数据集中,有 70 多列诊断,每行不超过 3 个数值。我试过 strsplit、separate、unite 函数。还没找到优雅的解决方案

我用过粘贴功能

dat$diagnoses<- apply( (dat[ , cols]), 1, function(x) paste(na.omit(x),collapse=", ") )

但是,它会生成一个包含许多逗号的字符串。

我尝试用 gsub 代替 ,但我仍然没有得到我希望的结果。

这是我能够得到的输出:"1,,3,4,," ",,,4,," " 1,,,,," ",2,,,," ",,3,4,," "1,,3,,,"

一个选项是使用 apply 遍历行,删除 NA 元素并将其 paste 放在一起

before$new <- apply(before, 1, function(x) toString(x[!is.na(x)]))
before$new
#[1] "1, 3, 4" "4"       "1"       "2"       "3, 4"    "1, 3"   

另一种可能是:

before$rowid <- 1:nrow(before)
aggregate(values ~ rowid, 
          paste0, collapse = ",",
          data.frame(before[5], stack(before[-5])))

  rowid values
1     1  1,3,4
2     2      4
3     3      1
4     4      2
5     5    3,4
6     6    1,3
foo = function(..., sep = ","){
    paste(..., sep = sep)
}

gsub(",?NA|NA,?", "", do.call(foo, before))
#[1] "1,3,4" "4"     "1"     "2"     "3,4"   "1,3" 

我不知道 toString 但从@akrun 借来并使用包 purrr :

purrr::pmap_chr(before, ~toString(na.omit(c(...))))
# [1] "1, 3, 4" "4"       "1"       "2"       "3, 4"    "1, 3"