为什么在 R 中熔化 return NA 列?

Why does melt return NA column in R?

我在 R 中有以下列表 df

structure(list(disease = structure(c(1L, 1L), .Label = "Barcelona", class = "factor"), 
    `<18` = structure(list(0.193103448275862, 
        0.0445344129554656), .Names = c(NA_character_, NA_character_
    )), `19-25` = structure(list(0.0413793103448276, 
        0.345748987854251), .Names = c(NA_character_, NA_character_
    )), `26-64` = structure(list(0.448275862068966, 0.167611336032389), .Names = c(NA_character_, 
    NA_character_)), `46-64` = structure(list(0.0344827586206897, 
        0.00647773279352227), .Names = c(NA_character_, NA_character_
    )), `>65` = structure(list(0.282758620689655, 
        0.435627530364373), .Names = c(NA_character_, NA_character_
    )), type = structure(1:2, .Label = c("Clinical Trial", "Real-World"
    ), class = "factor")), class = "data.frame", row.names = c(NA, 
-2L))

我想重新排列数据框,以便我可以使用 melt 按城市、公寓和年龄组获取每个值。但是,我得到一个额外的列作为输出:

melt(df)
           city           type           variable      value          NA
1  Barcelona       flat                  <18           0.19310345 0.044534413
2  Barcelona       house                 <18           0.19310345 0.044534413
3  Barcelona       flat                  19 - 25       0.04137931 0.345748988
4  Barcelona       house                 19 - 25       0.04137931 0.345748988
5  Barcelona       flat                  26 - 45       0.44827586 0.167611336
6  Barcelona       house                 26 - 45       0.44827586 0.167611336
7  Barcelona       flat                  46 - 64       0.03448276 0.006477733
8  Barcelona       house                 46 - 64       0.03448276 0.006477733
9  Barcelona       flat                  > 65          0.28275862 0.435627530
10 Barcelona       house                 > 65          0.28275862 0.435627530

有什么方法可以不使用 NA 列并在 value 列中获取唯一值?

问题是您的度量列是 list class,而不是 numeric class。如果我们将它们转换为数字,melt 将正常工作。 (我展示了一种方法,但最好在你的工作流中更早地进行,并首先防止将列创建为列表......如果我的代码适用于你的,这绝对是你应该做的示例数据在较大数据上遇到问题。tidyr::unnest 在这种情况下可能会有所帮助。)

sapply(df, class)
#  disease      <18    19-25    26-64    46-64      >65     type 
# "factor"   "list"   "list"   "list"   "list"   "list" "factor" 

list_cols = sapply(df, is.list)

df[list_cols] = lapply(df[list_cols], unlist)

reshape2::melt(df, id.vars = c("disease", "type"))
#      disease           type variable       value
# 1  Barcelona Clinical Trial      <18 0.193103448
# 2  Barcelona     Real-World      <18 0.044534413
# 3  Barcelona Clinical Trial    19-25 0.041379310
# 4  Barcelona     Real-World    19-25 0.345748988
# 5  Barcelona Clinical Trial    26-64 0.448275862
# 6  Barcelona     Real-World    26-64 0.167611336
# 7  Barcelona Clinical Trial    46-64 0.034482759
# 8  Barcelona     Real-World    46-64 0.006477733
# 9  Barcelona Clinical Trial      >65 0.282758621
# 10 Barcelona     Real-World      >65 0.435627530