计算 "melted" 数据框中零的数量

Question

嘿，我学习了 R，我试着计算融化数据中有多少个零。所以，我想知道有多少个零对应于列 a 和 b 并打印出两个结果。我生成了一个例子：

library(reshape)
library(plyr)
library(dplyr)
id = c(1,2,3,4,5,6,7,8,9,10)
b = c(0,0,5,6,3,7,2,8,1,8)
c = c(0,4,9,87,0,87,0,4,5,0)
test = data.frame(id,b,c)
test_melt = melt(test, id.vars = "id")
test_melt

我想我应该为此创建一个 if 语句。东西与 if (test$value == 0){print()}，但我如何告诉 R 计算已熔化的列的零？

Answer 1

sum(test_melt$value==0)

这应该可以做到。

Answer 2

你的数据：

test_melt %>%
  group_by(variable) %>%
  summarize(zeroes = sum(value == 0))
# # A tibble: 2 x 2
#   variable zeroes
#     <fctr>  <int>
# 1        b      2
# 2        c      4

基数 R：

aggregate(test_melt$value, by = list(variable = test_melt$variable),
          FUN = function(x) sum(x == 0))
#   variable x
# 1        b 2
# 2        c 4

...出于好奇：

library(microbenchmark)
microbenchmark(
  dplyr = group_by(test_melt, variable) %>% summarize(zeroes = sum(value == 0)),
  base1 = aggregate(test_melt$value, by = list(variable = test_melt$variable), FUN = function(x) sum(x == 0)),
  # @PankajKaundal's suggested "formula" notation reads easier
  base2 = aggregate(value ~ variable, test_melt, function(x) sum(x == 0))
)
# Unit: microseconds
#   expr     min      lq      mean    median        uq      max neval
#  dplyr 916.421 986.985 1069.7000 1022.1760 1094.7460 2272.636   100
#  base1 647.658 682.302  783.2065  715.3045  765.9940 1905.411   100
#  base2 813.219 867.737  950.3247  897.0930  959.8175 2017.001   100

Answer 3

这可能会有所帮助。这是你要找的吗？

 > test_melt[4] <- 1
    > test_melt2 <- aggregate(V4 ~ value + variable, test_melt, sum)
    > test_melt2
       value variable V4
    1      0        b  2
    2      1        b  1
    3      2        b  1
    4      3        b  1
    5      5        b  1
    6      6        b  1
    7      7        b  1
    8      8        b  2
    9      0        c  4
    10     4        c  2
    11     5        c  1
    12     9        c  1
    13    87        c  2

 V4 is the count

计算 "melted" 数据框中零的数量

Counting amount of zeros within a "melted" data frame

r

reshape

dataframe

dplyr