逻辑函数 table

Function for logical table

所以我有这个函数用于逻辑(维恩图)计算 但我无法为任何大小的任何数据框制作通用函数..

此函数仅适用于提供的数据框(仅四列)


how_much = 5000000
A <- sample(how_much, replace = TRUE, x = 1:5) 
B <- sample(how_much, replace = TRUE, x = 1:5)
C <- sample(how_much, replace = TRUE, x = 1:5)
D <- sample(how_much, replace = TRUE, x = 1:5)

VennData = data.table(A, B, C, D)


Venn_Counts <- function(dataset, unique_number, operator) {
  message("Operator arrgument are: `==` or`<` or `<=` or `>` or `>=`")
  if(inrange(unique_number, 1, 35) ){
    dataset %>% as_tibble() %>% 
      mutate(A = (operator(A, unique_number)),
             B = (operator(B, unique_number)),
             C = (operator(C, unique_number)), 
             D = (operator(D, unique_number))) %>%
      count(A, B, C, D)
  }
  else {
    print("Unique number must be in range from 1 to 5")
  }
}


Venn_Counts(VennData, 2, operator = `<=`)

我们如何使上述函数对于具有更多列的数据框具有通用性?

对于较小的物体,我们会得到类似的东西: arguments 设置为 unique_number = 3, operator = ==

count    A      B
 24     TRUE   TRUE
 20     TRUE   FALSE
 13     FALSE  TRUE
 43     FALSE  FALSE

当我们看到有 24 个观测值,其中 A 和 B 都等于 3,20 个观测值的 A 等于 3,B 不等于 3,13 个观测值的 A 不等于 3,B 等于3 等...

如何使用 dplyr 中的作用域动词:

library(data.table)
library(dplyr)

how_much = 5000000
A <- sample(how_much, replace = TRUE, x = 1:5) 
B <- sample(how_much, replace = TRUE, x = 1:5)
C <- sample(how_much, replace = TRUE, x = 1:5)
D <- sample(how_much, replace = TRUE, x = 1:5)

VennData = data.table(A, B, C, D)


Venn_Counts <- function(dataset, unique_number, operator) {
  message("Operator arrgument are: `==` or`<` or `<=` or `>` or `>=`")
  if(inrange(unique_number, 1, 35) ){
    dataset %>% 
      as_tibble() %>% 
      mutate_all( ~ operator(.x, unique_number)) %>%
      group_by_all() %>% 
      count()
  }
  else {
    print("Unique number must be in range from 1 to 5")
  }
}


Venn_Counts(VennData, 2, operator = `<=`)

我们可以直接比较 datasetoperator 并按所有列分组并计算计数。

Venn_Counts <- function(dataset, unique_number, operator) {
    message("Operator arrgument are: `==` or`<` or `<=` or `>` or `>=`")
    if(inrange(unique_number, 1, 35) ){
       (operator(dataset, unique_number)) %>% 
        as_tibble() %>% 
        group_by_all() %>% 
        summarise(n = n())
       }
   else {
     print("Unique number must be in range from 1 to 5")
   }
}
Venn_Counts(VennData, 2, operator = `<=`)

#   A     B     C     D         n
#  <lgl> <lgl> <lgl> <lgl>  <int>
#1 FALSE FALSE FALSE FALSE     2
#2 FALSE FALSE FALSE TRUE      3
#3 FALSE TRUE  TRUE  FALSE     1
#4 TRUE  FALSE FALSE TRUE      2
#5 TRUE  FALSE TRUE  FALSE     1
#6 TRUE  TRUE  TRUE  TRUE      1

数据

library(data.table)
library(tidyverse)
set.seed(1234)
how_much = 10
A <- sample(how_much, replace = TRUE, x = 1:5) 
B <- sample(how_much, replace = TRUE, x = 1:5)
C <- sample(how_much, replace = TRUE, x = 1:5)
D <- sample(how_much, replace = TRUE, x = 1:5)
VennData = data.table(A, B, C, D)