如何使用 tidyverse select 仅具有 2 个唯一值的行？

Question

我有：

library(magrittr)
set.seed(1234)
what_i_have <- tibble::tibble(
    A = c(0, 1) |> sample(5, replace = TRUE),
    B = c(0, 1) |> sample(5, replace = TRUE),
    C = c(0, 1) |> sample(5, replace = TRUE)
)

看起来像这样：

> what_i_have
# A tibble: 5 x 3
      A     B     C
  <dbl> <dbl> <dbl>
1     1     1     1
2     1     0     1
3     1     0     1
4     1     0     0
5     0     1     1

我想要的：

what_i_want <- what_i_have %>% .[apply(., 1, function(row) row |> unique() |> length() == 2),]

看起来像这样：

# A tibble: 4 x 3
      A     B     C
  <dbl> <dbl> <dbl>
1     1     0     1
2     1     0     1
3     1     0     0
4     0     1     1

我的问题是：是否有 tidyverse 方法来完成上述操作？

我试过这个：

what_i_have |> 
    dplyr::rowwise() |> 
    dplyr::filter_all(function(row) row |> unique() |> length() == 2)

但是returns下面是空的tibble不知道为什么

# A tibble: 0 x 3
# Rowwise: 
# … with 3 variables: A <dbl>, B <dbl>, C <dbl>

谢谢。

Answer 1

这是 tidyverse 的一个选项。在这里，我将每一行视为一个向量（通过 c_across），然后使用 n_distinct 和 return TRUE 获取具有 2 个唯一值的行的不同值的数量.

library(tidyverse)

what_i_have %>%
  rowwise %>%
  filter(n_distinct(c_across(everything())) == 2)

输出

      A     B     C
  <dbl> <dbl> <dbl>
1     0     1     1
2     1     0     1
3     1     0     0
4     1     1     0

apply 的混合方法可能是：

what_i_have %>% 
  filter(apply(., 1, \(x)length(unique(x)))==2)

数据

what_i_have  <-
  structure(
    list(
      A = c(0, 1, 1, 1, 1),
      B = c(1, 0, 0, 1, 1),
      C = c(1, 1, 0, 1, 0)
    ),
    class = c("tbl_df", "tbl", "data.frame"),
    row.names = c(NA,-5L)
  )

如何使用 tidyverse select 仅具有 2 个唯一值的行？

How to select row with exactly only 2 unique value with tidyverse?

r

dplyr

tidyverse

rowwise