R 中矢量操作的管道作为 dplyr data.frame 操作的替换

Pipeline for Vector Manipulations in R as Replacement of dplyr data.frame Manipulations

有一个常见的众所周知的 dplyr 管道操作数据框中的字符列:

library("dplyr")
data.frame(someVector = c("fff", "aaa", "bbb", "ccc", "ddd", "ccc")) %>% 
  distinct(someVector) %>% 
  arrange(someVector) %>% 
  filter(someVector != "bbb") %>% 
  pull(someVector)

此管道 returns 所需的矢量结果:

[1] "aaa" "ccc" "ddd" "fff"

然而,字符向量到数据帧的转换似乎不是最优的。使用以向量为参数的相同函数序列

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% 
   distinct() %>%
   arrange() %>% 
   filter(.data != "bbb")

导致错误:

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% distinct() 
Error in UseMethod("distinct") : 
  no applicable method for 'distinct' applied to an object of class "character"
   
c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% arrange() 
Error in UseMethod("arrange") : 
  no applicable method for 'arrange' applied to an object of class "character"
 
c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% filter(.data != "bbb")
Error in UseMethod("filter") : 
  no applicable method for 'filter' applied to an object of class "character"

这就是为什么需要将使用的函数“翻译”成(替换为)它们的向量类似物:

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>%
  unique() %>% 
  sort() 

我不知道 filter() 以管道方式进行向量操作的等价物。问题是如何为所需的输出进行适当的(最佳“R 风格”)流水线向量操作?

你有很多选择。例如,

这个

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% unique() %>% `[`(. != "bbb") %>% sort() 

这个

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% unique() %>% .[. != "bbb"] %>% sort() 

还有这个

c("fff", "aaa", "bbb", "ccc", "ddd", "ccc") %>% unique() %>% magrittr::extract(. != "bbb") %>% sort() 

都给

[1] "aaa" "ccc" "ddd" "fff"