dplyr::select - 包括新数据框末尾（或开头或中间）的所有其他列

Question

在与数据交互时，我发现 dplyr 库的 select() 函数是组织数据框列的好方法。

一个很好的用途，如果我碰巧使用有很多列的 df，我经常发现自己将两个变量放在一起以便于比较。这样做时，我需要在之前或之后附加所有其他列。我发现 matches(".") 函数是一种非常方便的方法。

例如：

library(nycflights13)
library(dplyr)

# just have the five columns:
select(flights, carrier, tailnum, year, month, day) 

# new order for all column:
select(flights, carrier, tailnum, year, month, day, matches(".")) 
# matches(".")  attached all other columns to end of new data frame

问题 - 我很好奇是否有更好的方法来做到这一点？在更灵活的意义上更好。

例如一个问题：有没有办法在新 data.frame 的开头或中间包含 "all other" 列？（请注意，select(flights, matches("."), year, month, day, ) 不会产生所需的结果，因为 matches(".") 附加了所有列，而 year, month, day 被忽略，因为它们是现有列名称的重复。）

Answer 1

虽然不是一个非常优雅的解决方案，但它确实有效。

  select(flights, carrier, tailnum, 
one_of(setdiff(colnames(flights),c("carrier","tailnum","year"))),year)

我用了setdiff函数来比较。由于 select 不接受字符串参数，我使用了 one_of 函数。有关 select 参数的许多实用函数列表，您可以参考此 post.

Answer 2

更新：使用 dplyr::relocate()

选定的列**在开头**：

flights %>%  
  relocate(carrier, tailnum, year, month, day)

选定的列**在最后**：

flights %>%  
  relocate(carrier, tailnum, year, month, day, .after = last_col())

旧答案

>如果你想**重新排序列**

所有其他列**在末尾**：

select(flights, carrier, tailnum, year, month, day, everything())

或者分两步，对select个字符向量中提供的变量，one_of("x", "y", "z"):

col <- c("carrier", "tailnum", "year", "month", "day")
select(flights, one_of(col), everything())

所有其他列**在开头**：

select(flights, -one_of(col), one_of(col))

If you want to add all the data frame again using dplyr:

最后的所有数据帧：

bind_cols(select(flights, one_of(col)), flights)

开头的所有数据帧：

bind_cols(flights, select(flights, one_of(col)))

Answer 3

在我看来，设置!是为了取一组变量的补集。

 mtcars %>% select(c(vs,am), !c(vs,am))

dplyr::select - 包括新数据框末尾（或开头或中间）的所有其他列

dplyr::select - Including All Other Columns at End of New Data Frame (or Beginning or Middle)

r

dplyr

更新：使用 dplyr::relocate()

旧答案