删除从匹配正则表达式的第一列开始的列范围

Question

我有以下数据框，它是 read_excel 的输出，在 excel 中缺少列名：

t <- tibble(A=rnorm(3), B=rnorm(3), "x"=rnorm(3), "y"=rnorm(3), Z=rnorm(3))
colnames(t)[3:4] <-  c("..3", "..4")

如何以灵活的动态方式 select 列 ..3 到 Z（不取决于数字或 table 宽度）。我在想这样的方向：

t %>% select(-starts_with(".."):-last_col())

但这给出了警告，因为 starts_with returns 两个值。

Answer 1

你可以使用 base R:

t[cumsum(startsWith(names(t), "..")) == 0]

# # A tibble: 3 x 2
#       A       B
#   <dbl>   <dbl>
# 1 -1.56 -0.0747
# 2 -1.68 -0.847 
# 3 -1.23 -1.20

您也可以将其与 select():

一起使用

t %>% 
  select(which(cumsum(startsWith(names(t), "..")) == 0))

PS。不要在 R 中使用 t 作为变量名，因为它是一个函数名。

Answer 2

我们可以强制select第一个：

t %>% select(-c(starts_with("..")[ 1 ]:last_col()))
# # A tibble: 3 x 2
#       A      B
#   <dbl>  <dbl>
# 1 0.889  0.505
# 2 0.655 -2.15 
# 3 1.34  -0.290

或者“更简洁”的方式使用首先:

select(-first(starts_with("..")):-last_col())

删除从匹配正则表达式的第一列开始的列范围

Drop column range starting with first column matching regex

r

dplyr

tidyselect