Select 包含特定 string/ID 的列

Question

我有一个庞大的数据框，几乎 300 columns。在列名称的开头，有一个代码。

例如列是：

GSUP_02X Which supplier do you use for each of the following services? Telephone line rental

或另一个列名是：

GSUP_03X Which supplier do you use for each of the following services? Fixed broadband

GSUP_02X and GSUP_03X are codes.

所以我想 select 所有与 vector 中的代码列表匹配的列。

我试过：

columns <- c("GSUP_02X","GSUP_03X")
consumer_brand_nps %>%
                      select(contains(columns))

但是得到如下错误：

Error: is.string(match) is not TRUE

任何其他 tidyr 或 dplyr 解决方案？

Answer 1

我们可以在paste之后'columns'一起使用matches

library(dplyr)

consumer_brand_nps %>%
                  select(matches(paste(columns, collapse="|")))
#     GSUP_02X   GSUP_03X
#1  -0.545880758 -1.3169081
#2   0.536585304  0.5982691
#3   0.419623149 -0.7622144
#4  -0.583627199 -1.4290903
#5   0.847460017  0.3322444
#6   0.266021979 -0.4690607
#7   0.444585270 -0.3349868
#8  -0.466495124  1.5362522
#9  -0.848370044  0.6099945
#10  0.002311942  0.5163357

数据

set.seed(24)
consumer_brand_nps <- as.data.frame(matrix(rnorm(10*5), ncol=5, nrow=10,
              dimnames = list(NULL, c(columns, LETTERS[1:3]))))

Answer 2

使用来自@akrun 的数据进行测试。

consumer_brand_nps %>%
  select_(.dots = columns ) ## note the underscore

Select 包含特定 string/ID 的列

Select columns that contain particular string/ID

select

r

dplyr

tidyr

数据