如何从包含特定文本的数据框列中提取数据

how do you extract data from data frame columns that contain certain text

我有这个数据框:

dput(df)

structure(list(Time = structure(1:4, .Label = c("1/29/2015 2:00", 
"1/29/2015 2:10", "1/29/2015 2:20", "1/29/2015 2:30"), class = "factor"), 
    WTAD..SNMP..AppTier.BIGIP.SNMP.CPU.5min.avg.on.Web01.Content.Match = structure(c(1L, 
    1L, 1L, 1L), .Label = "n/a", class = "factor"), WTAD..SNMP..AppTier.BIGIP.SNMP.CPU.5min.avg.on.Web01.Status = structure(c(1L, 
    1L, 1L, 1L), .Label = "n/a", class = "factor"), WTAD..SNMP..AppTier.BIGIP.SNMP.CPU.5min.avg.on.Web01.Value = c(12L, 
    12L, 12L, 12L), WTAD..SNMP..AppTier.BIGIP.SNMP.Memory.on.Web01.Content.Match = structure(c(1L, 
    1L, 1L, 1L), .Label = "n/a", class = "factor")), .Names = c("Time", 
"WTAD..SNMP..AppTier.BIGIP.SNMP.CPU.5min.avg.on.Web01.Content.Match", 
"WTAD..SNMP..AppTier.BIGIP.SNMP.CPU.5min.avg.on.Web01.Status", 
"WTAD..SNMP..AppTier.BIGIP.SNMP.CPU.5min.avg.on.Web01.Value", 
"WTAD..SNMP..AppTier.BIGIP.SNMP.Memory.on.Web01.Content.Match"
), class = "data.frame", row.names = c(NA, -4L))

我正在尝试包含以下内容的列:CPU.5min.avg.on.*.Value"

library(dplyr)
df<-select(df, Time, contains("CPU.5min.avg.on.*.Value"))

这项工作适用于 windows R,但不适用于 linux。知道我做错了什么吗?

基础 R 解决方案:

df[,c("Time",colnames(df)[sapply(colnames(df), function(u) grepl("CPU.5min.avg.on.*.Value",u))])]

dplyr解法:

select(df, Time, matches('CPU.5min.avg.on.*.Value'))

实际上,我很困惑您的解决方案在 Windows 下有效。 ?select 文档说:

contains(x, ignore.case = TRUE): selects all variables whose name contains x

matches(x, ignore.case = TRUE): selects all variables whose name matches the regular expression x

并且您正在尝试匹配代码中的正则表达式,因此它不应该在任何 OS.

下与 contain 一起使用