在 R 中保留具有特定字符串值的行数据
Keep rows data with specific string value in R
首先,我有字符串列表:
/index.php/abc/def
/link/view/id/123
/subject/view/id/456
然后,我有这样的数据集:
Date and Time Request
2016-01-17 05:46:26 aladdine.com/view/id/786
2016-01-17 05:46:30 aladdine.com/subject/view/id/456
2016-01-17 05:46:31 aladdine.com/pub/link/view/id/123
2016-01-17 05:46:44 aladdine.com/index.php/abc/def/ghi
2016-01-17 05:46:58 aladdine.com/brs/view/id.266
如何保留与之前列表具有相似文本的数据集?
输出:
Date and Time Request
2016-01-17 05:46:30 aladdine.com/subject/view/id/456
2016-01-17 05:46:31 aladdine.com/pub/link/view/id/123
2016-01-17 05:46:44 aladdine.com/index.php/abc/def/ghi
我从 R 数据库中取出了 CO2
示例。请将您的数据集分配给 dataSet
,您的列表分配给 iList
,并请将所有出现的 dataSet$Plant
更改为您感兴趣的列(可能 dataSet$Request
)。
生成的数据集保存在results
。
rm(list = ls());
dataSet <- CO2;
varsToCheck <- dataSet$Plant;
iList <- list("Qn1", "Mn1", "Mc1");
# Iterate over all rows
for(i in 1:length(dataSet$Plant)) {
# Extract string for checking
validateString <- varsToCheck[i];
# Iterate over all match criterions
for(j in 1:length(iList)) {
# Extract the match criterion
matchString <- iList[[j]];
# Validate if part of the string match the criterion
if(grepl(matchString, validateString)) {
# Create results object when we first add a row
if(exists("results")) {
results <- rbind(results, dataSet[i,]);
} else {
results <- dataSet[i,];
}
}
}
}
使用与@Cinnamon Star 相同的数据集,您可以:
dataSet <- CO2;
iList <- list("Qn1", "Mn1", "Mc1");
将所有字符串连接到类型为 (str1|str2|str3)
:
的单个正则表达式模式中
pat = paste(unlist (iList),collapse = "|")
pat = paste0("(",pat,")")
然后执行 grepl 以确定哪些行包含第 Plant
列中的该文本。
dataSet[grepl(pattern = pat,x = dataSet$Plant),]
结果:
Plant Type Treatment conc uptake
1 Qn1 Quebec nonchilled 95 16.0
2 Qn1 Quebec nonchilled 175 30.4
3 Qn1 Quebec nonchilled 250 34.8
4 Qn1 Quebec nonchilled 350 37.2
5 Qn1 Quebec nonchilled 500 35.3
6 Qn1 Quebec nonchilled 675 39.2
7 Qn1 Quebec nonchilled 1000 39.7
43 Mn1 Mississippi nonchilled 95 10.6
44 Mn1 Mississippi nonchilled 175 19.2
45 Mn1 Mississippi nonchilled 250 26.2
46 Mn1 Mississippi nonchilled 350 30.0
47 Mn1 Mississippi nonchilled 500 30.9
48 Mn1 Mississippi nonchilled 675 32.4
49 Mn1 Mississippi nonchilled 1000 35.5
64 Mc1 Mississippi chilled 95 10.5
65 Mc1 Mississippi chilled 175 14.9
66 Mc1 Mississippi chilled 250 18.1
67 Mc1 Mississippi chilled 350 18.9
68 Mc1 Mississippi chilled 500 19.5
69 Mc1 Mississippi chilled 675 22.2
70 Mc1 Mississippi chilled 1000 21.9
首先,我有字符串列表:
/index.php/abc/def
/link/view/id/123
/subject/view/id/456
然后,我有这样的数据集:
Date and Time Request
2016-01-17 05:46:26 aladdine.com/view/id/786
2016-01-17 05:46:30 aladdine.com/subject/view/id/456
2016-01-17 05:46:31 aladdine.com/pub/link/view/id/123
2016-01-17 05:46:44 aladdine.com/index.php/abc/def/ghi
2016-01-17 05:46:58 aladdine.com/brs/view/id.266
如何保留与之前列表具有相似文本的数据集?
输出:
Date and Time Request
2016-01-17 05:46:30 aladdine.com/subject/view/id/456
2016-01-17 05:46:31 aladdine.com/pub/link/view/id/123
2016-01-17 05:46:44 aladdine.com/index.php/abc/def/ghi
我从 R 数据库中取出了 CO2
示例。请将您的数据集分配给 dataSet
,您的列表分配给 iList
,并请将所有出现的 dataSet$Plant
更改为您感兴趣的列(可能 dataSet$Request
)。
生成的数据集保存在results
。
rm(list = ls());
dataSet <- CO2;
varsToCheck <- dataSet$Plant;
iList <- list("Qn1", "Mn1", "Mc1");
# Iterate over all rows
for(i in 1:length(dataSet$Plant)) {
# Extract string for checking
validateString <- varsToCheck[i];
# Iterate over all match criterions
for(j in 1:length(iList)) {
# Extract the match criterion
matchString <- iList[[j]];
# Validate if part of the string match the criterion
if(grepl(matchString, validateString)) {
# Create results object when we first add a row
if(exists("results")) {
results <- rbind(results, dataSet[i,]);
} else {
results <- dataSet[i,];
}
}
}
}
使用与@Cinnamon Star 相同的数据集,您可以:
dataSet <- CO2;
iList <- list("Qn1", "Mn1", "Mc1");
将所有字符串连接到类型为 (str1|str2|str3)
:
pat = paste(unlist (iList),collapse = "|")
pat = paste0("(",pat,")")
然后执行 grepl 以确定哪些行包含第 Plant
列中的该文本。
dataSet[grepl(pattern = pat,x = dataSet$Plant),]
结果:
Plant Type Treatment conc uptake
1 Qn1 Quebec nonchilled 95 16.0
2 Qn1 Quebec nonchilled 175 30.4
3 Qn1 Quebec nonchilled 250 34.8
4 Qn1 Quebec nonchilled 350 37.2
5 Qn1 Quebec nonchilled 500 35.3
6 Qn1 Quebec nonchilled 675 39.2
7 Qn1 Quebec nonchilled 1000 39.7
43 Mn1 Mississippi nonchilled 95 10.6
44 Mn1 Mississippi nonchilled 175 19.2
45 Mn1 Mississippi nonchilled 250 26.2
46 Mn1 Mississippi nonchilled 350 30.0
47 Mn1 Mississippi nonchilled 500 30.9
48 Mn1 Mississippi nonchilled 675 32.4
49 Mn1 Mississippi nonchilled 1000 35.5
64 Mc1 Mississippi chilled 95 10.5
65 Mc1 Mississippi chilled 175 14.9
66 Mc1 Mississippi chilled 250 18.1
67 Mc1 Mississippi chilled 350 18.9
68 Mc1 Mississippi chilled 500 19.5
69 Mc1 Mississippi chilled 675 22.2
70 Mc1 Mississippi chilled 1000 21.9