在一个变量中包含一些字符序列的子集行

Question

我想对一个变量中具有具体字符序列的行进行子集化。

例如，我想对变量 history 中至少有三个连续 1（“111”；例如“01110”、“11111”、“01111”）的行进行子集化。

下面是一些示例数据：

id <- c(1,2,3,4,5,6,7,8,9,10)
history <- c("01110", "00001", "11111", "01111", "11011", "11100",
             "00001", "10101", "11011", "10111")
(df <- data.frame(id, history))
#    id history
# 1   1   01110
# 2   2   00001
# 3   3   11111
# 4   4   01111
# 5   5   11011
# 6   6   11100
# 7   7   00001
# 8   8   10101
# 9   9   11011
# 10 10   10111

在这种情况下，我想 select 第 1、3、4、6 和 10 行。

Answer 1

尝试

df[grep('1{3,}', df$history),]

在一个变量中包含一些字符序列的子集行

Subset rows that contain some sequence of characters in one variable

r

subset