如何使用 R 在列中查找序列

Question

我是 R 的新手。我想找出一种方法来验证列中的序列。我尝试使用 seq() 但这并没有真正为我提供太多。

这是 df 的一个示例

    gp<-data.frame(Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443),
  date=c("2016-04-27", "2016-04-12","2016-04-27", "2016-04-12", "2016-04-27","2016-04-27","2016-5-16","2016-4-16", "2016-5-16),
Cal=c(1347,1347,1348,1496,1497,1496,1688,1688,1688,2063,2063,2064))

序列在 Cal 列内。每个 Id 的每组 cal 增加了 1。我想要做的是验证或搜索序列，然后创建一个新列来验证该 Id

的 True 或 false 增加 1

##This is the printed-out version of the df.
        Id date         Cal
      <dbl> <chr>      <dbl>
1 1503960366 2016-04-27  1347
2 1503960366 2016-04-12  1347
3 1503960366 2016-04-27  1348
4 4319703577 2016-04-12  1496
5 4319703577 2016-04-27  1497
6 4319703577 2016-04-27  1496
7 5553957443 2016-5-16   1688
8 5553957443 2016-4-16   1688
9 5553957443 2016-5-16   1688

##This is the outcome I am looking for

         Id date         Cal  Verify
      <dbl> <chr>      <dbl>   <dbl>
1 1503960366 2016-04-27  1347   False
2 1503960366 2016-04-12  1347   False
3 1503960366 2016-04-27  1348   True
4 4319703577 2016-04-12  1496   False
5 4319703577 2016-04-27  1497   True
6 4319703577 2016-04-27  1496   False 
7 5553957443 2016-5-16   1688   False
8 5553957443 2016-4-16   1688   False
9 5553957443 2016-5-16   1688   False

在正确的地方提供任何帮助或指导将不胜感激。提前致谢。

更新

gph<-data.frame(Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443,7503962366,6950855005,1893815059,4020332650,8583815059,4319703577,1927972279),date=c("2016-04-27", "2016-04-12","2016-04-27","2016-04-12", "2016-05-30”,"2016-04-16","2016-05-16”,"2016-04-27","2016-04-27","2016-5-16","2016-4-16","2016-05-16”,”2016-5-20”, "2016-05-22","2016-05-18","2016-04-05"),Cal=c(1347,1347,1348,1496,1497,1496,1688,1688,1688,2063,2063,2064,0,0,0,2022))

Answer 1

将当前 Cal 值与前一个值相减，并检查差值是否等于 1。

library(dplyr)

df %>%
  mutate(Verify = Cal - lag(Cal, default = 0) == 1)

#          Id       date  Cal Verify
#1 1503960366 2016-04-27 1347  FALSE
#2 1503960366 2016-04-12 1347  FALSE
#3 1503960366 2016-04-27 1348   TRUE
#4 4319703577 2016-04-12 1496  FALSE
#5 4319703577 2016-04-27 1497   TRUE
#6 4319703577 2016-04-27 1496  FALSE
#7 5553957443  2016-5-16 1688  FALSE
#8 5553957443  2016-4-16 1688  FALSE
#9 5553957443  2016-5-16 1688  FALSE

在基础 R 中 -

df$Verify <- c(FALSE, df$Cal[-1] - df$Cal[-nrow(df)] == 1)

数据

df <- structure(list(Id = c(1503960366, 1503960366, 1503960366, 4319703577, 
4319703577, 4319703577, 5553957443, 5553957443, 5553957443), 
    date = c("2016-04-27", "2016-04-12", "2016-04-27", "2016-04-12", 
    "2016-04-27", "2016-04-27", "2016-5-16", "2016-4-16", "2016-5-16"
    ), Cal = c(1347L, 1347L, 1348L, 1496L, 1497L, 1496L, 1688L, 
    1688L, 1688L)), class = "data.frame", row.names = c(NA, -9L))

Answer 2

使用 diff.

df <- transform(df, Verify=c(0, diff(Cal)) == 1)
df
#           Id       date  Cal Verify
# 1 1503960366 2016-04-27 1347  FALSE
# 2 1503960366 2016-04-12 1347  FALSE
# 3 1503960366 2016-04-27 1348   TRUE
# 4 4319703577 2016-04-12 1496  FALSE
# 5 4319703577 2016-04-27 1497   TRUE
# 6 4319703577 2016-04-27 1496  FALSE
# 7 5553957443  2016-5-16 1688  FALSE
# 8 5553957443  2016-4-16 1688  FALSE
# 9 5553957443  2016-5-16 1688  FALSE

如何使用 R 在列中查找序列

How to find a Sequence with in a column Using R

r

sequence

seq

dataframe