如何使用 R 在列中查找序列
How to find a Sequence with in a column Using R
我是 R 的新手。我想找出一种方法来验证列中的序列。我尝试使用 seq() 但这并没有真正为我提供太多。
这是 df 的一个示例
gp<-data.frame(Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443),
date=c("2016-04-27", "2016-04-12","2016-04-27", "2016-04-12", "2016-04-27","2016-04-27","2016-5-16","2016-4-16", "2016-5-16),
Cal=c(1347,1347,1348,1496,1497,1496,1688,1688,1688,2063,2063,2064))
序列在 Cal 列内。每个 Id 的每组 cal 增加了 1。我想要做的是验证或搜索序列,然后创建一个新列来验证该 Id
的 True 或 false 增加 1
##This is the printed-out version of the df.
Id date Cal
<dbl> <chr> <dbl>
1 1503960366 2016-04-27 1347
2 1503960366 2016-04-12 1347
3 1503960366 2016-04-27 1348
4 4319703577 2016-04-12 1496
5 4319703577 2016-04-27 1497
6 4319703577 2016-04-27 1496
7 5553957443 2016-5-16 1688
8 5553957443 2016-4-16 1688
9 5553957443 2016-5-16 1688
##This is the outcome I am looking for
Id date Cal Verify
<dbl> <chr> <dbl> <dbl>
1 1503960366 2016-04-27 1347 False
2 1503960366 2016-04-12 1347 False
3 1503960366 2016-04-27 1348 True
4 4319703577 2016-04-12 1496 False
5 4319703577 2016-04-27 1497 True
6 4319703577 2016-04-27 1496 False
7 5553957443 2016-5-16 1688 False
8 5553957443 2016-4-16 1688 False
9 5553957443 2016-5-16 1688 False
在正确的地方提供任何帮助或指导将不胜感激。提前致谢。
更新
gph<-data.frame(Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443,7503962366,6950855005,1893815059,4020332650,8583815059,4319703577,1927972279),date=c("2016-04-27", "2016-04-12","2016-04-27","2016-04-12", "2016-05-30”,"2016-04-16","2016-05-16”,"2016-04-27","2016-04-27","2016-5-16","2016-4-16","2016-05-16”,”2016-5-20”, "2016-05-22","2016-05-18","2016-04-05"),Cal=c(1347,1347,1348,1496,1497,1496,1688,1688,1688,2063,2063,2064,0,0,0,2022))
将当前 Cal
值与前一个值相减,并检查差值是否等于 1。
library(dplyr)
df %>%
mutate(Verify = Cal - lag(Cal, default = 0) == 1)
# Id date Cal Verify
#1 1503960366 2016-04-27 1347 FALSE
#2 1503960366 2016-04-12 1347 FALSE
#3 1503960366 2016-04-27 1348 TRUE
#4 4319703577 2016-04-12 1496 FALSE
#5 4319703577 2016-04-27 1497 TRUE
#6 4319703577 2016-04-27 1496 FALSE
#7 5553957443 2016-5-16 1688 FALSE
#8 5553957443 2016-4-16 1688 FALSE
#9 5553957443 2016-5-16 1688 FALSE
在基础 R 中 -
df$Verify <- c(FALSE, df$Cal[-1] - df$Cal[-nrow(df)] == 1)
数据
df <- structure(list(Id = c(1503960366, 1503960366, 1503960366, 4319703577,
4319703577, 4319703577, 5553957443, 5553957443, 5553957443),
date = c("2016-04-27", "2016-04-12", "2016-04-27", "2016-04-12",
"2016-04-27", "2016-04-27", "2016-5-16", "2016-4-16", "2016-5-16"
), Cal = c(1347L, 1347L, 1348L, 1496L, 1497L, 1496L, 1688L,
1688L, 1688L)), class = "data.frame", row.names = c(NA, -9L))
使用 diff
.
df <- transform(df, Verify=c(0, diff(Cal)) == 1)
df
# Id date Cal Verify
# 1 1503960366 2016-04-27 1347 FALSE
# 2 1503960366 2016-04-12 1347 FALSE
# 3 1503960366 2016-04-27 1348 TRUE
# 4 4319703577 2016-04-12 1496 FALSE
# 5 4319703577 2016-04-27 1497 TRUE
# 6 4319703577 2016-04-27 1496 FALSE
# 7 5553957443 2016-5-16 1688 FALSE
# 8 5553957443 2016-4-16 1688 FALSE
# 9 5553957443 2016-5-16 1688 FALSE
我是 R 的新手。我想找出一种方法来验证列中的序列。我尝试使用 seq() 但这并没有真正为我提供太多。
这是 df 的一个示例
gp<-data.frame(Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443),
date=c("2016-04-27", "2016-04-12","2016-04-27", "2016-04-12", "2016-04-27","2016-04-27","2016-5-16","2016-4-16", "2016-5-16),
Cal=c(1347,1347,1348,1496,1497,1496,1688,1688,1688,2063,2063,2064))
序列在 Cal 列内。每个 Id 的每组 cal 增加了 1。我想要做的是验证或搜索序列,然后创建一个新列来验证该 Id
的 True 或 false 增加 1##This is the printed-out version of the df.
Id date Cal
<dbl> <chr> <dbl>
1 1503960366 2016-04-27 1347
2 1503960366 2016-04-12 1347
3 1503960366 2016-04-27 1348
4 4319703577 2016-04-12 1496
5 4319703577 2016-04-27 1497
6 4319703577 2016-04-27 1496
7 5553957443 2016-5-16 1688
8 5553957443 2016-4-16 1688
9 5553957443 2016-5-16 1688
##This is the outcome I am looking for
Id date Cal Verify
<dbl> <chr> <dbl> <dbl>
1 1503960366 2016-04-27 1347 False
2 1503960366 2016-04-12 1347 False
3 1503960366 2016-04-27 1348 True
4 4319703577 2016-04-12 1496 False
5 4319703577 2016-04-27 1497 True
6 4319703577 2016-04-27 1496 False
7 5553957443 2016-5-16 1688 False
8 5553957443 2016-4-16 1688 False
9 5553957443 2016-5-16 1688 False
在正确的地方提供任何帮助或指导将不胜感激。提前致谢。
更新
gph<-data.frame(Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443,7503962366,6950855005,1893815059,4020332650,8583815059,4319703577,1927972279),date=c("2016-04-27", "2016-04-12","2016-04-27","2016-04-12", "2016-05-30”,"2016-04-16","2016-05-16”,"2016-04-27","2016-04-27","2016-5-16","2016-4-16","2016-05-16”,”2016-5-20”, "2016-05-22","2016-05-18","2016-04-05"),Cal=c(1347,1347,1348,1496,1497,1496,1688,1688,1688,2063,2063,2064,0,0,0,2022))
将当前 Cal
值与前一个值相减,并检查差值是否等于 1。
library(dplyr)
df %>%
mutate(Verify = Cal - lag(Cal, default = 0) == 1)
# Id date Cal Verify
#1 1503960366 2016-04-27 1347 FALSE
#2 1503960366 2016-04-12 1347 FALSE
#3 1503960366 2016-04-27 1348 TRUE
#4 4319703577 2016-04-12 1496 FALSE
#5 4319703577 2016-04-27 1497 TRUE
#6 4319703577 2016-04-27 1496 FALSE
#7 5553957443 2016-5-16 1688 FALSE
#8 5553957443 2016-4-16 1688 FALSE
#9 5553957443 2016-5-16 1688 FALSE
在基础 R 中 -
df$Verify <- c(FALSE, df$Cal[-1] - df$Cal[-nrow(df)] == 1)
数据
df <- structure(list(Id = c(1503960366, 1503960366, 1503960366, 4319703577,
4319703577, 4319703577, 5553957443, 5553957443, 5553957443),
date = c("2016-04-27", "2016-04-12", "2016-04-27", "2016-04-12",
"2016-04-27", "2016-04-27", "2016-5-16", "2016-4-16", "2016-5-16"
), Cal = c(1347L, 1347L, 1348L, 1496L, 1497L, 1496L, 1688L,
1688L, 1688L)), class = "data.frame", row.names = c(NA, -9L))
使用 diff
.
df <- transform(df, Verify=c(0, diff(Cal)) == 1)
df
# Id date Cal Verify
# 1 1503960366 2016-04-27 1347 FALSE
# 2 1503960366 2016-04-12 1347 FALSE
# 3 1503960366 2016-04-27 1348 TRUE
# 4 4319703577 2016-04-12 1496 FALSE
# 5 4319703577 2016-04-27 1497 TRUE
# 6 4319703577 2016-04-27 1496 FALSE
# 7 5553957443 2016-5-16 1688 FALSE
# 8 5553957443 2016-4-16 1688 FALSE
# 9 5553957443 2016-5-16 1688 FALSE