如何提高有条件定义向量的循环速度？

Question

我有以下代码块，我在其中以其他向量和数据帧为条件在 for 循环内创建一个向量。我正在迭代大约 15,000 次并且需要运行这段代码很多次（100 次）。现在它非常慢，所以我试图让它更快。我意识到在循环的每次迭代中使用 which() 可能效率很低，但我不确定如何更改它。我考虑过使用 apply() 函数，但不确定它们是否有助于加快速度。我也一直在考虑矢量化而不是运行ning for 循环。非常感谢您的时间和提前的帮助！

这是一个示例数据框 temp_dat:

   MONTH             ID  E
1      9 19951100023401 32
2      7 19951100023401 32
3      9 19951100023402 34
4      7 19951100023402 34
5      9 19951100023403 32
6      7 19951100023403 32
7      9 19951100023903 90
8      7 19951100023903 79
9      9 19951100024403 34
10     7 19951100024403 34

我运行ning的密码是：

vector1 <- c()
x<- unique(temp_dat$ID)
for (a in 1:length(x)) {
  b = x[a]
  
  vector1[a] <- as.numeric(((temp_dat[which(temp_dat$ID == b & temp_dat$MONTH == 9),]$E %in% c(90,97)) & (temp_dat[which(temp_dat$ID == b & temp_dat$MONTH == 7),]$E %in% c(79,77))))
}

它的输出向量 1 的值为

0 0 0 1 0

Answer 1

require(data.table)
temp_dat <- as.data.table(temp_dat)
temp_dat[,
         as.integer(
           any(MONTH == 9 & (E %in% c(90,97))) &&
             any(MONTH == 7 & (E %in% c(79,77)))
           ),
         by = ID]$V1

如何提高有条件定义向量的循环速度？

How to improve speed of a loop where conditionally defining vector?

performance

for-loop

r

vectorization