根据 R 中的三个条件，从 data.frame 中选择随机行并将其分配给另外两个 data.frame 之一

Question

我有一个 data.frame (a) 如下所述：

   V1 V2
1   a  b
2   a  e
3   a  f
4   b  c
5   b  e
6   b  f
7   c  d
8   c  g
9   c  h
10  d  g
11  d  h
12  e  f
13  f  g
14  g  h

假设每一行代表图形的一条边，行的值是顶点。

我想要的是从 data.frame (a) 中随机选择一行（这是一条边）并将其分配给 data.frame (b) 或 data.frame (c)基于以下三个条件。只是为了澄清 data.frame （b 和 c）一开始是空的。所以条件是：

当从data.frame (a) 中随机选取一条行（边）时，如果两个顶点都没有分配，则将边分配给行数最少的data.frame。

澄清这个条件：假设我从 data.frame (a) 中选择了一个随机行（边）#2，它有两个顶点 "a" 和 "e"。所以我应该检查 data.frame (b) 和 data.frame (c) 在它们的任何行中是否存在 "a" 或 "e"。因此，如果他们有 "a" 或 "e"，则不应执行此规则，而应检查下一条规则。如果两个 data.frames 都没有 "a" 或 "e" 出现在任何行中，那么应该在 data.frames 和有的行中检查 nrow（行数）应为该行分配较少数量的 nrow()。如果两者都有相同的 nrow() 那么两个 data.frame 中的任何一个都可以被分配到该行。

当从 data.frame (a) 中随机选取一行（边）并且如果该行的一个顶点出现在任何 data.frames (b) 或 (c ) 然后将行（边）分配给 data.frame

如果选择随机行，例如#3 有 "a" 和 "f"。然后应检查 data.frames b 和 c 以查看是否有任何行包含 "a" 或 "f"。假设 data.frame (b) 不包含 "a" 或 "f"，但 data.frame (c) 包含 "f"。所以该行应该分配给 data.frame (c)。现在也有可能 data.frame (b) 包含 "a" 而 data.frame(c) 包含 "f"。在这种情况下，data.frame (b) 中的 "a" 和 data.frame (c) 中的 "f" 的所有实例都应计算在内。如果 "a" 出现 3 次而 "f" 出现 4 次，则该行应分配给 (b)，即该行应分配给实例数较少的 data.frame data.frame.

中存在的顶点

当从 data.frame (a) 中随机选取一行（边）并且如果该行的两个顶点都存在于 data.frame 中，则将该行分配给 data.frame

总而言之，应从 data.frame(a) 中随机选择一行并检查上述条件，然后应分配给 data.frame(b) 或 (c)通过以上条件。因此必须检查 data.frame(a) 的所有行的条件。

Answer 1

这应该可以帮助您入门。正如您发现的那样，您不能连续随机 select 行，因为这会导致重复。相反，将行随机分配给一个向量，该向量给出了处理它们的顺序。如果您认为这不是正确的方法，您也可以随机 select 一行，然后将其从 a 然后从剩下的内容中随机 select。如果您仍然需要 a，请从 a.

的副本中删除该行

set.seed(1)
dfa <- data.frame(V1 = sample(letters[1:9], replace = TRUE), V2 = sample(letters[1:9], replace = TRUE))

todo <- sample(1:nrow(dfa), nrow(dfa), replace = FALSE)

dfb <- dfa[todo[1],]
dfc <- dfa[todo[2],]

现在按顺序继续 'todo'，应用您的条件并使用 rbind 将行添加到 dfb 和 dfc:

for (i in 3:length(todo)) {

    # apply your logic
    # if a row belongs in dfb, do
    dfb <- rbind(dfb, dfa[todo[i],])
    # etc
}

Answer 2

aCopy<-read.table("isnodes.txt")
p1<-aCopy[-c(1:nrow(aCopy)),]
p2<-aCopy[-c(1:nrow(aCopy)),]
currentRowHistory<-aCopy[-c(1:nrow(aCopy)),]

for(i in 1:nrow(a)) {
currentRow <- aCopy[sample(nrow(aCopy), 1), ]
currentRowHistory <- rbind(currentRow,currentRowHistory)
currentRowV1 <- as.character(currentRow$V1[1])
currentRowV2 <- as.character(currentRow$V2[1])
aCopy <- aCopy[!(aCopy$V1 == currentRowV1 & aCopy$V2 == currentRowV2),]

if(length(which(currentRowV1 == p1$V1)) | length(which(currentRowV1 == p1$V2))){
    if(length(which(currentRowV2 == p1$V1)) | length(which(currentRowV2 == p1$V2))){
 p1<-rbind(currentRow,p1)
        result <- "case 1 assign it to p1"
    }
    else if(length(which(currentRowV2 == p2$V1)) | length(which(currentRowV2 == p2$V2))){
 V1occurances <- length(which(p1$V1 == currentRowV1))+length(which(p1$V2==currentRowV1))
 V2occurances <- length(which(p2$V1 == currentRowV2))+length(which(p2$V2==currentRowV2))
 ifelse(V1occurances<V2occurances,p1<-rbind(currentRow,p1),p2<-rbind(currentRow,p2))
 result <- "case 2"
    }
    else {
 p1<-rbind(currentRow,p1)
        result <- "case 3 assign it to p1"
    }
} else if(length(which(currentRowV1 == p2$V1)) | length(which(currentRowV1 == p2$V2))){
    if(length(which(currentRowV2 == p2$V1)) | length(which(currentRowV2 == p2$V2))){
 p2<-rbind(currentRow,p2)
        result <- "case 1 assign it to p2"
    }
    else if(length(which(currentRowV2 == p1$V1)) | length(which(currentRowV2 == p1$V2))){
 V1occurancesInP2 <- length(which(p2$V1 == currentRowV1))+length(which(p2$V2==currentRowV1))
 V2occurancesInP1 <- length(which(p1$V1 == currentRowV2))+length(which(p1$V2==currentRowV2))
 ifelse(V1occurancesInP2<V2occurancesInP1,p2<-rbind(currentRow,p2),p1<-rbind(currentRow,p1))
        result <- "case 2"
    }
    else {
 p2<-rbind(currentRow,p2)
        result <- "case 3 assign it to p2"
    }
} else if(length(which(currentRowV2 == p1$V1)) | length(which(currentRowV2 == p1$V2))){
    p1<-rbind(currentRow,p1)
    result <- "Assign it to p1 case 3"
} else if(length(which(currentRowV2 == p2$V1)) | length(which(currentRowV2 == p2$V2))){
 p2<-rbind(currentRow,p2)
    result <- "Assign it to p2 case 3"
} else {
    ifelse(nrow(p1)<nrow(p2),p1<-rbind(currentRow,p1), p2<-rbind(currentRow,p2))

}
}

根据 R 中的三个条件，从 data.frame 中选择随机行并将其分配给另外两个 data.frame 之一

Selecting random row from a data.frame and assigning it to one of the two other data.frames based on three conditions in R

random

r

dataframe