将因子变量相对于另一行向上移动一行
Moving a factor variable up one row relative to another
不太确定如何解决这个问题。
这是一个示例数据集:
Bob <- sample("Bob", 6, replace = T)
Jeff <- sample("Jeff", 6, replace = T)
Carl <- sample("Carl", 6, replace = T)
Name <- array(c(Bob, Jeff, Carl), dim = c(18,1))
Week <- c("Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6",
"Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6",
"Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6")
variable.1 <- c("No", "No", "No", "Yes", "No", "No",
"Yes", "No", "No", "No", "Yes", "No",
"No", "Yes", "No", "No", "No", "Yes")
df <- data.frame(Name, Week, variable.1)
df
Name Week variable.1
1 Bob Week 1 No
2 Bob Week 2 No
3 Bob Week 3 No
4 Bob Week 4 Yes
5 Bob Week 5 No
6 Bob Week 6 No
7 Jeff Week 1 Yes
8 Jeff Week 2 No
9 Jeff Week 3 No
10 Jeff Week 4 No
11 Jeff Week 5 Yes
12 Jeff Week 6 No
13 Carl Week 1 No
14 Carl Week 2 Yes
15 Carl Week 3 No
16 Carl Week 4 No
17 Carl Week 5 No
18 Carl Week 6 Yes
我想做的是将 variable.1 列中的任何 "Yes" 向上移动一行,以便它可以反映为前一周信息的因子变量。我试图通过个人(而不是整个数据集)来做到这一点。当两个变量都是因素时,我想不出解决这个问题的最佳方法。理想情况下,我希望 NA 出现。我不希望一切都简单地向上移动。我只想让 NA 出现在 "Yes" 所在的位置,并让它覆盖它上面的 "No"。
所以,我希望成品像下面的 "New.Col" 一样:
Name Week variable.1 New.Col
1 Bob Week 1 No No
2 Bob Week 2 No No
3 Bob Week 3 No Yes
4 Bob Week 4 Yes NA
5 Bob Week 5 No No
6 Bob Week 6 No No
7 Jeff Week 1 Yes NA
8 Jeff Week 2 No No
9 Jeff Week 3 No No
10 Jeff Week 4 No Yes
11 Jeff Week 5 Yes NA
12 Jeff Week 6 No No
13 Carl Week 1 No Yes
14 Carl Week 2 Yes NA
15 Carl Week 3 No No
16 Carl Week 4 No No
17 Carl Week 5 No Yes
18 Carl Week 6 Yes NA
让我们试试这个。
我将继续按名称和周对 df
进行排序,以防某些数据出现乱序。 (这不会影响任何遗漏的周数!)我还会复制 variable.1
作为 newcol
中的角色来玩。
df <- df[order(df$Name, df$Week),]
df$newcol <- as.character(df$variable.1)
为了便于理解,我会写一个循环,但是计算性的,有更好的方法来做到这一点。这个循环将查看 df$Name
中的每个独特的人
for (person in unique(df$Name)) {
}
在循环内,我想 select 每个人 newcol
中的所有条目。
oldvalues <- df[df$Name == person, ]$newcol
然后我将继续将每个值向上移动 1 个条目并将最后一个条目设为 NA。
newvalues <- c(oldvalues[2:length(oldvalues)], NA)
我还想通过将那一周设为 NA 来说明每次旧值 "Yes"。
newvalues[oldvalues == "Yes"] <- NA
然后我可以把它放回 df
。
df[df$Name == person,]$newcol <- newvalues
现在循环结束了,你可以让 df$newcol
回到默认情况下不包括 NA
的因子
df$newcol <- factor(df$newcol)
或使其成为第三个因素水平
df$newcol <- factor(df$newcol, exclude = NULL)
不太确定如何解决这个问题。
这是一个示例数据集:
Bob <- sample("Bob", 6, replace = T)
Jeff <- sample("Jeff", 6, replace = T)
Carl <- sample("Carl", 6, replace = T)
Name <- array(c(Bob, Jeff, Carl), dim = c(18,1))
Week <- c("Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6",
"Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6",
"Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6")
variable.1 <- c("No", "No", "No", "Yes", "No", "No",
"Yes", "No", "No", "No", "Yes", "No",
"No", "Yes", "No", "No", "No", "Yes")
df <- data.frame(Name, Week, variable.1)
df
Name Week variable.1
1 Bob Week 1 No
2 Bob Week 2 No
3 Bob Week 3 No
4 Bob Week 4 Yes
5 Bob Week 5 No
6 Bob Week 6 No
7 Jeff Week 1 Yes
8 Jeff Week 2 No
9 Jeff Week 3 No
10 Jeff Week 4 No
11 Jeff Week 5 Yes
12 Jeff Week 6 No
13 Carl Week 1 No
14 Carl Week 2 Yes
15 Carl Week 3 No
16 Carl Week 4 No
17 Carl Week 5 No
18 Carl Week 6 Yes
我想做的是将 variable.1 列中的任何 "Yes" 向上移动一行,以便它可以反映为前一周信息的因子变量。我试图通过个人(而不是整个数据集)来做到这一点。当两个变量都是因素时,我想不出解决这个问题的最佳方法。理想情况下,我希望 NA 出现。我不希望一切都简单地向上移动。我只想让 NA 出现在 "Yes" 所在的位置,并让它覆盖它上面的 "No"。
所以,我希望成品像下面的 "New.Col" 一样:
Name Week variable.1 New.Col
1 Bob Week 1 No No
2 Bob Week 2 No No
3 Bob Week 3 No Yes
4 Bob Week 4 Yes NA
5 Bob Week 5 No No
6 Bob Week 6 No No
7 Jeff Week 1 Yes NA
8 Jeff Week 2 No No
9 Jeff Week 3 No No
10 Jeff Week 4 No Yes
11 Jeff Week 5 Yes NA
12 Jeff Week 6 No No
13 Carl Week 1 No Yes
14 Carl Week 2 Yes NA
15 Carl Week 3 No No
16 Carl Week 4 No No
17 Carl Week 5 No Yes
18 Carl Week 6 Yes NA
让我们试试这个。
我将继续按名称和周对 df
进行排序,以防某些数据出现乱序。 (这不会影响任何遗漏的周数!)我还会复制 variable.1
作为 newcol
中的角色来玩。
df <- df[order(df$Name, df$Week),]
df$newcol <- as.character(df$variable.1)
为了便于理解,我会写一个循环,但是计算性的,有更好的方法来做到这一点。这个循环将查看 df$Name
中的每个独特的人for (person in unique(df$Name)) {
}
在循环内,我想 select 每个人 newcol
中的所有条目。
oldvalues <- df[df$Name == person, ]$newcol
然后我将继续将每个值向上移动 1 个条目并将最后一个条目设为 NA。
newvalues <- c(oldvalues[2:length(oldvalues)], NA)
我还想通过将那一周设为 NA 来说明每次旧值 "Yes"。
newvalues[oldvalues == "Yes"] <- NA
然后我可以把它放回 df
。
df[df$Name == person,]$newcol <- newvalues
现在循环结束了,你可以让 df$newcol
回到默认情况下不包括 NA
的因子
df$newcol <- factor(df$newcol)
或使其成为第三个因素水平
df$newcol <- factor(df$newcol, exclude = NULL)