如何使用条件测试迭代地填充数据框中的值
How to iteratively populate values in a dataframe using conditional tests
我正在努力完成我最初认为是一项简单的任务,但我的 R 编码技能显然很生疏。简单地说,我在 R 中设置了一个数据框,"test"。 X1 是一个因子,X2 是一个具有空值的列:
- X1 X2
- F1 .
- F1 .
- F2 .
- F3 .
- F3 .
我对这个程序的最终目标是创建一个函数或程序,它将迭代因子的每个级别,要求用户提供一个值来填充 X2 当前级别的 X1,然后继续到X1的下一级。
你会如何编程?
我的问题来自循环本身。要么循环没有重写 X2 的值(我假设它是作为局部变量进行的),要么我收到 "condition has length >1" 错误。以下是我尝试过的几个版本:
someValue<-0
for (i in levels(test$X1)){
if (identical(test$X1,i)) {
test$X2<-someValue}
someValue+1
}
#This doesn't seem to overwrite X2
someValue<-0
for (i in levels(test$X1)){
if (test$X1==i) {
test$X2<-someValue}
someValue+1
}
#This throws the 'condition has length >1' warning. I understand why this is happening.
However, ifelse isn't an option because I want it to do nothing
and iterate to the next level of i if false.
我不想为此过程使用查找 table 或连接,因为这会浪费我试图通过编写此代码节省的时间。但显然我没有足够的能力在 R 中进行循环!
此函数执行您在问题中描述的内容:
fillfac <- function(vec){
fill <- character(length(vec))
# " iterate over each level of the factor"
for(i in levels(vec)){
#"ask the user for a value with which to fill X2"
# "over the current level of X1"
print(paste("What should be the fill for", i, "?"))
value <- scan(what = "character", n=1)
fill[labels(vec)[vec] == i] <- value
}
return(fill)
}
示例:
> X1 = factor(sample(1:5, size = 20, rep=T))
> X2 <- fillfac(X1)
[1] "What should be the fill for 1 ?"
1: "one"
Read 1 item
[1] "What should be the fill for 2 ?"
1: "two"
Read 1 item
[1] "What should be the fill for 3 ?"
1: "three"
Read 1 item
[1] "What should be the fill for 4 ?"
1: "four"
Read 1 item
[1] "What should be the fill for 5 ?"
1: "five"
Read 1 item
> (df <- as.data.frame(cbind(X1,X2)))
X1 X2
1 1 one
2 3 three
3 1 one
4 2 two
5 5 five
6 3 three
7 3 three
8 4 four
9 2 two
10 3 three
11 2 two
12 3 three
13 4 four
14 5 five
15 2 two
16 1 one
17 2 two
18 2 two
19 5 five
20 4 four
我正在努力完成我最初认为是一项简单的任务,但我的 R 编码技能显然很生疏。简单地说,我在 R 中设置了一个数据框,"test"。 X1 是一个因子,X2 是一个具有空值的列:
- X1 X2
- F1 .
- F1 .
- F2 .
- F3 .
- F3 .
我对这个程序的最终目标是创建一个函数或程序,它将迭代因子的每个级别,要求用户提供一个值来填充 X2 当前级别的 X1,然后继续到X1的下一级。
你会如何编程?
我的问题来自循环本身。要么循环没有重写 X2 的值(我假设它是作为局部变量进行的),要么我收到 "condition has length >1" 错误。以下是我尝试过的几个版本:
someValue<-0
for (i in levels(test$X1)){
if (identical(test$X1,i)) {
test$X2<-someValue}
someValue+1
}
#This doesn't seem to overwrite X2
someValue<-0
for (i in levels(test$X1)){
if (test$X1==i) {
test$X2<-someValue}
someValue+1
}
#This throws the 'condition has length >1' warning. I understand why this is happening.
However, ifelse isn't an option because I want it to do nothing
and iterate to the next level of i if false.
我不想为此过程使用查找 table 或连接,因为这会浪费我试图通过编写此代码节省的时间。但显然我没有足够的能力在 R 中进行循环!
此函数执行您在问题中描述的内容:
fillfac <- function(vec){
fill <- character(length(vec))
# " iterate over each level of the factor"
for(i in levels(vec)){
#"ask the user for a value with which to fill X2"
# "over the current level of X1"
print(paste("What should be the fill for", i, "?"))
value <- scan(what = "character", n=1)
fill[labels(vec)[vec] == i] <- value
}
return(fill)
}
示例:
> X1 = factor(sample(1:5, size = 20, rep=T))
> X2 <- fillfac(X1)
[1] "What should be the fill for 1 ?"
1: "one"
Read 1 item
[1] "What should be the fill for 2 ?"
1: "two"
Read 1 item
[1] "What should be the fill for 3 ?"
1: "three"
Read 1 item
[1] "What should be the fill for 4 ?"
1: "four"
Read 1 item
[1] "What should be the fill for 5 ?"
1: "five"
Read 1 item
> (df <- as.data.frame(cbind(X1,X2)))
X1 X2
1 1 one
2 3 three
3 1 one
4 2 two
5 5 five
6 3 three
7 3 three
8 4 four
9 2 two
10 3 three
11 2 two
12 3 three
13 4 four
14 5 five
15 2 two
16 1 one
17 2 two
18 2 two
19 5 five
20 4 four