mapply 将 non-numeric 参数赋予 R 中的二元运算符错误
mapply giving non-numeric argument to binary operator error in R
我正在尝试生成一个变量,用于标记受访者与其在家庭名册中的 parents 之间令人难以置信的年龄差异。 mapply() 在 "non-numeric argument to binary operator" 上给出错误,而当我仅在两列上应用该函数时我没有收到此错误。很感谢任何形式的帮助。下面,我尝试制作一个可重现的示例。
# Variables
respbirth <- c(1974, 1950, 1990, 1980 )
B1010_1 <- c(1950, 1960, 1960, 1979 )
B1040_1 <- c(3,3,3,3)
B1010_2 <- c(1974, NA, NA, 1975 )
B1040_2 <- c(3,1,3,3)
# Data frame
df <- data.frame(respbirth, B1010_1, B1040_1, B1010_2, B1040_2 )
df
# Generate empty variable for flaging cases
df$flag_parent <- FALSE
## Generate a function flagging implausible differences using year of birth
attach(df) # the function doesnt work without this for some reason
imp.parent <- function(data=df,parentAge=B1010_1,relationship=B1040_1) {
df$flag_parent <- with(df, ((respbirth-parentAge)<18) & (relationship==3))
return(df)
}
# Test
df <- imp.parent(parentAge=B1010_1,relationship=B1040_1)
# Apply this function to all columns
parentAge <- c(paste0("B1010_",1:19, sep=""))
relationship <- c(paste0("B1040_",1:19, sep=""))
mapply(imp.parent, parentAge, relationship )
目前您的 mapply
尝试存在许多问题,包括参数类型、函数调用、返回值等。为避免列出一长串这些问题,请考虑以下重构代码。
# Generate a function flagging implausible differences using year of birth
imp.parent <- function(parentAge, relationship, data=df) {
((df$respbirth - df[[parentAge]]) < 18) & (df[[relationship]] == 3)
}
# Apply this function to all columns
parentAge <- c(paste0("B1010_", 1:2))
relationship <- c(paste0("B1040_", 1:2))
# Assign columns True/False
df[paste0("false_flag_", 1:2)] <- mapply(imp.parent, parentAge, relationship )
df
# respbirth B1010_1 B1040_1 B1010_2 B1040_2 false_flag_1 false_flag_2
# 1 1974 1950 3 1974 3 FALSE TRUE
# 2 1950 1960 3 NA 1 TRUE FALSE
# 3 1990 1960 3 NA 3 FALSE NA
# 4 1980 1979 3 1975 3 TRUE TRUE
其实你根本不需要mapply
(隐藏循环)! R 可以计算跨等长数据块的逻辑条件,用于列块的矢量化分配:
# Apply this function to all columns
parentAge <- c(paste0("B1010_", 1:2))
relationship <- c(paste0("B1040_", 1:2))
# NOTICE USE OF `[` (NOT `[[`)
df[paste0("false_flag_", 1:2)] <- ((df$respbirth - df[parentAge]) < 18) & (df[relationship] == 3)
df
# respbirth B1010_1 B1040_1 B1010_2 B1040_2 false_flag_1 false_flag_2
# 1 1974 1950 3 1974 3 FALSE TRUE
# 2 1950 1960 3 NA 1 TRUE FALSE
# 3 1990 1960 3 NA 3 FALSE NA
# 4 1980 1979 3 1975 3 TRUE TRUE
我正在尝试生成一个变量,用于标记受访者与其在家庭名册中的 parents 之间令人难以置信的年龄差异。 mapply() 在 "non-numeric argument to binary operator" 上给出错误,而当我仅在两列上应用该函数时我没有收到此错误。很感谢任何形式的帮助。下面,我尝试制作一个可重现的示例。
# Variables
respbirth <- c(1974, 1950, 1990, 1980 )
B1010_1 <- c(1950, 1960, 1960, 1979 )
B1040_1 <- c(3,3,3,3)
B1010_2 <- c(1974, NA, NA, 1975 )
B1040_2 <- c(3,1,3,3)
# Data frame
df <- data.frame(respbirth, B1010_1, B1040_1, B1010_2, B1040_2 )
df
# Generate empty variable for flaging cases
df$flag_parent <- FALSE
## Generate a function flagging implausible differences using year of birth
attach(df) # the function doesnt work without this for some reason
imp.parent <- function(data=df,parentAge=B1010_1,relationship=B1040_1) {
df$flag_parent <- with(df, ((respbirth-parentAge)<18) & (relationship==3))
return(df)
}
# Test
df <- imp.parent(parentAge=B1010_1,relationship=B1040_1)
# Apply this function to all columns
parentAge <- c(paste0("B1010_",1:19, sep=""))
relationship <- c(paste0("B1040_",1:19, sep=""))
mapply(imp.parent, parentAge, relationship )
目前您的 mapply
尝试存在许多问题,包括参数类型、函数调用、返回值等。为避免列出一长串这些问题,请考虑以下重构代码。
# Generate a function flagging implausible differences using year of birth
imp.parent <- function(parentAge, relationship, data=df) {
((df$respbirth - df[[parentAge]]) < 18) & (df[[relationship]] == 3)
}
# Apply this function to all columns
parentAge <- c(paste0("B1010_", 1:2))
relationship <- c(paste0("B1040_", 1:2))
# Assign columns True/False
df[paste0("false_flag_", 1:2)] <- mapply(imp.parent, parentAge, relationship )
df
# respbirth B1010_1 B1040_1 B1010_2 B1040_2 false_flag_1 false_flag_2
# 1 1974 1950 3 1974 3 FALSE TRUE
# 2 1950 1960 3 NA 1 TRUE FALSE
# 3 1990 1960 3 NA 3 FALSE NA
# 4 1980 1979 3 1975 3 TRUE TRUE
其实你根本不需要mapply
(隐藏循环)! R 可以计算跨等长数据块的逻辑条件,用于列块的矢量化分配:
# Apply this function to all columns
parentAge <- c(paste0("B1010_", 1:2))
relationship <- c(paste0("B1040_", 1:2))
# NOTICE USE OF `[` (NOT `[[`)
df[paste0("false_flag_", 1:2)] <- ((df$respbirth - df[parentAge]) < 18) & (df[relationship] == 3)
df
# respbirth B1010_1 B1040_1 B1010_2 B1040_2 false_flag_1 false_flag_2
# 1 1974 1950 3 1974 3 FALSE TRUE
# 2 1950 1960 3 NA 1 TRUE FALSE
# 3 1990 1960 3 NA 3 FALSE NA
# 4 1980 1979 3 1975 3 TRUE TRUE