r 中的 oaxaca 包错误 - 不一致的参数
Error with oaxaca package in r - non-conformable arguments
我正在尝试 运行 使用 oaxaca 包进行 Oaxaca 分解,但包含某些变量似乎会触发错误 "non-conformable arguments." 目前为止据我所知,错误似乎只出现在包含某些 factor/categorical 变量时,但不是所有 factor/categorical 变量。
这是我的数据集的最小可重现示例,wvs_reduc:
structure(list(emp = c(1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0,
1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0), education = structure(c(4L, 3L, 2L, 2L, 3L,
3L, 2L, 6L, 4L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 4L, 4L, 1L, 2L, 4L,
4L, 4L, 4L, 4L, 4L, 3L, 4L, 4L, 4L, 4L, 3L, 2L, 4L, 4L, 4L, 3L,
2L, 4L, 3L), .Label = c("No Formal Education", "Primary or Less",
"Incomplete Secondary", "Secondary", "Incomplete University",
"University or More"), class = "factor"), marital = structure(c(1L,
1L, 3L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 4L, 3L, 1L, 1L,
4L, 3L, 1L, 3L, 4L, 1L, 3L, 3L, 3L, 3L, 1L, 3L, 4L, 4L, 4L, 4L,
3L, 3L, 4L, 3L, 3L, 4L, 3L), .Label = c("single", "cohabiting",
"married", "previously married"), class = "factor"), Arab = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"))
当我运行命令时:
library(oaxaca)
oaxaca(emp ~ education + marital | Arab,
data = wvs_reduc, group.weights = 0, R = 10)
我收到错误消息:t(x.mean.A) %*% delta.A 中出错:参数不一致。
万一它是相关的,当我 运行 我的较大数据集上的命令时,我反而得到一个类似但不相同的错误,包含变量 "marital" 而不是 "education" 或其他因子变量:
t(x.mean.A - x.mean.B) %*% beta.B 中的错误:不一致的参数
查看底层代码oaxaca:::.oaxaca.wrap
错误的部分就是这一行:
E <- as.numeric(t(x.mean.A - x.mean.B) %*% beta.B)
C <- as.numeric(t(x.mean.B) %*% (beta.A - beta.B))
I <- as.numeric(t(x.mean.A - x.mean.B) %*% (beta.A - beta.B))
如果x.mean.A中的任何一个是向量,那么它会抛出错误。在这个示例数据集中查看您的设计:
table(wvs_reduc$education,wvs_reduc$Arab)
0 1
No Formal Education 0 2
Primary or Less 2 10
Incomplete Secondary 4 3
Secondary 14 4
Incomplete University 0 0
University or More 0 1
所以那些全为零的将被删除,我会说你需要确保级别分布在你的分组类别中。我们可以通过模拟这个变量来证实这一点:
set.seed(111)
wvs_reduc$test_education =sample(levels(wvs_reduc$education),nrow(wvs_reduc),replace=TRUE)
wvs_reduc$test_marital =sample(levels(wvs_reduc$marital),nrow(wvs_reduc),replace=TRUE)
我们运行这个然后关闭bootstrap:
oaxaca(emp ~ test_education + test_marital | Arab, data=wvs_reduc,R=NULL)
如果我们设置 bootstrap 它会崩溃,因为在二次采样时,它会 运行 进入同样的错误:
oaxaca(emp ~ test_education + test_marital | Arab, data=wvs_reduc,R=2)
oaxaca: oaxaca() performing analysis. Please wait.
Bootstrapping standard errors:
1 / 2 (50%)
Error in t(x.mean.A) %*% delta.A : non-conformable arguments
In addition: There were 11 warnings (use warnings() to see them)
所以为了让它在你的整个数据帧上工作,你需要检查是否有 n=1 的级别(考虑组)
我正在尝试 运行 使用 oaxaca 包进行 Oaxaca 分解,但包含某些变量似乎会触发错误 "non-conformable arguments." 目前为止据我所知,错误似乎只出现在包含某些 factor/categorical 变量时,但不是所有 factor/categorical 变量。
这是我的数据集的最小可重现示例,wvs_reduc:
structure(list(emp = c(1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0,
1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0), education = structure(c(4L, 3L, 2L, 2L, 3L,
3L, 2L, 6L, 4L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 4L, 4L, 1L, 2L, 4L,
4L, 4L, 4L, 4L, 4L, 3L, 4L, 4L, 4L, 4L, 3L, 2L, 4L, 4L, 4L, 3L,
2L, 4L, 3L), .Label = c("No Formal Education", "Primary or Less",
"Incomplete Secondary", "Secondary", "Incomplete University",
"University or More"), class = "factor"), marital = structure(c(1L,
1L, 3L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 4L, 3L, 1L, 1L,
4L, 3L, 1L, 3L, 4L, 1L, 3L, 3L, 3L, 3L, 1L, 3L, 4L, 4L, 4L, 4L,
3L, 3L, 4L, 3L, 3L, 4L, 3L), .Label = c("single", "cohabiting",
"married", "previously married"), class = "factor"), Arab = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"))
当我运行命令时:
library(oaxaca)
oaxaca(emp ~ education + marital | Arab,
data = wvs_reduc, group.weights = 0, R = 10)
我收到错误消息:t(x.mean.A) %*% delta.A 中出错:参数不一致。
万一它是相关的,当我 运行 我的较大数据集上的命令时,我反而得到一个类似但不相同的错误,包含变量 "marital" 而不是 "education" 或其他因子变量:
t(x.mean.A - x.mean.B) %*% beta.B 中的错误:不一致的参数
查看底层代码oaxaca:::.oaxaca.wrap
错误的部分就是这一行:
E <- as.numeric(t(x.mean.A - x.mean.B) %*% beta.B)
C <- as.numeric(t(x.mean.B) %*% (beta.A - beta.B))
I <- as.numeric(t(x.mean.A - x.mean.B) %*% (beta.A - beta.B))
如果x.mean.A中的任何一个是向量,那么它会抛出错误。在这个示例数据集中查看您的设计:
table(wvs_reduc$education,wvs_reduc$Arab)
0 1
No Formal Education 0 2
Primary or Less 2 10
Incomplete Secondary 4 3
Secondary 14 4
Incomplete University 0 0
University or More 0 1
所以那些全为零的将被删除,我会说你需要确保级别分布在你的分组类别中。我们可以通过模拟这个变量来证实这一点:
set.seed(111)
wvs_reduc$test_education =sample(levels(wvs_reduc$education),nrow(wvs_reduc),replace=TRUE)
wvs_reduc$test_marital =sample(levels(wvs_reduc$marital),nrow(wvs_reduc),replace=TRUE)
我们运行这个然后关闭bootstrap:
oaxaca(emp ~ test_education + test_marital | Arab, data=wvs_reduc,R=NULL)
如果我们设置 bootstrap 它会崩溃,因为在二次采样时,它会 运行 进入同样的错误:
oaxaca(emp ~ test_education + test_marital | Arab, data=wvs_reduc,R=2)
oaxaca: oaxaca() performing analysis. Please wait.
Bootstrapping standard errors:
1 / 2 (50%)
Error in t(x.mean.A) %*% delta.A : non-conformable arguments
In addition: There were 11 warnings (use warnings() to see them)
所以为了让它在你的整个数据帧上工作,你需要检查是否有 n=1 的级别(考虑组)