当数据中没有空值或 NA 值时，带对数变量的 PLM 回归返回非有限值错误

Question

我正在使用 plm 包来分析我的面板数据，其中包含一组 14 年的状态。在运行ning plm 回归过程中，我多次遇到错误 "model matrix or response contain non-finite values"，但我最终通过删除具有 null 或 NA 值的观察结果解决了这些问题。但是，我正在做回归：


mod_3.1_within_log_b <- plm(log(PIB) ~ txinad + prod + op + emp + log(RT) + log (DC) + log(DK) + Gini + I(log(DC)*Gini) + I(log(DK)*Gini), data = dd, effect = 'individual')

summary (mod_3.1_within_log_b)

哪个returns

Error in model.matrix.pdata.frame(data, rhs=1, model=model, effect=effect,
model matrix or response contains non-finite values (NA/NaN/inf/-inf)

但是，正如我所说，我的数据不再包含空值或 NA 值。只是为了测试这一点，我运行单独的回归


mod_3.1_within_log_b <- plm(log(PIB) ~ txinad + prod + op + emp + log(RT) + log (DC) + Gini + I(log(DC)*Gini) + I(log(DK)*Gini), data = dd, effect = 'individual')

和


mod_3.1_within_log_b <- plm(log(PIB) ~ txinad + prod + op + emp + log(RT) + log(DK) + Gini + I(log(DC)*Gini) + I(log(DK)*Gini), data = dd, effect = 'individual')

summary (mod_3.1_within_log_b)

两者都有效，表明当我运行与 log(DK) 和 log(DC) 一起时，我收到错误。

提前致谢！

Answer 1

正如@StupidWolf 在评论中建议的那样，您的模型矩阵可能包含零值或负值（log(-1) returns NaN 和 log(0) return Inf).

plm 不会通过手动删除不完整的观察来处理此问题，但我们可以通过检查使用的模型矩阵（或查看原始数据）来手动执行此操作。在没有完整数据的情况下，这只是检查模型矩阵中一些简单问题的建议。

请注意，我已经缩短了公式以提高可读性。

mm <- model.matrix(txinad + prod + op + emp + log(RT) + 
                    (log(DC) + log(DK)) * Gini, data = dd)
## Check complete.cases
if(any(icc <- !complete.cases(mm))){
    cat('Rows in dd causing trouble:\n')
    print(dd[icc, ])
}

这会打印 dd 中的任何行，这会导致 model.matrix 中出现问题。

当数据中没有空值或 NA 值时，带对数变量的 PLM 回归返回非有限值错误

PLM regression with log variables returning non-finite values error when there are no null or NA values in the data

r

plm