GLM 对一个数值变量和一个分类变量的乘积进行回归的问题
Problem with GLM to regress on the product of one numeric variable and one categorical variable
我想对以下模型执行逻辑回归:
regression <- Y ~
netSales + size + CashAssetRatio + FRNG +
I(insolvency * countryCode)
使用以下代码:
tbmodel <- glm(regression, data=trainSplit,
weights=NULL, binomial(link = "logit"),
na.action=na.omit)
###### REPRENDRE ICI APRES PAUSE
但是,当我计算回归时出现以下错误:
Error in contrasts<-
(*tmp*
, value = contr.funs[1 + isOF[nn]]) :
les contrastes ne peuvent être appliqués qu'aux facteurs ayant au
moins deux niveaux In addition: Warning message: In
Ops.factor(insolvency, countryIsoCode) : ‘*’ not meaningful for
factors
事实是我不知道它可能来自哪里,因为我的变量 countryCode 是一个超过 2 个级别的因素,而且我没有 NA。以下是一些数据:
countryCode insolvency netSales Y size CashAssetRatio FRNG
47091 FR 0.0491 -0.04042249 0 2 1.123095 -0.001679786
24460 IT 0.0115 -0.04343820 0 1 1.078720 -0.001130815
11921 FR 0.0029 -0.04227984 0 2 1.076595 -0.001097954
1657 FR 0.0016 -0.04242885 0 2 1.075237 -0.001075071
37572 IT 0.0006 -0.04355702 0 1 1.077884 -0.001122143
8155 FR 0.0270 -0.04058710 0 2 1.076638 -0.001067854
你有什么想法吗?谢谢
根据?公式
While formulae usually involve just variable and factor names, they
can also involve arithmetic expressions. The formula log(y) ~ a +
log(x) is quite legal. When such arithmetic expressions involve
operators which are also used symbolically in model formulae, there
can be confusion between arithmetic and symbolic operator use.
To avoid this confusion, the function I() can be used to bracket those
portions of a model formula where the operators are used in their
arithmetic sense. For example, in the formula y ~ a + I(b+c), the term
b+c is to be interpreted as the sum of b and c.
所以你写的公式实际上是要求乘法。由于您想要的是迭代,因此删除 I()
.
我想对以下模型执行逻辑回归:
regression <- Y ~
netSales + size + CashAssetRatio + FRNG +
I(insolvency * countryCode)
使用以下代码:
tbmodel <- glm(regression, data=trainSplit,
weights=NULL, binomial(link = "logit"),
na.action=na.omit)
###### REPRENDRE ICI APRES PAUSE
但是,当我计算回归时出现以下错误:
Error in
contrasts<-
(*tmp*
, value = contr.funs[1 + isOF[nn]]) : les contrastes ne peuvent être appliqués qu'aux facteurs ayant au moins deux niveaux In addition: Warning message: In Ops.factor(insolvency, countryIsoCode) : ‘*’ not meaningful for factors
事实是我不知道它可能来自哪里,因为我的变量 countryCode 是一个超过 2 个级别的因素,而且我没有 NA。以下是一些数据:
countryCode insolvency netSales Y size CashAssetRatio FRNG
47091 FR 0.0491 -0.04042249 0 2 1.123095 -0.001679786
24460 IT 0.0115 -0.04343820 0 1 1.078720 -0.001130815
11921 FR 0.0029 -0.04227984 0 2 1.076595 -0.001097954
1657 FR 0.0016 -0.04242885 0 2 1.075237 -0.001075071
37572 IT 0.0006 -0.04355702 0 1 1.077884 -0.001122143
8155 FR 0.0270 -0.04058710 0 2 1.076638 -0.001067854
你有什么想法吗?谢谢
根据?公式
While formulae usually involve just variable and factor names, they can also involve arithmetic expressions. The formula log(y) ~ a + log(x) is quite legal. When such arithmetic expressions involve operators which are also used symbolically in model formulae, there can be confusion between arithmetic and symbolic operator use.
To avoid this confusion, the function I() can be used to bracket those portions of a model formula where the operators are used in their arithmetic sense. For example, in the formula y ~ a + I(b+c), the term b+c is to be interpreted as the sum of b and c.
所以你写的公式实际上是要求乘法。由于您想要的是迭代,因此删除 I()
.