R 中的多项逻辑回归

Multinomial Logistic Regression in R

我试图找到很多关于多项式回归的资源。但是,我很难找到可以显示给定 multiclass 因变量的 multiclass 响应变量概率的可视化。我在下面有一个示例,我想根据 WEEKDAY 和 COUNT 预测 EQUIPMENT。我如何使用 nnet::multinom() 执行此操作并提供有意义的可视化以帮助理解预测的 classes.

  dummy_data <- function(size){
  LOCATION <- sample(c("LOC_A", "LOC_B", "LOC_B"), size, replace = T, prob = c(0.4, 0.4, 0.2))
  EQUIPMENT <- sample(c("EQUIP_A", "EQUIP_B", "EQUIP_C", "EQUIP_D"), size, replace = TRUE)
  df <- data.frame(LOCATION, EQUIPMENT)
  df$COUNT <- sample(c(1:10), size, replace = TRUE)
  startTime <- as.POSIXct("2016-01-01")
  endTime <- as.POSIXct("2019-01-31")
  df$DATE <- as.Date(sample(seq(startTime, endTime, 1), size))
  df$WEEKDAY <- weekdays(as.Date(df$DATE))

  return(df)
}

df<- dummy_data(100)
library(nnet)
model <- nnet::multinom(EQUIPMENT ~ WEEKDAY + COUNT, data = df)

我应该如何从这里继续?我试着离开这个例子,想要一个类似的可视化来表示按工作日和设备细分的 class 的概率。我很难理解这个例子。谁能帮忙? https://stats.idre.ucla.edu/r/dae/multinomial-logistic-regression/

我认为您在加州大学洛杉矶分校网站示例中寻找的是“查看连续预测变量写入的不同值的平均预测概率(数据中的 COUNT)在 ses 的每个级别(WEEKDAY in你的数据)。"

为此,我将向您在 multinom 之前所做的添加以下行:

    df$EQUIPMENT <- as.factor(df$EQUIPMENT)
    df$EQUIPMENT <- relevel(df$EQUIPMENT, ref = "EQUIP_A")

    df$WEEKDAY <- as.factor(df$WEEKDAY)
    df$WEEKDAY <- relevel(df$WEEKDAY, ref = "Sunday")
   
    model <- nnet::multinom(EQUIPMENT ~ WEEKDAY + COUNT, data = df)

然后按照

的例子
    dmodel <- data.frame(WEEKDAY = rep(c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"), each = 10), COUNT = rep(c(1:10),7))
    pp.model <- cbind(dmodel, predict(model, newdata = dmodel, type = "probs", se = TRUE))


    lmodel <- melt(pp.model, id.vars = c("WEEKDAY", "COUNT"), value.name = "probability")

    ggplot(lmodel, aes(x = COUNT, y = probability, colour = WEEKDAY)) + geom_line() + facet_grid(variable ~., scales = "free")

但是,您可以看到该图看起来与 UCLA 示例有很大不同。为什么?在绘制图之前,我们需要先检查预测变量是否显着。为了简单地做到这一点(而不是遵循示例 link 中的步骤),我将使用 epiDisplay 包:

    epiDisplay::mlogit.display(model, decimal=3, alpha=0.05)

    Outcome =EQUIPMENT; Referent group = EQUIP_A 
                      EQUIP_B                           EQUIP_C                           EQUIP_D                          
                      Coeff./SE     RRR(95%CI)          Coeff./SE     RRR(95%CI)          Coeff./SE     RRR(95%CI)         
    (Intercept)      -0.083/1.0419 -                   0.036/0.9569  -                   -0.124/1.0174 -                  
    WEEKDAYMonday    -15.513/0NA   0(0,0)              -0.246/0.9629 0.782(0.118,5.162)  -0.672/1.0851 0.511(0.061,4.283) 
    WEEKDAYSaturday  -1.082/1.4201 0.339(0.021,5.482)  0.454/1.0477  1.575(0.202,12.275) 0.536/1.1023  1.708(0.197,14.822)
    WEEKDAYSunday    -0.269/1.1276 0.764(0.084,6.967)  -0.234/1.0488 0.791(0.101,6.182)  0.433/1.05    1.542(0.197,12.076)
    WEEKDAYThursday  0.516/1.0964  1.676(0.195,14.372) -1.373/1.3855 0.253(0.017,3.83)   0.296/1.119   1.344(0.15,12.047) 
    WEEKDAYTuesday   0.01/1.083    1.01(0.121,8.438)   -0.548/1.0827 0.578(0.069,4.827)  -0.678/1.1925 0.507(0.049,5.253) 
    WEEKDAYWednesday -1.1/1.1549   0.333(0.035,3.201)  -0.477/0.9758 0.621(0.092,4.201)  -1.794/1.3542 0.166(0.012,2.364) 
    COUNT            0.014/0.109   1.014(0.819,1.256)  0.042/0.0961  1.043(0.864,1.259)  0.021/0.1019  1.021(0.836,1.247) 

    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1  

并且您的模型中没有重要的预测变量。相反,使用示例数据的多项式回归显示预测变量显着:

    ml <- read.dta("https://stats.idre.ucla.edu/stat/data/hsbdemo.dta")
    ml$prog2 <- relevel(ml$prog, ref = "academic")

    test <- multinom(prog2 ~ ses + write, data = ml)
    epiDisplay::mlogit.display(test, decimal=3, alpha=0.05)

    Outcome =prog2; Referent group = academic 
               general                            vocation                           
               Coeff./SE       RRR(95%CI)         Coeff./SE        RRR(95%CI)        
   (Intercept) 2.852/1.1664*   -                  5.218/1.1636***  -                 
   sesmiddle   -0.533/0.4437   0.587(0.246,1.4)   0.291/0.4764     1.338(0.526,3.404)
   seshigh     -1.163/0.5142*  0.313(0.114,0.856) -0.983/0.5956    0.374(0.116,1.203)
   write       -0.058/0.0214** 0.944(0.905,0.984) -0.114/0.0222*** 0.893(0.855,0.932)

   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1