R Shiny 数据分类和汇总统计错误

Rshiny data classification and summary statistics errors

我是 Rshiny 的新手。我的任务是:

编写一个闪亮的应用程序,它使用导航栏,标题为“数据探索”和 “分类工具”,以便在 数据探索选项卡 中,用户可以:

  1. 选择任何变量并查看它的汇总统计数据,

  2. 使用 select 输入按比率类别查看变量的汇总统计数据。

  3. 在首次打开应用程序时,按费率组查看 beertax 变量图。

  4. 选择任何变量并可视化它与 Rate 变量的关系。不同的 应根据 selected 变量是连续变量还是分类变量显示图。

分类工具选项卡中,用户可以:

  1. 使用滑块输入到select从(0.4,0.5,0.6,0.7,0.8),数据使用比例 对于训练数据集(我们不会在这里使用测试数据,所以你应该使用 验证数据集的补充比例。)

  2. 查看训练数据的分类树,并使用单选按钮 “查看 p运行ed 树”或“查看 unp运行ed 树”。其中,对于 p运行ed 树,p运行ing 应该是 使用与最小 xerror 对应的 cp 值完成。

  3. 查看正确分类率和错误分类率(使用验证数据) p运行ed 分类树和 LDA 或 QDA 之一,这些结果应该在一个 table。 “最佳”分类方法,即误分类率最低的方法, 应该突出显示,并且应该有一个注释来通知用户 突出显示方式。

  4. 预测平均利率状态(即高于或低于美国平均水平) 看不见的状态,使用“最佳”分类方法,具有用户定义的观察集 变量值(即需要输入选项以允许用户输入他们想要的 观测值)。打开应用程序时出现的默认用户定义值, 应该是连续变量的均值和分类变量的众数。 还应该有一个警告,在用户进行推断时提醒他们。

目前,我已经基本完成了“分类工具”选项卡的第 3 步。但是,我遇到了各种需要帮助的错误。我会按顺序讲:

  1. 当我 运行 打开 Rstudio 后的应用程序时,我的第一个选项卡 'Data Exploration' 运行 没问题。但是,每当我重新加载应用程序时,摘要 table 都会失败,并且我会收到错误消息“unused argument (input$variable)”。我不确定为什么会这样。

  2. 当我 运行 应用程序时,第二个选项卡 'Classification tools' 上出现一个错误,它显示为“non-numeric argument to binary operator '。我调查了这意味着什么,我想我明白了,但我只是不确定这个错误如何适用于我的代码。而不是这个错误,我的目标是生成一个 table,其中包括 CART 模型的分类率和错误分类率,最终也包括 LDA 模型(取决于训练数据比例输入)。

  3. 在问题 2 之后,我调查了我的代码并意识到我认为之前的错误是由于行 lda.pred <- predict(lda.model, newdata = valid.data[-6]) 所以我删除了代码以查看会发生什么,然后我得到一个新的错误 'unused argument (pred == valid.label)'。同样,我不明白为什么会这样。

这是我的代码:

data <- read.csv("Fatality-task2.csv")

data$Rate <- as.factor(data$Rate)

library(shiny)
library(dplyr)
library(ggplot2)
library(markdown)
library(gtsummary)
library(ggdendro)
library(factoextra)
library(mclust)
library(cluster)
library(rpart)
library(rpart.plot)
#library(MASS)

#################################################################


ui <- fluidPage(
  navbarPage("",
             tabPanel("Data Exploration",
                      sidebarLayout(
                        sidebarPanel(
                          selectInput("variable",
                                      "Variable",
                                      colnames(data)),
                          selectInput("rate",
                                      "Rate",
                                      levels(data$Rate))
                        ),
                        mainPanel(
                          tableOutput("table"),
                          plotOutput("plot")
                        )
                      )    
             ),
             tabPanel("Classification tools",
                      sidebarLayout( 
                        sidebarPanel(
                          sliderInput("train.prop",
                                      "Training data proportion",
                                      min = 0.4,
                                      max = 0.8,
                                      step = 0.1,
                                      value = 0.6),
                          radioButtons("prune",
                                       "Pruning option",
                                       choices = c("view pruned tree",
                                                   "view unpruned tree"))
                        ),
                        mainPanel(
                          tableOutput("table2"),
                          plotOutput("plot2")
                          
                        )
                      )
             )
  )
)




#################################################################

server <- function(input, output) {
  sum <- reactive({
    data <- data %>%
      filter(Rate == input$rate) %>%
      select(input$variable) %>%
      summary() %>%
      as.data.frame() %>%
      tidyr::separate(Freq, c("Stat", "Value"), sep=":") %>%
      tidyr::pivot_wider(names_from =Stat, values_from = Value)
    data <- data[, -c(1,2)]
    
  })
  
  output$table <- renderTable({
    sum()
  })
  
  output$plot <- renderPlot({
    if (input$variable == "jaild" | input$variable == "Rate"){
      ggplot(data, aes(x = Rate, fill = data[[input$variable]])) +
        geom_bar(position = "dodge", width = 0.7) +
        if (input$variable == "Rate"){
          theme(legend.position = "none")
        }
    } else {
      ggplot(data, aes(x = Rate, y = data[[input$variable]], fill = Rate)) +
        geom_boxplot() +
        theme(legend.position = "none")
    }
    
    
  })
  
  output$plot2 <- renderPlot({
    
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    if (input$prune == "view pruned tree"){
      rpart.plot(ptree, uniform =TRUE)
    } else {
      rpart.plot(fit.tree)
    }
    
    
    
    
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  #######################
  table <- reactive({
    #################################
    library(MASS)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    #ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    ind2 <- setdiff(c(1:n), ind1)
    
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    train.label <- data[ind1, 6]
    valid.label <- data[ind2, 6]
    
    #################################
    
    ### fit cart model
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ### prune the tree
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    ### predict using the validation data on the pruned tree
    pred <- predict(ptree, newdata = valid.data, type = "class")
    
    ### lda
    
    lda.model <- lda(train.data[,-6], train.label)
    
    #lda.pred <- predict(lda.model, newdata = valid.data[,-6])
    
    
    ### create a classification table
    
    
    
    CCR <- sum(pred == valid.label)/nrow(valid.data)
    MCR <- 1 - CCR
    
    CR <- c(CCR, MCR)
    
    CR <- as.data.frame(CR)
    colnames(CR) <- "CART"
    rownames(CR) <- c("CCR", "MCR")
    
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  #############################
  output$table2 <- renderTable({
    table()
    
  }) 
  
}




#################################################################
shinyApp(ui, server)

在上述 3 种情况下,我的应用程序输出如下所示(按顺序)

我非常需要帮助,因此非常感谢您的帮助。如果您需要对代码进行任何说明,请告诉我。

~编辑~

这是我解决第二个和第三个错误后更新的代码(我想?):

data <- read.csv("Fatality-task2.csv")

data$Rate <- as.factor(data$Rate)

library(shiny)
library(dplyr)
library(ggplot2)
library(markdown)
library(gtsummary)
library(ggdendro)
library(factoextra)
library(mclust)
library(cluster)
library(rpart)
library(rpart.plot)
#library(MASS)

dput(head(data))

#################################################################


ui <- fluidPage(
  navbarPage("",
             tabPanel("Data Exploration",
                      sidebarLayout(
                        sidebarPanel(
                          selectInput("variable",
                                      "Variable",
                                      colnames(data)),
                          selectInput("rate",
                                      "Rate",
                                      levels(data$Rate))
                        ),
                        mainPanel(
                          tableOutput("table"),
                          plotOutput("plot")
                        )
                      )
             ),
             tabPanel("Classification tools",
                      sidebarLayout(
                        sidebarPanel(
                          sliderInput("train.prop",
                                      "Training data proportion",
                                      min = 0.4,
                                      max = 0.8,
                                      step = 0.1,
                                      value = 0.6),
                          radioButtons("prune",
                                       "Pruning option",
                                       choices = c("view pruned tree",
                                                   "view unpruned tree"))
                        ),
                        mainPanel(
                          tableOutput("table2"),
                          plotOutput("plot2")
                          
                        )
                      )
             )
  )
)




#################################################################

server <- function(input, output) {
  sum <- reactive({
    req(input$variable,input$rate)
    data <- data %>%
      filter(Rate == input$rate) %>%
      select(input$variable) %>%
      summary() %>%
      as.data.frame() %>%
      tidyr::separate(Freq, c("Stat", "Value"), sep=":") %>%
      tidyr::pivot_wider(names_from =Stat, values_from = Value)
    data <- data[, -c(1,2)]
    
  })
  
  output$table <- renderTable({
    sum()
  })
  
  output$plot <- renderPlot({
    req(input$variable)
    if (input$variable == "jaild" | input$variable == "Rate"){
      ggplot(data, aes(x = Rate, fill = .data[[as.name(input$variable)]])) +
        geom_bar(position = "dodge", width = 0.7) +
        if (input$variable == "Rate"){
          theme(legend.position = "none")
        }
    } else {
      ggplot(data, aes(x = Rate, y = .data[[as.name(input$variable)]], fill = Rate)) +
        geom_boxplot() +
        theme(legend.position = "none")
    }
    
    
  })
  
  output$plot2 <- renderPlot({
    req(input$train.prop,input$prune)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*as.numeric(input$train.prop)))
    ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    if (input$prune == "view pruned tree"){
      rpart.plot(ptree, uniform =TRUE)
    } else {
      rpart.plot(fit.tree)
    }
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  #######################
  table <- reactive({
    #################################
    library(MASS)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    #ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    ind2 <- setdiff(c(1:n), ind1)
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    #################################
    
    ### fit cart model
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ### prune the tree
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    ### predict using the validation data on the pruned tree
    pred <- predict(ptree, newdata = valid.data[,-6], type = "class")
    
    ### lda
    
     #lda.model <- lda(train.data[,-6], train.data[,6])
     
     lda.model <- lda(Rate~., data = train.data)
     
    
     lda.pred <- predict(lda.model, newdata = valid.data)
    
    
    ### create a classification table
    
    length(lda.model)
    
    
    x <- pred == valid.data[,6]
    
    CCR <- length(x[x == TRUE])/nrow(valid.data)
    MCR <- 1 - CCR
    
    CR <- c(CCR, MCR)
    
    z <- lda.pred$class == valid.data[,6]
    
    lda.CCR <- length(z[z == TRUE])/nrow(valid.data)
    lda.MCR <- 1 - CCR
    
    lda.CR <- c(lda.CCR, lda.MCR)
    
    y <- cbind(CR, lda.CR)
    
    y <- as.data.frame(y)
    colnames(y) <- c("CART", "LDA")
    rownames(y) <- c("CCR", "MCR")
    y
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  #############################
  output$table2 <- renderTable({
    table()
    
  },
  rownames = TRUE) 
  
}




#################################################################
shinyApp(ui, server)

但是我的应用程序在上次解决问题时没有保存,所以我不得不尝试从记忆中再次解决错误。我想我做对了。

这也是我的数据片段,

"beertax","jaild","vmiles","unrate","perinc","Rate" 1.53937947750092,"无",7.23388720703125,14.3999996185303,10544.15234375,1 1.78899073600769,"无",7.83634765625,13.6999998092651,10732.7978515625,1 1.71428561210632,"无",8.262990234375,11.1000003814697,11108.791015625,1 1.65254235267639,"无",8.7269169921875,8.89999961853027,11332.626953125,1 1.60990703105927,"无",8.952853515625,9.80000019073486,11661.5068359375,1 1.55999994277954,"无",9.1663017578125,7.8000001903486,11944,1 1.50144362449646,"无",9.6743232421875,7.19999980926514,12368.6240234375,1 0.214797139167786,"是",6.81015673828125,9.89999961853027,12309.0693359375,1 0.206422030925751,"是",6.58749462890625,9.10000038146973,12693.8076171875,1 0.296703308820724,"是",6.70997021484375,5,13265.93359375,1

使用 req()as.numeric() 应该可以解决前两个问题。之后你应该可以修复最后一个。

编辑

select(input$variable) 更改为 dplyr::select(input$variable) 应该可以消除您的第一个错误。在您的软件包列表中还有 4 个具有相同功能的其他软件包 select();因此你需要指定你打算从哪个包中使用它或者加载 dplyr last.

df1 <- read.table(text='"beertax","jaild","vmiles","unrate","perinc","Rate"
1.53937947750092,"no",7.23388720703125,14.3999996185303,10544.15234375,1
1.78899073600769,"no",7.83634765625,13.6999998092651,10732.7978515625,1
1.71428561210632,"no",8.262990234375,11.1000003814697,11108.791015625,1
1.65254235267639,"no",8.7269169921875,8.89999961853027,11332.626953125,1
1.60990703105927,"no",8.952853515625,9.80000019073486,11661.5068359375,1
1.55999994277954,"no",9.1663017578125,7.80000019073486,11944,1
1.50144362449646,"no",9.6743232421875,7.19999980926514,12368.6240234375,1
0.214797139167786,"yes",6.81015673828125,9.89999961853027,12309.0693359375,1
0.206422030925751,"yes",6.58749462890625,9.10000038146973,12693.8076171875,1
0.296703308820724,"yes",6.70997021484375,5,13265.93359375,1', header=TRUE, sep=",")

df2 <- df1 %>% transform(Rate=2)
data<- rbind(df1,df2)
data$Rate <- as.factor(data$Rate)

library(shiny)
library(dplyr)
library(ggplot2)
library(markdown)
library(gtsummary)
library(ggdendro)
library(factoextra)
library(mclust)
library(cluster)
library(rpart)
library(rpart.plot)
library(MASS)

ui <- fluidPage(
  navbarPage("Testing Data Exploration",
             tabPanel("Data Exploration",
                      sidebarLayout(
                        sidebarPanel(
                          selectInput("variable",
                                      "Variable",
                                      colnames(data)),
                          selectInput("rate",
                                      "Rate",
                                      levels(data$Rate))
                        ),
                        mainPanel(
                          DTOutput("table1"),
                          plotOutput("plot")
                        )
                      )
             ),
             tabPanel("Classification tools",
                      sidebarLayout(
                        sidebarPanel(
                          sliderInput("train.prop",
                                      "Training data proportion",
                                      min = 0.4,
                                      max = 0.8,
                                      step = 0.1,
                                      value = 0.6),
                          radioButtons("prune",
                                       "Pruning option",
                                       choices = c("view pruned tree",
                                                   "view unpruned tree"))
                        ),
                        mainPanel(
                          DTOutput("table2"),
                          plotOutput("plot2")

                        )
                      )
             )
  )
)


server <- function(input, output) {
  summ <- reactive({
    req(input$variable,input$rate)

    data1 <- data %>%
      filter(Rate == input$rate) %>%
      dplyr::select(input$variable) %>%
      summary() %>%
      as.data.frame() %>%
      tidyr::separate(Freq, c("Stat", "Value"), sep=":") %>%
      tidyr::pivot_wider(names_from = Stat, values_from = Value)
    data2 <- data1[, -c(1,2)]
    data2
  })

  output$table1 <- renderDT({
    summ()
  })

  output$plot <- renderPlot({
    req(input$variable)
    if (input$variable == "jaild" | input$variable == "Rate"){
      ggplot(data, aes(x = Rate, fill = .data[[as.name(input$variable)]])) +
        geom_bar(position = "dodge", width = 0.7) +
        if (input$variable == "Rate"){
          theme(legend.position = "none")
        }
    } else {
      ggplot(data, aes(x = Rate, y = .data[[as.name(input$variable)]], fill = Rate)) +
        geom_boxplot() +
        theme(legend.position = "none")
    }
  })

  output$plot2 <- renderPlot({
    req(input$train.prop,input$prune)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*as.numeric(input$train.prop)))
    ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))


    train.data <- data[ind1,]
    valid.data <- data[ind2,]


    fit.tree <- rpart(Rate~., data = train.data, method = "class")

    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])

    if (input$prune == "view pruned tree"){
      rpart.plot(ptree, uniform =TRUE)
    } else {
      rpart.plot(fit.tree)
    }
  })
  #######################
  table <- reactive({

    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    #ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    ind2 <- setdiff(c(1:n), ind1)

    train.data <- data[ind1,]
    valid.data <- data[ind2,]

    #################################

    ### fit cart model
    fit.tree <- rpart(Rate~., data = train.data, method = "class")

    ### prune the tree
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])

    ### predict using the validation data on the pruned tree
    pred <- predict(ptree, newdata = valid.data[,-6], type = "class")

    ### lda

    #lda.model <- lda(train.data[,-6], train.data[,6])

    lda.model <- lda(Rate~., data = train.data)


    lda.pred <- predict(lda.model, newdata = valid.data)

    ### create a classification table

    length(lda.model)


    x <- pred == valid.data[,6]

    CCR <- length(x[x == TRUE])/nrow(valid.data)
    MCR <- 1 - CCR

    CR <- c(CCR, MCR)

    z <- lda.pred$class == valid.data[,6]

    lda.CCR <- length(z[z == TRUE])/nrow(valid.data)
    lda.MCR <- 1 - CCR

    lda.CR <- c(lda.CCR, lda.MCR)

    y <- cbind(CR, lda.CR)

    y <- as.data.frame(y)
    colnames(y) <- c("CART", "LDA")
    rownames(y) <- c("CCR", "MCR")
    y
  })

  output$table2 <- renderDT({
    table()
  }, rownames = TRUE)

}

shinyApp(ui, server)