Statsample-glm gem IndexError: Specified vector y does not exist

Statsample-glm gem IndexError: Specified vector y does not exist

我正在尝试为一些学校表现数据创建泊松回归,这似乎是迄今为止最好的gem。

通过this post的实践分析,我想出了这个错误:

irb(main):001:0> require 'daru'
  require 'statsample-glm'
=> false
=> false
irb(main):003:0> data_set = Daru::DataFrame.from_csv "logistic_mle.csv"
=> #<Daru::DataFrame(200x4)>
                    a          b          c          y
          0 0.75171213 -3.2683591 1.70092606          0
          1 0.55421406 -2.9565972 2.66368360          0
          2 -1.8533164 -2.8293733 3.34679611          0
          3 -2.8861015 -0.7389824 4.74970154          0
          4 -2.6055309 0.56102031 5.48308397          0
          5 -4.2735321 1.62383436 5.35813425          0
          6 -4.7701259 1.22025583 6.41070111          0
          7 -6.9231483 2.86547174 8.73185919          0
          8 -7.5641950 4.94028695 8.94193466          0
          9 -8.6309366 4.27420502 9.27002100          0
         10 -8.9911114 5.10389362 11.7669513          0
         11 -9.9905763 7.87484596 12.4794035          0
         12 -10.381878 8.84300238 13.7498993          0
         13 -11.047682 9.44613324 13.5025027          0
         14 -12.434424 9.70515870 15.1221173          0
         15 -13.627294 10.4190343 16.3289942          0
         16 -15.620222 11.3788332 17.7367653          0
         17 -16.292239 13.1516565 18.6939344          0
         18 -16.715913 14.9076297 18.0246863          0
         19 -17.950125 15.8533651 20.6826094          0
         20 -18.989884 15.4331557 20.9101142          0
         21 -19.908508 16.8542366 22.0721145          0
         22 -21.146652 18.6785324 23.4977598          0
         23 -21.367574 18.3208056 23.9121114          0
         24 -22.131396 20.7616214 24.1683442          0
         25 -23.163631 21.1293492 25.2695476          0
         26 -24.136076 21.7035705 27.9161820          0
         27 -25.386072 23.3588003 27.8755285          0
         28 -27.254627 24.9201403 28.9810564          0
         29 -28.845061 25.1681854 29.6749936          0
        ...        ...        ...        ...        ...
irb(main):004:0> glm = Statsample::GLM.compute data_set, :y, :logistic, {constant: 1, algorithm: :mle} 
Traceback (most recent call last):
        1: from (irb):4
IndexError (Specified vector y does not exist)

进一步检查错误发现:

Caused by:
IndexError: Specified index :y does not exist

我已尝试根据此 Whosebug post 中的评论将 header 重新格式化为“日期”而不是“字符串”,但错误没有变化。

SO 社区有什么想法吗?

抱歉,我发布得太快了。我找到了一个有效的解决方案:

而不是

data_set = Daru::DataFrame.from_csv "logistic_mle.csv"

这一行有效:

data_set = Daru::DataFrame.from_csv("logistic_mle.csv", headers: true, header_converters: :symbol)