为什么使用 cut 函数的 Decile 值不正确

Why Decile values are incorrect using the cut function

我尝试使用代码 below.However 为每个观察附加一个十分位数值,但似乎这些值不正确。这可能是什么原因?

     df<-read.table(text="pregnant glucose blood skin INSULIN MASS  DIAB AGE CLASS  predict_probability 
                                  1     106    70   28     135 34.2 0.142  22     0       0.15316285       
                                  1      91    54   25     100 25.2 0.234  23     0       0.05613959       
                                  4     136    70    0       0 31.2 1.182  22     1       0.54034794       
                                  9     164    78    0       0 32.8 0.148  45     1       0.64361578       
                                  3     173    78   39     185 33.8 0.970  31     1       0.79185196       
                                 11     136    84   35     130 28.3 0.260  42     1       0.31927737       
                                  0     141    84   26       0 32.4 0.433  22     0       0.41609308       
                                  3     106    72    0       0 25.8 0.207  27     0       0.10460090       
                                  9     145    80   46     130 37.9 0.637  40     1       0.67061324       
                                 10     111    70   27       0 27.5 0.141  40     1       0.16152296       
                       ",header=T)

deciles <- cut(df$predict_probability, breaks=c(quantile(df$predict_probability, probs = seq(0, 1, by = 0.10))),labels = 1:10, include.lowest=TRUE)
        df1 <- cbind(df,deciles)
        head(df1,10)
           pregnant glucose blood skin INSULIN MASS  DIAB AGE CLASS predict_probability deciles
        1         1     106    70   28     135 34.2 0.142  22     0          0.15316285       3
        2         1      91    54   25     100 25.2 0.234  23     0          0.05613959       1
        3         4     136    70    0       0 31.2 1.182  22     1          0.54034794       7
        4         9     164    78    0       0 32.8 0.148  45     1          0.64361578       8
        5         3     173    78   39     185 33.8 0.970  31     1          0.79185196      10
        6        11     136    84   35     130 28.3 0.260  42     1          0.31927737       5
        7         0     141    84   26       0 32.4 0.433  22     0          0.41609308       6
        8         3     106    72    0       0 25.8 0.207  27     0          0.10460090       2
        9         9     145    80   46     130 37.9 0.637  40     1          0.67061324       9
        10       10     111    70   27       0 27.5 0.141  40     1          0.16152296       4

根据 Dason 的建议,这里是问题的完整答案。 quantile 函数应从代码中取出,因此 seq(0,1,by=0.1) 应直接传递给 cut 函数。

    deciles <- cut(df$predict_probability, seq(0,1,by=0.1) ,labels = 1:10, include.lowest=TRUE)
    df1 <- cbind(df,deciles)
    head(df1,10)
 pregnant glucose blood skin INSULIN MASS  DIAB AGE CLASS predict_probability deciles
1         1     106    70   28     135 34.2 0.142  22     0          0.15316285       2
2         1      91    54   25     100 25.2 0.234  23     0          0.05613959       1
3         4     136    70    0       0 31.2 1.182  22     1          0.54034794       6
4         9     164    78    0       0 32.8 0.148  45     1          0.64361578       7
5         3     173    78   39     185 33.8 0.970  31     1          0.79185196       8
6        11     136    84   35     130 28.3 0.260  42     1          0.31927737       4
7         0     141    84   26       0 32.4 0.433  22     0          0.41609308       5
8         3     106    72    0       0 25.8 0.207  27     0          0.10460090       2
9         9     145    80   46     130 37.9 0.637  40     1          0.67061324       7
10       10     111    70   27       0 27.5 0.141  40     1          0.16152296       2