通过理想值强度值分配隐马尔可夫模型的状态。

Assigning states of Hidden Markov Models by idealized values intensity values.

我的数据是运行石榴HMM(http://pomegranate.readthedocs.io/en/latest/HiddenMarkovModel.html),我将结果加载到Pandas DF中,并将理想强度定义为该状态下的所有点数:df["hmm_idealized"] = df.groupby(["hmm_state"],as_index = False)["Raw"].transform("median")。示例数据:

    +-----+-----------------+-------------+------------+
    |     |   hmm_idealized |   hmm_state |   hmm_diff |
    |-----+-----------------+-------------+------------|
    |   0 |           99862 |           3 |        nan |
    |   1 |           99862 |           3 |          0 |
    |   2 |           99862 |           3 |          0 |
    |   3 |           99862 |           3 |          0 |
    |   4 |           99862 |           3 |          0 |
    |   5 |           99862 |           3 |          0 |
    |   6 |          117759 |           4 |          1 |
    |   7 |          117759 |           4 |          0 |
    |   8 |          117759 |           4 |          0 |
    |   9 |          117759 |           4 |          0 |
    |  10 |          117759 |           4 |          0 |
    |  11 |          117759 |           4 |          0 |
    |  12 |          117759 |           4 |          0 |
    |  13 |          117759 |           4 |          0 |
    |  14 |          124934 |           2 |         -2 |
    |  15 |          124934 |           2 |          0 |
    |  16 |          124934 |           2 |          0 |
    |  17 |          124934 |           2 |          0 |
    |  18 |          124934 |           2 |          0 |
    |  19 |          117759 |           4 |          2 |
    |  20 |          117759 |           4 |          0 |
    |  21 |          117759 |           4 |          0 |
    |  22 |          117759 |           4 |          0 |
    |  23 |          117759 |           4 |          0 |
    |  24 |          117759 |           4 |          0 |
    |  25 |          117759 |           4 |          0 |
    |  26 |          117759 |           4 |          0 |
    |  27 |          117759 |           4 |          0 |
    |  28 |          117759 |           4 |          0 |
    |  29 |          117759 |           4 |          0 |
    |  30 |          117759 |           4 |          0 |
    |  31 |          117759 |           4 |          0 |
    |  32 |          117759 |           4 |          0 |
    |  33 |          117759 |           4 |          0 |
    |  34 |          117759 |           4 |          0 |
    |  35 |          117759 |           4 |          0 |
    |  36 |          117759 |           4 |          0 |
    |  37 |          117759 |           4 |          0 |
    |  38 |          117759 |           4 |          0 |
    |  39 |          117759 |           4 |          0 |
    |  40 |          106169 |           1 |         -3 |
    |  41 |          106169 |           1 |          0 |
    |  42 |          106169 |           1 |          0 |
    |  43 |          106169 |           1 |          0 |
    |  44 |          106169 |           1 |          0 |
    |  45 |          106169 |           1 |          0 |
    |  46 |          106169 |           1 |          0 |
    |  47 |          106169 |           1 |          0 |
    |  48 |          106169 |           1 |          0 |
    |  49 |          106169 |           1 |          0 |
    |  50 |          106169 |           1 |          0 |
    |  51 |          106169 |           1 |          0 |
    |  52 |          106169 |           1 |          0 |
    |  53 |          106169 |           1 |          0 |
    |  54 |          106169 |           1 |          0 |
    |  55 |          106169 |           1 |          0 |
    |  56 |          106169 |           1 |          0 |
    |  57 |          106169 |           1 |          0 |
    |  58 |          106169 |           1 |          0 |
    |  59 |          106169 |           1 |          0 |
    |  60 |          106169 |           1 |          0 |
    |  61 |          106169 |           1 |          0 |
    |  62 |          106169 |           1 |          0 |
    |  63 |          106169 |           1 |          0 |
    |  64 |          106169 |           1 |          0 |
    |  65 |          106169 |           1 |          0 |
    |  66 |          106169 |           1 |          0 |
    |  67 |          106169 |           1 |          0 |
    |  68 |          106169 |           1 |          0 |
    |  69 |          106169 |           1 |          0 |
    |  70 |          106169 |           1 |          0 |
    |  71 |          106169 |           1 |          0 |
    |  72 |          106169 |           1 |          0 |
    |  73 |          106169 |           1 |          0 |
    |  74 |          106169 |           1 |          0 |
    |  75 |           99862 |           3 |          2 |
    |  76 |           99862 |           3 |          0 |
    |  77 |           99862 |           3 |          0 |
    |  78 |           99862 |           3 |          0 |
    |  79 |           99862 |           3 |          0 |
    |  80 |           99862 |           3 |          0 |
    |  81 |           99862 |           3 |          0 |
    |  82 |           99862 |           3 |          0 |
    |  83 |           99862 |           3 |          0 |
    |  84 |           99862 |           3 |          0 |
    |  85 |           99862 |           3 |          0 |
    |  86 |           99862 |           3 |          0 |
    |  87 |           99862 |           3 |          0 |
    |  88 |           99862 |           3 |          0 |
    |  89 |           99862 |           3 |          0 |
    |  90 |           99862 |           3 |          0 |
    |  91 |           99862 |           3 |          0 |
    |  92 |           99862 |           3 |          0 |
    |  93 |           99862 |           3 |          0 |
    |  94 |           99862 |           3 |          0 |
    |  95 |           99862 |           3 |          0 |
    |  96 |           99862 |           3 |          0 |
    |  97 |           99862 |           3 |          0 |
    |  98 |           99862 |           3 |          0 |
    |  99 |           99862 |           3 |          0 |
    | 100 |           99862 |           3 |          0 |
    | 101 |           99862 |           3 |          0 |
    | 102 |           99862 |           3 |          0 |
    | 103 |           99862 |           3 |          0 |
    | 104 |           99862 |           3 |          0 |
    | 105 |           99862 |           3 |          0 |
    | 106 |           99862 |           3 |          0 |
    | 107 |           99862 |           3 |          0 |
    | 108 |           94127 |           0 |         -3 |
    | 109 |           94127 |           0 |          0 |
    | 110 |           94127 |           0 |          0 |
    | 111 |           94127 |           0 |          0 |
    | 112 |           94127 |           0 |          0 |
    | 113 |           94127 |           0 |          0 |
    | 114 |           94127 |           0 |          0 |
    | 115 |           94127 |           0 |          0 |
    | 116 |           94127 |           0 |          0 |
    | 117 |           94127 |           0 |          0 |
    | 118 |           94127 |           0 |          0 |
    | 119 |           94127 |           0 |          0 |
    | 120 |           94127 |           0 |          0 |
    | 121 |           94127 |           0 |          0 |
    | 122 |           94127 |           0 |          0 |
    | 123 |           94127 |           0 |          0 |
    | 124 |           94127 |           0 |          0 |
    | 125 |           94127 |           0 |          0 |
    | 126 |           94127 |           0 |          0 |
    | 127 |           94127 |           0 |          0 |
    | 128 |           94127 |           0 |          0 |
    | 129 |           94127 |           0 |          0 |
    | 130 |           94127 |           0 |          0 |
    | 131 |           94127 |           0 |          0 |
    | 132 |           94127 |           0 |          0 |
    | 133 |           94127 |           0 |          0 |
    | 134 |           94127 |           0 |          0 |
    | 135 |           94127 |           0 |          0 |
    | 136 |           94127 |           0 |          0 |
    | 137 |           94127 |           0 |          0 |
    | 138 |           94127 |           0 |          0 |
    | 139 |           94127 |           0 |          0 |
    | 140 |           94127 |           0 |          0 |
    | 141 |           94127 |           0 |          0 |
    | 142 |           94127 |           0 |          0 |
    | 143 |           94127 |           0 |          0 |
    | 144 |           94127 |           0 |          0 |
    | 145 |           94127 |           0 |          0 |
    | 146 |           94127 |           0 |          0 |
    | 147 |           94127 |           0 |          0 |
    | 148 |           94127 |           0 |          0 |
    | 149 |           94127 |           0 |          0 |
    | 150 |           94127 |           0 |          0 |
    | 151 |           94127 |           0 |          0 |
    | 152 |           94127 |           0 |          0 |
    | 153 |           94127 |           0 |          0 |
    | 154 |           94127 |           0 |          0 |
    | 155 |           94127 |           0 |          0 |
    | 156 |           94127 |           0 |          0 |
    | 157 |           94127 |           0 |          0 |
    | 158 |           94127 |           0 |          0 |
    | 159 |           94127 |           0 |          0 |
    | 160 |           94127 |           0 |          0 |
    | 161 |           94127 |           0 |          0 |
    | 162 |           94127 |           0 |          0 |
    | 163 |           94127 |           0 |          0 |
    | 164 |           94127 |           0 |          0 |
    | 165 |           94127 |           0 |          0 |
    | 166 |           94127 |           0 |          0 |
    | 167 |           94127 |           0 |          0 |
    | 168 |           94127 |           0 |          0 |
    | 169 |           94127 |           0 |          0 |
    | 170 |           94127 |           0 |          0 |
    | 171 |           94127 |           0 |          0 |
    | 172 |           94127 |           0 |          0 |
    | 173 |           94127 |           0 |          0 |
    | 174 |           94127 |           0 |          0 |
    | 175 |           94127 |           0 |          0 |
    | 176 |           94127 |           0 |          0 |
    | 177 |           94127 |           0 |          0 |
    | 178 |           94127 |           0 |          0 |
    | 179 |           94127 |           0 |          0 |
    | 180 |           94127 |           0 |          0 |
    | 181 |           94127 |           0 |          0 |
    | 182 |           94127 |           0 |          0 |
    | 183 |           94127 |           0 |          0 |
    | 184 |           94127 |           0 |          0 |
    | 185 |           94127 |           0 |          0 |
    | 186 |           94127 |           0 |          0 |
    | 187 |           94127 |           0 |          0 |
    | 188 |           94127 |           0 |          0 |
    | 189 |           94127 |           0 |          0 |
    | 190 |           94127 |           0 |          0 |
    | 191 |           94127 |           0 |          0 |
    | 192 |           94127 |           0 |          0 |
    | 193 |           94127 |           0 |          0 |
    | 194 |           94127 |           0 |          0 |
    | 195 |           94127 |           0 |          0 |
    | 196 |           94127 |           0 |          0 |
    | 197 |           94127 |           0 |          0 |
    | 198 |           94127 |           0 |          0 |
    | 199 |           94127 |           0 |          0 |
    | 200 |           94127 |           0 |          0 |
    +-----+-----------------+-------------+------------+

在分析结果的时候,我想统计模型增加的次数。我想使用 df.where(data.hmm_diff > 0).count() 函数来读取我递增一个状态的次数。但是,增加有时会跨越两个状态(即它会跳过中间状态),所以我需要通过对理想化值进行排序来重新分配 HMM 状态标签,以便最低状态为 0,最高状态为 4,等等。是否有一种根据理想化强度将 hmm_state 标签从任意重新分配到 的方法?
例如,标记为“1”的 hmm_state 位于 hmm_state 3 和 4

之间

看起来你只需要像这样定义一个排序的 HMM 状态:

state_orders = {v: i for i, v in enumerate(sorted(df.hmm_idealized.unique()))}
df['sorted_state'] = df.hmm_idealized.map(state_orders)

然后你可以像你在问题中所做的那样继续,但是在这个专栏上做一个差异,并计算它的跳跃。