如何计算Python中的优先矩阵?

How to calculate the Precedence Matrix in Python?

Precedence diagrams

优先矩阵以矩形格式显示从一个 activity 到另一个的流。优先图是一个二维矩阵,显示活动之间的流程。它可以包含不同的类型值,通过调整类型参数。

r-programming中可以使用bupar包计算。

#Example

# Absolute Frequencies
patients %>%
    precedence_matrix(type = "absolute") 

输出

## # A tibble: 13 x 3
##    antecedent            consequent                n
##    <fct>                 <fct>                 <int>
##  1 Triage and Assessment End                       2
##  2 Blood test            End                       1
##  3 Start                 Registration            500
##  4 Registration          Triage and Assessment   500
##  5 MRI SCAN              Discuss Results         236
##  6 Triage and Assessment Blood test              237
##  7 Blood test            MRI SCAN                236
##  8 Discuss Results       Check-out               492
##  9 X-Ray                 End                       2
## 10 Check-out             End                     491
## 11 X-Ray                 Discuss Results         259
## 12 Triage and Assessment X-Ray                   261
## 13 Discuss Results       End                       3

如何使用python得到优先级矩阵?有没有在python中得到优先级矩阵的包?

根据 OP 在此处评论中的要求,python 使用图结构对优先级矩阵的概念进行了极简证明。

class PrecedenceDiagram():
    def __init__(self, mode = 'relative'):
        # {<str> origin : { <str> destination: <int> frequency, ... } }
        self.graph = dict()
        self.flow_amount = 0
        self.mode = mode
    
    def update(self, origin, destination):
        '''
        increment frequency if origin exists and destination is in
        origin else instantiate origin/destination appropriately
        '''
        if origin in self.graph:
            if destination in self.graph[origin]:
                self.graph[origin][destination] += 1
            else:
                self.graph[origin][destination] = 1
        else:
            self.graph[origin] = dict()
            self.graph[origin][destination] = 1
        self.flow_amount += 1

    def display_precedence(self):
        '''
        display flow frequency
        '''
        print('O','D','f')
        for node, edges in self.graph.items():
            for edge, weight in edges.items():
                if self.mode == 'absolute':
                    print(node, edge, weight)
                elif self.mode == 'relative':
                    print(node, edge, weight/ self.flow_amount)
        print('-'*16)


pm = PrecedenceDiagram(mode='relative')
pm.update('a', 'b')
pm.update('b', 'c')
pm.update('a', 'b')
pm.update('a', 'b')
pm.update('a', 'd')
pm.update('a', 'e')
pm.update('e', 'a')
pm.update('a', 'n')
pm.update('a', 'b')
pm.update('a', 'b')
pm.update('a', 'b')
pm.display_precedence()
pm.mode = 'absolute'
pm.display_precedence()

使用 pm4py discover_dfg

示例集

import pandas as pd
import pm4py

df = pm4py.format_dataframe(pd.read_csv('https://raw.githubusercontent.com/pm4py/pm4py-core/release/notebooks/data/running_example.csv', sep=';'), case_id='case_id',activity_key='activity', timestamp_key='timestamp')

正在将数据转换为日志

from pm4py.objects.conversion.log import converter as log_converter
log = log_converter.apply(df)

discover_dfg(log)会return这样一个矩阵(作为一个字典)还有一个维护开始和结束活动的计数器。

d = pm4py.discover_dfg(log)[0]

数据整理

df = pd.DataFrame.from_dict(d, orient='index').reset_index()
df.rename(columns={"index" : "Antecedent,Consequent", 0 : "Count"}, inplace=True)
df['Antecedent'], df['Consequent'] = zip(*df["Antecedent,Consequent"])

最终输出

Antecedent Consequent Count
register request examine thoroughly 1
examine thoroughly check ticket 2
check ticket decide 6
decide reject request 3
register request check ticket 2
check ticket examine casually 2
examine casually decide 2
decide pay compensation 3
register request examine casually 3
examine casually check ticket 4
decide reinitiate request 3
reinitiate request examine thoroughly 1
check ticket examine thoroughly 1
examine thoroughly decide 1
reinitiate request check ticket 1
reinitiate request examine casually 1