networkx:如何设置自定义成本函数?
networkx: how to set custom cost function?
我正在关注 networkx 文档 (1),我想为成本函数设置不同的惩罚(例如 node_del_cost
和 node_ins_cost
)。比方说,我想对节点的 deletion/insertion 进行三分惩罚。
到目前为止,我已经创建了两个无向图,它们的区别在于标记节点 C(更新代码)。
import networkx as nx
G=nx.Graph()
G.add_nodes_from([("A", {'label':'CDKN1A'}), ("B", {'label':'CUL4A'}),
("C", {'label':'RB1'})])
G.add_edges_from([("A","B"), ("A","C")])
H=nx.Graph()
H.add_nodes_from([("A", {'label':'CDKN1A'}), ("B", {'label':'CUL4A'}),
("C", {'label':'AKT'})])
H.add_edges_from([("A","B"), ("A","C")])
# arguments
# node_match – a function that returns True if node n1 in G1 and n2 in G2 should be considered equal during matching.
# ignored if node_subst_cost is specified
def node_match(node1, node2):
return node1['label']==node2['label']
# node_subst_cost - a function that returns the costs of node substitution
# overrides node_match if specified.
def node_subst_cost(node1, node2):
return node1['label']==node2['label']
# node_del_cost - a function that returns the costs of node deletion
# if node_del_cost is not specified then default node deletion cost of 1 is used.
def node_del_cost(node1):
return node1['label']==3
# node_ins_cost - a function that returns the costs of node insertion
# if node_ins_cost is not specified then default node insertion cost of 1 is used.
def node_ins_cost(node2):
return node2['label']==3
paths, cost = nx.optimal_edit_paths(G, H, node_match=None, edge_match=None,
node_subst_cost=node_subst_cost, node_del_cost=node_del_cost, node_ins_cost=node_ins_cost,
edge_subst_cost=None, edge_del_cost=None, edge_ins_cost=None,
upper_bound=None)
# length of the path
print(len(paths))
# optimal edit path cost (graph edit distance).
print(cost)
这给了我 2.0
作为最佳路径成本和 7.0
作为路径长度。不过我不太明白为什么,因为我把penalty设置为3.0,所以编辑距离应该是3.
感谢您的建议!
奥尔哈
如文档中所述,当您将 node_subst_cost
函数作为参数传递时,它会忽略 node_match
函数并为任何替换操作应用成本,即使节点相等也是如此。所以我建议首先评估 node_subst_cost
函数中的节点相等性,然后相应地应用成本:
def node_subst_cost(node1, node2):
# check if the nodes are equal, if yes then apply no cost, else apply 3
if node1['label'] == node2['label']:
return 0
return 3
def node_del_cost(node):
return 3 # here you apply the cost for node deletion
def node_ins_cost(node):
return 3 # here you apply the cost for node insertion
paths, cost = nx.optimal_edit_paths(
G,
H,
node_subst_cost=node_subst_cost,
node_del_cost=node_del_cost,
node_ins_cost=node_ins_cost
)
print(cost) # which will return 3.0
您也可以对边缘操作执行相同的操作。
我正在关注 networkx 文档 (1),我想为成本函数设置不同的惩罚(例如 node_del_cost
和 node_ins_cost
)。比方说,我想对节点的 deletion/insertion 进行三分惩罚。
到目前为止,我已经创建了两个无向图,它们的区别在于标记节点 C(更新代码)。
import networkx as nx
G=nx.Graph()
G.add_nodes_from([("A", {'label':'CDKN1A'}), ("B", {'label':'CUL4A'}),
("C", {'label':'RB1'})])
G.add_edges_from([("A","B"), ("A","C")])
H=nx.Graph()
H.add_nodes_from([("A", {'label':'CDKN1A'}), ("B", {'label':'CUL4A'}),
("C", {'label':'AKT'})])
H.add_edges_from([("A","B"), ("A","C")])
# arguments
# node_match – a function that returns True if node n1 in G1 and n2 in G2 should be considered equal during matching.
# ignored if node_subst_cost is specified
def node_match(node1, node2):
return node1['label']==node2['label']
# node_subst_cost - a function that returns the costs of node substitution
# overrides node_match if specified.
def node_subst_cost(node1, node2):
return node1['label']==node2['label']
# node_del_cost - a function that returns the costs of node deletion
# if node_del_cost is not specified then default node deletion cost of 1 is used.
def node_del_cost(node1):
return node1['label']==3
# node_ins_cost - a function that returns the costs of node insertion
# if node_ins_cost is not specified then default node insertion cost of 1 is used.
def node_ins_cost(node2):
return node2['label']==3
paths, cost = nx.optimal_edit_paths(G, H, node_match=None, edge_match=None,
node_subst_cost=node_subst_cost, node_del_cost=node_del_cost, node_ins_cost=node_ins_cost,
edge_subst_cost=None, edge_del_cost=None, edge_ins_cost=None,
upper_bound=None)
# length of the path
print(len(paths))
# optimal edit path cost (graph edit distance).
print(cost)
这给了我 2.0
作为最佳路径成本和 7.0
作为路径长度。不过我不太明白为什么,因为我把penalty设置为3.0,所以编辑距离应该是3.
感谢您的建议!
奥尔哈
如文档中所述,当您将 node_subst_cost
函数作为参数传递时,它会忽略 node_match
函数并为任何替换操作应用成本,即使节点相等也是如此。所以我建议首先评估 node_subst_cost
函数中的节点相等性,然后相应地应用成本:
def node_subst_cost(node1, node2):
# check if the nodes are equal, if yes then apply no cost, else apply 3
if node1['label'] == node2['label']:
return 0
return 3
def node_del_cost(node):
return 3 # here you apply the cost for node deletion
def node_ins_cost(node):
return 3 # here you apply the cost for node insertion
paths, cost = nx.optimal_edit_paths(
G,
H,
node_subst_cost=node_subst_cost,
node_del_cost=node_del_cost,
node_ins_cost=node_ins_cost
)
print(cost) # which will return 3.0
您也可以对边缘操作执行相同的操作。