大小为 K 的有色集团的算法

Algorithm for Colored Clique of Size K

我将现实世界的问题抽象为图论问题。目前,我试图通过使用 networkx 库中的 clique-algorithm 来解决这个问题。我需要了解如何调整我的算法以提高 运行 时间性能。

问题:

G 是一个无向的彩色图,设 R 是一个映射,其中每种颜色都映射到给定的数量,例如

R = {"red": 3, "blue": 1, "green": 1, "magenta": 1, "orange": 1}

我需要找到一组节点 V,具有以下属性(或确定不存在这样的集合):

  1. 对于R中的每个(颜色,数量),我们至少需要数量 V.

    中的彩色节点

    (以上面的R为例,我们在V中至少需要3个红色节点,1个蓝色节点等)

  2. V中的节点不允许成为G中的邻居。

简单示例:

有了这个彩色图表和上面给定的要求 R, 一种可能的解决方案是:V = {0, 1, 2, 8, 13, 15, 16},因为 R 所需的所有颜色量都包含在 V 中,并且没有节点与之相邻另一个。

当前方法:

import networkx as nx

def find_result(G, R):
    CG = nx.complement(G)
    for V in nx.find_cliques(CG):
        color_count = get_color_count(G, V)
        if all([(color_count.get(color, 0) >= R[color]) for color in R.keys()]):
            return V
        
# Helper method to count colors
def get_color_count(G, V):
    colors = [G.nodes[v]["color"] for v in V]
    count = dict()
    for color in colors:
        count[color] = count.get(color, 0) + 1
    return count

性能问题:

通过这种简单的方法,我 运行 遇到了性能问题(即对于 R 的特定配置,算法花费的时间太长了)。我的图表可能会变得更糟,如下所示,有 282 个节点、4640 条边和 7 种不同的颜色。

所以这是我的问题:如何自定义 networkx-clique-algorithm 以利用颜色信息、团大小等?我很高兴 ideas/hints 有关改进或新方法的任何信息。

编辑 1

这是上面大图中的图内容(sparse6 格式)。内容可以粘贴到文件中,然后用networkx读取。

>>sparse6<<:~?CY_A??@_??OA_??OA?M??@?G?oC_??OA?K@?D_??OA?K@?D?Y??@?G?oC?S@_F_??OA?K@?D?W@oG_??OA?K@?D?W@oG?e??@?G?oC?S@_F?_AOI`CCGO@ED?R`WBOT`_B_V`gEW[@mEo[`qF?]`}DWV`eGwYAKHG[@wFw\AYI_??C?_B?O@OE?[A?H?gAwk???OA?K@?D?W@oG?cA_J_??OA?K@?D?W@oG?cA_Ja{FOf_??OA?K@?D?W@oG?cA_J_??OA?K@?D?W@oG?cA_JbO??@?G?oC?S@_F?_AOI?mM?__??OA?K@?D?W@oG?cA_JbkBw|AQB_Wc?Nw~CANp?CE??@?G?oC?UPOK?sLwK?sLpD_oBOvB{OPDCYB?LB[O@ACSP`F_oBPDCWPpG_oBPDCWPpGCeB?LB{OPDCWPpGCcQgK?sO@ACSP`FC_QPICmB?LB{O@DCWPpGCcQ`JCqB?LB{O@DCWPpGCcQ`JCoRWK?sNp?CCPPEC[Q@HCgQpKCsRgK?sNp?CGPPEC[Q@HCgQpKCsR`N_oBP@CGPPEC[Q@HCgQpKCsR`NDAB?LCCO`DCWPpGCcQ`JCoRPMC{S@P_oBO~CCO`DCWPpGCcQ`JCoRPMC{S@PDIB?LC?OPACSP`FC_QPICkR@LCwRpODCS`RdWTW~C?RPMC{S@TDYNp@C[QpNDKTPUD]TPUD[UHTDWTpWDeNp?CsR`ND?TPUD[U@XDiNp@C[QpNDKTPUD[U@XDgUwq_??OA?K@?D?W@oG?cA_JBCK`\bGVP]dsV`^_??OA?K@?D?W@oG?cA_JBCVP]D{WH\DwVp_EE??@?G?oC?SOw??C?_B?O@PBEM??@?G?oC?SOpbEQOpbEOXXbEOXPeeKX@dEWXxBEKX@dEWXpgeKX@dEWXpgEeWpcESX`fE_YPi_??OA?K@?D?W@oG?cA_JBKLG??C?_B?O@OE?[A?H?gAosEq??@?G?oC?S@_F?_AOI?kL@dE_YpkEu??@?G?oC?S@_F?_AOI?kL@kEsZg??C?_B?O@OE?[A?H?gAosC?O`GCoS@SEoZPmE}KpkEsZ`nFAZ@lEwZpoFEXPgEkZ@lEwZpoFC[hkEsZ`nF?[PqFMO@AC_R@ODOZ@lEwZpoFC[`rFQKphEgYpkEsZ`nF?[PqFK\@tecY`jEoZPmE{[@pFG[psFS\hdE_YPiEkZ@lEwZpoFC[`rFO\PuF]YPiEkZ@lEwZpoFC[`rFO\PuF[]H?CGQ@KD?T@hEgYpkEsZ`nF?[PqFK\@tFW\pwFeKpkEsZ`nF?[PqFK\@tFW\pwFc]hkEsZ`nF?[PqFK\@tFW\pwFc]`zeSY@jEoZPmE{[@pFG[psFS\`vF_]PyFk^HkEsZ`nF?[PqFK\@tFW\pwFc]`zFo^X?CGQ@KD?T@kEsZ`nF?[PqFK\@tFW\pwFc]`zFo^P}bKOPADCS`RDOZ@lEwZpoFC[`rFO\PuF[]@xFg]p{Fs^`~cCO`PDGSpSEoZPmE{[@pFG[psFS\`vF_]PyFk^@|Fw^q?cCO`PDGSpSESY@jEoZPmE{[@pFG[psFS\`vF_]PyFk^@|Fw^q?GEOPADCS`RDOZ@lEwZpoFC[`rFO\PuF[]@xFg]p{Fs^`~G?_QAc?OPAC_R@ODCS`RDOZ@lEwZpoFC[`rFO\PuF[]@xFg]p{Fs^`~G?_QAGM??@?G?oE?[A?H_??OA?W@oGGU??@?G@_F?_Np?CsR`ND?TpZGS`g??K@_HGS`aFgS`aFGaNp?CsR`ND?TpZGS`aFG_aW??K@_HGS`aFG_aQIgS`aFG_aQIGmNp?CsR`ND?TpZGS`aFG_aQIGkbIN?G@OG?kbiMG}?_D?_AqMG{cGM?{E?zByB_WBwcgM@_NaQHMB_WBwcaRHQBozHGcqSHUcaRHOdQUhGcqSHSdaVhGcqSHSdaVHaBozCOcaRHOdQUH[eAXcOcaRHOdQUH[eAXHiPAQHKdATHWdqWHceaZcOcaRHOdQUH[eAXHgeq[_{MqQHKdATHWdqWHceaZHofYQHKdATHWdqWHceaZHofQ]hGcqSHSdaVH_eQYHkfA\HwfyQHKdATHWdqWHceaZHofQ]H{gIbEs[PqFK\@tF[^A@e{[PqFK\@tFc^aBIMO@AC_R@ODO[@pFG[psFS]`~GOgqces[`vFk^@|Fw^q@IKhAde{\@xFk^@|Fw^qBIKhAdIYO@AC_R@ODO[@tFg]p{Fs^`~GOgqcIShafcCO`PDGSpSEs[`vFo_A@GG_qCIKhAdIWhqgcCO`PDGSpSE{\@xFw_A@GG_qCIKhAdIWhqgIeO@@CGQ@KD?SPQDKT@oFS]`~G?_QAGK`AbIOhQeI[iAhIi??@?G?oE?[A?HGS`aFG_aw??C?_E?[A?~C?RPMC{S@VDk`QEG[aaLIq??B?WAQDG_aqKGsjAlb{O@LCwRpOD[UqFGgaqKGsjAlIy??@?G?oC?S@_F?_AOI?kIOkBY??@?G?oC?S@_F?_AOI?kIOkJA??@?G?oC?S@_F?_AOI?kIOkHwfq_ICkApacLaoJCkgE?[A?H?gAohJ?kQqJMIQ]H{gA`J?kQqJKlGhBWaqKGsjanJ?kQqJKlAt_W@oG?cA_JAcaqKGsjanJ?kQqJKlAtJYIQJGobQ]H{gA`IwjqoJCkarJOlQuJ]??@?G?oC?S@_F?_AOI?kJ?uH?cQoJCkarJOlQuJ[mG??C?_B?O@OE?[A?H?gAokH?cQoJCkarJOlQuJ[mAx_??OA?K@?D?W@oG?cA_JAocAPHwfq_ICkApJGkqsJSlavJ_mQybWcAPJ?kQqJKlAtJWlqwJcmaz_W@oG?cA_JH?cQoJCkarJOlQuJ[mAxJgmq{h?cQ]H{gA`J?kQqJKlAtJWlqwJcmazJonWuGkbALH?cQmI{kApJGkqsJSlavJ_mQyJknA|Jy@_F?_AOI?kaqKGscAPIwjqoJCkarJOlQuJ[mAxJgmq{Jsna~gkbALH?cQ]H{gA`IwjqoJCkarJOlQuJ[mAxJgmq{Jsna~KA??@?G?oC?S@_F?_AOI?kJ?uHSeQ\ICkApJGkqsJSlavJ_mQyJknA|Jwnr?KE??@?G?oC?S@_F?_AOI?kJATHcfQ`J?kQqJKlAtJWlqwJcmazJonQ}J{oB@KI??@?G?oC?S@_F?_AOI?kJATHcfQ]H{gA`J?kQqJKlAtJWlqwJcmazJonQ}J{oB@KGowuHSeQ\ICkApJGkqsJSlavJ_mQyJknA|Jwnr?KCobBKQ@_F?_AOI?kdQXHsgQoJCkarJOlQuJ[mAxJgmq{Jsna~K?oRAKKpBDhSeQ\Hwfq_ICkApJGkqsJSlavJ_mQyJknA|Jwnr?KCobBKOpREbWaqKGsdQXHsgQmI{kApJGkqsJSlavJ_mQyJknA|Jwnr?KCobBKOpREK]@_F?_AOI?kaqKGsdQXHsgQmI{kApJGkqsJSlavJ_mQyJknA|Jwnr?KCobBKOpREK[qIJGobQTHcfQ]H{gA`IwjqoJCkarJOlQuJ[mAxJgmq{Jsna~K?oRAKKpBDKWprGKe??@?G?oC?S@_F?_AOI?kG?oB_OpbEWYW??C?_B?O@OE?[A?H?gAo_B?MADG_aqkIwqw??C?_B?O@OE?[A?H?gAo_B?M@UDgqrK_??OA?K@?D?W@oG?cA_JA?K?wEs[`vFo_QbIWiRJKorW??C?_B?O@OE?[A?H?gAo_B?MBJKorRM_??OA?K@?D?W@oG?cA_JA?M?xCKWpeEcqrKKsrbN_??OA?K@?D?W@oG?cA_JA?M?xGSaAJIojbJKorRMK{sG??C?_B?O@OE?[A?H?gAo_B_MPUDgqrKKsrbNL?sW??C?_B?O@OE?[A?H?gAo_B_MPlFG\p{GCgqeIcqrKKsrbNL?sRQ_??OA?K@?D?W@oG?cA_JA?M?xKkrBLKwrrOLCsbRa?M@BDST`VD_WpeEcqrKKsrbNL?sRQLKtG??K@_HA?M@TDWTpWGSaAJIojbJKorRMK{sBPLGsrSLUG?wDST`VD_UbJKorRMK{sBPLGsrSLStg_B_TPUD[U@lFG\p{GCgqeIcqrKKsrbNL?sRQLKtBTLWtw_B_TPUD[UBJKorRMK{sBPLGsrSLStbVLa??@?G?oC?S@_F?_AOI?kK@BDSUPbEWYRJKorRMK{sBPLGsrSLStbVL_uW??C?_B?O@OE?[A?H?gAooDSUQDG_aqkIwqrKKsrbNL?sRQLKtBTLWtrWLcug??C?_B?O@OE?[A?H?gAooDST`XDgqrKKsrbNL?sRQLKtBTLWtrWLcubZ_??OA?K@?D?W@oG?cA_JB?TPXEs[`vFo_QbIWiRJKorRMK{sBPLGsrSLStbVL_uRYLkvG??C?_B?O@OE?[A?H?gAooDSURJKorRMK{sBPLGsrSLStbVL_uRYLkvB\_??OA?K@?D?W@oG?cA_JBcOpTDcWpeEcqrKKsrbNL?sRQLKtBTLWtrWLcubZLovR]_??OA?K@?D?W@oG?cA_JBcTPXGSaAJIojbJKorRMK{sBPLGsrSLStbVL_uRYLkvB\Lwvw??C?_B?O@OE?[A?H?gAoxDST`XDgqrKKsrbNL?sRQLKtBTLWtrWLcubZLovR]L{wG??C?_B?O@OE?[A?H?gAoxDSUPlFG\p{GCgqeIcqrKKsrbNL?sRQLKtBTLWtrWLcubZLovR]L{wB`_??OA?K@?D?W@oG?cA_JBcTPXKkrBLKwrrOLCsbRLOtRUL[uBXLgur[Lsvb^M?wRacKTPUD[U@XEKX`hKkrBLKwrrOLCsbRLOtRUL[uBXLgur[Lsvb^M?wRaMM??B?WAPTDWTpWDc`QGGkjAmKkrBLKwrrOLCsbRLOtRUL[uBXLgur[Lsvb^M?wRaMKxHTDWTpWDcUbJKorRMK{sBPLGsrSLStbVL_uRYLkvB\Lwvr_MCwbbMOxXTDWTpWDcZPqF[^A@IKhahKkrBLKwrrOLCsbRLOtRUL[uBXLgur[Lsvb^M?wRaMKxBdMYTPUD[U@XKkrBLKwrrOLCsbRLOtRUL[uBXLgur[Lsvb^M?wRaMKxBdMWxxdE_YPiEkZ`rFW\pwFc]`|GIYPiEk\`vF_]PyHwfq_ICkatJ_mq}KCpBFKgyXdE_YPiEkZ`rFW\pwFc]`|GG`aHGoyRiecY`jFW\pwFc]aEGcbA]H{gA`JGlQwJknb@KOprIMcybjeSY@hEgYpmFK\`vF_]PyFs_aJGobQmI{lavJ_nr?KCqBHKgyRiMkzHhEgYpuF[]@xFgaqKGsfa^I?gQmI{katJWlqwJkna~K?oRCK[qBHKgyRiMkzBleSY@jEw[pwFs_aTHcfQ`KGorCKSpbFK_qRIMcybjMozRmhSeQ\Hwfq_ICkatJ_mq}KCobBKOpREK[qBHKgyRiMkzBlMwzxdE_YpmFK]@|GG`aHGodQXHsgRAKKpBDKWprGKcqbhMgyrkMszbnNA`aHGodQXHsfa^I?gQqJSmAzJwoRAKKpBDKWprGKcqbhMgyrkMszbnN?{XdE_YpmFK]@|GGaqKGsdQXHsgQmI{lavJ_nr?KCobBKOpREK[qBHKgyRiMkzBlMwzroNC{iJGobQTHcfQ]H{gA`IwjqqJSlavJ_mq}J{oB@KGorCKSpbFK_qRIMcybjMozRmM{{BpNG{x\EA|X\E?|RunS|bvdsWBtNW|rwnS|bvN_}X\E?|RuN[}BxNi|RuN[}BxNg}wNAOM_zBscaUHgfgNAOI?zBscaUHgfb|_??OA?K@?D?W@oG?cA_JAcJ?uJ?kQqJKlAtJWlqwJcmazJonrAKKpBDKa??@?G?oC?S@_F?_AOI?kIOkJ?kQqJKlAtJWlqwJcmazKGorCN}??@?G?oC?S@_F?_AOI?kIOkHwfq_ICkApJGkqsJSlavJ_mQyJknb@KGorCK[qbiMozboNG|B~OA?_D?_AohBWbqPJ?kQqJKlAtJWlqwJcnA~KGpRGN|?C@_G?oC?SA?H?gAohG{cQoJCkarJOlQuJ[mB~O@?SA_G@OG?kIQNHCfa^I?gQoJCkarJOlQuJ[mAzJwoRCK[qbiMozboNG|B~O@?SAOMIOuGkbALIwjqoJCkarJOlQuJ[mAxJonr?KCobDK_qRIMszbrNO~s?OD?cBOQ?oC?SAOI?kIQJGobQmI{kApJGkqsJSlavJ_nr?KCqBHKgzRmNK|B~O@?SAOL@CDacaqKGsfa^I?gQmI{kApJGkqsJSlavJ_mq}J{oB@KOprGKcqbiMozRmN?{brNO~s?OD?cBOP@SE_??OA?K@?D?W@oG?cA_JAoLaoJCkarJWmQyJknA~KGorCKSqB~O@?SAOL@CDOX@w??C?_B?O@OE?[A?H?gAokJ?kQqJcmazKGorCN|?C@OH?sCOT@cFOa??@?G?oC?S@_F?_AOI?kJA]H{gA`J?kQqJSmAxJgmq}KCobBKOprIMgzBmN?{bsN|?C@OH?sCOT@cFO`AW@?G@?D?[A?I?kLaNHCkArJWmQ{J{obDK_~s?OD?cBOP@SEO\ACHOi?OA?K@?D?[A?H?gAqNHC~s?OD?cBOP@SEO\ACHOhAw@?G@?D?[A?I?kbqPHwfq_ICkatJ_mq}KCpBFKgybkMw{BqNO~s?OD?cBOP@SEO\ACHOhAsK_C@?F?gLaJGobQmI{kArJWlqwJcnA~K?oRAKSqBHKgzRmNK|B~O@?SAOL@CDOX@sGOdAcJOpBW@?K@?D?[AOI?kaqKGsjanJWlqwJ{oB@K_qRIMszbrNO~s?OD?cBOP@SEO\ACHOhAsKOtBg@?O@oIGkbALHwfq_ICjanJGlQuJ[mAzJwnr?KCpBFK_qRIMgzBlMw{BqNK|B~O@?SAOL@CDOX@sGOdAcJOpBSMO}??@?G?oC?S@_F?_AOI?kJ?uHSeQ\ICkApJGkquJcmazJonrAKKpBDKWprGKcqbnN?{RqNK|B~O@?SAOL@CDOX@sGOdAcJOpBSMO|CG??C?_B?O@OE?[A?H?gAokHSeQ\ICkApJGmQyJkobBKOpREK[qBHKgzroNC{brNO~s?OD?cBOP@SEO\ACHOhAsKOtBcNP@CW??C?_B?O@OE?[A?H?gAokHSeQ\Hwfq_ICkApJGlQwJcmazJwoRAKKpBDKWprGKcqbiMozbnN?{RqNK|B~O@?SAOL@CDOX@sGOdAcJOpBSMO|CCPPI?_D?_AouG{cQTHcfQ`J?kquJcnA~KGorCKSpbFK_qRIM{{BpNG{rsN|?C@OH?sCOT@cFO`ASIOlBCLOxBsOPDCcR_G?oC?SA?H?gAqNHCdQXHsgRAKKpBDKWprGKcqbnN?{RqNK|B~O@?SAOL@CDOX@sGOdAcJOpBSMO|CCPPHCsS_G@OG?kbqPHSeQ\Hwfq_ICkatJ_mq}KCobBKOpREK[qBHKgybkMwzroNC{brNO~s?OD?cBOP@SEO\ACHOhAsKOtBcNP@CSQPLDCTbWaqKGsdQXHsgQmI{kArJWlqwJcnA~K?oRAKKpBDKWprGKcqblMwzroNC{brNO~s?OD?cBOP@SEO\ACHOhAsKOtBcNP@CSQPLDCTPY?oC?SAOI?kaqKGsdQXHsgQmI{lavJ_nr?KCobBKOpREK[qBHKgzRmM{{BpNG{rsN|?C@OH?sCOT@cFO`ASIOlBCLOxBsOPDCcRPPDSUP]aqKGsdQXHsfa^I?gQmI{katJWlqwJkna~K?oRAKKpBDKWprGKcqbiMozRmM{{BpNG{rsN|?C@OH?sCOT@cFO`ASIOlBCLOxBsOPDCcRPPDSUP\EN

要为图形着色,请使用以下实现:

    G = nx.read_sparse6("graph_file")
    
    for i in range(0,12):
        G.nodes[i]["color"] = "yellow"
    for i in range(12,40):
        G.nodes[i]["color"] = "red"
    for i in range(40,63):
        G.nodes[i]["color"] = "turquoise"
    for i in range(63,85):
        G.nodes[i]["color"] = "green"
    for i in range(85,163):
        G.nodes[i]["color"] = "blue"
    for i in range(163, 176):
        G.nodes[i]["color"] = "magenta"
    for i in range(176, 282):
        G.nodes[i]["color"] = "orange"  

正式地,我们有一个 独立的 设置 侧面有一些覆盖约束的问题。整数规划 似乎很适合,虽然我不知道 R 具体是哪个 给你添麻烦

独立集的简单整数规划公式 每个节点都有一个 0–1 变量,每个边都有一个约束。这 然而,线性松弛很差,因为(例如)三角图 对于“1.5 节点”,有一个分数解,其中每个节点都是 0.5 分数独立集。因为 G 最多只有几百个 然而,我们可以使用一个更好的公式来约束 每个最大团(对于每个团,独立集包含在 大多数一个派系节点)。

使用 NetworkX 和 OR-Tools 以下:

import networkx as nx

G = nx.read_sparse6("graph_file")

for i in range(0, 12):
    G.nodes[i]["color"] = "yellow"
for i in range(12, 40):
    G.nodes[i]["color"] = "red"
for i in range(40, 63):
    G.nodes[i]["color"] = "turquoise"
for i in range(63, 85):
    G.nodes[i]["color"] = "green"
for i in range(85, 163):
    G.nodes[i]["color"] = "blue"
for i in range(163, 176):
    G.nodes[i]["color"] = "magenta"
for i in range(176, 282):
    G.nodes[i]["color"] = "orange"


R = {
    "red": 11,
    "turquoise": 11,
    "blue": 8,
    "orange": 6,
    "green": 4,
    "magenta": 2,
    "yellow": 1,
}


import collections
from ortools.linear_solver import pywraplp

solver = pywraplp.Solver.CreateSolver("SCIP")
x = {v: solver.BoolVar(str(v)) for v in G.nodes}
for K in nx.find_cliques(G):
    solver.Add(sum(x[v] for v in K) <= 1)
nodes_with_color = collections.defaultdict(list)
for v in G.nodes:
    nodes_with_color[G.nodes[v]["color"]].append(v)
for color, bound in R.items():
    solver.Add(sum(x[v] for v in nodes_with_color[color]) >= bound)
solver.Maximize(sum(x_v for x_v in x.values()))
status = solver.Solve()
if status == pywraplp.Solver.OPTIMAL:
    solution = [v for (v, x_v) in x.items() if x_v.SolutionValue()]
    print(*solution)
    print(collections.Counter(G.nodes[v]["color"] for v in solution))
elif status == pywraplp.Solver.INFEASIBLE:
    print("no solution")
else:
    print("unknown status", status)