控制桑基图连接

Question

我正在尝试使用 Matplotlib Sankey 图控制哪些流相互连接。我正在修改基本的两个系统示例。

我认为我的困惑归结为误解了这实际上意味着什么：

Notice that only one connection is specified, but the systems form a circuit since: (1) the lengths of the paths are justified and (2) the orientation and ordering of the flows is mirrored.

我做了一个玩具示例，它使用单个数据集，然后为第二个系统修改它以确保所有数字都匹配。

import numpy as np
import matplotlib.pyplot as plt

from matplotlib.sankey import Sankey

plt.rcParams["figure.figsize"] = (15,10)


system_1 = [
    {"label": "1st",  "value":  2.00, "orientation":  0},
    {"label": "2nd",  "value":  0.15, "orientation": -1},
    {"label": "3rd",  "value":  0.60, "orientation": -1},
    {"label": "4th",  "value": -0.10, "orientation": -1},
    {"label": "5th",  "value":  0.25, "orientation": -1},
    {"label": "6th",  "value":  0.25, "orientation": -1},
    {"label": "7th",  "value":  0.25, "orientation": -1},
    {"label": "8th",  "value":  0.25, "orientation": -1},
    {"label": "9th",  "value":  0.25, "orientation": -1}
]

system_2 = system_1[:4]
system_2.append({"label": "new",  "value":  -0.25, "orientation": 1})


fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, xticks=[], yticks=[], title="Where are all my cows?")
flows  = [x["value"] for x in system_1]
labels = [x["label"] for x in system_1]
orientations=[x["orientation"] for x in system_1]
sankey = Sankey(ax=ax, unit="cow")
sankey.add(flows=flows, 
           labels=labels,
           label='one',
           orientations=orientations)

sankey.add(flows=[-x["value"] for x in system_2], 
           labels=[x["label"] for x in system_2],
           label='two',
           orientations=[-x["orientation"] for x in system_2], 
           prior=0, 
           connect= (0,0)
          )

diagrams = sankey.finish()
diagrams[-1].patch.set_hatch('/')
plt.legend(loc='best')


plt.show()

这给了我：

它应该加入具有匹配标签的流。

我读过 this and this 但它们并没有帮助我理解实际发生的事情。

Answer 1

先解开迷惑

I think my confusion comes down to misunderstanding what this actually means:

Notice that only one connection is specified, but the systems form a circuit since: (1) the lengths of the paths are justified and (2) the orientation and ordering of the flows is mirrored.

(2) 流的方向和顺序是镜像的。

你可能理解错了mirrored的意思，在这种情况下确实令人困惑。有人会认为，镜像等于反转，但这只是部分正确：
flows（或者如您在代码中所说的那样：values）必须倒置，这个您做对了。因为 values 对应于输入 (value > 0) 或输出 (value < 0)。并且只有输出可以连接到输入，反之亦然。

但是您尝试连接的两个流的 orientation 必须相同。这个没有倒置，但还是要"mirrored"。这是什么意思？那么，如果一个 I/O 正朝着他来的箭头方向看，它需要看到另一个 I/O （就像照镜子一样），然后它们才能连接起来。作为一个 non-native 的演讲者，解释起来并不容易，但我会尝试说明这个想法：

Able to connect:         Not able to connect:        Not able to connect:
I/O  Mirror  I/O         I/O  Mirror  I/O            I/O  Mirror  I/O
╚══>   |    >══╝          ╗     |      ╔                    |      ║
                          ║     |      ║             ══>    |      ║
                          v     |      ^                    |      ^

在您的代码中，您反转了 orientation。这就是为什么橙色系统的第三个流在左上角，而蓝色系统的第三个流在右下角。没有办法，这些 I/O 将永远能够 "see" 彼此。

您可以通过删除方向中 x 前面的 - 来恢复第二个系统的反转：

orientations=[x["orientation"] for x in system_2]

您会看到现在流彼此接近，但您处于 Not able to connect 插图（第 2 号）中所示的情况。这意味着您的图表结构将无法以这种方式工作。您只能在这三个方向上弯曲单流：-90°、0° 或 90°。 orientations = -1, 0 or 1 的通讯员。直接连接这些流的唯一方法是设置它们的 orientation=0，但在我看来这不是您的目标。

您需要一种新的方法来完成这项任务，这样您就不会像以前那样陷入无法连接流程的境地。我已经修改了您的代码以（也许？）达到您的目标。它看起来不再一样了，但我认为这是了解有关方向和镜像以及所有其他内容的概念的良好开端。

(1) 路径长度合理

您会在下面的代码中看到，我已经为 pathlengths 变量设置了值（在第二个系统中）。我的经验是，如果您有太多需要连接的流，matplotlib 将无法再自动完成。

代码和输出

import numpy as np
import matplotlib.pyplot as plt

from matplotlib.sankey import Sankey

plt.rcParams["figure.figsize"] = (15,10)


system_1 = [
    {"label": "1st",  "value": -2.00, "orientation":  1},
    {"label": "4th",  "value":  0.10, "orientation":  1},
    {"label": "2nd",  "value":  0.15, "orientation":  1},
    {"label": "3rd",  "value":  0.60, "orientation":  1},
    {"label": "5th",  "value":  0.25, "orientation": -1},
    {"label": "6th",  "value":  0.25, "orientation": -1},
    {"label": "7th",  "value":  0.25, "orientation":  1},
    {"label": "8th",  "value":  0.25, "orientation":  1},
    {"label": "9th",  "value":  0.25, "orientation":  0}
]

system_2 = [
    {"label": "1st",  "value":  2.00, "orientation":  1},
    {"label": "4th",  "value": -0.10, "orientation":  1},
    {"label": "2nd",  "value": -0.15, "orientation":  1},
    {"label": "3rd",  "value": -0.60, "orientation":  1},
    {"label": "new",  "value": -0.25, "orientation":  1}
]

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, xticks=[], yticks=[], title="Where are all my cows?")

flows_1  = [x["value"] for x in system_1]
labels_1 = [x["label"] for x in system_1]
orientations_1=[x["orientation"] for x in system_1]

flows_2  = [x["value"] for x in system_2]
labels_2 = [x["label"] for x in system_2]
orientations_2=[x["orientation"] for x in system_2]

sankey = Sankey(ax=ax, unit=None)
sankey.add(flows=flows_1, 
           labels=labels_1,
           label='one',
           orientations=orientations_1)

sankey.add(flows=flows_2, 
           labels=labels_2,
           label='two',
           orientations=orientations_2,
           pathlengths=[0, 0.4, 0.5, 0.65, 1.25],
           prior=0,
           connect=(0,0))

diagrams = sankey.finish()
diagrams[-1].patch.set_hatch('|')
diagrams[-0].patch.set_hatch('-')
plt.legend(loc='best')


plt.show()

控制桑基图连接

Controlling Sankey diagram connections

python

matplotlib

sankey-diagram