Plotly:如何从数据框中绘制桑基图?
Plotly: How to draw a sankey diagram from a dataframe?
我有一个数据框:
Vendor Name Category Count
AKJ Education Books 846888
AKJ Education Computers & Tablets 1045
Amazon Books 1294423
Amazon Computers & Tablets 42165
Amazon Other 415
Flipkart Books 1023
我正在尝试使用上述数据框绘制桑基图,源是 供应商名称,目标是 类别,并且流量或宽度为 Count。我尝试使用 Plotly,但没有成功。有没有人有 Plotly 制作桑基图的解决方案?
谢谢
post 的答案将向您表明,将 Sankey 数据源强制放入一个数据框中可能会很快导致混乱。您最好将节点与链接分开,因为它们的构造不同。
因此您的节点数据框应如下所示:
ID Label Color
0 AKJ Education #4994CE
1 Amazon #8A5988
2 Flipkart #449E9E
3 Books #7FC241
4 Computers & tablets #D3D3D3
5 Other #4994CE
您的链接数据框应如下所示:
Source Target Value Link Color
0 3 846888 rgba(127, 194, 65, 0.2)
0 4 1045 rgba(127, 194, 65, 0.2)
1 3 1294423 rgba(211, 211, 211, 0.5)
1 4 42165 rgba(211, 211, 211, 0.5)
1 5 415 rgba(211, 211, 211, 0.5)
2 5 1 rgba(253, 227, 212, 1)
现在,如果您使用与 plot.ly 上的苏格兰公投图表类似的设置,您将能够构建此:
由于数字之间的巨大差异,该特定图表看起来有点奇怪。出于说明目的,我已将您的所有号码替换为 1
:
以下是轻松复制并粘贴到 Jupyter Notebook 的全部内容:
# imports
import pandas as pd
import numpy as np
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
# Nodes & links
nodes = [['ID', 'Label', 'Color'],
[0,'AKJ Education','#4994CE'],
[1,'Amazon','#8A5988'],
[2,'Flipkart','#449E9E'],
[3,'Books','#7FC241'],
[4,'Computers & tablets','#D3D3D3'],
[5,'Other','#4994CE'],]
# links with your data
links = [['Source','Target','Value','Link Color'],
# AKJ
[0,3,1,'rgba(127, 194, 65, 0.2)'],
[0,4,1,'rgba(127, 194, 65, 0.2)'],
# Amazon
[1,3,1,'rgba(211, 211, 211, 0.5)'],
[1,4,1,'rgba(211, 211, 211, 0.5)'],
[1,5,1,'rgba(211, 211, 211, 0.5)'],
# Flipkart
[2,5,1,'rgba(253, 227, 212, 1)'],
[2,3,1,'rgba(253, 227, 212, 1)'],]
# links with some data for illustrative purposes ################
#links = [
# ['Source','Target','Value','Link Color'],
#
# # AKJ
# [0,3,846888,'rgba(127, 194, 65, 0.2)'],
# [0,4,1045,'rgba(127, 194, 65, 0.2)'],
#
# # Amazon
# [1,3,1294423,'rgba(211, 211, 211, 0.5)'],
# [1,4,42165,'rgba(211, 211, 211, 0.5)'],
# [1,5,415,'rgba(211, 211, 211, 0.5)'],
#
# # Flipkart
# [2,5,1,'rgba(253, 227, 212, 1)'],]
#################################################################
# Retrieve headers and build dataframes
nodes_headers = nodes.pop(0)
links_headers = links.pop(0)
df_nodes = pd.DataFrame(nodes, columns = nodes_headers)
df_links = pd.DataFrame(links, columns = links_headers)
# Sankey plot setup
data_trace = dict(
type='sankey',
domain = dict(
x = [0,1],
y = [0,1]
),
orientation = "h",
valueformat = ".0f",
node = dict(
pad = 10,
# thickness = 30,
line = dict(
color = "black",
width = 0
),
label = df_nodes['Label'].dropna(axis=0, how='any'),
color = df_nodes['Color']
),
link = dict(
source = df_links['Source'].dropna(axis=0, how='any'),
target = df_links['Target'].dropna(axis=0, how='any'),
value = df_links['Value'].dropna(axis=0, how='any'),
color = df_links['Link Color'].dropna(axis=0, how='any'),
)
)
layout = dict(
title = "Draw Sankey Diagram from dataframes",
height = 772,
font = dict(
size = 10),)
fig = dict(data=[data_trace], layout=layout)
iplot(fig, validate=False)
我有一个数据框:
Vendor Name Category Count
AKJ Education Books 846888
AKJ Education Computers & Tablets 1045
Amazon Books 1294423
Amazon Computers & Tablets 42165
Amazon Other 415
Flipkart Books 1023
我正在尝试使用上述数据框绘制桑基图,源是 供应商名称,目标是 类别,并且流量或宽度为 Count。我尝试使用 Plotly,但没有成功。有没有人有 Plotly 制作桑基图的解决方案?
谢谢
post
因此您的节点数据框应如下所示:
ID Label Color
0 AKJ Education #4994CE
1 Amazon #8A5988
2 Flipkart #449E9E
3 Books #7FC241
4 Computers & tablets #D3D3D3
5 Other #4994CE
您的链接数据框应如下所示:
Source Target Value Link Color
0 3 846888 rgba(127, 194, 65, 0.2)
0 4 1045 rgba(127, 194, 65, 0.2)
1 3 1294423 rgba(211, 211, 211, 0.5)
1 4 42165 rgba(211, 211, 211, 0.5)
1 5 415 rgba(211, 211, 211, 0.5)
2 5 1 rgba(253, 227, 212, 1)
现在,如果您使用与 plot.ly 上的苏格兰公投图表类似的设置,您将能够构建此:
由于数字之间的巨大差异,该特定图表看起来有点奇怪。出于说明目的,我已将您的所有号码替换为 1
:
以下是轻松复制并粘贴到 Jupyter Notebook 的全部内容:
# imports
import pandas as pd
import numpy as np
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
# Nodes & links
nodes = [['ID', 'Label', 'Color'],
[0,'AKJ Education','#4994CE'],
[1,'Amazon','#8A5988'],
[2,'Flipkart','#449E9E'],
[3,'Books','#7FC241'],
[4,'Computers & tablets','#D3D3D3'],
[5,'Other','#4994CE'],]
# links with your data
links = [['Source','Target','Value','Link Color'],
# AKJ
[0,3,1,'rgba(127, 194, 65, 0.2)'],
[0,4,1,'rgba(127, 194, 65, 0.2)'],
# Amazon
[1,3,1,'rgba(211, 211, 211, 0.5)'],
[1,4,1,'rgba(211, 211, 211, 0.5)'],
[1,5,1,'rgba(211, 211, 211, 0.5)'],
# Flipkart
[2,5,1,'rgba(253, 227, 212, 1)'],
[2,3,1,'rgba(253, 227, 212, 1)'],]
# links with some data for illustrative purposes ################
#links = [
# ['Source','Target','Value','Link Color'],
#
# # AKJ
# [0,3,846888,'rgba(127, 194, 65, 0.2)'],
# [0,4,1045,'rgba(127, 194, 65, 0.2)'],
#
# # Amazon
# [1,3,1294423,'rgba(211, 211, 211, 0.5)'],
# [1,4,42165,'rgba(211, 211, 211, 0.5)'],
# [1,5,415,'rgba(211, 211, 211, 0.5)'],
#
# # Flipkart
# [2,5,1,'rgba(253, 227, 212, 1)'],]
#################################################################
# Retrieve headers and build dataframes
nodes_headers = nodes.pop(0)
links_headers = links.pop(0)
df_nodes = pd.DataFrame(nodes, columns = nodes_headers)
df_links = pd.DataFrame(links, columns = links_headers)
# Sankey plot setup
data_trace = dict(
type='sankey',
domain = dict(
x = [0,1],
y = [0,1]
),
orientation = "h",
valueformat = ".0f",
node = dict(
pad = 10,
# thickness = 30,
line = dict(
color = "black",
width = 0
),
label = df_nodes['Label'].dropna(axis=0, how='any'),
color = df_nodes['Color']
),
link = dict(
source = df_links['Source'].dropna(axis=0, how='any'),
target = df_links['Target'].dropna(axis=0, how='any'),
value = df_links['Value'].dropna(axis=0, how='any'),
color = df_links['Link Color'].dropna(axis=0, how='any'),
)
)
layout = dict(
title = "Draw Sankey Diagram from dataframes",
height = 772,
font = dict(
size = 10),)
fig = dict(data=[data_trace], layout=layout)
iplot(fig, validate=False)