如何将数据框转换为三元组列表
How to convert a dataframe into a list of 3-tuples
我想使用 python
中的 networkx
库创建一个有向图。
我有一个 pandas
数据框,如下所示:
Head Mounted Display Marker Smartphone
2D data extrusion 3 0 1
AgiSoft PhotoScan 3D design 1 2 2
AuGeo Esri AR template 1 1 2
BIM 1 1 0
Blender 3D design 0 2 4
Bluetooth localization 1 1 0
CityEngine 3 1 2
GIS data processing 3 1 2
GNSS localization 1 2 4
Google ARCore 0 1 5
Google SketchUp 3D design 1 2 0
Image Stitching 1 1 4
Java Development Kit 0 1 0
SLAM 1 2 2
Unity 3D 8 12 10
Unreal Engine 1 1 0
Vuforia 2 7 3
作为“networkx.DiGraph.add_weighted_edges_from”函数的输入,我需要将其格式化为如下所示的三元组列表:
('Head Mounted Display', '2D data extrusion', 3),
('Head Mounted Display', 'Agisoft PhotoScan 3D design', 1),
('Head Mounted Display','AuGeo Esri AR template', 1),
etc...
此外,还有权重为 0 的元组,例如:
('Marker', '2D data extrusion', 0)
需要从列表中删除。
有人知道怎么做吗?
提前致谢!
您可以关注下方代码
lstOfTuples = []
for i in range(df.shape[0]):
for j in range(df.shape[1]):
index = df.index[i]
col = df.columns[j]
value = float(df.loc[index, col])
if value > 0:
lstOfTuples.append((col, index, value))
lstOfTuples
像这样创建一个有向图
G = nx.Graph()
G.add_weighted_edges_from(ebunch_to_add=lstOfTuples)
您可以按如下方式创建所需元组列表:
def createTuples(df, onColumn=0):
sze = df.shape[0]
colName = list(df.columns)[onColumn]
rslt = []
for r in range(sze):
if df.iloc[r][onColumn] > 0:
rslt.append((colName, df.iloc[r].name, df.iloc[r][onColumn]))
return rslt
此方法允许您指定要在第一个元组位置使用的列标题。
使用df.columns[0]
获取'HeadMountedDisplay',使用df.index[i]
获取行名。请注意,df 指的是您的 df 名称。
然后使用带条件的元组:
tuple((df.columns[0], df.index[i], df[df.columns[0]][i]) for i in range(len(df)) if df[df.columns[0]][i] is not 0)
使用 .melt
将有助于获得您感兴趣的形状。这是一个可重现的示例:
import networkx as nx
import pandas as pd
# create a dummy dataframe with a similar structure
df = pd.DataFrame(zip(range(6), range(5, -1, -1)))
df.columns = list("ab")
df.index = list("qwerty")
# flatten the dataframe for easier processing
flat = df.melt(ignore_index=False).reset_index()
# ignore 0
mask = flat["value"] > 0
flat = flat.loc[mask]
# create a directed graphp
G = nx.DiGraph()
# fill-in with edges
for start, end, weight in flat.values:
G.add_edge(start, end, weight=weight)
print(G.nodes()) # ['w', 'a', 'e', 'r', 't', 'y', 'q', 'b']
print(
G.edges()
) # [('w', 'a'), ('w', 'b'), ('e', 'a'), ('e', 'b'), ('r', 'a'), ('r', 'b'), ('t', 'a'), ('t', 'b'), ('y', 'a'), ('q', 'b')]
与@SultanOrazbayev 的回答类似,您可以融化数据框,但您可以利用 nx.from_pandas_edgelist
函数直接使用融化的数据框,而无需创建元组列表。
# Sample df
df = pd.DataFrame({'Head Mounted Display':[3,1,1,1,0],'Marker':[0,2,1,1,2],'Smartphone':[1,2,2,0,4]})
# melt the dataframe and filter out the rows with weight of zero
df_long_temp = df.reset_index().melt(id_vars='index',var_name='to',value_name='weight')
df_long = df_long_temp[df_long_temp['weight'] != 0]
# create the graph with edge weights
g = nx.from_pandas_edgelist(df_long,source='index',target='to',
edge_attr='weight',create_using=nx.DiGraph)
# drawing the graph
pos = nx.spring_layout(g)
nx.draw_networkx(g,pos=pos)
weight_dict = {(u,v):'w={}'.format(w) for u,v,w in g.edges(data='weight')}
nx.draw_networkx_edge_labels(g,pos=pos,edge_labels=weight_dict)
我想使用 python
中的 networkx
库创建一个有向图。
我有一个 pandas
数据框,如下所示:
Head Mounted Display Marker Smartphone
2D data extrusion 3 0 1
AgiSoft PhotoScan 3D design 1 2 2
AuGeo Esri AR template 1 1 2
BIM 1 1 0
Blender 3D design 0 2 4
Bluetooth localization 1 1 0
CityEngine 3 1 2
GIS data processing 3 1 2
GNSS localization 1 2 4
Google ARCore 0 1 5
Google SketchUp 3D design 1 2 0
Image Stitching 1 1 4
Java Development Kit 0 1 0
SLAM 1 2 2
Unity 3D 8 12 10
Unreal Engine 1 1 0
Vuforia 2 7 3
作为“networkx.DiGraph.add_weighted_edges_from”函数的输入,我需要将其格式化为如下所示的三元组列表:
('Head Mounted Display', '2D data extrusion', 3),
('Head Mounted Display', 'Agisoft PhotoScan 3D design', 1),
('Head Mounted Display','AuGeo Esri AR template', 1),
etc...
此外,还有权重为 0 的元组,例如:
('Marker', '2D data extrusion', 0)
需要从列表中删除。
有人知道怎么做吗?
提前致谢!
您可以关注下方代码
lstOfTuples = []
for i in range(df.shape[0]):
for j in range(df.shape[1]):
index = df.index[i]
col = df.columns[j]
value = float(df.loc[index, col])
if value > 0:
lstOfTuples.append((col, index, value))
lstOfTuples
像这样创建一个有向图
G = nx.Graph()
G.add_weighted_edges_from(ebunch_to_add=lstOfTuples)
您可以按如下方式创建所需元组列表:
def createTuples(df, onColumn=0):
sze = df.shape[0]
colName = list(df.columns)[onColumn]
rslt = []
for r in range(sze):
if df.iloc[r][onColumn] > 0:
rslt.append((colName, df.iloc[r].name, df.iloc[r][onColumn]))
return rslt
此方法允许您指定要在第一个元组位置使用的列标题。
使用df.columns[0]
获取'HeadMountedDisplay',使用df.index[i]
获取行名。请注意,df 指的是您的 df 名称。
然后使用带条件的元组:
tuple((df.columns[0], df.index[i], df[df.columns[0]][i]) for i in range(len(df)) if df[df.columns[0]][i] is not 0)
使用 .melt
将有助于获得您感兴趣的形状。这是一个可重现的示例:
import networkx as nx
import pandas as pd
# create a dummy dataframe with a similar structure
df = pd.DataFrame(zip(range(6), range(5, -1, -1)))
df.columns = list("ab")
df.index = list("qwerty")
# flatten the dataframe for easier processing
flat = df.melt(ignore_index=False).reset_index()
# ignore 0
mask = flat["value"] > 0
flat = flat.loc[mask]
# create a directed graphp
G = nx.DiGraph()
# fill-in with edges
for start, end, weight in flat.values:
G.add_edge(start, end, weight=weight)
print(G.nodes()) # ['w', 'a', 'e', 'r', 't', 'y', 'q', 'b']
print(
G.edges()
) # [('w', 'a'), ('w', 'b'), ('e', 'a'), ('e', 'b'), ('r', 'a'), ('r', 'b'), ('t', 'a'), ('t', 'b'), ('y', 'a'), ('q', 'b')]
与@SultanOrazbayev 的回答类似,您可以融化数据框,但您可以利用 nx.from_pandas_edgelist
函数直接使用融化的数据框,而无需创建元组列表。
# Sample df
df = pd.DataFrame({'Head Mounted Display':[3,1,1,1,0],'Marker':[0,2,1,1,2],'Smartphone':[1,2,2,0,4]})
# melt the dataframe and filter out the rows with weight of zero
df_long_temp = df.reset_index().melt(id_vars='index',var_name='to',value_name='weight')
df_long = df_long_temp[df_long_temp['weight'] != 0]
# create the graph with edge weights
g = nx.from_pandas_edgelist(df_long,source='index',target='to',
edge_attr='weight',create_using=nx.DiGraph)
# drawing the graph
pos = nx.spring_layout(g)
nx.draw_networkx(g,pos=pos)
weight_dict = {(u,v):'w={}'.format(w) for u,v,w in g.edges(data='weight')}
nx.draw_networkx_edge_labels(g,pos=pos,edge_labels=weight_dict)