如何将嵌套字典转换为图表?
How can I turn a nested dictionary in to a graph?
我有一个嵌套字典,如下所示:
{'Track_108': {'Track_3994': [(1, 6)],
'Track_4118': [(8, 9)],
'Track_4306': [(25, 26), (28, 30)]},
'Track_112': {'Track_4007': [(19, 20)]},
'Track_121': {'Track_4478': [(102, 104)]},
'Track_130': {'Track_4068': [(132, 134)]},
'Track_141': {'Track_5088': [(93, 95)],
'Track_5195': [(103, 104), (106, 107)]}
列表是特定事件的间隔(持续时间)。第一个数字是“起始帧”,第二个数字是“最后一帧”。所以“Track_3994”有一个持续时间为 6 帧的事件。
我想绘制一个直方图,x 轴是事件的持续时间,y 轴是计数。我需要一个用于整个字典的 histplot,最好还有一个用于您在第一列中看到的每个曲目的 histplot。
这将是整个词典的图表。 y 轴表示持续时间在字典中出现的次数。对于我提供的数据,只有一个持续时间为 6 的事件,因此该条的高度为 1。x 轴上为 2 的条在 y 轴上的高度为 5,因为有持续时间为 2 帧的 5 个事件。
对于每个曲目的图表,直方图将仅显示该曲目的持续时间分布。所以这些图会小很多。例如。 track_108 将有一个图形,其中 x=2 的高度为 2 的条形,x=3 的高度为 1 的条形,x=6 的高度为 1 的条形。
为了解决计算和计数工作,你可以使用这样的东西:
from typing import Dict, List, Tuple # just typing hints for used/expected types in functions, could be left out
def calculate_track_event_data(data_dict: Dict[str, List[Tuple[int, int]]]) -> Dict[int, int]:
"""
Counts the durations afor a single track sub-dict (contains a dict of other tracks with a list of their durations as specified in question).
Returns a dict with duration to count as key-value pairs.
"""
hist_plot_data = {}
for track, track_data in data_dict.items():
for duration_info in track_data:
duration = duration_info[1] - duration_info[0] + 1 # calculate duration
try:
hist_plot_data[duration] += 1 # count up for calculated duration
except KeyError:
hist_plot_data[duration] = 1 # add duration if not added yet
return hist_plot_data
def calculate_top_layer_event_data(data_dict: Dict[str, Dict[str, List[Tuple[int, int]]]]) -> Dict[int, int]:
"""
Counts the durations across the entire dict.
Returns a dict with duration to count as key-value pairs.
"""
hist_plot_data = {}
for top_level_track, top_level_track_data in data_dict.items():
hist_for_track = calculate_track_event_data(top_level_track_data)
for duration, count in hist_for_track.items():
try:
hist_plot_data[duration] += count # sum up collected count for calculated duration
except KeyError:
hist_plot_data[duration] = count # add duration if not added yet
return hist_plot_data
对于给定的字典,结果为:
# Data definition
data = {'Track_108': {'Track_3994': [(1, 6)],
'Track_4118': [(8, 9)],
'Track_4306': [(25, 26), (28, 30)]},
'Track_112': {'Track_4007': [(19, 20)]},
'Track_121': {'Track_4478': [(102, 104)]},
'Track_130': {'Track_4068': [(132, 134)]},
'Track_141': {'Track_5088': [(93, 95)],
'Track_5195': [(103, 104), (106, 107)]}}
# Call in code:
print(calculate_track_event_data(data['Track_108']))
print(calculate_top_layer_event_data(data))
# Result on output:
{6: 1, 2: 2, 3: 1} <-- Result for Track 108
{6: 1, 2: 5, 3: 4} <-- Result for complete dictionary
要可视化结果,您可以使用 python 库之一,例如 mathplotlib
(看看例如 How to plot a histogram using Matplotlib in Python with a list of data? or https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html)
我有一个嵌套字典,如下所示:
{'Track_108': {'Track_3994': [(1, 6)],
'Track_4118': [(8, 9)],
'Track_4306': [(25, 26), (28, 30)]},
'Track_112': {'Track_4007': [(19, 20)]},
'Track_121': {'Track_4478': [(102, 104)]},
'Track_130': {'Track_4068': [(132, 134)]},
'Track_141': {'Track_5088': [(93, 95)],
'Track_5195': [(103, 104), (106, 107)]}
列表是特定事件的间隔(持续时间)。第一个数字是“起始帧”,第二个数字是“最后一帧”。所以“Track_3994”有一个持续时间为 6 帧的事件。
我想绘制一个直方图,x 轴是事件的持续时间,y 轴是计数。我需要一个用于整个字典的 histplot,最好还有一个用于您在第一列中看到的每个曲目的 histplot。
这将是整个词典的图表。 y 轴表示持续时间在字典中出现的次数。对于我提供的数据,只有一个持续时间为 6 的事件,因此该条的高度为 1。x 轴上为 2 的条在 y 轴上的高度为 5,因为有持续时间为 2 帧的 5 个事件。
对于每个曲目的图表,直方图将仅显示该曲目的持续时间分布。所以这些图会小很多。例如。 track_108 将有一个图形,其中 x=2 的高度为 2 的条形,x=3 的高度为 1 的条形,x=6 的高度为 1 的条形。
为了解决计算和计数工作,你可以使用这样的东西:
from typing import Dict, List, Tuple # just typing hints for used/expected types in functions, could be left out
def calculate_track_event_data(data_dict: Dict[str, List[Tuple[int, int]]]) -> Dict[int, int]:
"""
Counts the durations afor a single track sub-dict (contains a dict of other tracks with a list of their durations as specified in question).
Returns a dict with duration to count as key-value pairs.
"""
hist_plot_data = {}
for track, track_data in data_dict.items():
for duration_info in track_data:
duration = duration_info[1] - duration_info[0] + 1 # calculate duration
try:
hist_plot_data[duration] += 1 # count up for calculated duration
except KeyError:
hist_plot_data[duration] = 1 # add duration if not added yet
return hist_plot_data
def calculate_top_layer_event_data(data_dict: Dict[str, Dict[str, List[Tuple[int, int]]]]) -> Dict[int, int]:
"""
Counts the durations across the entire dict.
Returns a dict with duration to count as key-value pairs.
"""
hist_plot_data = {}
for top_level_track, top_level_track_data in data_dict.items():
hist_for_track = calculate_track_event_data(top_level_track_data)
for duration, count in hist_for_track.items():
try:
hist_plot_data[duration] += count # sum up collected count for calculated duration
except KeyError:
hist_plot_data[duration] = count # add duration if not added yet
return hist_plot_data
对于给定的字典,结果为:
# Data definition
data = {'Track_108': {'Track_3994': [(1, 6)],
'Track_4118': [(8, 9)],
'Track_4306': [(25, 26), (28, 30)]},
'Track_112': {'Track_4007': [(19, 20)]},
'Track_121': {'Track_4478': [(102, 104)]},
'Track_130': {'Track_4068': [(132, 134)]},
'Track_141': {'Track_5088': [(93, 95)],
'Track_5195': [(103, 104), (106, 107)]}}
# Call in code:
print(calculate_track_event_data(data['Track_108']))
print(calculate_top_layer_event_data(data))
# Result on output:
{6: 1, 2: 2, 3: 1} <-- Result for Track 108
{6: 1, 2: 5, 3: 4} <-- Result for complete dictionary
要可视化结果,您可以使用 python 库之一,例如 mathplotlib (看看例如 How to plot a histogram using Matplotlib in Python with a list of data? or https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html)