如何访问字典中的键?
How to access keys in dictionary?
作为 python 的初学者,我正在处理一个 CSV 文件并创建了一个边缘列表,以便这些值与 CSV 中该行的值具有一对一的映射,例如:
输出:
value1 value2
value1 value3
value2 value3
value4 value5
.
.
.
然后我为 csv 文件中的每个值分配了唯一编号,这样唯一编号充当键,CSV 中的项目充当字典中的值。此外,如果任何值在 CSV 文件中重复,我不想为其分配另一个键。
输出:
dictionary=
{
1: "value1",
2: "value2",
3: "value3",
.
.
.
}
现在我想要边缘列表(我之前创建的)作为输出,但是边缘列表中的值应该被它们在字典中的键替换,例如:
1 2
1 3
2 3
.
.
.
谢谢!
我想这里的主要问题是从唯一数字到名称构建字典,因为我觉得您应该以相反的方式构建它(从名称到唯一数字)。此外,构建此地图时,您缺少将映射转换为具有唯一编号的最终映射的代码。
在下面找到我建议的代码:
#!/usr/bin/python
# -*- coding: utf-8 -*-
# For better print formatting
from __future__ import print_function
# Imports
import sys
#
# HELPER METHODS
#
def mapping(csv_filename, mapping_filename):
if __debug__:
print("CSV File: " + str(csv_filename))
print("Mapping File: " + str(mapping_filename))
# Retrieve data from CSV file
with open(csv_filename, "r") as csv_file:
data_raw = csv_file.readlines()
data = []
for line in data_raw:
line = line.strip()
elements = line.split(",")
elements = [e.strip() for e in elements]
data.append(elements)
# Create mapping list and file
mapping_list = []
with open(mapping_filename, "w") as mapping_file:
for elements in data:
j = 0
while j != len(elements) - 1:
for k in range(j + 1, len(elements)):
# Add to mapping
temp = [elements[j], elements[k]]
mapping_list.append(temp)
# Write to file
mapping_file.write(elements[j] + " " + elements[k] + "\n")
j += 1
# Return the mapping
return mapping_list
def build_key_map(mapping_list):
if __debug__:
print("Mapping List: " + str(mapping_list))
key_dict = {}
i = 1
# Check each parsed node inside each edge
for edge in mapping_list:
for node in edge:
# Add node to keys if it has not been registered yet
if node not in key_dict.keys():
key_dict[node] = i
i = i + 1
return key_dict
def build_graph(mapping_list, key_dict):
if __debug__:
print("Mapping List: " + str(mapping_list))
print("Key Dict: " + str(key_dict))
# Copy the existing mapping changing each node (inside edge) by its unique number
new_mapping_list = []
for edge in mapping_list:
new_edge = []
for node in edge:
new_edge.append(key_dict[node])
new_mapping_list.append(new_edge)
return new_mapping_list
#
# MAIN
#
def main():
import sys
csv_file = sys.argv[1]
mapping_file = sys.argv[2]
mapping_list = mapping(csv_file, mapping_file)
key_dict = build_key_map(mapping_list)
new_mapping_list = build_graph(mapping_list, key_dict)
print("FINAL MAPPING: ")
for edge in new_mapping_list:
print(edge)
#
# ENTRY POINT
#
if __name__ == "__main__":
main()
请注意,尽管将所有过程放在一个函数中可能会提高性能,但我已尝试将您的代码保存在单独的函数中:
mapping function
解析 CSV 文件并生成一个映射(从节点到节点的边列表)并将其写入给定文件。在这里,CSV 解析被“,”分割(就像您提供的示例一样),尽管您的原始代码被“:”分割。
build_key_map
创建一个从节点名称到唯一编号的字典。
build_graph
使用节点名称的唯一编号转换映射。
使用您的输入,预期输出为:
> python parser.py "csv.txt" "map.txt"
CSV File: csv.txt
Mapping File: map.txt
Mapping List: [['man', 'nut'], ['man', 'bag'], ['nut', 'bag'], ['rat', 'cat'], ['dog', 'fog'], ['dog', 'cat'], ['dog', 'man'], ['fog', 'cat'], ['fog', 'man'], ['cat', 'man']]
Mapping List: [['man', 'nut'], ['man', 'bag'], ['nut', 'bag'], ['rat', 'cat'], ['dog', 'fog'], ['dog', 'cat'], ['dog', 'man'], ['fog', 'cat'], ['fog', 'man'], ['cat', 'man']]
Key Dict: {'nut': 2, 'dog': 6, 'cat': 5, 'bag': 3, 'rat': 4, 'fog': 7, 'man': 1}
FINAL MAPPING:
[1, 2]
[1, 3]
[2, 3]
[4, 5]
[6, 7]
[6, 5]
[6, 1]
[7, 5]
[7, 1]
[5, 1]
此外,我添加了一些评论,但如果我需要澄清某些部分,请告诉我。
编辑:
顺便说一下,如果您真的需要从唯一数字到值的映射,您可以随时反转字典并存储它,而算法可以继续使用从名称到唯一数字的字典。要反转字典,您只需要:
inverted_dict = dict([[v,k] for k,v in key_dict.items()])
编辑2:
我在这里提供另一个版本的映射函数,它直接生成具有唯一值的映射(而不是使用多个函数和中间结构)。
def mapping(csv_filename, mapping_filename):
if __debug__:
print("CSV File: " + str(csv_filename))
print("Mapping File: " + str(mapping_filename))
# Retrieve data from CSV file
with open(csv_filename, "r") as csv_file:
data_raw = csv_file.readlines()
data = []
for line in data_raw:
line = line.strip()
elements = line.split(",")
elements = [e.strip() for e in elements]
data.append(elements)
# Create mapping list and file
mapping_list = []
key_dict = {}
unique_num = 1
with open(mapping_filename, "w") as mapping_file:
for elements in data:
j = 0
while j != len(elements) - 1:
for k in range(j + 1, len(elements)):
if __debug__:
print("Converting: " + elements[j] + " -> " + elements[k])
# Transform elements to keys
if elements[j] in key_dict.keys():
key_j = key_dict[elements[j]]
else:
key_dict[elements[j]] = unique_num
key_j = unique_num
unique_num = unique_num + 1
if elements[k] in key_dict.keys():
key_k = key_dict[elements[k]]
else:
key_dict[elements[k]] = unique_num
key_k = unique_num
unique_num = unique_num + 1
# Add to mapping
if __debug__:
print("Adding: " + str(key_j) + " -> " + str(key_k))
mapping_list.append([key_j, key_k])
# Write to file
mapping_file.write(str(key_j) + " " + str(key_k) + "\n")
j += 1
# Return the mapping
return mapping_list
它的预期输出是:
> python parser.py "csv.txt" "map.txt"
CSV File: csv.txt
Mapping File: map.txt
Converting: man -> nut
Adding: 1 -> 2
Converting: man -> bag
Adding: 1 -> 3
Converting: nut -> bag
Adding: 2 -> 3
Converting: rat -> cat
Adding: 4 -> 5
Converting: dog -> fog
Adding: 6 -> 7
Converting: dog -> cat
Adding: 6 -> 5
Converting: dog -> man
Adding: 6 -> 1
Converting: fog -> cat
Adding: 7 -> 5
Converting: fog -> man
Adding: 7 -> 1
Converting: cat -> man
Adding: 5 -> 1
FINAL MAPPING:
[1, 2]
[1, 3]
[2, 3]
[4, 5]
[6, 7]
[6, 5]
[6, 1]
[7, 5]
[7, 1]
[5, 1]
除非别无选择,否则不应在循环中手动管理索引 (while(j!=len(i)-1)
)。
对于组合,您可以使用 itertools
:
>>> import itertools
>>> list(itertools.combinations(["man", "nut", "bag"], 2))
[('man', 'nut'), ('man', 'bag'), ('nut', 'bag')]
然后很容易得到边缘。
我为你给出的例子创建了一个reader:
>>> data = """man,nut,bag
... rat,cat
... dog,fog,cat,man"""
...
>>> import io
>>> import csv
>>> reader = csv.reader(io.StringIO(data))
边缘是每行组合的串联:
>>> edges = [(v1, v2) for row in reader for v1, v2 in itertools.combinations(row, 2)]
>>> edges
[('man', 'nut'), ('man', 'bag'), ('nut', 'bag'), ('rat', 'cat'), ('dog', 'fog'), ('dog', 'cat'), ('dog', 'man'), ('fog', 'cat'), ('fog', 'man'), ('cat', 'man')]
现在您可以从 edges
:
中提取唯一元素
>>> vs = sorted(set(a for e in edges for a in e))
>>> vs
['bag', 'cat', 'dog', 'fog', 'man', 'nut', 'rat']
(我在这里使用 sorted
以获得可重现的结果,但您不需要它)。要给每个顶点编号,只需使用列表中的索引:
>>> list(enumerate(vs))
[(0, 'bag'), (1, 'cat'), (2, 'dog'), (3, 'fog'), (4, 'man'), (5, 'nut'), (6, 'rat')]
>>> i_by_v = {v: i for i, v in enumerate(vs)}
>>> i_by_v
{'bag': 0, 'cat': 1, 'dog': 2, 'fog': 3, 'man': 4, 'nut': 5, 'rat': 6}
让我们用它们的编号替换顶点:
>>> [(i_by_v[v1], i_by_v[v2]) for v1, v2 in edges]
[(4, 5), (4, 0), (5, 0), (6, 1), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4), (1, 4)]
现在您可以使用任何您想要的图形算法。
作为 python 的初学者,我正在处理一个 CSV 文件并创建了一个边缘列表,以便这些值与 CSV 中该行的值具有一对一的映射,例如: 输出:
value1 value2
value1 value3
value2 value3
value4 value5
.
.
.
然后我为 csv 文件中的每个值分配了唯一编号,这样唯一编号充当键,CSV 中的项目充当字典中的值。此外,如果任何值在 CSV 文件中重复,我不想为其分配另一个键。
输出:
dictionary=
{
1: "value1",
2: "value2",
3: "value3",
.
.
.
}
现在我想要边缘列表(我之前创建的)作为输出,但是边缘列表中的值应该被它们在字典中的键替换,例如:
1 2
1 3
2 3
.
.
.
谢谢!
我想这里的主要问题是从唯一数字到名称构建字典,因为我觉得您应该以相反的方式构建它(从名称到唯一数字)。此外,构建此地图时,您缺少将映射转换为具有唯一编号的最终映射的代码。
在下面找到我建议的代码:
#!/usr/bin/python
# -*- coding: utf-8 -*-
# For better print formatting
from __future__ import print_function
# Imports
import sys
#
# HELPER METHODS
#
def mapping(csv_filename, mapping_filename):
if __debug__:
print("CSV File: " + str(csv_filename))
print("Mapping File: " + str(mapping_filename))
# Retrieve data from CSV file
with open(csv_filename, "r") as csv_file:
data_raw = csv_file.readlines()
data = []
for line in data_raw:
line = line.strip()
elements = line.split(",")
elements = [e.strip() for e in elements]
data.append(elements)
# Create mapping list and file
mapping_list = []
with open(mapping_filename, "w") as mapping_file:
for elements in data:
j = 0
while j != len(elements) - 1:
for k in range(j + 1, len(elements)):
# Add to mapping
temp = [elements[j], elements[k]]
mapping_list.append(temp)
# Write to file
mapping_file.write(elements[j] + " " + elements[k] + "\n")
j += 1
# Return the mapping
return mapping_list
def build_key_map(mapping_list):
if __debug__:
print("Mapping List: " + str(mapping_list))
key_dict = {}
i = 1
# Check each parsed node inside each edge
for edge in mapping_list:
for node in edge:
# Add node to keys if it has not been registered yet
if node not in key_dict.keys():
key_dict[node] = i
i = i + 1
return key_dict
def build_graph(mapping_list, key_dict):
if __debug__:
print("Mapping List: " + str(mapping_list))
print("Key Dict: " + str(key_dict))
# Copy the existing mapping changing each node (inside edge) by its unique number
new_mapping_list = []
for edge in mapping_list:
new_edge = []
for node in edge:
new_edge.append(key_dict[node])
new_mapping_list.append(new_edge)
return new_mapping_list
#
# MAIN
#
def main():
import sys
csv_file = sys.argv[1]
mapping_file = sys.argv[2]
mapping_list = mapping(csv_file, mapping_file)
key_dict = build_key_map(mapping_list)
new_mapping_list = build_graph(mapping_list, key_dict)
print("FINAL MAPPING: ")
for edge in new_mapping_list:
print(edge)
#
# ENTRY POINT
#
if __name__ == "__main__":
main()
请注意,尽管将所有过程放在一个函数中可能会提高性能,但我已尝试将您的代码保存在单独的函数中:
mapping function
解析 CSV 文件并生成一个映射(从节点到节点的边列表)并将其写入给定文件。在这里,CSV 解析被“,”分割(就像您提供的示例一样),尽管您的原始代码被“:”分割。build_key_map
创建一个从节点名称到唯一编号的字典。build_graph
使用节点名称的唯一编号转换映射。
使用您的输入,预期输出为:
> python parser.py "csv.txt" "map.txt"
CSV File: csv.txt
Mapping File: map.txt
Mapping List: [['man', 'nut'], ['man', 'bag'], ['nut', 'bag'], ['rat', 'cat'], ['dog', 'fog'], ['dog', 'cat'], ['dog', 'man'], ['fog', 'cat'], ['fog', 'man'], ['cat', 'man']]
Mapping List: [['man', 'nut'], ['man', 'bag'], ['nut', 'bag'], ['rat', 'cat'], ['dog', 'fog'], ['dog', 'cat'], ['dog', 'man'], ['fog', 'cat'], ['fog', 'man'], ['cat', 'man']]
Key Dict: {'nut': 2, 'dog': 6, 'cat': 5, 'bag': 3, 'rat': 4, 'fog': 7, 'man': 1}
FINAL MAPPING:
[1, 2]
[1, 3]
[2, 3]
[4, 5]
[6, 7]
[6, 5]
[6, 1]
[7, 5]
[7, 1]
[5, 1]
此外,我添加了一些评论,但如果我需要澄清某些部分,请告诉我。
编辑:
顺便说一下,如果您真的需要从唯一数字到值的映射,您可以随时反转字典并存储它,而算法可以继续使用从名称到唯一数字的字典。要反转字典,您只需要:
inverted_dict = dict([[v,k] for k,v in key_dict.items()])
编辑2:
我在这里提供另一个版本的映射函数,它直接生成具有唯一值的映射(而不是使用多个函数和中间结构)。
def mapping(csv_filename, mapping_filename):
if __debug__:
print("CSV File: " + str(csv_filename))
print("Mapping File: " + str(mapping_filename))
# Retrieve data from CSV file
with open(csv_filename, "r") as csv_file:
data_raw = csv_file.readlines()
data = []
for line in data_raw:
line = line.strip()
elements = line.split(",")
elements = [e.strip() for e in elements]
data.append(elements)
# Create mapping list and file
mapping_list = []
key_dict = {}
unique_num = 1
with open(mapping_filename, "w") as mapping_file:
for elements in data:
j = 0
while j != len(elements) - 1:
for k in range(j + 1, len(elements)):
if __debug__:
print("Converting: " + elements[j] + " -> " + elements[k])
# Transform elements to keys
if elements[j] in key_dict.keys():
key_j = key_dict[elements[j]]
else:
key_dict[elements[j]] = unique_num
key_j = unique_num
unique_num = unique_num + 1
if elements[k] in key_dict.keys():
key_k = key_dict[elements[k]]
else:
key_dict[elements[k]] = unique_num
key_k = unique_num
unique_num = unique_num + 1
# Add to mapping
if __debug__:
print("Adding: " + str(key_j) + " -> " + str(key_k))
mapping_list.append([key_j, key_k])
# Write to file
mapping_file.write(str(key_j) + " " + str(key_k) + "\n")
j += 1
# Return the mapping
return mapping_list
它的预期输出是:
> python parser.py "csv.txt" "map.txt"
CSV File: csv.txt
Mapping File: map.txt
Converting: man -> nut
Adding: 1 -> 2
Converting: man -> bag
Adding: 1 -> 3
Converting: nut -> bag
Adding: 2 -> 3
Converting: rat -> cat
Adding: 4 -> 5
Converting: dog -> fog
Adding: 6 -> 7
Converting: dog -> cat
Adding: 6 -> 5
Converting: dog -> man
Adding: 6 -> 1
Converting: fog -> cat
Adding: 7 -> 5
Converting: fog -> man
Adding: 7 -> 1
Converting: cat -> man
Adding: 5 -> 1
FINAL MAPPING:
[1, 2]
[1, 3]
[2, 3]
[4, 5]
[6, 7]
[6, 5]
[6, 1]
[7, 5]
[7, 1]
[5, 1]
除非别无选择,否则不应在循环中手动管理索引 (while(j!=len(i)-1)
)。
对于组合,您可以使用 itertools
:
>>> import itertools
>>> list(itertools.combinations(["man", "nut", "bag"], 2))
[('man', 'nut'), ('man', 'bag'), ('nut', 'bag')]
然后很容易得到边缘。
我为你给出的例子创建了一个reader:
>>> data = """man,nut,bag
... rat,cat
... dog,fog,cat,man"""
...
>>> import io
>>> import csv
>>> reader = csv.reader(io.StringIO(data))
边缘是每行组合的串联:
>>> edges = [(v1, v2) for row in reader for v1, v2 in itertools.combinations(row, 2)]
>>> edges
[('man', 'nut'), ('man', 'bag'), ('nut', 'bag'), ('rat', 'cat'), ('dog', 'fog'), ('dog', 'cat'), ('dog', 'man'), ('fog', 'cat'), ('fog', 'man'), ('cat', 'man')]
现在您可以从 edges
:
>>> vs = sorted(set(a for e in edges for a in e))
>>> vs
['bag', 'cat', 'dog', 'fog', 'man', 'nut', 'rat']
(我在这里使用 sorted
以获得可重现的结果,但您不需要它)。要给每个顶点编号,只需使用列表中的索引:
>>> list(enumerate(vs))
[(0, 'bag'), (1, 'cat'), (2, 'dog'), (3, 'fog'), (4, 'man'), (5, 'nut'), (6, 'rat')]
>>> i_by_v = {v: i for i, v in enumerate(vs)}
>>> i_by_v
{'bag': 0, 'cat': 1, 'dog': 2, 'fog': 3, 'man': 4, 'nut': 5, 'rat': 6}
让我们用它们的编号替换顶点:
>>> [(i_by_v[v1], i_by_v[v2]) for v1, v2 in edges]
[(4, 5), (4, 0), (5, 0), (6, 1), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4), (1, 4)]
现在您可以使用任何您想要的图形算法。