python 将每个单词映射到它自己的文本
python map each word to its own text
我有一个这样的单词列表:
word_list=[{"word": "python",
"repeted": 4},
{"word": "awsome",
"repeted": 3},
{"word": "frameworks",
"repeted": 2},
{"word": "programing",
"repeted": 2},
{"word": "Whosebug",
"repeted": 2},
{"word": "work",
"repeted": 1},
{"word": "error",
"repeted": 1},
{"word": "teach",
"repeted": 1}
]
,来自另一个笔记列表:
note_list = [{"note_id":1,
"note_txt":"A curated list of awesome Python frameworks"},
{"note_id":2,
"note_txt":"what is awesome Python frameworks"},
{"note_id":3,
"note_txt":"awesome Python is good to wok with it"},
{"note_id":4,
"note_txt":"use Whosebug to lern programing with python is awsome"},
{"note_id":5,
"note_txt":"error in programing is good to learn"},
{"note_id":6,
"note_txt":"Whosebug is very useful to share our knoloedge"},
{"note_id":7,
"note_txt":"teach, work"},
]
我想知道如何将每个单词映射到它自己的音符:
maped_list=[{"word": "python",
"notes_ids": [1,2,3,4]},
{"word": "awsome",
"notes_ids": [1,2,3]},
{"word": "frameworks",
"notes_ids": [1,2]},
{"word": "programing",
"notes_ids": [4,5]},
{"word": "Whosebug",
"notes_ids": [4,6]},
{"word": "work",
"notes_ids": [7]},
{"word": "error",
"notes_ids": [5]},
{"word": "teach",
"notes_ids": [7]}
]
我的作品:
# i started by appending all the notes text into one list
notes_test = []
for note in note_list:
notes_test.append(note['note_txt'])
# calculate the reptition of each word
dict = {}
for sentence in notes_test:
for word in re.split('\s', sentence): # split with whitespace
try:
dict[word] += 1
except KeyError:
dict[word] = 1
word_list= []
for key in dict.keys():
word = {}
word['word'] = key
word['repeted'] = dict[key]
word_list.append(word)
我的问题:
- 如何映射单词列表和注释列表以获得映射列表
- 你怎么看我代码的质量,有什么意见
您可以使用列表理解:
mapped_list = [{"word": w_dict["word"],
"notes_ids": [n_dict["note_id"] for n_dict in note_list
if w_dict["word"].lower() in n_dict["note_txt"].lower()]
} for w_dict in word_list]
结果将是:
[{'word': 'python', 'notes_ids': [1, 2, 3, 4]},
{'word': 'awsome', 'notes_ids': [4]},
{'word': 'frameworks', 'notes_ids': [1, 2]},
{'word': 'programing', 'notes_ids': [4, 5]},
{'word': 'Whosebug', 'notes_ids': [4, 6]},
{'word': 'work', 'notes_ids': [1, 2, 7]},
{'word': 'error', 'notes_ids': [5]},
{'word': 'teach', 'notes_ids': [7]}]
- 尝试在创建字典的同时创建maped_list,在迭代时添加单词的索引。
- 不要使用
dict
作为变量,它是python创建dicts的保留名称,如dict()
,如果你使用它,它将被覆盖。此外,您的输入不包含 space 以外的任何其他白色 space,您可以使用 sentence.split()。您可以做的另一件事是将所有单词转换为小写,因此无论是否写成大写它们都没有区别。
我有一个这样的单词列表:
word_list=[{"word": "python",
"repeted": 4},
{"word": "awsome",
"repeted": 3},
{"word": "frameworks",
"repeted": 2},
{"word": "programing",
"repeted": 2},
{"word": "Whosebug",
"repeted": 2},
{"word": "work",
"repeted": 1},
{"word": "error",
"repeted": 1},
{"word": "teach",
"repeted": 1}
]
,来自另一个笔记列表:
note_list = [{"note_id":1,
"note_txt":"A curated list of awesome Python frameworks"},
{"note_id":2,
"note_txt":"what is awesome Python frameworks"},
{"note_id":3,
"note_txt":"awesome Python is good to wok with it"},
{"note_id":4,
"note_txt":"use Whosebug to lern programing with python is awsome"},
{"note_id":5,
"note_txt":"error in programing is good to learn"},
{"note_id":6,
"note_txt":"Whosebug is very useful to share our knoloedge"},
{"note_id":7,
"note_txt":"teach, work"},
]
我想知道如何将每个单词映射到它自己的音符:
maped_list=[{"word": "python",
"notes_ids": [1,2,3,4]},
{"word": "awsome",
"notes_ids": [1,2,3]},
{"word": "frameworks",
"notes_ids": [1,2]},
{"word": "programing",
"notes_ids": [4,5]},
{"word": "Whosebug",
"notes_ids": [4,6]},
{"word": "work",
"notes_ids": [7]},
{"word": "error",
"notes_ids": [5]},
{"word": "teach",
"notes_ids": [7]}
]
我的作品:
# i started by appending all the notes text into one list
notes_test = []
for note in note_list:
notes_test.append(note['note_txt'])
# calculate the reptition of each word
dict = {}
for sentence in notes_test:
for word in re.split('\s', sentence): # split with whitespace
try:
dict[word] += 1
except KeyError:
dict[word] = 1
word_list= []
for key in dict.keys():
word = {}
word['word'] = key
word['repeted'] = dict[key]
word_list.append(word)
我的问题:
- 如何映射单词列表和注释列表以获得映射列表
- 你怎么看我代码的质量,有什么意见
您可以使用列表理解:
mapped_list = [{"word": w_dict["word"],
"notes_ids": [n_dict["note_id"] for n_dict in note_list
if w_dict["word"].lower() in n_dict["note_txt"].lower()]
} for w_dict in word_list]
结果将是:
[{'word': 'python', 'notes_ids': [1, 2, 3, 4]},
{'word': 'awsome', 'notes_ids': [4]},
{'word': 'frameworks', 'notes_ids': [1, 2]},
{'word': 'programing', 'notes_ids': [4, 5]},
{'word': 'Whosebug', 'notes_ids': [4, 6]},
{'word': 'work', 'notes_ids': [1, 2, 7]},
{'word': 'error', 'notes_ids': [5]},
{'word': 'teach', 'notes_ids': [7]}]
- 尝试在创建字典的同时创建maped_list,在迭代时添加单词的索引。
- 不要使用
dict
作为变量,它是python创建dicts的保留名称,如dict()
,如果你使用它,它将被覆盖。此外,您的输入不包含 space 以外的任何其他白色 space,您可以使用 sentence.split()。您可以做的另一件事是将所有单词转换为小写,因此无论是否写成大写它们都没有区别。