离线词典程序:查找相似的单词以及开头相同的单词
Offline dictionary program: find both similar words as well as words that begin the same
我写了这个离线词典程序。我希望当用户按下一个键时,该程序进入数据库并找到一个与用户到目前为止输入的单词接近的单词。或者当用户完整地输入一个单词并且该单词在数据库中时,程序将显示它及其含义。
这部分一切顺利。然后我想要例如当用户输入单词 "a" 时,程序显示数据库中以 "a".
开头的所有单词
这是我的问题的一个例子:当我们输入 "a" 时,应该显示所有以 "a" 开头的单词和含义。但是程序显示如下:
这是我的一些 json
格式的数据库:
{"apple": ["Apple", "apple", "Sib", "Apfel", "Des pommes"], "average": ["Average", "average", "Miangin", "Durchschnitt", "Des pommes"], "acknowledge": ["Acknowledge", "acknowledge", "Tasdigh Kardan", "Zu bestatigen", "Pour reconnaître"], "book": ["Book", "book", "Ketab", "Buch", "Livre"], "banana": ["Banana", "banana", "Mouz", "Bananen", "Bananes"], "beach grass": ["Beach Grass", "beach grass", "Chamane Sahel", "Strandhafer", "herbe de plage"], "cat": ["Cat", "cat", "Gorbe", "Katzen", "chatte"], "certificate": ["Certificate", "certificate", "Govahi Name", "Zertifikat", "certificat"], "declaration of conformity": ["Declaration Of Conformity", "declaration of conformity", "Elamie Entebagh", "Konformitatserklarung", "déclaration de conformité"], "database": ["Database", "database", "Paygah Dade", "Datenbank", "base de données"], "dear colleagues": ["Dear Colleagues", "dear colleagues", "Hamkarane Aziz", "Liebe Mitarbeiterinnen und Mitarbeiter", "Chers collègues"]}
在这本词典中,每个单词都有英语、波斯语、法语和德语的含义。
你可以在下面看到我的代码:
import json
import msvcrt
import os
from difflib import get_close_matches
DataBase = json.load(open("DataBase.json"))
def getMeaning(w):
w = w.lower()
n = len(w)
if w in DataBase:
return DataBase[w]
elif len(get_close_matches(w,DataBase.keys(),1,0.3)) > 0:
close_match = get_close_matches(w,DataBase.keys(),1,0.3)[0]
print("Not Found!\nCheck The Close Match:\n")
return DataBase[close_match]
else:
print ("Not Found!\n")
res = [value for key, value in DataBase.items()]
for i in res:
for j in i:
if w in j[0:n].lower():
print(j)
return ''
word = ''
while True:
if msvcrt.kbhit():
temp = msvcrt.getwch()
word += temp
os.system('cls')
print(word)
print("\n")
meaning = getMeaning(word)
for item in meaning:
print(item)
请注意,由于 msvcrt.kbhit()
。
,您必须 运行 CMD
中的此程序才能正常工作
如果有人输入 a
,您将调用 getMeaning
,后者又会调用 get_close_matches
。然后,您将检查该调用是否具有非零长度 return 值,如果有,您将执行 return DataBase[close_match]
。 getMeaning
到此结束。
如果 get_close_matches
产生结果,您将永远无法达到 getMeaning
的 else
部分。在您的问题的屏幕截图中,我们可以看到用户输入 a
的结果,这是有意义的,因为 get_close_matches
发现 cat
类似于 a
。
尽管如此,您应该使用 startswith
if you want to test if a string begins with another string. Also, you don't need elif
or else
after the previous if
or elif
has a return
and I have changed the names according to PEP 8 section Descriptive Naming Styles。
这是一个可能的解决方案,使用一个过滤器,如果字母与 word
中的字母相同,则只接受接近匹配:
from difflib import get_close_matches
database = {"apple": ["Apple", "apple", "Sib", "Apfel", "Des pommes"], "average": ["Average", "average", "Miangin", "Durchschnitt", "Des pommes"], "acknowledge": ["Acknowledge", "acknowledge", "Tasdigh Kardan", "Zu bestatigen", "Pour reconnaître"], "book": ["Book", "book", "Ketab", "Buch", "Livre"], "banana": ["Banana", "banana", "Mouz", "Bananen", "Bananes"], "beach grass": ["Beach Grass", "beach grass", "Chamane Sahel", "Strandhafer", "herbe de plage"], "cat": ["Cat", "cat", "Gorbe", "Katzen", "chatte"], "certificate": ["Certificate", "certificate", "Govahi Name", "Zertifikat", "certificat"], "declaration of conformity": ["Declaration Of Conformity", "declaration of conformity", "Elamie Entebagh", "Konformitatserklarung", "déclaration de conformité"], "database": ["Database", "database", "Paygah Dade", "Datenbank", "base de données"], "dear colleagues": ["Dear Colleagues", "dear colleagues", "Hamkarane Aziz", "Liebe Mitarbeiterinnen und Mitarbeiter", "Chers collègues"]}
def get_meaning(word):
# Make word case-insensitive
word = word.lower()
# Check if word already in database
if word in database:
return {word: database[word]}
# Find possible close matches
close_matches = get_close_matches(word, database.keys(), 1, 0.3)
# Filter matches: keep only those which contain the same letters
close_matches = [
close_match
for close_match in close_matches
if set(close_match) == set(word)
]
# Return close matches if any left
if close_matches:
return {
close_match: database[close_match]
for close_match in close_matches
}
# Return all dictionary entries which start with the word
return {
entry: database[entry]
for entry in database
if entry.startswith(word)
}
现在 a
不再产生 cat
:
>>> get_meaning("a")
{'apple': ['Apple', 'apple', 'Sib', 'Apfel', 'Des pommes'], 'average': ['Average', 'average', 'Miangin', 'Durchschnitt', 'Des pommes'], 'acknowledge': ['Acknowledge', 'acknowledge', 'Tasdigh Kardan', 'Zu bestatigen', 'Pour reconnaître']}
但是 applle
仍然被识别为 apple
:
>>> get_meaning("applle")
{'apple': ['Apple', 'apple', 'Sib', 'Apfel', 'Des pommes']}
或者,您可以修改调用 get_close_matches
的参数 cutoff
以获得不同的结果。
在 get_close_matches
中,可选参数 cutoff
是范围 [0, 1]
中的浮点数。
得分低于单词相似度的可能性将被忽略。
所以我只需要将 get_close_matches
的 cutoff
从 0.3
更改为 0.8
。
这解决了我的问题。
elif len(get_close_matches(w,DataBase.keys(),1,0.8)) > 0:
close_match = get_close_matches(w,DataBase.keys(),1,0.8)[0]
print("Not Found!\nCheck The Close Match:\n")
return DataBase[close_match]
我写了这个离线词典程序。我希望当用户按下一个键时,该程序进入数据库并找到一个与用户到目前为止输入的单词接近的单词。或者当用户完整地输入一个单词并且该单词在数据库中时,程序将显示它及其含义。
这部分一切顺利。然后我想要例如当用户输入单词 "a" 时,程序显示数据库中以 "a".
开头的所有单词这是我的问题的一个例子:当我们输入 "a" 时,应该显示所有以 "a" 开头的单词和含义。但是程序显示如下:
这是我的一些 json
格式的数据库:
{"apple": ["Apple", "apple", "Sib", "Apfel", "Des pommes"], "average": ["Average", "average", "Miangin", "Durchschnitt", "Des pommes"], "acknowledge": ["Acknowledge", "acknowledge", "Tasdigh Kardan", "Zu bestatigen", "Pour reconnaître"], "book": ["Book", "book", "Ketab", "Buch", "Livre"], "banana": ["Banana", "banana", "Mouz", "Bananen", "Bananes"], "beach grass": ["Beach Grass", "beach grass", "Chamane Sahel", "Strandhafer", "herbe de plage"], "cat": ["Cat", "cat", "Gorbe", "Katzen", "chatte"], "certificate": ["Certificate", "certificate", "Govahi Name", "Zertifikat", "certificat"], "declaration of conformity": ["Declaration Of Conformity", "declaration of conformity", "Elamie Entebagh", "Konformitatserklarung", "déclaration de conformité"], "database": ["Database", "database", "Paygah Dade", "Datenbank", "base de données"], "dear colleagues": ["Dear Colleagues", "dear colleagues", "Hamkarane Aziz", "Liebe Mitarbeiterinnen und Mitarbeiter", "Chers collègues"]}
在这本词典中,每个单词都有英语、波斯语、法语和德语的含义。
你可以在下面看到我的代码:
import json
import msvcrt
import os
from difflib import get_close_matches
DataBase = json.load(open("DataBase.json"))
def getMeaning(w):
w = w.lower()
n = len(w)
if w in DataBase:
return DataBase[w]
elif len(get_close_matches(w,DataBase.keys(),1,0.3)) > 0:
close_match = get_close_matches(w,DataBase.keys(),1,0.3)[0]
print("Not Found!\nCheck The Close Match:\n")
return DataBase[close_match]
else:
print ("Not Found!\n")
res = [value for key, value in DataBase.items()]
for i in res:
for j in i:
if w in j[0:n].lower():
print(j)
return ''
word = ''
while True:
if msvcrt.kbhit():
temp = msvcrt.getwch()
word += temp
os.system('cls')
print(word)
print("\n")
meaning = getMeaning(word)
for item in meaning:
print(item)
请注意,由于 msvcrt.kbhit()
。
CMD
中的此程序才能正常工作
如果有人输入 a
,您将调用 getMeaning
,后者又会调用 get_close_matches
。然后,您将检查该调用是否具有非零长度 return 值,如果有,您将执行 return DataBase[close_match]
。 getMeaning
到此结束。
如果 get_close_matches
产生结果,您将永远无法达到 getMeaning
的 else
部分。在您的问题的屏幕截图中,我们可以看到用户输入 a
的结果,这是有意义的,因为 get_close_matches
发现 cat
类似于 a
。
尽管如此,您应该使用 startswith
if you want to test if a string begins with another string. Also, you don't need elif
or else
after the previous if
or elif
has a return
and I have changed the names according to PEP 8 section Descriptive Naming Styles。
这是一个可能的解决方案,使用一个过滤器,如果字母与 word
中的字母相同,则只接受接近匹配:
from difflib import get_close_matches
database = {"apple": ["Apple", "apple", "Sib", "Apfel", "Des pommes"], "average": ["Average", "average", "Miangin", "Durchschnitt", "Des pommes"], "acknowledge": ["Acknowledge", "acknowledge", "Tasdigh Kardan", "Zu bestatigen", "Pour reconnaître"], "book": ["Book", "book", "Ketab", "Buch", "Livre"], "banana": ["Banana", "banana", "Mouz", "Bananen", "Bananes"], "beach grass": ["Beach Grass", "beach grass", "Chamane Sahel", "Strandhafer", "herbe de plage"], "cat": ["Cat", "cat", "Gorbe", "Katzen", "chatte"], "certificate": ["Certificate", "certificate", "Govahi Name", "Zertifikat", "certificat"], "declaration of conformity": ["Declaration Of Conformity", "declaration of conformity", "Elamie Entebagh", "Konformitatserklarung", "déclaration de conformité"], "database": ["Database", "database", "Paygah Dade", "Datenbank", "base de données"], "dear colleagues": ["Dear Colleagues", "dear colleagues", "Hamkarane Aziz", "Liebe Mitarbeiterinnen und Mitarbeiter", "Chers collègues"]}
def get_meaning(word):
# Make word case-insensitive
word = word.lower()
# Check if word already in database
if word in database:
return {word: database[word]}
# Find possible close matches
close_matches = get_close_matches(word, database.keys(), 1, 0.3)
# Filter matches: keep only those which contain the same letters
close_matches = [
close_match
for close_match in close_matches
if set(close_match) == set(word)
]
# Return close matches if any left
if close_matches:
return {
close_match: database[close_match]
for close_match in close_matches
}
# Return all dictionary entries which start with the word
return {
entry: database[entry]
for entry in database
if entry.startswith(word)
}
现在 a
不再产生 cat
:
>>> get_meaning("a")
{'apple': ['Apple', 'apple', 'Sib', 'Apfel', 'Des pommes'], 'average': ['Average', 'average', 'Miangin', 'Durchschnitt', 'Des pommes'], 'acknowledge': ['Acknowledge', 'acknowledge', 'Tasdigh Kardan', 'Zu bestatigen', 'Pour reconnaître']}
但是 applle
仍然被识别为 apple
:
>>> get_meaning("applle")
{'apple': ['Apple', 'apple', 'Sib', 'Apfel', 'Des pommes']}
或者,您可以修改调用 get_close_matches
的参数 cutoff
以获得不同的结果。
在 get_close_matches
中,可选参数 cutoff
是范围 [0, 1]
中的浮点数。
得分低于单词相似度的可能性将被忽略。
所以我只需要将 get_close_matches
的 cutoff
从 0.3
更改为 0.8
。
这解决了我的问题。
elif len(get_close_matches(w,DataBase.keys(),1,0.8)) > 0:
close_match = get_close_matches(w,DataBase.keys(),1,0.8)[0]
print("Not Found!\nCheck The Close Match:\n")
return DataBase[close_match]