如何使用 fuzzywuzzy 从列表中提取全文?
How to extract full text from a list with fuzzywuzzy?
下面是我的代码:
from fuzzywuzzy import fuzz
check = open("text.txt","a")
MIN_MATCH_SCORE = 30
heard_word = 'i5-1135G7 '
possible_words = check
guessed_word = [word for word in possible_words if fuzz.ratio(heard_word, word) >=
MIN_MATCH_SCORE]
print ('this one - ', guessed_word)
预期输出:
11th Generation Intel® Core™ i5-1135G7 Processor
是否可以通过单独给出 'i5-1135G7 ' 来获得预期输出中的整个句子?有没有其他解决方案可以达到我的期望?提前谢谢你。
下面是 link for text.txt
https://drive.google.com/file/d/1Mo3qFmeOAqa3WPPyg8SpeFVSjDx7AQBj/view
为了抵消较长的句子并确保在单词级别重叠,您应该使用 token_set_ratio
。此外,如果您想要完整的单词重叠,请将 MIN_MATCH_SCORE
增加到接近 100。
from fuzzywuzzy import fuzz
MIN_MATCH_SCORE = 90
heard_word = 'i5-1135G7'
possible_words = ['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to 4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)',
'windows 10 64 bit', 'intel i7']
print ([word for word in possible_words
if fuzz.token_set_ratio(heard_word, word) >= MIN_MATCH_SCORE])
输出:
['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to 4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)']
#token_set_ratio 工作正常!
从 fuzzywuzzy 导入 fuzz
s = []
for l in df1.values:
l = ', '.join(l)
s.append(l)
s = ', '.join(s)
main = [x for x in g if x]
MIN_MATCH_SCORE = 60
heard_word = 'i5-11th gen'
guessed_word = [word for word in main if fuzz.token_set_ratio(heard_word,
word) >= MIN_MATCH_SCORE]
print ('this one - ', guessed_word)
下面是我的代码:
from fuzzywuzzy import fuzz
check = open("text.txt","a")
MIN_MATCH_SCORE = 30
heard_word = 'i5-1135G7 '
possible_words = check
guessed_word = [word for word in possible_words if fuzz.ratio(heard_word, word) >=
MIN_MATCH_SCORE]
print ('this one - ', guessed_word)
预期输出:
11th Generation Intel® Core™ i5-1135G7 Processor
是否可以通过单独给出 'i5-1135G7 ' 来获得预期输出中的整个句子?有没有其他解决方案可以达到我的期望?提前谢谢你。
下面是 link for text.txt
https://drive.google.com/file/d/1Mo3qFmeOAqa3WPPyg8SpeFVSjDx7AQBj/view
为了抵消较长的句子并确保在单词级别重叠,您应该使用 token_set_ratio
。此外,如果您想要完整的单词重叠,请将 MIN_MATCH_SCORE
增加到接近 100。
from fuzzywuzzy import fuzz
MIN_MATCH_SCORE = 90
heard_word = 'i5-1135G7'
possible_words = ['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to 4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)',
'windows 10 64 bit', 'intel i7']
print ([word for word in possible_words
if fuzz.token_set_ratio(heard_word, word) >= MIN_MATCH_SCORE])
输出:
['11th Generation Intel® Core™ i5-1135G7 Processor (2.40 GHz,up to 4.20 GHz with Turbo Boost, 4 Cores, 8 Threads, 8 MB Cache)']
#token_set_ratio 工作正常!
从 fuzzywuzzy 导入 fuzz
s = []
for l in df1.values:
l = ', '.join(l)
s.append(l)
s = ', '.join(s)
main = [x for x in g if x]
MIN_MATCH_SCORE = 60
heard_word = 'i5-11th gen'
guessed_word = [word for word in main if fuzz.token_set_ratio(heard_word,
word) >= MIN_MATCH_SCORE]
print ('this one - ', guessed_word)