Python 3.6 Pandas Difflib Get_Close_Matches 使用用户输入过滤数据帧

Python 3.6 Pandas Difflib Get_Close_Matches to filter a dataframe with user input

使用通过 pandas 数据框导入的 csv,我试图在 df 的一列中搜索类似于用户生成的输入的条目。以前从未使用过 difflib,我的尝试以 TypeError: object of type 'float' has no len() or an empty [] list.

结束
import difflib
import pandas as pd

df = pd.read_csv("Vendorlist.csv", encoding= "ISO-8859-1")
word = input ("Enter a vendor: ")

def find_it(w):
    w = w.lower()
    return difflib.get_close_matches(w, df.vendorname, n=50, cutoff=.6)

alternatives = find_it(word)
print (alternatives)

错误似乎发生在 "return.difflib.get_close_matches(w, df.vendorname, n=50, cutoff=.6)"

我正在尝试使用名为 'vendorname' 的列获得与 "word" 类似的结果。

非常感谢帮助。

您的列 vendorname 类型不正确。

在您的 return 语句中尝试:

return difflib.get_close_matches(w, df.vendorname.astype(str), n=50, cutoff=.6)

import difflib
import pandas as pd

df = pd.read_csv("Vendorlist.csv", encoding= "ISO-8859-1")
word = input ("Enter a vendor: ")

def find_it(w):
    w = w.lower()
    return difflib.get_close_matches(w, df.vendorname.astype(str), n=50, cutoff=.6)

alternatives = find_it(word)
print (alternatives)

如@johnchase

的评论所述

The question also mentions the return of an empty list. The return of get_close_matches is a list of matches, if no item matched within the cutoff an empty list will be returned – johnchase

我跳过了:

astype(str)in (return difflib.get_close_matches(w, df.vendorname.astype(str), n=50, cutoff=.6))

代替使用:

dtype='string' in (df = pd.read_csv("Vendorlist.csv", encoding= "ISO-8859-1"))