根据列表计算DataFrame中的匹配值
Count Match value in DataFrame based on list
我有一个数据框,例如有一些项目标题
ratings_dict = {
"TYPE": ["Testing","Headphone","Iphone","AC","Laptop","Monitor"],
}
df = pd.DataFrame(ratings_dict)
想要根据给定列表计算值:
Search_history=['test','phone','lap','testing','tes','iphone','Headphone','head','Monitor','ac']
预期输出:
注意:在这种情况下,单词“phone”与数据帧“Headphone”和“Iphone”中的 2 个值匹配,然后两者的计数都会递增.
任何建议或代码片段都会有所帮助。
由你来定义什么条件是有意义的,你的问题有点太笼统了。您可以检查值是否匹配,也可以在检查之前将一些列表值转换为默认值
您需要将所有内容都转换为小写,然后计算 TYPE 是搜索历史项的子字符串的次数,反之亦然
import pandas as pd
ratings_dict = {
"TYPE": ["Testing","Headphone","Iphone","AC","Laptop","Monitor"],
}
df = pd.DataFrame(ratings_dict)
Search_history=['test','phone','lap','testing','tes','iphone','Headphone','head','Monitor','ac']
# convert everything to lower case
Search_history = [ x.lower() for x in Search_history]
df['TYPE'] = [ x.lower() for x in df.TYPE]
# count up the number of times one of the TYPEs is a substring of a Search_history or a Search_history is a substring of a TYPE
df['count'] = [ sum( x in y or y in x for y in Search_history) for x in df.TYPE]
我有一个数据框,例如有一些项目标题
ratings_dict = {
"TYPE": ["Testing","Headphone","Iphone","AC","Laptop","Monitor"],
}
df = pd.DataFrame(ratings_dict)
想要根据给定列表计算值:
Search_history=['test','phone','lap','testing','tes','iphone','Headphone','head','Monitor','ac']
预期输出:
注意:在这种情况下,单词“phone”与数据帧“Headphone”和“Iphone”中的 2 个值匹配,然后两者的计数都会递增.
任何建议或代码片段都会有所帮助。
由你来定义什么条件是有意义的,你的问题有点太笼统了。您可以检查值是否匹配,也可以在检查之前将一些列表值转换为默认值
您需要将所有内容都转换为小写,然后计算 TYPE 是搜索历史项的子字符串的次数,反之亦然
import pandas as pd
ratings_dict = {
"TYPE": ["Testing","Headphone","Iphone","AC","Laptop","Monitor"],
}
df = pd.DataFrame(ratings_dict)
Search_history=['test','phone','lap','testing','tes','iphone','Headphone','head','Monitor','ac']
# convert everything to lower case
Search_history = [ x.lower() for x in Search_history]
df['TYPE'] = [ x.lower() for x in df.TYPE]
# count up the number of times one of the TYPEs is a substring of a Search_history or a Search_history is a substring of a TYPE
df['count'] = [ sum( x in y or y in x for y in Search_history) for x in df.TYPE]