将数据列表与 CSV 文件进行比较并对匹配项进行排序

compare list of data with CSV file and sort the matching

我有一个产品名称数据集和一个品牌列表。 我需要找出我的列表中有多少品牌产品。

**Brands sample :** ['HM International', 'Sara', 'Wildcraft', 'Nike']
**Product name sample :** [Attache backpack11Green Waterproof Backpack
Simba BTSPOKEMON POKÈMON POKÈ BALLS 18 BP Waterproof S...
HM International HMHTPB 24304MK Waterproof Multipurpos...
Chris & Kate CKB_122SS Waterproof School Bag
Simba BTSPRINCESS FOLLOW YOUR DREAMS 16 BP Waterproof ...
Kuber Industries School Bag, Backpack Waterproof School...
Minnie Trio School Bag Waterproof School Bag
Thomas School Bag Waterproof School Bag
Sara Green 002 Shoulder Bag
Disney Frozen Anna & Elsa Pink Sequins 16' ' Backpack
Disney Princess Pink Flap 18' ' Backpack
My Baby Excel Peppa Side Sling Bag Sling Bag
Ranger Black School Bag with laptop compartment Waterpr...
HM International HMHTPB 73279AV Waterproof Multipurpos...
Peppa Peppa Pig Pink Plush Toy Wallet Round Shape Plush...
Disney Frozen Anna & Elsa Pink Sequins 14' ' Backpack
Disney Frozen Magic Blue 16' ' School Bag
Good Friends stylish Waterproof School Bag
ZEVORA Pink 3D Design Children Travel & School Bag, 1 L...
Gleam A103 School Bag
SARA BAGS TG15 Waterproof Backpack
Despicable Me Favourite Subject School Bag 16 inches Tr...
AARIP LTB037 Waterproof School Bag
Simba BTSSMURFS FOOTBALL 18 BP Waterproof School Bag
Gleam JB0402C Waterproof School Bag
Simba BTSSMURFS SMURFETTE SINGING STAR 18 BP Waterproo... ]

我建议使用 str.findall with word boundary regex for search multiple values, then flatten nested lists and use Counter:

from collections import Counter

Brands = ['HM International', 'Sara', 'Wildcraft', 'Nike']
pat = r'\b{}\b'.format('|'.join(Brands))

d = Counter([y for x in df['Product'].str.findall(pat) for y in x])
print (d)

Counter({'HM International': 2, 'Sara': 1})

或者如果想要 Series 在输出中使用 Series.value_counts:

s = pd.Series(np.concatenate(df['Product'].str.findall(pat))).value_counts()
print (s)
HM International    2
Sara                1
dtype: int64

设置:

d = {'Product': ['Attache backpack11Green Waterproof Backpack', 'Simba BTSPOKEMON POKÈMON POKÈ BALLS 18 BP Waterproof S...', 'HM International HMHTPB 24304MK Waterproof Multipurpos...', 'Chris & Kate CKB_122SS Waterproof School Bag', 'Simba BTSPRINCESS FOLLOW YOUR DREAMS 16 BP Waterproof ...', 'Kuber Industries School Bag, Backpack Waterproof School...', 'Minnie Trio School Bag Waterproof School Bag', 'Thomas School Bag Waterproof School Bag', 'Sara Green 002 Shoulder Bag', "Disney Frozen Anna & Elsa Pink Sequins 16' ' Backpack", "Disney Princess Pink Flap 18' ' Backpack", 'My Baby Excel Peppa Side Sling Bag Sling Bag', 'Ranger Black School Bag with laptop compartment Waterpr...', 'HM International HMHTPB 73279AV Waterproof Multipurpos...', 'Peppa Peppa Pig Pink Plush Toy Wallet Round Shape Plush...', "Disney Frozen Anna & Elsa Pink Sequins 14' ' Backpack", "Disney Frozen Magic Blue 16' ' School Bag", 'Good Friends stylish Waterproof School Bag', 'ZEVORA Pink 3D Design Children Travel & School Bag, 1 L...', 'Gleam A103 School Bag', 'SARA BAGS TG15 Waterproof Backpack', 'Despicable Me Favourite Subject School Bag 16 inches Tr...', 'AARIP LTB037 Waterproof School Bag', 'Simba BTSSMURFS FOOTBALL 18 BP Waterproof School Bag', 'Gleam JB0402C Waterproof School Bag', 'Simba BTSSMURFS SMURFETTE SINGING STAR 18 BP Waterproo']}
df = pd.DataFrame(d)
print (df.head())
                                             Product
0        Attache backpack11Green Waterproof Backpack
1  Simba BTSPOKEMON POKÈMON POKÈ BALLS 18 BP Wate...
2  HM International HMHTPB 24304MK Waterproof Mul...
3       Chris & Kate CKB_122SS Waterproof School Bag
4  Simba BTSPRINCESS FOLLOW YOUR DREAMS 16 BP Wat...