Pandas 值查找但具有重复值
Pandas value lookup but with duplicate values
我有一个包含项目价格的列表列表,这些元素的顺序也很重要。我还有一个数据框,其中包含这些列表中的项目及其相关价格。我试图遍历每个列表并基本上用相应的项目替换列表列表中的价格元素。我遇到的问题是有两种价格相同的商品。目前我的代码只是将这两个重复定价的项目添加到列表中,但我希望它为这两个项目创建一个单独的列表。
当前代码:
data = {'Item':['Apples', 'Cereal', 'Corn', 'Pasta', 'Detergent', 'Coffee', 'Ketchup', 'Oats', 'Olive Oil'],
'Price':[4, 2, 6, 5, 10, 9, 2, 3, 1]}
df = pd.DataFrame(data)
combos = [[4, 2, 3, 6], [2, 10, 2, 4], [6, 1, 10, 2]]
testing = []
for list in combos:
output = df.set_index('Price').loc[list, 'Item'].to_numpy().tolist()
testing.append(output)
print(testing)
输出:
[['Apples', 'Cereal', 'Ketchup', 'Oats', 'Corn'], ['Cereal', 'Ketchup', 'Detergent', 'Cereal', 'Ketchup', 'Apples'], ['Corn', 'Olive Oil', 'Detergent', Cereal, 'Ketchup']]
我想要的结果:
[['Apples', 'Cereal', 'Oats', 'Corn'], ['Cereal', 'Detergent', 'Cereal', 'Apples'], ['Cereal', 'Detergent', 'Ketchup', 'Apples'], ['Ketchup', 'Detergent', 'Cereal', 'Apples'], ['Ketchup', 'Detergent', 'Ketchup', 'Apples'], ['Corn', 'Olive Oil', 'Detergent', 'Cereal'], ['Corn', 'Olive Oil', 'Detergent', 'Ketchup']]
使用 itertools.product
和 chain
的一种方式:
from itertools import product, chain
prices = df.groupby("Price")["Item"].apply(list)
list(chain.from_iterable(product(*prices.loc[c]) for c in combos))
输出:
[('Apples', 'Cereal', 'Oats', 'Corn'),
('Apples', 'Ketchup', 'Oats', 'Corn'),
('Cereal', 'Detergent', 'Cereal', 'Apples'),
('Cereal', 'Detergent', 'Ketchup', 'Apples'),
('Ketchup', 'Detergent', 'Cereal', 'Apples'),
('Ketchup', 'Detergent', 'Ketchup', 'Apples'),
('Corn', 'Olive Oil', 'Detergent', 'Cereal'),
('Corn', 'Olive Oil', 'Detergent', 'Ketchup')]
你也可以使用pd.MultiIndex.from_product
来产生笛卡尔积:
prices = df.groupby("Price")["Item"].apply(list)
out = []
for combo in combos:
product = pd.MultiIndex.from_product(prices.loc[combo]).tolist()
out.extend(map(list, product))
输出:
[['Apples', 'Cereal', 'Oats', 'Corn'],
['Apples', 'Ketchup', 'Oats', 'Corn'],
['Cereal', 'Detergent', 'Cereal', 'Apples'],
['Cereal', 'Detergent', 'Ketchup', 'Apples'],
['Ketchup', 'Detergent', 'Cereal', 'Apples'],
['Ketchup', 'Detergent', 'Ketchup', 'Apples'],
['Corn', 'Olive Oil', 'Detergent', 'Cereal'],
['Corn', 'Olive Oil', 'Detergent', 'Ketchup']]
我有一个包含项目价格的列表列表,这些元素的顺序也很重要。我还有一个数据框,其中包含这些列表中的项目及其相关价格。我试图遍历每个列表并基本上用相应的项目替换列表列表中的价格元素。我遇到的问题是有两种价格相同的商品。目前我的代码只是将这两个重复定价的项目添加到列表中,但我希望它为这两个项目创建一个单独的列表。
当前代码:
data = {'Item':['Apples', 'Cereal', 'Corn', 'Pasta', 'Detergent', 'Coffee', 'Ketchup', 'Oats', 'Olive Oil'],
'Price':[4, 2, 6, 5, 10, 9, 2, 3, 1]}
df = pd.DataFrame(data)
combos = [[4, 2, 3, 6], [2, 10, 2, 4], [6, 1, 10, 2]]
testing = []
for list in combos:
output = df.set_index('Price').loc[list, 'Item'].to_numpy().tolist()
testing.append(output)
print(testing)
输出:
[['Apples', 'Cereal', 'Ketchup', 'Oats', 'Corn'], ['Cereal', 'Ketchup', 'Detergent', 'Cereal', 'Ketchup', 'Apples'], ['Corn', 'Olive Oil', 'Detergent', Cereal, 'Ketchup']]
我想要的结果:
[['Apples', 'Cereal', 'Oats', 'Corn'], ['Cereal', 'Detergent', 'Cereal', 'Apples'], ['Cereal', 'Detergent', 'Ketchup', 'Apples'], ['Ketchup', 'Detergent', 'Cereal', 'Apples'], ['Ketchup', 'Detergent', 'Ketchup', 'Apples'], ['Corn', 'Olive Oil', 'Detergent', 'Cereal'], ['Corn', 'Olive Oil', 'Detergent', 'Ketchup']]
使用 itertools.product
和 chain
的一种方式:
from itertools import product, chain
prices = df.groupby("Price")["Item"].apply(list)
list(chain.from_iterable(product(*prices.loc[c]) for c in combos))
输出:
[('Apples', 'Cereal', 'Oats', 'Corn'),
('Apples', 'Ketchup', 'Oats', 'Corn'),
('Cereal', 'Detergent', 'Cereal', 'Apples'),
('Cereal', 'Detergent', 'Ketchup', 'Apples'),
('Ketchup', 'Detergent', 'Cereal', 'Apples'),
('Ketchup', 'Detergent', 'Ketchup', 'Apples'),
('Corn', 'Olive Oil', 'Detergent', 'Cereal'),
('Corn', 'Olive Oil', 'Detergent', 'Ketchup')]
你也可以使用pd.MultiIndex.from_product
来产生笛卡尔积:
prices = df.groupby("Price")["Item"].apply(list)
out = []
for combo in combos:
product = pd.MultiIndex.from_product(prices.loc[combo]).tolist()
out.extend(map(list, product))
输出:
[['Apples', 'Cereal', 'Oats', 'Corn'],
['Apples', 'Ketchup', 'Oats', 'Corn'],
['Cereal', 'Detergent', 'Cereal', 'Apples'],
['Cereal', 'Detergent', 'Ketchup', 'Apples'],
['Ketchup', 'Detergent', 'Cereal', 'Apples'],
['Ketchup', 'Detergent', 'Ketchup', 'Apples'],
['Corn', 'Olive Oil', 'Detergent', 'Cereal'],
['Corn', 'Olive Oil', 'Detergent', 'Ketchup']]