通过递归地将用户输入与产品特征匹配来创建产品包

Create product bundle by matching user input to product features recursively

我正在从事 Product Bundle 创建和推荐项目。捆绑和推荐必须根据用户输入实时进行。

条件是 1.The 产品包应尽可能涵盖用户输入。 2. 推荐项目中用户输入的重复应该更少。

user_input=['a','b','c']

d1=['a','c','d']
d2=['a','b','e','f']
d3=['a','c','b','f']
d4=['b']
d5=['g','e','a']
d6=['g']

expected output - d1 + d4, d3, d1+d2,d5+d4+d1

以下是我的代码,它给出了结果,但它显示了重复的结果,也没有显示所有的组合。感谢任何帮助。

dlist=[d1,d2,d3,d4,d5,d6]
diff_match=[]
# find match n diff of each product based on user input
for i in range(len(dlist)):
    match=set(user_input).intersection(set(dlist[i]))
    #print("match is",match)
    diff=set(user_input).difference(set(dlist[i]))
    #print("diff is",diff)
    temp={'match':match,'diff':diff}
    diff_match.append(temp)
    
for i in range(len(diff_match)):
    # if match is found, recommend the product alone
    diff_match_temp=diff_match[i]['match']
    print("diff match temp is",diff_match_temp)
    if diff_match_temp==user_input:
        print ("absolute match")
    #scenario where the user input is subset of product features, seperate from partial match
    elif (all(x in list(diff_match_temp) for x in list(user_input))):
        print("User input subset of product features")
        print("The parent list is",diff_match[i]['match'])
        print("the product is", dlist[i])
    else:
        '''else check if the difference between user input and the current product is fulfilled by other product, 
        if yes, these products are bundled together'''
        for j in range(len(diff_match)):
            temp_diff=diff_match[i]['diff']
            print("temp_diff is",temp_diff)
            # empty set should be explicitly checked to avoid wrong match
            if (temp_diff.intersection(diff_match[j]['match'])==temp_diff and len(temp_diff)!=0 and list(temp_diff) != user_input) :
            #if temp_diff==diff_match[j]['match'] and len(temp_diff)!=0 and list(temp_diff) != user_input :
                print("match found with another product")
                print("parent is",dlist[i])
                print("the combination is",dlist[j] )

要递归执行此操作,您可以创建一个函数,该函数 returns 具有最大组件覆盖率的产品并递归以完成具有剩余组件的捆绑:

def getBundles(C,P):
    cSet     = set(C) # use set operations,  largest to smallest coverage
    coverage = sorted(P.items(),key=lambda pc:-len(cSet.intersection(pc[1])))
    for i,(p,cs) in enumerate(coverage):
        if cSet.isdisjoint(cs):continue          # no coverage
        if cSet.issubset(cs): yield [p];continue # complete (stop recursion)
        remaining = cSet.difference(cs)   # remaining components to bundle
        unused    = dict(coverage[i+1:])  # products not already bundled
        yield from ([p]+rest for rest in getBundles(remaining,unused))

输出:

prods = {"d1":['a','c','d'],
         "d2":['a','b','e','f'],
         "d3":['a','c','b','f'],
         "d4":['b'],
         "d5":['g','e','a'],
         "d6":['g']}

user_input=['a','b','c']

for bundle in getBundles(user_input,prods):
    print(bundle)

['d3']
['d1', 'd2']
['d1', 'd4']

请注意,['d5'、'd1'、'd2']等冗余组合被排除在外,因为 d1+d2 涵盖了 d5 涵盖的所有内容,所以d5 是多余的。相同产品的排列也被排除在外(例如 d1+d2 与 d2+d1 相同)

[编辑]

如果您需要提供冗余组合(可能作为扩展选择选项),您可以编写一个略有不同的递归函数,该函数不会排除它们。您还应该在呈现结果时按照最接近的顺序对结果进行排序:

def getBundles(C,P,remain=None,bundle=None):
    cSet  = set(C)                               # use set operations
    if remain is None: remain,bundle = cSet,[]   # track coverage & budle
    prods = list(P.items())                      # remaining products
    for i,(p,cs) in enumerate(prods):
        if cSet.isdisjoint(cs):continue         # no coverage
        newBundle = bundle+[p]                  # add product to bundle
        if remain.issubset(cs): yield newBundle # full coverage bundle 
        toCover = remain.difference(cs)         # not yet covered 
        unused  = dict(prods[i+1:])             # remainin products
        yield from getBundles(C,unused,toCover,newBundle) # recurse for rest

输出:

prods = {"d1":['a','c','d'],
         "d2":['a','b','e','f'],
         "d3":['a','c','b','f'],
         "d4":['b'],
         "d5":['g','e','a'],
         "d6":['g']}    
user_input=['a','b','c']

for bundle in sorted(getBundles(user_input,prods),
                     key=lambda b:sum(map(len,(prods[p] for p in b)))):
    print(bundle)

['d1', 'd4']
['d3']
['d3', 'd4']
['d1', 'd2']
['d1', 'd3']
['d1', 'd4', 'd5']
['d3', 'd5']
['d1', 'd2', 'd4']
['d1', 'd3', 'd4']
['d2', 'd3']
['d3', 'd4', 'd5']
['d2', 'd3', 'd4']
['d1', 'd2', 'd5']
['d1', 'd3', 'd5']
['d1', 'd2', 'd3']
['d1', 'd2', 'd4', 'd5']
['d1', 'd3', 'd4', 'd5']
['d2', 'd3', 'd5']
['d1', 'd2', 'd3', 'd4']
['d2', 'd3', 'd4', 'd5']
['d1', 'd2', 'd3', 'd5']
['d1', 'd2', 'd3', 'd4', 'd5']