根据食物偏好将人们分组
Match people into groups according to food preference
我正在寻找一种算法来帮助我将人们分成 3 组(a、b、c)。一个群体中的人应该适应在一起,这意味着食物偏好应该以一种他们都能同意同一种食物的方式相匹配。组内每个集群(子组)由 6 人组成。
假设有 4 种食物偏好:
- 这个人喜欢吃肉
- 喜欢吃素的人
- 喜欢吃纯素的人
- 没有食物偏好,基本上什么都喜欢吃
我想将人员分成 3 个逻辑组:
- a 组:肉类和 no_food_preference
- b 组:素食主义者、素食主义者和 no_food_preference
- c 组:素食主义者和 no_food_preference
我用no_food_preference的人来填满集群,以确保每个集群包含6个人。
将所有人分组后,每组由6人的倍数组成。
我的问题: 我很努力地尝试了,但找不到适合我的算法。我发现很难处理这样一个事实,即该算法应该处理任意数量的参与者。
示例:
import pandas as pd
df = pd.DataFrame(
{
"user_id": [i for i in range(1, 55)],
"Master_FoodPreference": ["meat", "vegetarian", "meat", "vegan", "meat", "vegetarian", "meat", "vegetarian", "no_food_preference",
"meat",'no_food_preference', 'vegetarian',"meat", "meat",
"vegetarian", "vegetarian", "vegan", "vegetarian", "vegetarian", "no_food_preference", "vegan",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian",
"meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "vegetarian", "vegetarian", "vegetarian",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "no_food_preference",
"no_food_preference", "no_food_preference", "meat", "no_food_preference", "meat", "meat",
"vegan", "no_food_preference", "no_food_preference", "vegan" ,"no_food_preference" ,"vegan" ,"vegan" ]
}
)
df.head()
>>>>
user_id Master_FoodPreference
0 1 meat
1 2 vegetarian
2 3 meat
3 4 vegan
4 5 meat
您如何将这些人分为 group_a
、group_b
和 group_c
?
编辑 - 小组构成:
每组 (a,b,c) 都会得到一个特定的标签:
- a组:人们会用肉做饭
- b 组:人们会做一顿素食
- c组:人们会做一顿素食
这意味着,我们应该尽量让大多数素食者进入group_c
。如果 group_c
是完整的,我们将它们放入 group_b
。注意:我们不能将纯素食者放入 group_c
,因为纯素食者不吃素食。
创建 4 个数据框:3 个用于您的组(dfA、dfB、dfC),1 个用于没有食物偏好的组(dfX),然后如果需要,用 X 组填充每个组 A、B、C:
dfX = df[df['Master_FoodPreference'].eq('no_food_preference')]
dfA = df[df['Master_FoodPreference'].eq('meat')]
dfA = dfA.append(dfX.sample(len(dfA) % 6))
dfB = df[df['Master_FoodPreference'].eq('vegan')
| df['Master_FoodPreference'].eq('vegetarian')]
dfB = dfB.append(dfX.sample(len(dfB) % 6))
dfC = df[df['Master_FoodPreference'].eq('vegetarian')]
dfC = dfC.append(dfX.sample(len(dfC) % 6))
输出:
>>> dfB
user_id Master_FoodPreference
1 2 vegetarian
3 4 vegan
5 6 vegetarian
7 8 vegetarian
11 12 vegetarian
14 15 vegetarian
15 16 vegetarian
16 17 vegan
17 18 vegetarian
18 19 vegetarian
20 21 vegan
21 22 vegetarian
22 23 vegetarian
23 24 vegetarian
24 25 vegetarian
25 26 vegetarian
26 27 vegetarian
28 29 vegetarian
30 31 vegetarian
32 33 vegetarian
33 34 vegetarian
34 35 vegetarian
35 36 vegetarian
36 37 vegetarian
37 38 vegetarian
38 39 vegetarian
39 40 vegetarian
47 48 vegan
50 51 vegan
52 53 vegan
53 54 vegan
49 50 no_food_preference
After distributing all people into groups, each group will consist of multiple of 6 people.
你的样本是可能的:
# Before append
>>> len(dfA), len(dfB), len(dfC), len(dfX)
(12, 31, 24, 11)
似乎并不太难:将项目分组,然后使用“no_food_preference”中的项目以 6 为模填充其他组 - 如果某些项目仍保留在“no_food_preference”中将他们移到另一组:
pref = ["meat", "vegetarian", "meat", "vegan", "meat", "vegetarian", "meat", "vegetarian", "no_food_preference",
"meat",'no_food_preference', 'vegetarian',"meat", "meat",
"vegetarian", "vegetarian", "vegan", "vegetarian", "vegetarian", "no_food_preference", "vegan",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian",
"meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "vegetarian", "vegetarian", "vegetarian",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "no_food_preference",
"no_food_preference", "no_food_preference", "meat", "no_food_preference", "meat", "meat",
"vegan", "no_food_preference", "no_food_preference", "vegan" ,"no_food_preference" ,"vegan" ,"vegan" ]
def assign_groups(pref):
groups={}
for i,p in enumerate(pref):
if p in groups:
groups[p].append(i)
else:
groups[p] = [i]
for p in ['meat','vegetarian','vegan']:
need = len(groups[p]) % 6
if need:
for i in range(6-need):
groups[p].append(groups["no_food_preference"].pop())
if len(groups["no_food_preference"]):
groups["meat"] += groups["no_food_preference"]
del groups["no_food_preference"]
return groups
assign_groups(pref)
{'meat': [0, 2, 4, 6, 9, 12, 13, 27, 29, 43, 45, 46, 8, 10, 19, 31, 40, 41], 'vegetarian': [1, 5, 7, 11, 14, 15, 17, 18, 21, 22, 23, 24, 25, 26, 28, 30, 32, 33, 34, 35, 36, 37, 38, 39], 'vegan': [3, 16, 20, 47, 50, 52, 53, 51, 49, 48, 44, 42]}
当然,如果项目总数是 6 的倍数,这将起作用。
编辑
我更新了代码以更符合原始请求并处理一些特殊情况。一些观察:
- 我们需要总人数是 6(或我们为“集群”大小选择的值)的倍数
- 如果我们想确保处理所有种可能性,我们需要假设肉食者也可以吃蔬菜——即它们可以用来填满蔬菜甚至素食主义者集群。否则某些情况无法解决,例如,如果簇大小为 6,则 7 x 肉、7 x 素食、7 x 素食、3 x no-pref 没有解决方案
- 所以我们首先处理纯素食者组,用无偏好填充它,然后如果需要素食者,然后如果仍然需要肉食者;然后处理剩下的素食者,在他们的小组中加入无偏好者,然后是肉食者;最后是肉组,它只能装满没有偏好的人;最后,如果仍然存在一些无偏好的集群,我们将它们添加到一组 (meat)
修改后的代码如下所示(我添加了一个辅助函数来将人们从一个组移动到另一个):
pref = ["meat", "vegetarian", "meat", "vegan", "meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "meat",
'no_food_preference', 'vegetarian',"meat", "meat","vegetarian", "vegetarian", "vegan", "vegetarian", "vegetarian",
"no_food_preference", "vegan", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian",
"meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "vegetarian", "vegetarian", "vegetarian",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "no_food_preference",
"no_food_preference", "no_food_preference", "meat", "no_food_preference", "meat", "meat", "vegan",
"no_food_preference", "no_food_preference", "vegan" ,"no_food_preference" ,"vegan" ,"vegan" ]
groups = {}
def assign_groups(pref, pergroup):
global groups, pref
groups = {'meat':[], 'vegetarian':[], 'vegan':[], 'no_food_preference':[]}
fillers = {'meat':['no_food_preference'],
'vegetarian':['no_food_preference', 'meat'],
'vegan':['no_food_preference', 'vegetarian', 'meat']}
for i,p in enumerate(pref):
groups[p].append(i)
for p in ['vegan','vegetarian','meat']:
need = len(groups[p]) % pergroup
if need:
fill_idx = 0
need = pergroup - need
while need:
f = fillers[p][fill_idx]
avail = len(groups[f])
if need > avail:
from_to(p, f, avail)
need -= avail
fill_idx += 1
else:
from_to(p, f, need)
need = 0
if len(groups["no_food_preference"]):
from_to("meat", "no_food_preference", len(groups["no_food_preference"]))
return groups
def from_to(p,f,n):
global groups
for i in range(n):
groups[p].append(groups[f].pop())
assign_groups(pref, 6)
{'meat': [0, 2, 4, 6, 9, 12, 13, 27, 29, 43, 45, 46, 41, 40, 31, 19, 10, 8], 'vegetarian': [1, 5, 7, 11, 14, 15, 17, 18, 21, 22, 23, 24, 25, 26, 28, 30, 32, 33, 34, 35, 36, 37, 38, 39], 'vegan': [3, 16, 20, 47, 50, 52, 53, 51, 49, 48, 44, 42], 'no_food_preference': []}
我正在寻找一种算法来帮助我将人们分成 3 组(a、b、c)。一个群体中的人应该适应在一起,这意味着食物偏好应该以一种他们都能同意同一种食物的方式相匹配。组内每个集群(子组)由 6 人组成。 假设有 4 种食物偏好:
- 这个人喜欢吃肉
- 喜欢吃素的人
- 喜欢吃纯素的人
- 没有食物偏好,基本上什么都喜欢吃
我想将人员分成 3 个逻辑组:
- a 组:肉类和 no_food_preference
- b 组:素食主义者、素食主义者和 no_food_preference
- c 组:素食主义者和 no_food_preference
我用no_food_preference的人来填满集群,以确保每个集群包含6个人。
将所有人分组后,每组由6人的倍数组成。
我的问题: 我很努力地尝试了,但找不到适合我的算法。我发现很难处理这样一个事实,即该算法应该处理任意数量的参与者。
示例:
import pandas as pd
df = pd.DataFrame(
{
"user_id": [i for i in range(1, 55)],
"Master_FoodPreference": ["meat", "vegetarian", "meat", "vegan", "meat", "vegetarian", "meat", "vegetarian", "no_food_preference",
"meat",'no_food_preference', 'vegetarian',"meat", "meat",
"vegetarian", "vegetarian", "vegan", "vegetarian", "vegetarian", "no_food_preference", "vegan",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian",
"meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "vegetarian", "vegetarian", "vegetarian",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "no_food_preference",
"no_food_preference", "no_food_preference", "meat", "no_food_preference", "meat", "meat",
"vegan", "no_food_preference", "no_food_preference", "vegan" ,"no_food_preference" ,"vegan" ,"vegan" ]
}
)
df.head()
>>>>
user_id Master_FoodPreference
0 1 meat
1 2 vegetarian
2 3 meat
3 4 vegan
4 5 meat
您如何将这些人分为 group_a
、group_b
和 group_c
?
编辑 - 小组构成: 每组 (a,b,c) 都会得到一个特定的标签:
- a组:人们会用肉做饭
- b 组:人们会做一顿素食
- c组:人们会做一顿素食
这意味着,我们应该尽量让大多数素食者进入group_c
。如果 group_c
是完整的,我们将它们放入 group_b
。注意:我们不能将纯素食者放入 group_c
,因为纯素食者不吃素食。
创建 4 个数据框:3 个用于您的组(dfA、dfB、dfC),1 个用于没有食物偏好的组(dfX),然后如果需要,用 X 组填充每个组 A、B、C:
dfX = df[df['Master_FoodPreference'].eq('no_food_preference')]
dfA = df[df['Master_FoodPreference'].eq('meat')]
dfA = dfA.append(dfX.sample(len(dfA) % 6))
dfB = df[df['Master_FoodPreference'].eq('vegan')
| df['Master_FoodPreference'].eq('vegetarian')]
dfB = dfB.append(dfX.sample(len(dfB) % 6))
dfC = df[df['Master_FoodPreference'].eq('vegetarian')]
dfC = dfC.append(dfX.sample(len(dfC) % 6))
输出:
>>> dfB
user_id Master_FoodPreference
1 2 vegetarian
3 4 vegan
5 6 vegetarian
7 8 vegetarian
11 12 vegetarian
14 15 vegetarian
15 16 vegetarian
16 17 vegan
17 18 vegetarian
18 19 vegetarian
20 21 vegan
21 22 vegetarian
22 23 vegetarian
23 24 vegetarian
24 25 vegetarian
25 26 vegetarian
26 27 vegetarian
28 29 vegetarian
30 31 vegetarian
32 33 vegetarian
33 34 vegetarian
34 35 vegetarian
35 36 vegetarian
36 37 vegetarian
37 38 vegetarian
38 39 vegetarian
39 40 vegetarian
47 48 vegan
50 51 vegan
52 53 vegan
53 54 vegan
49 50 no_food_preference
After distributing all people into groups, each group will consist of multiple of 6 people.
你的样本是可能的:
# Before append
>>> len(dfA), len(dfB), len(dfC), len(dfX)
(12, 31, 24, 11)
似乎并不太难:将项目分组,然后使用“no_food_preference”中的项目以 6 为模填充其他组 - 如果某些项目仍保留在“no_food_preference”中将他们移到另一组:
pref = ["meat", "vegetarian", "meat", "vegan", "meat", "vegetarian", "meat", "vegetarian", "no_food_preference",
"meat",'no_food_preference', 'vegetarian',"meat", "meat",
"vegetarian", "vegetarian", "vegan", "vegetarian", "vegetarian", "no_food_preference", "vegan",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian",
"meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "vegetarian", "vegetarian", "vegetarian",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "no_food_preference",
"no_food_preference", "no_food_preference", "meat", "no_food_preference", "meat", "meat",
"vegan", "no_food_preference", "no_food_preference", "vegan" ,"no_food_preference" ,"vegan" ,"vegan" ]
def assign_groups(pref):
groups={}
for i,p in enumerate(pref):
if p in groups:
groups[p].append(i)
else:
groups[p] = [i]
for p in ['meat','vegetarian','vegan']:
need = len(groups[p]) % 6
if need:
for i in range(6-need):
groups[p].append(groups["no_food_preference"].pop())
if len(groups["no_food_preference"]):
groups["meat"] += groups["no_food_preference"]
del groups["no_food_preference"]
return groups
assign_groups(pref)
{'meat': [0, 2, 4, 6, 9, 12, 13, 27, 29, 43, 45, 46, 8, 10, 19, 31, 40, 41], 'vegetarian': [1, 5, 7, 11, 14, 15, 17, 18, 21, 22, 23, 24, 25, 26, 28, 30, 32, 33, 34, 35, 36, 37, 38, 39], 'vegan': [3, 16, 20, 47, 50, 52, 53, 51, 49, 48, 44, 42]}
当然,如果项目总数是 6 的倍数,这将起作用。
编辑
我更新了代码以更符合原始请求并处理一些特殊情况。一些观察:
- 我们需要总人数是 6(或我们为“集群”大小选择的值)的倍数
- 如果我们想确保处理所有种可能性,我们需要假设肉食者也可以吃蔬菜——即它们可以用来填满蔬菜甚至素食主义者集群。否则某些情况无法解决,例如,如果簇大小为 6,则 7 x 肉、7 x 素食、7 x 素食、3 x no-pref 没有解决方案
- 所以我们首先处理纯素食者组,用无偏好填充它,然后如果需要素食者,然后如果仍然需要肉食者;然后处理剩下的素食者,在他们的小组中加入无偏好者,然后是肉食者;最后是肉组,它只能装满没有偏好的人;最后,如果仍然存在一些无偏好的集群,我们将它们添加到一组 (meat)
修改后的代码如下所示(我添加了一个辅助函数来将人们从一个组移动到另一个):
pref = ["meat", "vegetarian", "meat", "vegan", "meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "meat",
'no_food_preference', 'vegetarian',"meat", "meat","vegetarian", "vegetarian", "vegan", "vegetarian", "vegetarian",
"no_food_preference", "vegan", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian",
"meat", "vegetarian", "meat", "vegetarian", "no_food_preference", "vegetarian", "vegetarian", "vegetarian",
"vegetarian", "vegetarian", "vegetarian", "vegetarian", "vegetarian", "no_food_preference",
"no_food_preference", "no_food_preference", "meat", "no_food_preference", "meat", "meat", "vegan",
"no_food_preference", "no_food_preference", "vegan" ,"no_food_preference" ,"vegan" ,"vegan" ]
groups = {}
def assign_groups(pref, pergroup):
global groups, pref
groups = {'meat':[], 'vegetarian':[], 'vegan':[], 'no_food_preference':[]}
fillers = {'meat':['no_food_preference'],
'vegetarian':['no_food_preference', 'meat'],
'vegan':['no_food_preference', 'vegetarian', 'meat']}
for i,p in enumerate(pref):
groups[p].append(i)
for p in ['vegan','vegetarian','meat']:
need = len(groups[p]) % pergroup
if need:
fill_idx = 0
need = pergroup - need
while need:
f = fillers[p][fill_idx]
avail = len(groups[f])
if need > avail:
from_to(p, f, avail)
need -= avail
fill_idx += 1
else:
from_to(p, f, need)
need = 0
if len(groups["no_food_preference"]):
from_to("meat", "no_food_preference", len(groups["no_food_preference"]))
return groups
def from_to(p,f,n):
global groups
for i in range(n):
groups[p].append(groups[f].pop())
assign_groups(pref, 6)
{'meat': [0, 2, 4, 6, 9, 12, 13, 27, 29, 43, 45, 46, 41, 40, 31, 19, 10, 8], 'vegetarian': [1, 5, 7, 11, 14, 15, 17, 18, 21, 22, 23, 24, 25, 26, 28, 30, 32, 33, 34, 35, 36, 37, 38, 39], 'vegan': [3, 16, 20, 47, 50, 52, 53, 51, 49, 48, 44, 42], 'no_food_preference': []}