随机将参与者重新分配到组中，这样最初来自同一组的参与者就不会最终进入同一组

Question

我基本上是在尝试进行这种 Monte Carlo 类型的分析，我将实验参与者随机重新分配到新组，然后根据随机新组重新分析数据。所以这就是我想要做的事情：

参赛者最初分为八组，每组四人。我想将每个参与者随机重新分配到一个新组，但我不希望任何参与者最终都在一个新组中 与来自同一原始组的另一个参与者。

这是我在这方面的进展：

import random
import pandas as pd
import itertools as it

data = list(it.product(range(8),range(4)))
test_df = pd.DataFrame(data=data,columns=['group','partid'])
test_df['new_group'] = None

for idx, row in test_df.iterrows():
    start_group = row['group']
    takens      = test_df.query('group == @start_group')['new_group'].values
    fulls       = test_df.groupby('new_group').count().query('partid >= 4').index.values
    possibles   = [x for x in test_df['group'].unique() if (x not in takens)
                                                      and (x not in fulls)]
    test_df.loc[idx,'new_group'] = random.choice(possibles)

这里的基本思想是，我随机将参与者重新分配到一个新组，但有以下限制条件：(a) 新组没有他们原来的组伙伴之一，并且 (b) 新组没有没有 4 个或更多参与者已重新分配给它。

这种方法的问题是，很多时候，当我们尝试重新分配最后一个组时，唯一剩余的组槽位于同一组中。我可以也只是尝试在失败时重新随机化直到成功，但这感觉很愚蠢。另外，我想进行 100 次随机重新分配，这样方法可能会变得非常慢....

所以必须有更聪明的方法来做到这一点。我也觉得应该有一个更简单的方法来解决这个问题，因为目标感觉很简单（但我意识到这可能会产生误导...）

Answer 1

编辑：更好的解决方案

在沉睡之后，我发现 ~ Big O of numGroups.

中有一个明显更好的解决方案

示例数据

import random
import numpy as np
import pandas as pd
import itertools as it

np.random.seed(0)
numGroups=4
numMembers=4

data = list(it.product(range(numGroups),range(numMembers)))
df = pd.DataFrame(data=data,columns=['group','partid'])

解决方案

g = np.repeat(range(numGroups),numMembers).reshape((numGroups,numMembers))
In [95]: g
Out[95]: 
array([[0, 0, 0, 0],
       [1, 1, 1, 1],
       [2, 2, 2, 2],
       [3, 3, 3, 3]])

g = np.random.permutation(g)
In [102]: g
Out[102]: 
array([[2, 2, 2, 2],
       [3, 3, 3, 3],
       [1, 1, 1, 1],
       [0, 0, 0, 0]])

g = np.tile(g,(2,1))
In [104]: g
Out[104]: 
array([[2, 2, 2, 2],
       [3, 3, 3, 3],
       [1, 1, 1, 1],
       [0, 0, 0, 0],
       [2, 2, 2, 2],
       [3, 3, 3, 3],
       [1, 1, 1, 1],
       [0, 0, 0, 0]])

注意对角线。

array([[2, -, -, -],
       [3, 3, -, -],
       [1, 1, 1, -],
       [0, 0, 0, 0],
       [-, 2, 2, 2],
       [-, -, 3, 3],
       [-, -, -, 1],
       [-, -, -, -]])

从上到下取对角线。

newGroups = []
for i in range(numGroups):
    newGroups.append(np.diagonal(g[i:i+numMembers]))

In [106]: newGroups
Out[106]: 
[array([2, 3, 1, 0]),
 array([3, 1, 0, 2]),
 array([1, 0, 2, 3]),
 array([0, 2, 3, 1])]

newGroups = np.ravel(newGroups)
df["newGroups"] = newGroups

In [110]: df
Out[110]: 
    group  partid  newGroups
0       0       0          2
1       0       1          3
2       0       2          1
3       0       3          0
4       1       0          3
5       1       1          1
6       1       2          0
7       1       3          2
8       2       0          1
9       2       1          0
10      2       2          2
11      2       3          3
12      3       0          0
13      3       1          2
14      3       2          3
15      3       3          1

旧解决方案：蛮力法

结果比我想象的要难很多...

我有一种蛮力法，基本上可以猜测组的不同排列，直到它最终得到一个每个人最终都在不同组中的排列。这种方法与您所展示的相比的好处是它不会受到 "running out of groups at the end".

的影响

它可能会变慢 - 但对于 8 个组和每个组 4 个成员来说它很快。

示例数据

import random
import numpy as np
import pandas as pd
import itertools as it

random.seed(0)
numGroups=4
numMembers=4

data = list(it.product(range(numGroups),range(numMembers)))
df = pd.DataFrame(data=data,columns=['group','partid'])

解决方案

g = np.repeat(range(numGroups),numMembers).reshape((numGroups,numMembers))

In [4]: g
Out[4]: 
array([[0, 0, 0, 0],
       [1, 1, 1, 1],
       [2, 2, 2, 2],
       [3, 3, 3, 3]])

def reArrange(g):
    g = np.transpose(g)
    g = [np.random.permutation(x) for x in g]
    return np.transpose(g)

# check to see if any members in each old group have duplicate new groups
# if so repeat
while np.any(np.apply_along_axis(lambda x: len(np.unique(x))<numMembers,1,g)):
    g = reArrange(g)

df["newGroup"] = g.ravel()

In [7]: df
Out[7]: 
    group  partid  newGroup
0       0       0         2
1       0       1         3
2       0       2         1
3       0       3         0
4       1       0         0
5       1       1         1
6       1       2         2
7       1       3         3
8       2       0         1
9       2       1         0
10      2       2         3
11      2       3         2
12      3       0         3
13      3       1         2
14      3       2         0
15      3       3         1

随机将参与者重新分配到组中，这样最初来自同一组的参与者就不会最终进入同一组

Randomly reassign participants to groups such that participants originally from same group don't end up in same group

python

random

pandas

experimental-design

data-science

编辑：更好的解决方案

示例数据

解决方案

旧解决方案：蛮力法

示例数据

解决方案