为什么我的(新手)代码这么慢?

Why is my (newbie) code so slow?

我正在学习 python 并遇到了 this example 我以前见过的模型的模拟。其中一个函数看起来太长了,所以我认为尝试提高它的效率是个好习惯。我的尝试虽然需要更少的代码,但速度大约是原来的 1/60。是的,我把它弄糟了 60 倍。

我的问题是,我哪里出错了?我已经尝试对函数的各个部分进行计时,但没有看到瓶颈在哪里。

这是原始函数。它适用于人们生活在网格上的模型,他们的幸福取决于他们是否与大多数邻居属于同一种族。 (这是谢林的segregation model。)所以我们给它一个人的 x,y 坐标,并通过检查每个邻居的种族来确定他们的幸福程度。

def is_unhappy(self, x, y):

    race = self.agents[(x,y)]
    count_similar = 0
    count_different = 0

    if x > 0 and y > 0 and (x-1, y-1) not in self.empty_houses:
        if self.agents[(x-1, y-1)] == race:
            count_similar += 1
        else:
            count_different += 1
    if y > 0 and (x,y-1) not in self.empty_houses:
        if self.agents[(x,y-1)] == race:
            count_similar += 1
        else:
            count_different += 1
    if x < (self.width-1) and y > 0 and (x+1,y-1) not in self.empty_houses:
        if self.agents[(x+1,y-1)] == race:
            count_similar += 1
        else:
            count_different += 1
    if x > 0 and (x-1,y) not in self.empty_houses:
        if self.agents[(x-1,y)] == race:
            count_similar += 1
        else:
            count_different += 1        
    if x < (self.width-1) and (x+1,y) not in self.empty_houses:
        if self.agents[(x+1,y)] == race:
            count_similar += 1
        else:
            count_different += 1
    if x > 0 and y < (self.height-1) and (x-1,y+1) not in self.empty_houses:
        if self.agents[(x-1,y+1)] == race:
            count_similar += 1
        else:
            count_different += 1        
    if x > 0 and y < (self.height-1) and (x,y+1) not in self.empty_houses:
        if self.agents[(x,y+1)] == race:
            count_similar += 1
        else:
            count_different += 1        
    if x < (self.width-1) and y < (self.height-1) and (x+1,y+1) not in self.empty_houses:
        if self.agents[(x+1,y+1)] == race:
            count_similar += 1
        else:
            count_different += 1

    if (count_similar+count_different) == 0:
        return False
    else:
        return float(count_similar)/(count_similar+count_different) < self.similarity_threshold 

这是我的代码,正如我所说,它要慢得多。我想通过创建 "offsets" 的列表来避免上面的所有 if 语句添加到每个人的坐标以确定可能的邻居的位置,检查这是否是一个有效的位置,然后检查邻居的种族。

def is_unhappy2(self, x, y):
    thisRace = self.agents[(x,y)]
    count_same = 0
    count_other = 0

    for xo, yo in list(itertools.product([-1,0,1],[-1,0,1])):
        if xo==0 and yo==0:
            # do nothing for case of no offset
            next
        else:
            # check if there's a neighbor at the offset of (xo, yo)
            neighbor = tuple(np.add( (x,y), (xo,yo) ))
            if neighbor in self.agents.keys():
                if self.agents[neighbor] == thisRace:
                    count_same += 1
                else:
                    count_other += 1
    if count_same+count_other == 0:
        return False
    else:
        return float(count_same) / (count_same + count_other) < self.similarity threshold

(创建 class 的其余代码是示例来自的 on the site。)

计时结果如下:

%timeit s.is_unhappy2(49,42)
100 loops, best of 3: 5.99 ms per loop

%timeit s.is_unhappy(49,42)
10000 loops, best of 3: 103 µs per loop

我希望有 python 知识的人可以立即看到我做错了什么,而不必深入了解其余代码的细节。你能看出为什么我的代码比原来的差这么多吗?

罪魁祸首似乎是这一行:

neighbor = tuple(np.add( (x,y), (xo,yo) ))

将其更改为这个显示了巨大的加速:

neighbor = (x + xo, y + yo)

不要使用np.add,只使用neighbor = (x+xo, y+yo)。这应该会使它更快(在我的小测试中快 10 倍)。

您还可以...

  • if neighbor in self.agents: 没有 .keys()
  • 省略 list
  • 检查 xo or yo 并且没有空的 if 块
  • 避免在自代理中重复查找邻居

结果:

for xo, yo in itertools.product([-1,0,1],[-1,0,1]):
    if xo or yo:
        neighbor = self.agents.get((x+xo, y+yo))
        if neighbor is not None:
            if neighbor == thisRace:
                count_same += 1
            else:
                count_other += 1

您还可以添加

self.neighbor_deltas = tuple(set(itertools.product([-1,0,1],[-1,0,1])) - {(0, 0)})

到 class 初始值设定项,然后您的函数可以只使用那些预先计算的增量:

for xo, yo in self.neighbor_deltas:
    neighbor = self.agents.get((x+xo, y+yo))
    if neighbor is not None:
        if neighbor == thisRace:
            count_same += 1
        else:
            count_other += 1

恭喜您决定改进该作者荒谬的重复代码,顺便说一下。