python 2 中的“范围内”构造 --- 工作速度太慢

Question

我想检查给定的 x 是否位于区间 [0,a-1] 内。作为一个懒惰的编码员，我写了

x in range(a)

并且（因为那段代码在 4.5 嵌套循环中）很快运行陷入了性能问题。我测试了它，结果确实运行time of n in range(n) 在于 O(n)，给予或接受。我实际上认为我的代码将被优化为 x >= 0 and x < a 但似乎并非如此。即使我提前修复了 range(a)，时间也不会变得恒定（尽管它改进了很多）- 请参阅旁注。

所以，我的问题是：

我应该使用 x >= 0 and x < a 并且 永远不会 写 x in range(a) 吗？还有更好的写法吗？

旁注：

我试着在 SO 中搜索 range, python-2.7, performance tags put together, found nothing (same with python-2.x)。
如果我尝试以下操作：
```
i = range(a)
...
x in i
```
所以范围是固定的，我只测量 x in i 的运行时间，我仍然在 O(x) 中得到运行时间（假设 a 是足够大）。
运行n in xrange(n)的时间也是O(n)。
我发现，它对 python 3 提出了类似的问题。我决定在 python 3 上测试相同的东西，它通过了测试，就像它什么都不是。我为 python 2.

Answer 1

您可以获得与 python3 中相同的性能，方法是使用自定义范围 class 并覆盖 in 运算符。对于琐碎的情况，它的性能不如简单比较，但是您将避免使用内置 range() 或 xrange().

请注意，测试 value in range(low, high) 不同于 low < value <= high，因为范围将仅包含整数。所以 7.2 in range(10) == False.

但更重要的是 range() 可以采用可选的第三个 step 参数，因此如果您需要测试 value in range(low, high, step)，您可以考虑使用自定义 class.

编辑：@mike239x 发现 future package 包含一个 range 对象，类似于我的回答中的对象（除了其他帮助您编写代码的函数 python2/3交叉兼容）。使用它应该是安全的，因为它可能已经过良好的测试和稳定。

这个 class 的一个对象包装了一个 xrange 对象，并且只覆盖了非常昂贵的 in 操作。对于常规迭代，它的工作方式与 xrange.

一样

class range(object):
  """Python 2 range class that emulates the constant time `in` operation from python 3"""

  def __init__(self, *args):
    self.start, self.end = (0, args[0]) if len(args) == 1 else args[:2]
    self.step = 1 if len(args) < 3 else args[2]
    self.xrange = xrange(*args)

  def __contains__(self, other):
    # implements the `in` operator as O(1) instead of xrange's O(n)
    try:
      assert int(other) == other
    except Exception:
      return False  # other is not an integer
    if self.step > 0:
      if not self.start <= other < self.end:
        return False  # other is out of range
    else:
      if not self.start >= other > self.end:
        return False  # other is out of range
    # other is in range. Check if it's a valid step
    return (self.start - other) % self.step == 0

  def __iter__(self):
    # returns an iterator used in for loops
    return iter(self.xrange)

  def __getattr__(self, attr):
    # delegate failed attribute lookups to the encapsulated xrange
    return getattr(self.xrange, attr)

内置的xrange对象是用C实现的，所以我们不能使用class继承。相反，我们可以使用组合并将除 __contains__ 之外的所有内容委托给封装的 xrange 对象。

contains 的实现可以与 cpython rangeobject 实现中的 range_contains_long 进行比较。 Here's the python 3.6 source code for that function.

编辑：要获得更全面的 python 实施，请查看 future.builtins.range from the future library.

Answer 2

Call `x in range( a )` slow?^{（注意 py2 隐藏风险如果使用 range() ...）}

       23[us] spent in [py2] to process ( x in range( 10E+0000 ) )
        4[us] spent in [py2] to process ( x in range( 10E+0001 ) )
        3[us] spent in [py2] to process ( x in range( 10E+0002 ) )
       37[us] spent in [py2] to process ( x in range( 10E+0003 ) )
      404[us] spent in [py2] to process ( x in range( 10E+0004 ) )
     4433[us] spent in [py2] to process ( x in range( 10E+0005 ) )
    45972[us] spent in [py2] to process ( x in range( 10E+0006 ) )
   490026[us] spent in [py2] to process ( x in range( 10E+0007 ) )
  2735056[us] spent in [py2] to process ( x in range( 10E+0008 ) )

MemoryError

in range( a ) 构造函数的语法不仅 slow in [TIME]-domain，如果做得更聪明，则最多具有 O(log N)，而不是通过 list-ed 值的枚举域进行纯顺序搜索，但是
在
py2，本机 range() 总是也有复合 add-on O( N ) 两者的成本 [TIME]-域成本（构建时间）以及 [SPACE]-域成本（分配 space 存储 + 花费更多时间将所有这些数据通过... ) 这种基于 range 的 memory-representation 构造。

让我们对一种安全的 `O( 1 )` 缩放方法进行基准测试 ^{（+总是做基准测试）}

>>> from zmq import Stopwatch
>>> aClk = Stopwatch()
>>> a = 123456789; x = 123456; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
4L
>>> a = 123456789; x = 123456; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
3L

需要 3 ~ 4 [us] 来评估 condition-based 公式，具有 O( 1 ) 缩放比例，对 [=27= 不变] 震级。

接下来，使用 `x in range( a )` 公式进行测试：

>>> a = 123456789; x = 123456; aClk.start(); _ = ( x in range( a ) );aClk.stop()

并且您的机器几乎会在 memory-throughput 范围内冻结 CPU-starvations（更不用说一些 ~ 100 [ns] 成本范围内令人讨厌的掉期溢出效应~ 15.000.000 [ns] IO data-flows 的 ~ 15.000.000 [ns] 成本高出几个数量级。

不，不，不。从来没有办法测试 x 在有界范围内。

创建其他一些基于 class 的评估器的想法，仍然通过枚举 ( set ) 解决问题将永远无法满足基准 3 ~ 4 [us] _{（如果不使用一些超出我对 class 物理和量子物理学中的 cause-effect 定律的理解的外星魔法）}

Python 3 has changed the way, how the range()-constructor works，但这不是原作的核心优点post：

3 [us] spent in [py3] to process ( x in range( 10E+0000 ) ) 2 [us] spent in [py3] to process ( x in range( 10E+0001 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0002 ) ) 2 [us] spent in [py3] to process ( x in range( 10E+0003 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0004 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0005 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0006 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0007 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0008 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0009 ) ) 2 [us] spent in [py3] to process ( x in range( 10E+0010 ) ) 1 [us] spent in [py3] to process ( x in range( 10E+0011 ) )

在Python2中，range()和xrange()都没有逃出O( N )缩放的陷阱，其中xrange()-generator 似乎只运行 2 倍慢

>>> from zmq import Stopwatch >>> aClk = Stopwatch() >>> for expo in xrange( 8 ): ... a = int( 10**expo); x = a-2; aClk.start(); _ = ( x in range( a ) );aClk.stop() ... 3L 8L 5L 40L 337L 3787L 40466L 401572L >>> for expo in xrange( 8 ): ... a = int( 10**expo); x = a-2; aClk.start(); _ = ( x in xrange( a ) );aClk.stop() ... 3L 10L 7L 77L 271L 2772L 28338L 280464L

range-bounds 语法享有 O( 1 ) 常数时间 ~ < 1 [us]，如前所述上面，所以再次比较的标准是：

>>> for expo in xrange( 8 ): ... a = int( 10**expo); x = a-2; aClk.start(); _ = ( 0 <= x < a );aClk.stop() ... 2L 0L 1L 0L 0L 1L 0L 1L

Answer 3

range in Python 2 is that it creates a list of values, so x in range(a) will create a list and linearly scan that list. xrange的问题应该是生成器的问题，但是并没有快多少；可能仍然只是线性扫描值，只是没有先创建整个列表。

In [2]: %timeit 5*10**5 in range(10**6 + 1)  # Python 2
10 loops, best of 3: 18.1 ms per loop

In [3]: %timeit 5*10**5 in xrange(10**6 + 1) # Python 2
100 loops, best of 3: 6.21 ms per loop

In , range 更聪明，不仅不创建整个列表，而且还提供了 contains 检查的快速实现。

In [1]: %timeit 5*10**5 in range(10**6 + 1)  # Python 3
1000000 loops, best of 3: 324 ns per loop

更快，恕我直言，可读性更强：使用比较链：

In [2]: %timeit 0 <= 5*10**5 < 10**6 + 1     # Python 2 or 3
10000000 loops, best of 3: 46.6 ns per loop

Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?

"No"、"it depends" 和 "yes"。您不应使用 x >= 0 and x < a，因为 0 <= x < a 更短且更易于解析（对于弱小的人），并被解释为 (0 <= x) and (x < a)。而且你不应该在Python2中使用in range，但是在Python3中，你可以使用如果你喜欢的话。

不过，我更喜欢比较链，因为 a <= x < b 比 x in range(a, b) 更明确地说明边界（如果 x == b 会怎么样？），这可以防止许多 off-by-one 错误或 +1 填充范围。

此外，请注意 0 <= x < a 与 x in range(0, a) 并不严格相同，因为 range 将只包含整数值，即 1.5 in range(0, 5) 是 False，而 0 <= 1.5 < 5 是 True，这可能不是您想要的。此外，使用 range 您可以使用 1 以外的步骤，例如5 in range(4, 10, 2) 是 False，但也可以使用纯数学来实现，例如作为 (4 <= x < 10) and (x - 4 % 2 == 0).

Answer 4

所以，是的，基本上，在 Python 2 中使用 range（如上所述）是一个坏主意 - python 实际上创建了一个包含范围 + 的所有值的列表之后它以最直接的方式搜索整个列表。

其中一个选项如下：使用 Python 3 中的 range，它可以更好地处理这种情况 for various reasons。 "Well"，你问，"how do I use range from Python 3 in Python 2"？答案是：使用futurelibrary。安装那个，记下来

from future.builtins import range

在您的代码 header 和 wuolah！- 您的范围现在表现为 Python 3 中的范围，现在您可以再次使用 x in range(a)，没有任何性能问题。

python 2 中的“范围内”构造 --- 工作速度太慢

`in range` construction in python 2 --- working too slow

python

performance

range

overhead

python-2.7

所以，我的问题是：

Call `x in range( a )` slow?^{（注意 py2 隐藏风险如果使用 range() ...）}

让我们对一种安全的 `O( 1 )` 缩放方法进行基准测试 ^{（+总是做基准测试）}

接下来，使用 `x in range( a )` 公式进行测试：

python 2 中的“范围内”构造 --- 工作速度太慢

`in range` construction in python 2 --- working too slow

python

performance

range

overhead

python-2.7

所以，我的问题是：

Call x in range( a ) slow?（注意 py2 隐藏风险如果使用 range() ...）

让我们对一种安全的 O( 1 ) 缩放方法进行基准测试 （+总是做基准测试）

接下来，使用 x in range( a ) 公式进行测试：

Call `x in range( a )` slow?^{（注意 py2 隐藏风险如果使用 range() ...）}

让我们对一种安全的 `O( 1 )` 缩放方法进行基准测试 ^{（+总是做基准测试）}

接下来，使用 `x in range( a )` 公式进行测试：