我们如何有效地检查 Python 中的字符串是否为十六进制

Question

我需要检查字符串是否为十六进制。我学到了 2 种方法 -

1.) 遍历每个字符

all(c in string.hexdigits for c in s) # Straight forward with no optimizations

2.) 使用int() 函数检查错误

try:
    int(s, 16)
    return True
except ValueError:
    return False

在第一种情况下，我知道复杂度是 O(n)。但是第二个呢？那里的时间复杂度是多少？

Answer 1

int(s, 16) 的复杂度仍为 O(n)，其中 n == len(s)，但两者不能直接比较。 int 将在比 all 更低的级别迭代数据，后者更快，但 int 也做更多的工作（它实际上必须计算 [=18= 的整数值） ]).

那么哪个更快？您必须对两者进行分析。

In [1]: s = "783c"

In [2]: import string

In [3]: %timeit all(c in string.hexdigits for c in s)
800 ns ± 3.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %%timeit
   ...: try:
   ...:   int(s, 16)
   ...: except ValueError:
   ...:   pass
   ...:
223 ns ± 1.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

看起来内部迭代获胜。我也在 9 位字符串上进行了测试，int 仍然快了大约 4 倍。

但是无效字符串呢？

In [8]: s = 'g'

In [9]: %%timeit
   ...: try:
   ...:   int(s, 16)
   ...: except ValueError:
   ...:   pass
   ...:
1.09 µs ± 2.62 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [10]: %timeit all(c in string.hexdigits for c in s)
580 ns ± 6.55 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

现在，我们基本上是在测试短路的好处与捕获异常的代价。如果错误出现在字符串后面会怎样？

In [11]: s = "738ab89ffg"

In [12]: %timeit all(c in string.hexdigits for c in s)
1.59 µs ± 19.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [13]: %%timeit
    ...: try:
    ...:   int(s, 16)
    ...: except ValueError:
    ...:   pass
    ...:
1.25 µs ± 19.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

现在我们又看到了内部迭代的好处。

我们如何有效地检查 Python 中的字符串是否为十六进制

How can we efficiently check if a string is hexadecimal in Python

python

python-internals