是否有快速、实用的素数生成器？

Question

假设我有一个自然数 n 并且我想要一个包含不超过 n.

的所有素数的列表（或其他）

经典的素数筛算法在 O(n log n) 时间和 O(n) space 内运行——它适用于更多命令式语言，但需要对列表进行就地修改和随机访问，从根本上来说。

有一个涉及优先级队列的功能版本，非常漂亮——你可以看看here。这在 O(n / log(n)) 处具有更好的 space 复杂度（渐近更好，但在实际规模上值得商榷）。不幸的是，时间分析很糟糕，但它非常接近 O(n^2)（实际上，我认为它大约是 O(n log(n) Li(n))，但 log(n) Li(n) 大约是 n）。

渐近地说，实际上最好在生成每个数字时使用连续的试验除法来检查它的素数，因为这只需要 O(1) space 和 O(n^{3/2}) 时间。有没有更好的方法？

编辑： 结果证明我的计算完全不正确。文中的算法是O(n (log n) (log log n))，文中对此进行了解释和证明（见下面的答案），而不是我上面放的复杂的乱七八糟的。如果有一个真正的 O(n log log n) 纯算法，我仍然很乐意看到。

Answer 1

这是 Melissa O'Neill 算法的 Haskell 实现（来自链接文章）。与 Gassa 链接到的实现不同，我很少使用惰性，因此性能分析很清楚——O(n log n log log n)，即线性算术n log log n，命令式埃拉托色尼筛法的写入次数。

堆实现只是一个锦标赛树。平衡逻辑在push；通过每次交换子树，我们确保对于每个分支，左子树的大小与右子树的大小相同或大一倍，从而确保深度 O(log n).

module Sieve where

type Nat = Int

data Heap = Leaf !Nat !Nat
          | Branch !Nat !Heap !Heap
          deriving Show

top :: Heap -> Nat
top (Leaf n _) = n
top (Branch n _ _) = n

leaf :: Nat -> Heap
leaf p = Leaf (3 * p) p

branch :: Heap -> Heap -> Heap
branch h1 h2 = Branch (min (top h1) (top h2)) h1 h2

pop :: Heap -> Heap
pop (Leaf n p) = Leaf (n + 2 * p) p
pop (Branch _ h1 h2)
  = case compare (top h1) (top h2) of
        LT -> branch (pop h1) h2
        EQ -> branch (pop h1) (pop h2)
        GT -> branch h1 (pop h2)

push :: Nat -> Heap -> Heap
push p h@(Leaf _ _) = branch (leaf p) h
push p (Branch _ h1 h2) = branch (push p h2) h1

primes :: [Nat]
primes
  = let helper n h
          = case compare n (top h) of
                LT -> n : helper (n + 2) (push n h)
                EQ -> helper (n + 2) (pop h)
                GT -> helper n (pop h)
      in 2 : 3 : helper 5 (leaf 3)

Answer 2

在这里，如果 (Haskell's) 纯数组算作纯数组（他们应该，IMO）。复杂度显然是 O(n log (log n))，前提是 accumArray 确实为每个条目花费了 O(1) 时间它被给予，因为它应该：

import Data.Array.Unboxed 
import Data.List (tails, inits)

primes = 2 : [ n |
   (r:q:_, px) <- zip (tails (2 : [p^2 | p <- primes]))
                      (inits primes),
   (n,True)    <- assocs ( accumArray (\_ _ -> False) True
                             (r+1,q-1)
                             [ (m,()) | p <- px
                                      , let s = div (r+p) p * p
                                      , m <- [s,s+p..q-1]]
                             :: UArray Int Bool ) ]

通过段计算素数连续平方之间的素数，通过枚举素数列表相应前缀的倍数生成合成（使用 inits），就像任何适当的埃拉托色尼筛法一样，通过重复添加。

因此，素数 {2,3} 用于筛选从 10 到 24; {2,3,5} 从 26 到 48；等等。 See also.

此外，Python generator-based sieve might be considered functional as well. Python's dicts are extremely well-performing, empirically，虽然我不确定那里使用的倍数 over-producing 方案的确切成本以避免重复合成。

更新： testing it 确实如预期的那样产生了有利的结果：

{-     original heap       tweaked           nested-feed         array-based
          (3*p,p)         (p*p,2*p)            JBwoVL              abPSOx
          6Uv0cL          2x speed-up     another 3x+ speed-up
                n^                n^                  n^                  n^
100K:  0.78s             0.38s               0.13s              0.065s    
200K:  2.02s   1.37      0.97s   1.35        0.29s   1.16       0.13s    1.00
400K:  5.05s   1.32      2.40s   1.31        0.70s   1.27       0.29s    1.16
800K: 12.37s   1.29                     1M:  2.10s   1.20       0.82s    1.13
 2M:                                                            1.71s    1.06
 4M:                                                            3.72s    1.12
10M:                                                            9.84s    1.06
    overall in the tested range:
               1.33                                  1.21                1.09
-}

with empirical orders of growth 计算产生 n 个素数，其中 O(n log log n) 通常被视为 n^1.05...1.10 和 O(n log n log log n) as n^1.20...1.25.

"nested-feed" variant implements the postponement technique (as also seen in the above linked Python answer) that achieves quadratic reduction of the heap size which evidently has a noticeable impact on the empirical complexity, even if not quite reaching the still better results for the array-based code of this answer, which is able to produce 10 秒内在 ideone.com 上有 1000 万个质数（总体增长率仅为 n^1.09在测试范围内）。

（"original heap" is of course the code from 这里）。

Answer 3

前阵子推导了一个素数生成函数（按顺序生成所有素数）。还为它创建了一个 6 页的校样。我认为这实际上是历史上第一个素数生成函数（至少我找不到其他例子）。

这里是:

(-1)^((4*gamma(x)+4)/x)-1

不确定它的计算速度有多快。它 returns 0 对于所有素数（或者可能是 1，不记得了）。 Gamma 函数本质上是阶乘的，因此可以在早期很快。将负 1 提高到小数指数是另一回事，我相信它可能使用 base_e 中的积分，或者可能使用一些三角函数；不记得了。

我不懂 LaTeX，所以如果有人想编辑我的 post 并包含一个 LaTeX 版本，那就太棒了！

是否有快速、实用的素数生成器？

Is there a fast, functional prime generator?

algorithm

primes

functional-programming

sieve-of-eratosthenes