分解 99 位长的大素数的问题

Question

号码是112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869 我的代码如下

def getfactors(number):
    factors = []
    for potentialFactor in range(1 , int(math.sqrt(number)) + 1):
        if number % potentialFactor == 0:
            factors.append(potentialFactor)
    return factors

并且输入是

getfactors(112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869)

该程序已经运行至少 3 小时了，我仍然没有结果。该代码也适用于其他数字。我可以使用任何算法或方法来加快速度吗？

Answer 1

你对这个号码了解多少？如果它是一个 RSA public 密钥，那么它只有两个大素数。如果它是一个随机数，那么它很可能有小的质因数。数字的类型将决定您希望如何对其进行因式分解。

两个辅助功能也很有用。首先是 Eratosthenes 筛法来构建一个素数列表，例如 50,000 或某个方便的限制。其次是大数素数检验，例如Miller-Rabin，检查余数是否为素数。

使用 Eratosthenes 筛法为您提供所有小素数，直到一个方便的极限。依次测试每个素数直到目标数的平方根。当找到能整除测试数的质数时，将测试数除以使其变小。一个质数可以分裂不止一次。完成所有除法后，将素数限制重置为较小数字的平方根。

if (numToTest MOD trialFactor = 0)
  repeat
    listOfFactors.add(trialFactor)
    numToTest <- numToTest/trialFactor
  until (numToTest MOD trialFactor != 0)
  primeLimit <- sqrt(numTotest)
endif

一旦您测试的数字减少到 1，您就已经完全分解了它。

如果在完全因式分解目标之前运行出素数，则值得运行进行 Miller-Rabin 测试 64 次以查看余数本身是否为素数；这可能会为您节省大量寻找大质数的不存在因子的工作。如果余数是复合的，那么您可以使用更大的筛再试一次，或者使用一种重型因式分解方法：二次筛、椭圆曲线等。

Answer 2

由于 RSA 素数彼此接近，因此您的方法将花费大量时间来考虑给定数字。即使 sieving with Sieve of Eratosthenes won't help, since you have a 326-bit number. Can you sieving to 163-bit, there is no way. This is slightly larger than the first RSA challenge RSA-100 有 300 位。

使用现有的库，例如

CADO-NFS； http://cado-nfs.gforge.inria.fr/
NFS 分解：http://gilchrist.ca/jeff/factoring/nfs_beginners_guide.html
保理即服务https://seclab.upenn.edu/projects/faas/

实验

我试过 Pollard's p-1 algorithm, still running for one and a half-day and did not produce a result, yet. This is what expected due to the B bound must be around 2^55 with success probability 1/27. I've stopped the experiment after the CADO-NFS succeeds. This is self-implemented Pollard's p-1, one can find an optimized in GMP-ECM
尝试了 CADO-NFS。稳定版本可能不容易为新机器编译，所以更喜欢 GitLab.
中的活动版本
使用 4 个内核大约 6 小时后，CADO-NFS 生成了结果。正如预期的那样，这是一个 RSA CTF/Challange。因为我不想破坏乐趣；这里是 SHA-512 的哈希承诺，它是用 OpenSSL 执行的；

echo -n "prime x" | openssl sha512

27c64b5b944146aa1e40b35bd09307d04afa8d5fa2a93df9c5e13dc19ab032980ad6d564ab23bfe9484f64c4c43a993c09360f62f6d70a5759dfeabf59f18386

faebc6b3645d45f76c1944c6bd0c51f4e0d276ca750b6b5bc82c162e1e9364e01aab42a85245658d0053af526ba718ec006774b7084235d166e93015fac7733d

实验详情

CPU : Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
RAM : 32GB - 不需要太多内存，至少在多项式选择和筛选过程中是这样。
专用内核：4
测试机Ubuntu20.04.1 LTS
CUDA - 否
gcc 版本 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
cmake 版本 3.16.3
外部图书馆：Ubuntu 的规范
没有任何内容
CODA-NFS 版本：GitLab 开发版本克隆于 2021 年 1 月 23 日
位大小；
- n 有 326 位（RSA-100 挑战有 330 位并在 1991 年被 Lenstra 打破）
- p 有 165 位
- q 有 162 位

cado-nfs-2.3.0 没有编译并给出关于 HWLOC-HWLOC_TOPOLOGY_FLAG_IO_DEVICES 的错误。请一位朋友测试编译，它对他们有用。这是一个较旧的 Linux 版本。所以我决定使用 GitLab version.

Answer 3

我已经为 R 编程语言编写了一个名为 RcppBigIntAlgos 的库，它可以在合理的时间内分解这些类型的数字（不如使用出色的 CADO-NFS @kelalaka 的回答中使用的库。

正如其他人指出的那样，除非您投入大量时间，否则您自己编写的任何程序都很难计算出您的数字。就我个人而言，我已经投入了数千小时。您的里程可能会有所不同。

这是我在普通 R 控制台中进行的测试运行（没有特殊 IDE）：

numDig99 <- "112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869"
## install.packages("RcppBigIntAlgos") if necessary
library(RcppBigIntAlgos)

prime_fac <- quadraticSieve(numDig99, showStats=TRUE, nThreads=8)

Summary Statistics for Factoring:
    112887987371630998240814603336195913423482111436696007401429072377238341647882152698281999652360869

|      MPQS Time     | Complete | Polynomials |   Smooths  |  Partials  |
|--------------------|----------|-------------|------------|------------|
|  11h 13m 25s 121ms |   100%   |   11591331  |    8768    |    15707   |

|  Mat Algebra Time  |    Mat Dimension   |
|--------------------|--------------------|
|    1m 39s 519ms    |    24393 x 24475   |

|     Total Time     |
|--------------------|
|  11h 15m 12s 573ms |

就像@kelalaka 一样，我们获得了相同的散列值（同样，运行就在 R 控制台中）：

system(sprintf("printf %s | openssl sha512", prime_fac[1]))
faebc6b3645d45f76c1944c6bd0c51f4e0d276ca750b6b5bc82c162e1e9364e01aab42a85245658d0053af526ba718ec006774b7084235d166e93015fac7733d

system(sprintf("printf %s | openssl sha512", prime_fac[2]))
27c64b5b944146aa1e40b35bd09307d04afa8d5fa2a93df9c5e13dc19ab032980ad6d564ab23bfe9484f64c4c43a993c09360f62f6d70a5759dfeabf59f18386

RcppBigIntAlgos::quadraticSieve 中实现的算法是 Multiple Polynomial Quadratic Sieve. There is a more efficient version of the quadratic sieve known as the Self Initializing Quadratic Sieve，但是，没有那么多可用的文献。

这是我的机器规格：

MacBook Pro (15-inch, 2017)
Processor: 2.8 GHz Quad-Core Intel Core i7
Memory; 16 GB 2133 MHz LPDDR3

这是我的 R 信息：

sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RcppBigIntAlgos_1.0.1 gmp_0.6-0            

loaded via a namespace (and not attached):
[1] compiler_4.0.3 tools_4.0.3    Rcpp_1.0.5

分解 99 位长的大素数的问题

Issues factoring large prime that is 99 digits long

encryption

math

cryptography

rsa

prime-factoring