大数整数类型的性能

Question

我正在尝试解决 AdventOfCode 2018 day 14。任务大致是通过在两个已经存在的数字的基础上迭代地附加一个或两个数字来创建一个有很多数字的数字。对于 Haskell，我认为 Integer 可能很适合代表巨大的数字。我认为我的程序正确，至少它似乎适用于 AoC 提供的样本。但是我注意到，当数字包含超过 10^4 个数字（附加程序中的recipeCount）时，程序的性能会急剧下降。当将数字增加到以下位数时，我观察到以下执行时间：

10000位：0.314s
20000位：1.596s
30000位：4.306s
40000位：8.954s

看起来像 O(n^2) 或更糟，不是吗？

这是为什么？据我所知，该程序只进行基本计算。

import Data.Bool (bool)

main :: IO ()
main = print solve

recipeCount :: Int
recipeCount = 10000

solve :: Integer
solve = loop 0 1 37 2
    where
        loop recipeA recipeB recipes recipesLength
            | recipesLength >= recipeCount + 10 = recipes `rem` (10 ^ 10)
            | otherwise =
                let recipeAScore = digitAt (recipesLength - 1 - recipeA) recipes
                    recipeBScore = digitAt (recipesLength - 1 - recipeB) recipes
                    recipeSum = fromIntegral $ recipeAScore + recipeBScore
                    recipeSumDigitCount = bool 2 1 $ recipeSum < 10
                    recipes' = recipes * (10 ^ recipeSumDigitCount) + recipeSum
                    recipesLength' = recipesLength + recipeSumDigitCount
                    recipeA' = (recipeA + recipeAScore + 1) `rem` recipesLength'
                    recipeB' = (recipeB + recipeBScore + 1) `rem` recipesLength'
                in loop recipeA' recipeB' recipes' recipesLength'

digitAt :: Int -> Integer -> Int
digitAt i number = fromIntegral $ number `quot` (10 ^ i) `rem` 10

P.S.: 因为我是 Haskell 的新手，我也非常感谢对程序本身（风格、算法等）的反馈。

编辑：我找到了对我的程序的两个版本进行概要分析的选项。两个版本均使用 ghc -O2 -rtsopts ./Program.hs 和运行以及 ./Program +RTS -sstderr.

编译

第一个整数版本在生成 50,000 个食谱时产生以下输出：

   2,435,108,280 bytes allocated in the heap
         886,656 bytes copied during GC
          44,672 bytes maximum residency (2 sample(s))
          29,056 bytes maximum slop
               0 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1925 colls,     0 par    0.018s   0.017s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time   15.208s  ( 15.225s elapsed)
  GC      time    0.018s  (  0.017s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time   15.227s  ( 15.242s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    160,115,875 bytes per MUT second

  Productivity  99.9% of total user, 99.9% of total elapsed

具有可变数组的第二个版本在生成约 500,000 个食谱时产生以下输出：

      93,437,744 bytes allocated in the heap
          16,120 bytes copied during GC
         538,408 bytes maximum residency (2 sample(s))
          29,056 bytes maximum slop
               0 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0        88 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.021s  (  0.020s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.021s  (  0.021s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    4,552,375,284 bytes per MUT second

  Productivity  97.0% of total user, 97.2% of total elapsed

Answer 1

我认为首先使用 Integer 作为您的食谱列表是一个大危险信号。 Integers 存储号码，但您的问题不要求号码。它需要一个数字列表。一个 Integer，其优先级是一个数字，基本上是 "compressed"：它是二进制的，而不是十进制的，并且试图从中提取一个十进制数字意味着你必须做时髦的，不平凡的数学，正如其他人所说。此外，纯度对您不利，因为每次向列表添加新数字时，您最终都会复制整个列表。问题大小约为 100,000-1,000,000 位数字（我得到的问题输入约为 800,000），即每次复制 Integer 大小约为 log_(2^8)(10^(10^5)) = ~41000 字节。这部分似乎也是二次方的。

我会推荐 "decompressing" 你的数字列表。您可以用 1 个字节表示一个数字（这确实浪费了很多 space！）

import Data.Word
type Digit = <a href="https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Word.html#t:Word8" rel="nofollow noreferrer">Word8</a>
addDigit :: Digit -> Digit -> (Digit, Digit)
addDigit = _yourJob

您可以使用数组将逻辑的主体实现为函数。是的，Haskell 确实有数组，从具有实际 O(1) 索引的连续内存块的意义上来说。只是我们喜欢找到 "more functional" 的方式来表达问题而不是数组。但是，如果您需要它们，它们总是在那里。

import Data.Array.Unboxed -- from the <a href="https://hackage.haskell.org/package/array-0.5.3.0" rel="nofollow noreferrer">array</a> package, which is a core library
makeRecipes ::
  -- | Elf 1's starting score
  Digit ->
  -- | Elf 2's starting score
  Digit ->
  -- | Number of recipes to make
  Int ->
  -- | Scores of the recipes made, indices running from 0 upwards
  <a href="https://hackage.haskell.org/package/array-0.5.3.0/docs/Data-Array-Unboxed.html#t:UArray" rel="nofollow noreferrer">UArray</a> Int Digit

数组最酷的地方在于你可以在 ST monad 中改变它们，同时得到一个纯粹的结果。因此，这个数组不会受到任何复制，索引它所涉及的数学运算也很少。

import Control.Monad.ST
import Data.Array.ST
makeRecipes elf1 elf2 need = <a href="https://hackage.haskell.org/package/array-0.5.3.0/docs/Data-Array-ST.html#v:runSTUArray" rel="nofollow noreferrer">runSTUArray</a> $ do
  arr <- <a href="https://hackage.haskell.org/package/array-0.5.3.0/docs/Data-Array-MArray.html#v:newArray_" rel="nofollow noreferrer">newArray_</a> (0, need)
  <a href="https://hackage.haskell.org/package/array-0.5.3.0/docs/Data-Array-MArray.html#v:writeArray" rel="nofollow noreferrer">writeArray</a> arr 0 elf1
  writeArray arr 1 elf2
  loop arr 0 1 2
  return arr
  where
    loop :: STUArray s Int Digit -> Int -> Int -> Int -> ST s ()
    loop arr loc1 loc2 done = _yourJob

loop 给出了数组，其中部分填充了 done 配方分数，以及两个精灵的位置 loc1, loc2 < done。它应该使用 addDigit 和 readArray 计算新食谱的分数，并使用 writeArray 将它们添加到数组的正确位置。如果数组已满，它应该终止（它没有 return 任何有用的东西）。否则，它应该继续计算精灵的新位置，然后递归。

然后您可以在 makeRecipes 之上编写一个小适配器来实际提取最后十个食谱，提供正确的输入等。当我填写程序中的所有空白时，我得到了一个运行时-O2 在我的输入（大约 800,000）上有 .07s，-O0 有大约 0.8s。输入似乎需要 O(n) 时间。

大数整数类型的性能

Performance of Integer type with large numbers

haskell

biginteger