为什么 sum 在 haskell 中比 foldl' 慢?

Why is sum slower than foldl' in haskell?

为什么 ghc 中的默认 sum 比其等效的 foldl'foldlstricter equivalent)慢 ~10 倍?如果是这样,为什么不使用 foldl'?

来实现呢?
import Data.List
> foldl' (+) 0 [1..10^7]
50000005000000
(0.39 secs, 963,528,816 bytes)

> sum [1..10^7]
50000005000000
(4.13 secs, 1,695,569,176 bytes)

为了完整起见,这里还有 foldlfoldr 的统计数据。

> foldl (+) 0 [1..10^7]
50000005000000
(4.02 secs, 1,695,828,752 bytes)

> foldr (+) 0 [1..10^7]
50000005000000
(3.78 secs, 1,698,386,648 bytes)

看起来 sum 是使用 foldl 实现的,因为它们的运行时间相似。在 ghc 7.10.2.

上测试

sum函数在GHC中使用foldl实现:

-- | The 'sum' function computes the sum of a finite list of numbers.
sum                     :: (Num a) => [a] -> a
{-# INLINE sum #-}
sum                     =  foldl (+) 0

可见in the source.

必须这样,因为是规范in the Haskell report.

理由很可能是对于列表的某些惰性元素类型,foldl 是正确的做法。 (我个人认为 foldl 几乎总是错误的,应该只使用 foldl'。)

充分优化后,GHC 将内联该定义,将其专门用于手头的元素类型,注意循环是严格的,并在每次迭代中强制计算累加器;正如@AndrásKovács 所观察到的,有效地将其变成 foldl'

从 GHC-7.10 开始,sum itselfFoldable 类型 class 的方法,默认定义通过 foldMap。然而,instance Foldable [] 用上面的 sum.

定义覆盖了它

为了补充@Joachim Breitner 的回答,我发现这篇文章 blog post,非常有趣(摘自 reddit 讨论,感谢@ZhekaKozlov link)。

When Haskell 1.0 was published on this day 24 years ago there was no seq function at all, so there was no choice but to define foldl in the “classic” way.

Eventually, six years later after much discussion, we got the seq function in Haskell 1.3. Though actually in Haskell 1.3 seq was part of an Eval class, so you couldn’t use it just anywhere, such as in foldl. In Haskell 1.3 you would have had to define foldl' with the type:

foldl' :: Eval b => (b -> a -> b) -> b -> [a] -> b

Haskell 1.4 and Haskell 98 got rid of the Eval class constraint for seq but foldl was not changed. Hugs and GHC and other implementations added the non-standard foldl'.

I suspect that people then considered it a compatibility and inertia issue. It was easy enough to add a non-standard foldl' but you can’t so easily change the standard.

I suspect that if we had had seq from the beginning then we would have defined foldl using it.

Miranda, one of Haskell’s predecessor languages, already had seq 5 years before Haskell 1.0.

顺便说一句,我通过使用

成功地缩短了 20 毫秒
foldl1' (+) [1..10^7]

所以,我想 foldl1' 应该是 sumproduct 的默认设置(对空列表进行特殊处理)。