配置 Snap 以提高性能

Question

我只是在玩 Snap 框架，想看看它对其他框架的表现如何（在完全人为的情况下）。

我发现我的 Snap 应用程序最高时约为 1500 requests/second（该应用程序只是 snap init; snap build; ./dist/app/app，即没有对 snap 创建的默认应用程序进行代码更改）：

$ ab -n 20000 -c 500 http://127.0.0.1:8000/                                        
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
Completed 18000 requests
Completed 20000 requests
Finished 20000 requests


Server Software:        Snap/0.9.5.1
Server Hostname:        127.0.0.1
Server Port:            8000

Document Path:          /
Document Length:        721 bytes

Concurrency Level:      500
Time taken for tests:   12.845 seconds
Complete requests:      20000
Failed requests:        0
Total transferred:      17140000 bytes
HTML transferred:       14420000 bytes
Requests per second:    1557.00 [#/sec] (mean)
Time per request:       321.131 [ms] (mean)
Time per request:       0.642 [ms] (mean, across all concurrent requests)
Transfer rate:          1303.07 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   44 287.6      0    3010
Processing:     6  274 153.6    317    1802
Waiting:        5  274 153.6    317    1802
Total:         20  318 346.2    317    3511

Percentage of the requests served within a certain time (ms)
  50%    317
  66%    325
  75%    334
  80%    341
  90%    352
  95%    372
  98%   1252
  99%   2770
 100%   3511 (longest request)

然后我启动了一个 Grails 应用程序，看起来 Tomcat（一旦 JVM 预热）可以承受更多负载：

$ ab -n 20000 -c 500 http://127.0.0.1:8080/test-0.1/book                                                                                                                                                                                                     
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
Completed 18000 requests
Completed 20000 requests
Finished 20000 requests


Server Software:        Apache-Coyote/1.1
Server Hostname:        127.0.0.1
Server Port:            8080

Document Path:          /test-0.1/book
Document Length:        722 bytes

Concurrency Level:      500
Time taken for tests:   4.366 seconds
Complete requests:      20000
Failed requests:        0
Total transferred:      18700000 bytes
HTML transferred:       14440000 bytes
Requests per second:    4581.15 [#/sec] (mean)
Time per request:       109.143 [ms] (mean)
Time per request:       0.218 [ms] (mean, across all concurrent requests)
Transfer rate:          4182.99 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   67 347.4      0    3010
Processing:     1   30  31.4     21     374
Waiting:        0   26  24.4     20     346
Total:          1   97 352.5     21    3325

Percentage of the requests served within a certain time (ms)
  50%     21
  66%     28
  75%     35
  80%     42
  90%     84
  95%    230
  98%   1043
  99%   1258
 100%   3325 (longest request)

我猜部分原因可能是 Tomcat 似乎保留了大量 RAM 并且可以 keep/cache 某些方法。在此实验期间，Tomcat 使用了超过 700mb 或 RAM，而 Snap 几乎没有接近 70mb。

我的问题：

我是在比较苹果和橘子吗？
为 throughput/speed 优化 Snap 需要采取哪些步骤？

进一步的实验：

然后，按照 mightybyte 的建议，我开始尝试 +RTS -A4M -N4 选项。该应用每秒能够处理超过 2000 个请求（增加约 25%）。

我还删除了嵌套模板并从顶级 tpl 文件中提供了一个文档（与以前相同大小）。这将性能提高到每秒 7000 多个请求。内存使用量上升到大约 700MB。

Answer 1

我绝不是这方面的专家，所以我只能真正回答你的第一个问题，是的，你在比较苹果和橘子（还有香蕉，但没有意识到）。

首先，您似乎在尝试对不同事物进行基准测试，因此您的结果自然会不一致。其中之一是示例 Snap 应用程序，另一个只是 "a Grails application"。这些东西到底在做什么？你在服务页面吗？处理请求？应用程序的差异将解释性能差异。

其次，RAM 使用量的差异也显示了这些应用程序在做什么方面的差异。 Haskell Web 框架非常擅长处理没有太多 RAM 的大型实例，而其他框架，如您所见的 Tomcat，由于 RAM 有限，其性能将受到限制。尝试将两个应用程序限制为 100mb，看看您的性能差异会发生什么。

如果你想比较不同的框架，你真的需要运行一个标准的应用程序来做到这一点。 Snap 使用 Pong 基准测试来做到这一点。可以看到旧测试的结果（来自 2011 年和 Snap 0.3）here。这段与您的情况极为相关：

If you’re comparing this with our previous results you will notice that we left out Grails. We discovered that our previous results for Grails may have been too low because the JVM had not been given time to warm up. The problem is that after the JVM warms up for some reason httperf isn’t able to get any samples from which to generate a replies/sec measurement, so it outputs 0.0 replies/sec. There are also 1000 connreset errors, so we decided the Grails numbers were not reliable enough to use.

作为比较，Yesod 博客有一个大约在同一时间的 Pong 基准测试，它显示了类似的结果。你可以发现 here. They also link to their benchmark code if you would like to try to run a more similar benchmark, it is available on Github.

Answer 2

jkeuhlen 的回答对您的第一个问题进行了很好的观察。至于你的第二个问题，你肯定可以使用一些东西来调整性能。如果您查看 Snap，old raw result data, you can see that we were running the application with +RTS -A4M -N4. The -N4 option tells the GHC runtime to use 4 threads. (Note that you have to build the application with -threaded to do this.) The -A4M option sets the size of the garbage collector's allocation area. Our experiments showed that these two seemed to have the biggest impact on performance. But that was done a long time ago and GHC has changed a lot since then, so you probably want to play around with them and find what works best for you. This page 有关于可用于控制 GHC 运行时的其他命令行选项的深入信息，如果您希望进行更多实验。

去年在更新基准方面做了一些工作。如果您对此感兴趣，请查看 snap-benchmarks repository 中的不同分支。如果能在一组新的基准测试中获得更多帮助，那就太好了。

配置 Snap 以提高性能

Configuring Snap for performance

haskell

haskell-snap-framework