Erlang OTP 18 上 concurrency/scalability 的最佳时钟或数字生成器函数？

Question

目前正在将 erlang 应用程序从 17 版转换为 18 版。可伸缩性和性能是设计中的主要指标。该程序需要一种方法来区分和排序新输入，要么使用许多唯一的单调递增数字（连续的数字流），要么使用其他一些机制。当前版本 (17) 没有为此使用 now() 因为它是一个可扩展性瓶颈（全局锁），所以它通过读取时钟和做其他事情来为传入的数据生成标签。我是试图找出在 18 年做到这一点的最佳方法，并从我运行.

的测试中得到一些有趣的结果

我预计 erlang:unique_integer([monotonic]) 的结果会很差，因为我预计它会有像 now() 这样的全局锁。假设可以并行读取时钟，我希望其中一个时钟功能能够获得最佳结果。相反，erlang:unique_integer([monotonic]) 在我进行基准测试的所有函数中获得了最好的结果，而时钟函数的效果更差。

谁能解释一下结果，告诉我哪些 erlang 函数应该给出最好的结果，哪些东西（时钟、数字生成器等）在 18 中是或不是全局锁定的？另外，如果您发现我的测试方法有任何问题，请务必指出。

测试PLATFORM/METHODOLOGY

windows 7 64 bit
erlang otp 18 (x64)
2 intel cores (celeron 1.8GHz)
2 erlang processes spawned to run each test function concurrently 500000 times
    for a total of 1000000 times, timed with timer:tc
each test run 10 times in succession and all results recorded

基线测试，连续

erlang:unique_integer([monotonic])
47000-94000

平行时间

erlang:unique_integer([monotonic])
~94000

ets:update_counter
450000-480000

erlang:monotonic_time
202000-218000

erlang:system_time
218000-234000

os:system_time
124000-141000

calendar:universal_time
453000-530000

Answer 1

如果您询问测试方法，我希望您也包括您的代码，因为基准代码中可能存在一个小错误，这可能会破坏结果。所以我写了一个 Gist 这样我们就可以使用相同的代码比较结果。 YMMV 特别是因为我使用 Linux 并且计时器强烈依赖于底层 OS。有我的结果：

$ uname -a
Linux hynek-notebook 4.1.0-1-amd64 #1 SMP Debian 4.1.3-1 (2015-08-03) x86_64 GNU/Linux
$ grep 'model name' /proc/cpuinfo 
model name      : Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
model name      : Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
model name      : Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
model name      : Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
$ erl
Erlang/OTP 18 [erts-7.0] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V7.0  (abort with ^G)
1> c(test).
{ok,test}
2> test:bench_all(1).
[{unique_monotonic_integer,{38341,39804}},
 {update_counter,{158248,159319}},
 {monotonic_time,{217531,218272}},
 {system_time,{224630,226960}},
 {os_system_time,{53489,53691}},
 {universal_time,{114125,116324}}]
3> test:bench_all(2).
[{unique_monotonic_integer,{40109,40238}},
 {update_counter,{307393,338993}},
 {monotonic_time,{120024,121612}},
 {system_time,{123634,124928}},
 {os_system_time,{29606,29992}},
 {universal_time,{177544,178820}}]
4> test:bench_all(20).
[{unique_monotonic_integer,{23796,26364}},
 {update_counter,{514835,527087}},
 {monotonic_time,{91916,93662}},
 {system_time,{94615,96249}},
 {os_system_time,{27194,27598}},
 {universal_time,{317353,340187}}]
5>

首先我要注意的是，只有 erlang:unique_integer/0,1 和 ets:update_counter/3,4,5 生成 unique 值。甚至 erlang:monotonic_time/0 也可以生成两个相同的时间戳！因此，如果您想要 unique 号码，除了使用 erlang:unique_integer/0,1 别无选择。如果你想要唯一的单调时间戳，你可以使用 {erlang:monotonic_time(), erlang:unique_integer()} 或者如果你不需要时间部分，你可以使用 erlang:unique_integer([monotonic])。如果不需要单调和唯一，可以使用其他选项。因此，如果您需要 unique monotonic 数字，那么只有一个不错的选择，那就是 erlang:unique_integer([monotonic]).

我第二次要注意，生成两个进程不足以测试可伸缩性。如您所见，当我对 20 个进程使用 os:timestamp/0 时，它们开始赶上 erlang:unique_integer/0,1。还有另一个问题。我们都使用只有两个 CPU 的硬件。测试可扩展性的数量太少了。想象一下在具有 64 个或更多内核的 HW 上结果会如何。

编辑：使用 {write_concurrency, true} 会提高 ets:update_counter，但仍远远超过 erlang:unique_integer/0,1。

2> test:bench(test:update_counter(),1).
{203830,213657}
3> test:bench(test:update_counter(),2).
{129148,140627}
4> test:bench(test:update_counter(),20).
{471858,501198}

Answer 2

根据erlang code base，erlang:unique_integer([monotonic])只是递增的原子整数。这有效快速。虽然这仍然会造成内存屏障，但与传统的全局锁定方法相比，原子操作仍然很便宜。

Erlang OTP 18 上 concurrency/scalability 的最佳时钟或数字生成器函数？

Best clock or number generator function for concurrency/scalability on Erlang OTP 18?

concurrency

erlang

performance

time

benchmarking