带交换的 Golang 无锁数组

Golang lock-free array with swap

我有一个双插槽数组,当生产者设置它时需要在插槽之间交换,并且总是 return 一个有效的插槽给消费者。至于原子操作逻辑方面,我无法想象当两个 goroutine 写入同一个数组槽时的情况,但竞争检测器不这么认为。有谁能解释一下,错误在哪里?

type checkConfig struct {
    timeout   time.Time
}

type checkConfigVersions struct {
    config [2]*checkConfig
    reader uint32
    writer uint32
}

func (c *checkConfigVersions) get() *checkConfig {
    return c.config[atomic.LoadUint32(&c.reader)]
}

func (c *checkConfigVersions) set(new *checkConfig) {
    for {
        reader := atomic.LoadUint32(&c.reader)
        writer := atomic.LoadUint32(&c.writer)
        switch diff := reader ^ writer; {
        case diff == 0:
            runtime.Gosched()
        case diff == 1:
            if atomic.CompareAndSwapUint32(&c.writer, writer, (writer+1)&1) {
                c.config[writer] = new
                atomic.StoreUint32(&c.reader, writer)
                return
            }
        }
    }
}

数据竞争发生在 c.config[writer] = new,但在我看来这是不可能的。

fun main() {
    runtime.GOMAXPROCS(runtime.NumCPU())

    var wg sync.WaitGroup

    ccv := &checkConfigVersions{reader: 0, writer: 1}

    for i := 0; i < runtime.NumCPU(); i++ {
        wg.Add(100)
        go func(i int) {
            for j := 0; j < 100; j++ {
                ccv.set(&checkConfig{})
                wg.Done()
            }
        }(i)
    }

    wg.Wait()
    fmt.Println(ccv.get())
}

数据竞争检测器输出:

==================
WARNING: DATA RACE
Write at 0x00c42009a020 by goroutine 12:
  main.(*checkConfigVersions).set()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:118 +0xd9
  main.main.func1()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:42 +0x60

Previous write at 0x00c42009a020 by goroutine 11:
  main.(*checkConfigVersions).set()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:118 +0xd9
  main.main.func1()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:42 +0x60

Goroutine 12 (running) created at:
  main.main()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:40 +0x159

Goroutine 11 (running) created at:
  main.main()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:40 +0x159
==================

如果你尝试用 ccv.read() 读取它,你会赶上另一场比赛,但在 readwrite 之间相同的数组槽...

我可以更改你的代码来检查刚刚写的 c.writer:

if atomic.CompareAndSwapUint32(&c.writer, writer, (writer+1)&1) {
  newWriter := atomic.LoadUint32(&c.writer)
  if newWriter != (writer+1)&1 {
    panic(fmt.Errorf("wrote %d, but writer is %d", (writer+1)&1, newWriter))
  }
  //c.config[writer] = new
  atomic.StoreUint32(&c.reader, writer)
  return
}

手动设置 GOMAXPROCS(和 goroutines 的数量)为 3,我在 运行 几次后得到以下结果:

$ go run main.go  
panic: wrote 1, but writer is 0

goroutine 6 [running]:
main.(*checkConfigVersions).set(0xc42000a060, 0x1, 0xc4200367a0)
    main.go:36 +0x16d
main.main.func1(0xc42000a060, 0xc420018110, 0x1)
    main.go:58 +0x63
created by main.main
    main.go:56 +0xd6
exit status 2

主要原因是GOMAXPROCS设置了OS线程的使用数量。这些不受 Go 控制(Go 只是将 goroutines 调度到 OS 线程上)。相反,OS 可以根据需要安排 OS 个线程。

这意味着其中一个 goroutine 将 CAS(1, 1, 0),然后被 OS 挂起。在那段时间里,另一个 goroutine 将经过 CAS(0, 0, 1)。这允许第三个 goroutine 执行 CAS(1, 1, 0) 并在原 goroutine 再次安排执行的同时继续执行。砰!他们都试图写入相同的 config[writer].

当您想避免互斥锁时,原子性非常有用,但在这种情况下不应使用原子性来执行同步。

我不太确定您的原始方案是什么,但我相当有信心使用互斥体会为您省去很多痛苦。