在没有互斥量的情况下并发读取或写入时会发生什么

Question

在Go中，一个sync.Mutex或chan用于防止共享对象的并发访问。但是，在某些情况下，我只对对象的变量或字段的“最新”值感兴趣。或者我喜欢写一个值，而不关心另一个 go-routine 是否稍后覆盖它或刚刚覆盖它。

更新： TLDR；只是不要这样做。这不安全。阅读答案、评论和链接的文档！

2021 年更新： Go 内存模型由 Russ Cox going to be specified more thoroughly and there are three great articles 提供，它将教您更多关于非同步内存访问的惊人效果。这些文章总结了许多下面的讨论和学习。

这是一个示例程序的两个变体 good 和 bad，它们似乎都使用当前的 Go 运行时产生“正确”的输出：

package main

import (
    "flag"
    "fmt"
    "math/rand"
    "time"
)

var bogus = flag.Bool("bogus", false, "use bogus code")

func pause() {
    time.Sleep(time.Duration(rand.Uint32()%100) * time.Millisecond)
}

func bad() {
    stop := time.After(100 * time.Millisecond)
    var name string

    // start some producers doing concurrent writes (DANGER!)
    for i := 0; i < 10; i++ {
        go func(i int) {
            pause()
            name = fmt.Sprintf("name = %d", i)
        }(i)
    }

    // start consumer that shows the current value every 10ms
    go func() {
        tick := time.Tick(10 * time.Millisecond)
        for {
            select {
            case <-stop:
                return
            case <-tick:
                fmt.Println("read:", name)
            }
        }
    }()

    <-stop
}

func good() {
    stop := time.After(100 * time.Millisecond)
    names := make(chan string, 10)

    // start some producers concurrently writing to a channel (GOOD!)
    for i := 0; i < 10; i++ {
        go func(i int) {
            pause()
            names <- fmt.Sprintf("name = %d", i)
        }(i)
    }

    // start consumer that shows the current value every 10ms
    go func() {
        tick := time.Tick(10 * time.Millisecond)
        var name string
        for {
            select {
            case name = <-names:
            case <-stop:
                return
            case <-tick:
                fmt.Println("read:", name)
            }
        }
    }()

    <-stop
}

func main() {
    flag.Parse()
    if *bogus {
        bad()
    } else {
        good()
    }
}

预期输出如下：

...
read: name = 3
read: name = 3
read: name = 5
read: name = 4
...

read: 和 read: name=[0-9] 的任意组合都是该程序的正确输出。接收任何其他字符串作为输出将是错误的。

当运行这个程序用go run --race bogus.go是安全的。

但是，go run --race bogus.go -bogus 警告并发读写。

对于 map 类型和附加到切片时，我总是需要互斥锁或类似的保护方法来避免段错误或意外行为。但是，将文字（原子值）读写到变量或字段值似乎是安全的。

问题：我可以安全地并发读取和写入哪些 Go 数据类型，而无需互斥锁，不会产生段错误，也不会从内存中读取垃圾？

请在你的回答中解释为什么某些东西在 Go 中安全或不安全。

更新：我重写了示例以更好地反映原始代码，其中我遇到了并发写入问题。重要的倾向已经在评论中。我会接受一个足够详细地总结这些知识的答案（尤其是在 Go 运行时）。

Answer 1

Which Go data types can I safely read and safely write concurrently without a mutext and without producing segfaults and without reading garbage from memory?

None.

真的就是这么简单：在任何情况下，您都不能同时读取和写入 Go 中的任何内容。

（顺便说一句：你的 "correct" 程序不正确，它很活泼，即使你摆脱了竞争条件，它也不会确定地产生输出。）

Answer 2

为什么不能使用频道

package main

import (
    "fmt"
    "sync"
)

func main() {

    var wg sync.WaitGroup // wait group to close channel
    var buffer int = 1    // buffer of the channel

    // channel to get the share data
    cName := make(chan string, buffer)
    for i := 0; i < 10; i++ {
        wg.Add(1) // add to wait group
        go func(i int) {
            cName <- fmt.Sprintf("name = %d", i)
            wg.Done() // decrease wait group.
        }(i)

    }

    go func() {
        wg.Wait() // wait of wait group to be 0
        close(cName) // close the channel
    }()

    // process all the data
    for n := range cName {
        println("read:", n)
    }

}

以上代码returns输出如下

read: name = 0
read: name = 5
read: name = 1
read: name = 2
read: name = 3
read: name = 4
read: name = 7
read: name = 6
read: name = 8
read: name = 9

https://play.golang.org/p/R4n9ssPMOeS

Article about channels

Answer 3

However, in some cases I am just interested in the latest value of a variable or field of an object.

这是根本问题："latest" 这个词是什么意思？

假设，从数学上讲，我们有一个值序列 X_i，其中 0 <= i < N。那么显然X_j就是"later than"X_i 如果 j > i。这是 "latest" 的一个很好的简单定义，可能就是您想要的。

但是当一台机器中的两个独立的 CPUs——包括 Go 程序中的两个 goroutines——同时工作时，时间本身就失去了意义。我们不能说是 i < j、i == j 还是 i > j。所以latest.

这个词没有正确的定义

为了解决此类问题，现代 CPU 硬件和作为编程语言的 Go 为我们提供了某些 同步原语 。如果 CPUs A 和 B 执行内存栅栏指令或同步指令，或使用任何其他存在的硬件规定，CPUs（and/or 一些外部硬件）将插入所需的任何内容"time" 的概念重新获得其意义。也就是说，如果 CPU 使用屏障指令，我们可以说在屏障之前执行的内存加载或存储是 "before" 和内存加载或在之后执行的存储屏障是 "after".

（在某些现代硬件中，实际实现由加载和存储缓冲区组成，它们可以重新排列加载和存储进入内存的顺序。屏障指令同步缓冲区，或在其中放置一个实际屏障, 这样负载和存储就不能越过障碍。这个特定的具体实现提供了一种简单的方法来思考这个问题，但并不完整：您应该将时间简单地视为 不存在 在 hardware-provided 同步之外，即所有从某些位置加载和存储到某些位置是同时发生的，而不是按顺序发生的，除了这些障碍。）

无论如何，Go 的 sync 包为您提供了一种简单的高级访问方法来应对这些类型的障碍。在互斥 Lock 调用之前执行的已编译代码确实在之前锁定函数 returns 确实完成了，而在调用之后执行的代码实际上直到 after 锁函数returns.

Go 的频道提供相同类型的 before/after 时间保证。

Go 的 sync/atomic 包提供了低得多的保证。一般来说，您应该避免这种情况，以支持更高级别的频道或 sync.Mutex 样式保证。（编辑以添加注释：您可以在此处使用 sync/atomic 的 Pointer 操作，但不能直接使用 string 类型，因为 Go 字符串是实际上实现为包含两个独立值的 header：一个指针和一个长度。您可以通过更新指向 string object 的指针来使用另一层间接寻址来解决此问题。但在你考虑这样做之前，你应该对语言首选方法的使用进行基准测试，并验证这些方法是否存在问题，因为在 sync/atomic 级别工作的代码很难编写也很难调试。）

在没有互斥量的情况下并发读取或写入时会发生什么

What happens when reading or writing concurrently without a mutex

concurrency

mutex

atomic

shared-memory

go