我的双重检查锁定模式实现是否正确?

Is my Double-Checked Locking Pattern implementation right?

Meyers 的书 Effective Modern C++,第 16 项中的示例。

in a class caching an expensive-to-compute int, you might try to use a pair of std::atomic avriables instead of a mutex:

class Widget {
public:
    int magicValue() const {
        if (cachedValid) {
            return cachedValue;
        } else {
            auto val1 = expensiveComputation1();
            auto val2 = expensiveComputation2();

            cachedValue = va1 + val2;
            cacheValid = true;
            return cachedValue;
        }
    }
private:
    mutable std::atomic<bool> cacheValid { false };
    mutable std::atomic<int> cachedValue;
};

This will work, but sometimes it will work a lot harder than it should.Consider: A thread calls Widget::magicValue, sees cacheValid as false, performs the two expensive computations, and assigns their sum to cachedValud. At that point, a second thread calss Widget::magicValue, also sees cacheValid as false, and thus carries out the same expensive computations that the first thread has just finished.

然后他给出了一个带互斥量的解决方案:

class Widget {
public:
    int magicValue() const {
        std::lock_guard<std::mutex> guard(m);
        if (cacheValid) {
            return cachedValue;
        } else {
            auto val1 = expensiveComputation1();
            auto val2 = expensiveComputation2();

            cachedValue = va1 + val2;
            cacheValid = true;
            return cachedValue;
        }
    }
private:
    mutable std::mutex m;
    mutable bool cacheValid { false };
    mutable int cachedValue;
};

但我认为解决方案不是那么有效,我考虑将互斥锁和原子结合起来组成一个双重检查锁定模式,如下所示。

class Widget {
public:
    int magicValue() const {
        if (!cacheValid)  {
            std::lock_guard<std::mutex> guard(m);
            if (!cacheValid) {
                auto val1 = expensiveComputation1();
                auto val2 = expensiveComputation2();

                cachedValue = va1 + val2;
                cacheValid = true;
            }
        }
        return cachedValue;
    }
private:
    mutable std::mutex m;
    mutable std::atomic<bool> cacheValid { false };
    mutable std::atomic<int> cachedValue;
};

因为本人是多线程编程新手,所以想了解:

编辑:


修复了代码。if (!cachedValue) -> if (!cacheValid)

不正确:

int magicValue() const {
    if (!cachedValid)  {

        // this part is unprotected, what if a second thread evaluates
        // the previous test when this first is here? it behaves 
        // exactly like in the first example.

        std::lock_guard<std::mutex> guard(m);
        if (!cachedValue) {
            auto val1 = expensiveComputation1();
            auto val2 = expensiveComputation2();

            cachedValue = va1 + val2;
            cachedValid = true;
        }
    }
    return cachedValue;

正如 HappyCactus 所指出的,第二个检查 if (!cachedValue) 实际上应该是 if (!cachedValid)。除了这个错字外,我认为您对双重检查锁定模式的演示是正确的。但是,我认为在cachedValue上使用std::atomic是没有必要的。唯一写入 cachedValue 的地方是 cachedValue = va1 + val2;。在完成之前,任何线程都不会到达语句 return cachedValue;,这是唯一读取 cachedValue 的地方。因此,写入和读取不可能并发。而且并发读取也没有问题

您可以通过降低内存排序要求来稍微提高您的解决方案的效率。此处不需要原子操作的默认顺序一致性内存顺序。

性能差异在 x86 上可以忽略不计,但在 ARM 上很明显,因为顺序一致性内存顺序在 ARM 上很昂贵。有关详细信息,请参阅 “Strong” and “weak” hardware memory models by Herb Sutter

建议的更改:

class Widget {
public:
    int magicValue() const {
        if (cachedValid.load(std::memory_order_acquire)) { // Acquire semantics.
            return cachedValue;
        } else {
            auto val1 = expensiveComputation1();
            auto val2 = expensiveComputation2();

            cachedValue = va1 + val2; // Non-atomic write.

            // Release semantics.
            // Prevents compiler and CPU store reordering.
            // Makes this and preceding stores by this thread visible to other threads.
            cachedValid.store(true, std::memory_order_release); 
            return cachedValue;
        }
    }
private:
    mutable std::atomic<bool> cacheValid { false };
    mutable int cachedValue; // Non-atomic.
};

Is my code right?

是的。您对 Double-Checked Locking Pattern 的应用是正确的。但请参阅下面的一些改进。

Does it performance better ?

与完全锁定的变体(在您的 post 中排名第二)相比,它大多具有更好的性能,直到 magicValue() 仅被调用一次(但即使在那种情况下,性能损失也可以忽略不计) .

与无锁变体(在您的 post 中排名第一)相比,您的代码表现出更好的性能,直到价值计算比 等待互斥体 .

例如,10 个值的总和(通常)比等待互斥 .在那种情况下,第一个变体是可取的。从另一方面看,10 从文件 读取比等待互斥,所以你的变体是比第一更好。


实际上,您的代码有一些简单的改进,可以使其更快(至少在某些机器上)并提高代码的理解力:

  1. cachedValue 变量根本不需要原子语义。它由 cacheValid 标志保护,原子性完成所有工作。此外,单个原子标志可以保护多个非原子值。

  2. 此外,如答案 中所述,当访问 cacheValid 标志时,您不需要顺序一致性顺序(当您简单阅读时默认应用或写原子变量),释放-获取顺序就足够了。


class Widget {
public:
    int magicValue() const {
        //'Acquire' semantic when read flag.
        if (!cacheValid.load(std::memory_order_acquire))  { 
            std::lock_guard<std::mutex> guard(m);
            // Reading flag under mutex locked doesn't require any memory order.
            if (!cacheValid.load(std::memory_order_relaxed)) {
                auto val1 = expensiveComputation1();
                auto val2 = expensiveComputation2();

                cachedValue = va1 + val2;
                // 'Release' semantic when write flag
                cacheValid.store(true, std::memory_order_release);
            }
        }
        return cachedValue;
    }
private:
    mutable std::mutex m;
    mutable std::atomic<bool> cacheValid { false };
    mutable int cachedValue; // Atomic isn't needed here.
};