为什么 Visual Studio 在没有优化的情况下正确编译这个函数，但在优化的情况下却错误地编译了这个函数？

Question

我正在尝试使用类似 y 组合器的 lambda 包装（尽管我知道它们实际上并不是严格意义上的 y 组合器），但我遇到了一个非常奇怪的问题。我的代码完全按照我在调试配置（关闭优化）中的预期运行，但在发布（设置为 Optimizations (Favor Speed) (/Ox)）中跳过了大量（重要的！）位。

请注意，lambda 函数的内部基本上是无关紧要的，它们只是为了确保它可以正确递归等。

// main.cpp
#include <iostream>
#include <string>
#define uint unsigned int

// Defines a y-combinator-style thing to do recursive things. Includes a system where the lambda can declare itself to be obsolete.
// Yes, it's hacky and ugly. Don't worry about it, this is all just testing functionality.
template <class F>
class YCombinator {
public:
    F m_f; // the lambda will be stored here
    bool m_selfDestructing = false; //!< Whether the combinator will self-destruct should its lambda mark itself as no longer useful.
    bool m_selfDestructTrigger = false; //!< Whether the combinator's lambda has marked itself as no longer useful.

    // a forwarding operator:
    template <class... Args>
    decltype(auto) evaluate(Args&&... args) {
        // Avoid storing return if we can, 
        if (!m_selfDestructing) {
            // Pass itself to m_f, then the arguments.
            return m_f(*this, std::forward<Args>(args)...);
        }
        else {
            // Pass itself to m_f, then the arguments.
            auto r = m_f(*this, std::forward<Args>(args)...);
            // self-destruct if necessary, allowing lamdas to delete themselves if they know they're no longer useful.
            if (m_selfDestructTrigger) {
                delete this;
            }
            return r;
        }
    }
};
template <class F> YCombinator(F, bool sd)->YCombinator<F>;

// Tests some instances.
int main() {
    // Most basic test
    auto a = YCombinator{
        [](auto & self, uint in)->uint{
            uint out = in;
            for (uint i = 1u; i < in; ++i) {
                out += self.evaluate(i);
            }
            return out;
        },
        false
    };

    // Same as a, but checks it works as a pointer.
    auto b = new YCombinator{
        [](auto & self, uint in)->uint {
            uint out = in;
            for (uint i = 0u; i < in; ++i) {
                out += self.evaluate(i);
            }

            return out;
        },
        false
    };

    // c elided for simplicity

    // Checks the self-deletion mechanism
    auto d = new YCombinator{
        [&a, b](auto & self, uint in)->uint {
            std::cout << "Running d(" << in << ") [SD-" << self.m_selfDestructing << "]..." << std::endl;

            uint outA = a.evaluate(in);
            uint outB = b->evaluate(in);

            if (outA == outB)
                std::cout << "d(" << in << ") [SD-" << self.m_selfDestructing << "] confirmed both a and b produced the same output of " << outA << "." << std::endl;

            self.m_selfDestructTrigger = true;

            return outA;
        },
        true
    };

    uint resultA = a.evaluate(4u);
    std::cout << "Final result: a(4) = " << resultA << "." << std::endl << std::endl;

    uint resultB = (*b).evaluate(5u);
    std::cout << "Final result: b(5) = " << resultB << "." << std::endl << std::endl;

    uint resultD = d->evaluate(2u);
    std::cout << "Final result: d(2) = " << resultD << "." << std::endl << std::endl;

    resultD = d->evaluate(2u);
    std::cout << "Final result: d(2) = " << resultD << "." << std::endl << std::endl;
}

应该发生的是 d 的第一次评估工作正常，设置 d.m_selfDestructTrigger，并导致自身被删除。然后 d 的第二次评估应该崩溃，因为 d 不再真正存在。这正是调试配置中发生的情况。 （注意：正如@largest_prime_is_463035818 在下面指出的那样，它不应该像遇到未定义的行为那样崩溃。）

但在 Release 配置中，据我所知，evaluate 中的所有代码都被完全跳过，执行直接跳转到 lambda。显然，优化代码中的断点有点令人怀疑，但这似乎就是正在发生的事情。我试过重建项目，但没有成功； VS 似乎对此很坚决。

我疯了吗？我错过了什么吗？或者这是 VS（甚至编译器）中的实际错误？如果您能协助确定这是代码问题还是工具问题，我们将不胜感激。

注意：我在 VS2019 16.8.3 上，使用 /std:c++ latest 功能集。

Answer 1

未定义行为是一种非局部现象。如果您的程序遇到 UB，这意味着程序的行为整体是未定义的，而不仅仅是它做坏事的那一小部分。

因此，UB 有可能“时间旅行”，影响理论上应该在执行 UB 之前正确执行的代码。也就是说，在展示 UB 的程序中没有“正确”； either the program is correct, or it is incorrect.

能走多远取决于实现，但就标准而言，VS 的行为与标准一致。

Answer 2

问题

无论优化选项如何，在这两种情况下都会调用代码中的 delete this。

        if (m_selfDestructTrigger) {
            delete this;
        }

在您的代码中，“b”对象被删除，但随后您“评估()”它，这会导致访问冲突，因为您正在使用已经释放的堆。

在我的例子中，它在发布和调试配置中都给出了访问冲突错误，但在你的例子中，由于以下原因，优化可能不会发生访问冲突。

可能有一些情况，例如在您的情况下，使用释放堆不会导致错误，并且您的印象是程序运行良好（如优化或发布配置），因为释放堆是未清除并保留旧对象。

这不是编译器错误，而是您删除对象的方式。

对象自行删除通常是一种不好的风格，因为您可能会引用已删除的对象，就像您的情况一样。 whether objects should delete themselves.

上有讨论

如果您注释“删除”行，您的代码将运行没有访问冲突。如果您仍然怀疑它可能是编译器错误并且“执行直接跳转到 lambda”，则可以使用更简单的方法来调试应用程序。这种更简单的方法是避免“删除”并从您怀疑被编译器跳过的代码块中输出一些文本。

解决方案

您可以使用其他编译器，特别是带有消毒剂的 clang 以确保它不是您正在使用的 Microsoft Visual Studio 编译器的错误。

例如，使用：

clang++.exe -std=c++20 -fsanitize=address calc.cpp

和运行生成的可执行文件。

在此示例中，您的代码是使用 "Address Sanitizer", which is a memory error detector supported by this compiler. Using various sanitizers 编译的，可能有助于您将来调试 C/C++ 程序。

您将收到类似这样的错误，表明您在释放堆后正在使用它：

=================================================================
==48820==ERROR: AddressSanitizer: heap-use-after-free on address 0x119409fa0380 at pc 0x7ff799c91d6c bp 0x004251cff720 sp 0x004251cff768
READ of size 1 at 0x119409fa0380 thread T0
    #0 0x7ff799c91d6b in main+0xd6b (c:\calc\clang\calc.exe+0x140001d6b)
    #1 0x7ff799c917de in main+0x7de (c:\calc\clang\calc.exe+0x1400017de)
    #2 0x7ff799cf799f in __scrt_common_main_seh d:\agent\_work\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #3 0x7ffe3cff53fd in BaseThreadInitThunk+0x1d (C:\WINDOWS\System32\KERNEL32.DLL+0x1800153fd)
    #4 0x7ffe3ddc590a in RtlUserThreadStart+0x2a (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18006590a)

0x119409fa0380 is located 16 bytes inside of 24-byte region [0x119409fa0370,0x119409fa0388)
freed by thread T0 here:
    #0 0x7ff799cf6684 in operator delete C:\src\llvm_package_6923b0a7\llvm-project\compiler-rt\lib\asan\asan_new_delete.cpp:160
    #1 0x7ff799c91ede in main+0xede (c:\calc\clang\calc.exe+0x140001ede)
    #2 0x7ff799c916e4 in main+0x6e4 (c:\calc\clang\calc.exe+0x1400016e4)
    #3 0x7ff799cf799f in __scrt_common_main_seh d:\agent\_work\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #4 0x7ffe3cff53fd in BaseThreadInitThunk+0x1d (C:\WINDOWS\System32\KERNEL32.DLL+0x1800153fd)
    #5 0x7ffe3ddc590a in RtlUserThreadStart+0x2a (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18006590a)

证明

您也可以使用下面的批处理文件来比较经过clang优化和未经过优化的两个版本的输出结果，发现它们产生的结果相同：

clang++ -std=c++20 -O3 -o calc-O3.exe calc.cpp
clang++ -std=c++20 -O0 -o calc-O0.exe calc.cpp
calc-O3.exe > calc-O3.txt
calc-O0.exe > calc-O0.txt
fc calc-O3.txt calc-O0.txt

它将给出以下内容：

Comparing files calc-O3.txt and calc-O0.txt
FC: no differences encountered

对于 Microsoft Visual Studio 编译器，使用以下批处理文件：

cl.exe /std:c++latest /O2 /Fe:calc-O3.exe calc.cpp
cl.exe /std:c++latest /Od /Fe:calc-O0.exe calc.cpp
calc-O3.exe > calc-O3.txt
calc-O0.exe > calc-O0.txt
fc calc-O3.txt calc-O0.txt

它也会产生相同的结果，因此无论优化如何，代码运行都是相同的（而不是像您写的那样“完全跳过求值中的所有代码”）——您可能已经调试过了由于优化而错误。

为什么 Visual Studio 在没有优化的情况下正确编译这个函数，但在优化的情况下却错误地编译了这个函数？

Why does Visual Studio compile this function correctly without optimisation, but incorrectly with optimisation?

c++

lambda

y-combinator

visual-studio

c++20

问题

解决方案

证明