Python 内存安全吗?
Is Python memory-safe?
随着 Deno 成为新的 Node.js 竞争对手,很多新闻文章都提到了 Rust 的内存安全特性,其中一篇特别指出 Rust 和 Go 的内存安全性很好Swift 和 Kotlin 一样,但后两者并未广泛用于系统编程。
Safe Rust is the true Rust programming language. If all you do is write Safe Rust, you will never have to worry about type-safety or memory-safety. You will never endure a dangling pointer, a use-after-free, or any other kind of Undefined Behavior.
这激起了我的兴趣,想了解 Python 是否可以被视为内存安全的,如果是或否,安全或不安全的程度如何?
从一开始,维基百科上的article on memory safety甚至没有提到Python,而Python上的文章似乎只提到了内存管理。
我最接近找到答案的是 :
The wikipedia article associates type-safe to memory-safe, meaning, that the same memory area cannot be accessed as e.g. integer and string. In this way Python is type-safe. You cannot change the type of a object implicitly.
但即便如此,这似乎也只是暗示了两个方面之间的联系(使用来自维基百科的关联,这又是有争议的)并且没有关于 Python 是否可以被视为内存安全的明确答案。
维基百科lists以下内存安全问题的例子:
Access errors: invalid read/write of a pointer
Buffer overflow - out-of-bound writes can corrupt the content of adjacent objects, or internal data (like bookkeeping information for the heap) or return addresses.
Buffer over-read - out-of-bound reads can reveal sensitive data or help attackers bypass address space layout randomization.
Python at least tries 来防止这些。
Race condition - concurrent reads/writes to shared memory
在具有可变数据结构的语言中,这实际上并不难。 (函数式编程和不可变数据结构的倡导者经常使用这个事实作为支持他们的论据)。
Invalid page fault - accessing a pointer outside the virtual memory space. A null pointer dereference will often cause an exception or program termination in most environments, but can cause corruption in operating system kernels or systems without memory protection, or when use of the null pointer involves a large or negative offset.
Use after free - dereferencing a dangling pointer storing the address of an object that has been deleted.
Uninitialized variables - a variable that has not been assigned a value is used. It may contain an undesired or, in some languages, a corrupt value.
Null pointer dereference - dereferencing an invalid pointer or a pointer to memory that has not been allocated
Wild pointers arise when a pointer is used prior to initialization to some known state. They show the same erratic behaviour as dangling pointers, though they are less likely to stay undetected.
没有真正的方法来阻止某人试图访问空指针。在 C# 和 Java 中,这会导致 exception. In C++, this results in undefined behavior.
Memory leak - when memory usage is not tracked or is tracked incorrectly
Stack exhaustion - occurs when a program runs out of stack space, typically because of too deep recursion. A guard page typically halts the program, preventing memory corruption, but functions with large stack frames may bypass the page.
C#、Java 和 Python 等语言中的内存泄漏与手动管理内存的 C 和 C++ 等语言中的含义不同。在 C 或 C++ 中,由于无法释放已分配的内存而导致内存泄漏。在具有托管内存的语言中,您不必显式取消分配内存,但仍然可以通过不小心在某个地方维护对某个对象的引用来做一些非常相似的事情,即使在不再需要该对象之后也是如此。
对于 event handlers in C# 和长期存在的集合 类 这样的事情,这实际上很容易做到;尽管我们使用的是托管内存,但我实际上曾参与过存在内存泄漏的项目。从某种意义上说,在管理内存的环境中工作实际上会使这些问题变得更加危险,因为程序员可能会有一种错误的安全感。根据我的经验,即使是经验丰富的工程师也常常无法进行内存分析或编写测试用例来检查这一点(可能是由于环境给了他们一种错误的安全感)。
堆栈耗尽在 Python 中也很容易完成(例如无限递归)。
Heap exhaustion - the program tries to allocate more memory than the amount available. In some languages, this condition must be checked for manually after each allocation.
仍然很有可能 - 我很尴尬地承认我已经在 C# 中亲自完成了(虽然还没有在 Python 中)。
Double free - repeated calls to free may prematurely free a new object at the same address. If the exact address has not been reused, other corruption may occur, especially in allocators that use free lists.
Invalid free - passing an invalid address to free can corrupt the heap.
Mismatched free - when multiple allocators are in use, attempting to free memory with a deallocation function of a different allocator[20]
Unwanted aliasing - when the same memory location is allocated and modified twice for unrelated purposes.
不需要的别名实际上在 Python 中很容易做到。这是 Java 中的 (完全披露:我写了接受的答案);您可以在 Python 中轻松地做一些非常相似的事情。其他的由 Python 解释器本身管理。
因此,内存安全似乎是相对的。完全取决于您对“内存安全问题”的看法,实际上很难完全避免。 Java、C# 和 Python 等高级语言可以防止其中许多最严重的错误,但还有其他问题很难或不可能完全避免。
随着 Deno 成为新的 Node.js 竞争对手,很多新闻文章都提到了 Rust 的内存安全特性,其中一篇特别指出 Rust 和 Go 的内存安全性很好Swift 和 Kotlin 一样,但后两者并未广泛用于系统编程。
Safe Rust is the true Rust programming language. If all you do is write Safe Rust, you will never have to worry about type-safety or memory-safety. You will never endure a dangling pointer, a use-after-free, or any other kind of Undefined Behavior.
这激起了我的兴趣,想了解 Python 是否可以被视为内存安全的,如果是或否,安全或不安全的程度如何?
从一开始,维基百科上的article on memory safety甚至没有提到Python,而Python上的文章似乎只提到了内存管理。
我最接近找到答案的是
The wikipedia article associates type-safe to memory-safe, meaning, that the same memory area cannot be accessed as e.g. integer and string. In this way Python is type-safe. You cannot change the type of a object implicitly.
但即便如此,这似乎也只是暗示了两个方面之间的联系(使用来自维基百科的关联,这又是有争议的)并且没有关于 Python 是否可以被视为内存安全的明确答案。
维基百科lists以下内存安全问题的例子:
Access errors: invalid read/write of a pointer
Buffer overflow - out-of-bound writes can corrupt the content of adjacent objects, or internal data (like bookkeeping information for the heap) or return addresses.
Buffer over-read - out-of-bound reads can reveal sensitive data or help attackers bypass address space layout randomization.
Python at least tries 来防止这些。
Race condition - concurrent reads/writes to shared memory
在具有可变数据结构的语言中,这实际上并不难。 (函数式编程和不可变数据结构的倡导者经常使用这个事实作为支持他们的论据)。
Invalid page fault - accessing a pointer outside the virtual memory space. A null pointer dereference will often cause an exception or program termination in most environments, but can cause corruption in operating system kernels or systems without memory protection, or when use of the null pointer involves a large or negative offset.
Use after free - dereferencing a dangling pointer storing the address of an object that has been deleted.
Uninitialized variables - a variable that has not been assigned a value is used. It may contain an undesired or, in some languages, a corrupt value.
Null pointer dereference - dereferencing an invalid pointer or a pointer to memory that has not been allocated
Wild pointers arise when a pointer is used prior to initialization to some known state. They show the same erratic behaviour as dangling pointers, though they are less likely to stay undetected.
没有真正的方法来阻止某人试图访问空指针。在 C# 和 Java 中,这会导致 exception. In C++, this results in undefined behavior.
Memory leak - when memory usage is not tracked or is tracked incorrectly
Stack exhaustion - occurs when a program runs out of stack space, typically because of too deep recursion. A guard page typically halts the program, preventing memory corruption, but functions with large stack frames may bypass the page.
C#、Java 和 Python 等语言中的内存泄漏与手动管理内存的 C 和 C++ 等语言中的含义不同。在 C 或 C++ 中,由于无法释放已分配的内存而导致内存泄漏。在具有托管内存的语言中,您不必显式取消分配内存,但仍然可以通过不小心在某个地方维护对某个对象的引用来做一些非常相似的事情,即使在不再需要该对象之后也是如此。
对于 event handlers in C# 和长期存在的集合 类 这样的事情,这实际上很容易做到;尽管我们使用的是托管内存,但我实际上曾参与过存在内存泄漏的项目。从某种意义上说,在管理内存的环境中工作实际上会使这些问题变得更加危险,因为程序员可能会有一种错误的安全感。根据我的经验,即使是经验丰富的工程师也常常无法进行内存分析或编写测试用例来检查这一点(可能是由于环境给了他们一种错误的安全感)。
堆栈耗尽在 Python 中也很容易完成(例如无限递归)。
Heap exhaustion - the program tries to allocate more memory than the amount available. In some languages, this condition must be checked for manually after each allocation.
仍然很有可能 - 我很尴尬地承认我已经在 C# 中亲自完成了(虽然还没有在 Python 中)。
Double free - repeated calls to free may prematurely free a new object at the same address. If the exact address has not been reused, other corruption may occur, especially in allocators that use free lists.
Invalid free - passing an invalid address to free can corrupt the heap.
Mismatched free - when multiple allocators are in use, attempting to free memory with a deallocation function of a different allocator[20]
Unwanted aliasing - when the same memory location is allocated and modified twice for unrelated purposes.
不需要的别名实际上在 Python 中很容易做到。这是 Java 中的
因此,内存安全似乎是相对的。完全取决于您对“内存安全问题”的看法,实际上很难完全避免。 Java、C# 和 Python 等高级语言可以防止其中许多最严重的错误,但还有其他问题很难或不可能完全避免。