Python: bytearray 对象在填充时变为字节（不可变）

Question

我正在尝试读取二进制文件的原始内容，以便可以在内存中对其进行操作。据我了解，bytes() 对象是不可变的，而 bytearray() 对象是可变的，所以我将文件读入字节数组，然后尝试修改后者：

raw_data = bytearray()

try:
    with open(input_file, "rb") as f:
         raw_data = f.read()
except IOError:
    print('Error opening', input_file)

raw_data[0] = 55   # attempt to modify the first byte

然而，最后一行的结果是 TypeError: 'bytes' object does not support item assignment。
等等... 什么 'bytes' 对象？

让我们看看 Python 报告的实际数据类型，在填充数组之前和之后：

raw_data = bytearray()
print('Before:', type(raw_data))

try:
    with open(input_file, "rb") as f:
         raw_data = f.read()
except IOError:
    print('Error opening', input_file)

print('After: ', type(raw_data))

输出：

Before: <class 'bytearray'>
After:  <class 'bytes'>

这是怎么回事？为什么类型被修改了，我可以阻止吗？

我总是可以从 raw_data 的内容创建另一个 bytearray 对象，但如果我可以节省内存并只修改原件就更好了。

Answer 1

为什么类型被修改了？看下面：

>>> x = 12
>>> type(x)
<class 'int'>
>>> x = 7.0
>>> type(x)
<class 'float'>

当然，我将值 12 分配给 x，结果 x 的类型为 int。但是后来我将 new 值 7.0 分配给 x，这改变了 x 的值类型。这是正在演示的 Python 动态类型 的基础。

因此，您最初将 bytearray 实例分配给 raw_data 并不重要。重要的是对 raw_data 的 last 赋值，即：

raw_data = f.read()

并调用 f.read() returns class bytes.

解决这个问题的方法是 pre-allocating 具有正确大小的 bytearray 并使用 readinto:

with open(input_file, mode="rb") as f:
    # Seek to end of file and return offset from beginning:
    file_size = f.seek(0, 2)
    # Seek back to beginning:
    f.seek(0, 0)
    # Pre-alllocate bytearray:
    raw_data = bytearray(file_size)
    f.readinto(raw_data)

Python: bytearray 对象在填充时变为字节（不可变）

Python: bytearray object becomes bytes (immutable) when populated

python

arrays

byte

mutability