Python:编辑文件中的特定十六进制值
Python: Edit specific hex values in a file
我正在尝试编辑 .m4a 文件(音频文件)中的特定数据行,但我无法在 python 中找到一种方法。我知道还有其他类似的线索,但是当我在十六进制编辑器程序(例如 HxD)中打开一个 .m4a 文件时,它给我的十六进制数据与我从 python 脚本中获得的数据不同。我对术语有点困惑。我需要做的是用 python 读取文件并将其转换为我的十六进制编辑器使用的格式替换数据然后将其转换回并将其写入文件。我真的不知道这是否可行,或者是否有更简单的方法。我还是 Python 的新手,所以我还在学习。我真的只需要有人给我指出正确的方向。这样做的原因与我要更改的文件的元数据有关。
我的python版本:Python3.7.4
这里是相关文件的 link:https://drive.google.com/file/d/1m8SpCLSyX265_I00MFT1IyltpTAvxntF/view?usp=sharing
我的代码:
with open(file, 'rb') as f:
content = f.read().hex()
print(content)
以下是我需要编辑的行(来自我的十六进制编辑器)
00 00 01 80 68 69 33 32
(文字翻译:����hi32)
替换为:
00 00 00 00 68 69 33 32
我的文件在十六进制编辑器中的开头如下所示 (HxD):
00 00 00 00 00 00 00 00 01 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 01 0C 60 6D 64 69 61 00 00 00 20 6D 64 68 64 00 00 00 00 D9 98 96 40 D9 B2 F7 52 00 00 AC 44 00 84 EC 00 00 00 00 00 00 00 00 22 68 64 6C 72 00 00 00 00 00 00 00 00 73 6F 75 6E 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 0C 16 6D 69 6E 66 00 00 00 10 73 6D 68 64 00 00 00 00 00 00 00 00 00 00 00 24 64 69 6E 66 00 00 00 1C 64 72 65 66 00 00 00 00 00 00 00 01 00 00 00 0C 75 72 6C 20 00 00 00 01 00 01 0B DA 73 74 62 6C 00 00 80 76 73 74 73 64 00 00 00 00 00 00 00 01 00 00 80 66 6D 70 34 61 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 02 00 10 00 00 00 00 AC 44 00 00 00 00 00 33 65 73 64 73 00 00 00 00 03 80 80 80 22 00 00 00 04 80 80 80 14 40 15 00 18 00 00 04 82 90 00 03 E8 00 05 80 80 80 02 12 10 06 80 80 80 01 02 00 00 00
我从 Python 脚本中得到的十六进制的开头如下所示:
d5df0d0ef02daf279fd6b15fae5c6e0bc79bec22095ceeada5e77371afc8ee36f10773b1b2c06b1b1ee4e5cccbf67403b26fd37cc6e3cc9f11019ab604f0071872ec6c092cc20b2a6d4460c55986623b50
阅读十六进制的差异
when I open a .m4a file in a hex editor program (HxD for example) it gives me different hex data than what I get from my python script.
阅读Python
这是我在 python 中看到的,显示前 32 个字符:
with open('01 Choir (Remix).m4a', 'rb') as f:
content = f.read().hex()
print(content[:32])
00000020667479704d34412000000000
和xxd一起阅读
使用 bash,再次选择前 32 个字符:
$ xxd -ps 01\ Choir\ \(Remix\).m4a | head -c 32
00000020667479704d34412000000000
此处 xxd -ps
获取文件的十六进制字符串,head 获取此输出的前 32 个字符。
请注意,它们是相同的十六进制。
重写十六进制
The following is the line I need to edit (from my hex editor)
0000018068693332
替换为:
0000000068693332
您已经完成了一半的解决方案 - 只需字符串替换并重写到文件中。请记住,虽然 Python 的正则表达式库 re, is more powerful here, it's also not necessary as all you need to do is string replacement. And string replacement is an order of magnitude faster 比使用正则表达式要好。
如果您确实需要使用正则表达式,有很多方法可以Edit Hex。
# replace_bytes.py
source_str = '0000018068693332'
replace_str = '0000000068693332'
with open('01 Choir (Remix).m4a', 'rb') as f:
content = f.read().hex()
print(source_str + " in `01 Choir (Remix).m4a`: ", source_str in content)
content = content.replace(source_str, replace_str)
with open('01 Choir (Remix) edited.m4a', 'wb') as f:
f.write(bytes.fromhex(content))
with open('01 Choir (Remix) edited.m4a', 'rb') as f:
new_content = f.read().hex()
print(source_str + " in `01 Choir (Remix) edited.m4a`:", source_str in new_content)
然后运行它:
$ python replace_bytes.py
0000018068693332 in `01 Choir (Remix).m4a`: True
0000018068693332 in `01 Choir (Remix) edited.m4a`: False
我正在尝试编辑 .m4a 文件(音频文件)中的特定数据行,但我无法在 python 中找到一种方法。我知道还有其他类似的线索,但是当我在十六进制编辑器程序(例如 HxD)中打开一个 .m4a 文件时,它给我的十六进制数据与我从 python 脚本中获得的数据不同。我对术语有点困惑。我需要做的是用 python 读取文件并将其转换为我的十六进制编辑器使用的格式替换数据然后将其转换回并将其写入文件。我真的不知道这是否可行,或者是否有更简单的方法。我还是 Python 的新手,所以我还在学习。我真的只需要有人给我指出正确的方向。这样做的原因与我要更改的文件的元数据有关。
我的python版本:Python3.7.4
这里是相关文件的 link:https://drive.google.com/file/d/1m8SpCLSyX265_I00MFT1IyltpTAvxntF/view?usp=sharing
我的代码:
with open(file, 'rb') as f:
content = f.read().hex()
print(content)
以下是我需要编辑的行(来自我的十六进制编辑器)
00 00 01 80 68 69 33 32
(文字翻译:����hi32)
替换为:
00 00 00 00 68 69 33 32
我的文件在十六进制编辑器中的开头如下所示 (HxD):
00 00 00 00 00 00 00 00 01 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 01 0C 60 6D 64 69 61 00 00 00 20 6D 64 68 64 00 00 00 00 D9 98 96 40 D9 B2 F7 52 00 00 AC 44 00 84 EC 00 00 00 00 00 00 00 00 22 68 64 6C 72 00 00 00 00 00 00 00 00 73 6F 75 6E 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 0C 16 6D 69 6E 66 00 00 00 10 73 6D 68 64 00 00 00 00 00 00 00 00 00 00 00 24 64 69 6E 66 00 00 00 1C 64 72 65 66 00 00 00 00 00 00 00 01 00 00 00 0C 75 72 6C 20 00 00 00 01 00 01 0B DA 73 74 62 6C 00 00 80 76 73 74 73 64 00 00 00 00 00 00 00 01 00 00 80 66 6D 70 34 61 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 02 00 10 00 00 00 00 AC 44 00 00 00 00 00 33 65 73 64 73 00 00 00 00 03 80 80 80 22 00 00 00 04 80 80 80 14 40 15 00 18 00 00 04 82 90 00 03 E8 00 05 80 80 80 02 12 10 06 80 80 80 01 02 00 00 00
我从 Python 脚本中得到的十六进制的开头如下所示:
d5df0d0ef02daf279fd6b15fae5c6e0bc79bec22095ceeada5e77371afc8ee36f10773b1b2c06b1b1ee4e5cccbf67403b26fd37cc6e3cc9f11019ab604f0071872ec6c092cc20b2a6d4460c55986623b50
阅读十六进制的差异
when I open a .m4a file in a hex editor program (HxD for example) it gives me different hex data than what I get from my python script.
阅读Python
这是我在 python 中看到的,显示前 32 个字符:
with open('01 Choir (Remix).m4a', 'rb') as f:
content = f.read().hex()
print(content[:32])
00000020667479704d34412000000000
和xxd一起阅读
使用 bash,再次选择前 32 个字符:
$ xxd -ps 01\ Choir\ \(Remix\).m4a | head -c 32
00000020667479704d34412000000000
此处 xxd -ps
获取文件的十六进制字符串,head 获取此输出的前 32 个字符。
请注意,它们是相同的十六进制。
重写十六进制
The following is the line I need to edit (from my hex editor)
0000018068693332
替换为:
0000000068693332
您已经完成了一半的解决方案 - 只需字符串替换并重写到文件中。请记住,虽然 Python 的正则表达式库 re, is more powerful here, it's also not necessary as all you need to do is string replacement. And string replacement is an order of magnitude faster 比使用正则表达式要好。
如果您确实需要使用正则表达式,有很多方法可以Edit Hex。
# replace_bytes.py
source_str = '0000018068693332'
replace_str = '0000000068693332'
with open('01 Choir (Remix).m4a', 'rb') as f:
content = f.read().hex()
print(source_str + " in `01 Choir (Remix).m4a`: ", source_str in content)
content = content.replace(source_str, replace_str)
with open('01 Choir (Remix) edited.m4a', 'wb') as f:
f.write(bytes.fromhex(content))
with open('01 Choir (Remix) edited.m4a', 'rb') as f:
new_content = f.read().hex()
print(source_str + " in `01 Choir (Remix) edited.m4a`:", source_str in new_content)
然后运行它:
$ python replace_bytes.py
0000018068693332 in `01 Choir (Remix).m4a`: True
0000018068693332 in `01 Choir (Remix) edited.m4a`: False