Python 读取字符后一定数量的字节
Python Read Certain Number of Bytes After Character
我正在处理一个字符分隔的十六进制文件,其中每个字段都有一个特定的起始代码。我打开的文件是 'rb',但我想知道,在使用 .find 获取起始码的索引后,如何从该位置读取一定数量的字节?
这就是我加载文件的方式以及我正在尝试做的事情
with open(someFile, 'rb') as fileData:
startIndex = fileData.find('(G')
data = fileData[startIndex:7]
其中 7 是我要从 find 函数返回的索引中读取的字节数。我正在使用 python 2.7.3
你可以得到一个子串在 python2.7 下的字节串中的位置,像这样:
>>> with open('student.txt', 'rb') as f:
... data = f.read()
...
>>> data # holds the French word for student: élève
'\xc3\xa9l\xc3\xa8ve\n'
>>> len(data) # this shows we are dealing with bytes here, because "élève\n" would be 6 characters long, had it been properly decoded!
8
>>> len(data.decode('utf-8'))
6
>>> data.find('\xa8') # continue with the bytestring...
4
>>> bytes_to_read = 3
>>> data[4:4+bytes_to_read]
'\xa8ve'
特殊字符可以找找,为了兼容Python3k,最好在字符前加一个b
,表示这些是字节(在Python2.x中,不用虽然):
>>> data.find(b'è') # in python2.x this works too (unfortunately, because it has lead to a lot of confusion): data.find('è')
3
>>> bytes_to_read = 3
>>> pos = data.find(b'è')
>>> data[pos:pos+bytes_to_read] # when you use the syntax 'n:m', it will read bytes in a bytestring
'\xc3\xa8v'
>>>
我正在处理一个字符分隔的十六进制文件,其中每个字段都有一个特定的起始代码。我打开的文件是 'rb',但我想知道,在使用 .find 获取起始码的索引后,如何从该位置读取一定数量的字节? 这就是我加载文件的方式以及我正在尝试做的事情
with open(someFile, 'rb') as fileData:
startIndex = fileData.find('(G')
data = fileData[startIndex:7]
其中 7 是我要从 find 函数返回的索引中读取的字节数。我正在使用 python 2.7.3
你可以得到一个子串在 python2.7 下的字节串中的位置,像这样:
>>> with open('student.txt', 'rb') as f:
... data = f.read()
...
>>> data # holds the French word for student: élève
'\xc3\xa9l\xc3\xa8ve\n'
>>> len(data) # this shows we are dealing with bytes here, because "élève\n" would be 6 characters long, had it been properly decoded!
8
>>> len(data.decode('utf-8'))
6
>>> data.find('\xa8') # continue with the bytestring...
4
>>> bytes_to_read = 3
>>> data[4:4+bytes_to_read]
'\xa8ve'
特殊字符可以找找,为了兼容Python3k,最好在字符前加一个b
,表示这些是字节(在Python2.x中,不用虽然):
>>> data.find(b'è') # in python2.x this works too (unfortunately, because it has lead to a lot of confusion): data.find('è')
3
>>> bytes_to_read = 3
>>> pos = data.find(b'è')
>>> data[pos:pos+bytes_to_read] # when you use the syntax 'n:m', it will read bytes in a bytestring
'\xc3\xa8v'
>>>