有效地将逗号分隔值的字符串转换为字节

Efficiently convert a string of comma separated values to bytes

我的 python3 程序正在从其他地方接收以下格式的字符串形式的数据(... 表示我需要输入的更多数据):

data = "0,12,145,234;1,0,0,128;2,255,255,255;...;909,100,100,100;"

我想将其转换为打包的二进制数据,我忽略了 ,; 字符。目前,我正在做以下事情:

splitData = data.split(';')[:-1] # ignore the last ';'
buff = []
for item in splitData:
    addr, R, G, B = item.split(',')
    addr = int(addr) # two bytes
    R    = int(R)    # one byte
    G    = int(G)    # one byte
    B    = int(B)    # one byte
    packed = struct.pack('HBBB', addr, R, G, B)
    buff.append(packed)
dataBytes = b''.join(buff)

对于上面的示例数据,此过程为我提供了以下信息:

dataBytes = b'\x00\x00\x0c\x91\xea\x01\x00\x00\x00\x80...\x8d\x03ddd'

这就是我想要的(大约是原始字符串大小的三分之一)。

但是,此过程大约需要 0.002 秒。我需要每帧执行此过程 33 次,这导致大约 0.05 秒的计算时间,相当于每秒大约 20 帧。如果可能的话,我想加快速度。

有没有比上述方法更快的将字符串数据转换为字节数据的方法?

使用 itertools,进行替换然后拆分,映射到 int,最后压缩成四个,速度提高了大约 25%:

 In [82]: data = "0,12,145,234;1,0,0,128;2,255,255,255;909,100,100,100;" * 1000
 In [83]: from itertools import  imap, izip
 [84]: %%timeit  
splitData = data.split(';')[:-1] # ignore the last ';'
buff = []
for item in splitData:
    addr, R, G, B = item.split(',')
    addr = int(addr) # two bytes
    R    = int(R)    # one byte
    G    = int(G)    # one byte
    B    = int(B)    # one byte
    packed = struct.pack('HBBB', addr, R, G, B)
    buff.append(packed)
dataBytes = b''.join(buff)
   ....: 
100 loops, best of 3: 8.61 ms per loop

In [85]: %%timeit     
mapped = imap(int, data[:-1].replace(";", ",").split(","))
b"".join([struct.pack('HBBB', *sub) for sub in izip(mapped, mapped, mapped, mapped)])
   ....: 
100 loops, best of 3: 6.27 ms per loop

使用python3,只需使用地图和zip:

In [4]: %%timeit
mapped = map(int, data[:-1].replace(";", ",").split(","))
b"".join([struct.pack('HBBB', *sub) for sub in zip(mapped, mapped, mapped, mapped)])
   ...: 
100 loops, best of 3: 3.61 ms per loop

In [5]: %%timeit        
splitData = data.split(';')[:-1] # ignore the last ';'
buff = []                                                                  for item in splitData:
    addr, R, G, B = item.split(',')
    addr = int(addr) # two bytes
    R    = int(R)    # one byte
    G    = int(G)    # one byte
    B    = int(B)    # one byte
    packed = struct.pack('HBBB', addr, R, G, B)
    buff.append(packed)
dataBytes = b''.join(buff)
   ...: 
100 loops, best of 3: 4.89 ms per loop