用 tobytes() 写入二进制数据无法用 Windows 上的软件读取
writing binary data with tobytes() can not be read with software on Windows
我正在尝试使用 python 将一些 xyz 点数据写入 .ply 文件。
我在这里使用 this 脚本,它基本上通过 recarry 和 numpy 方法将 pandas DataFrame 写入二进制格式 tobytes()
:
import pandas as pd
import numpy as np
pc = pd.read_csv('points.txt')
with open('some_file.ply', 'w') as ply:
ply.write("ply\n")
ply.write('format binary_little_endian 1.0\n')
ply.write("comment Author: Phil Wilkes\n")
ply.write("obj_info generated with pcd2ply.py\n")
ply.write("element vertex {}\n".format(len(pc)))
ply.write("property float x\n")
ply.write("property float y\n")
ply.write("property float z\n")
ply.write("end_header\n")
pc[['x', 'y', 'z']] = pc[['x', 'y', 'z']].astype('f4')
ply.write(pc[['x', 'y', 'z']].to_records(index=False).tobytes())
这个脚本在我的 Mac 上运行良好,像 CloudCompare 这样的软件可以读取它;然而,当我在 windows 机器上使用相同的脚本时,CloudCompare 可以读取 header 信息,但会乱码二进制内容。
当我将文本文件版本读入 CloudCompare 并输出为二进制文件时,Linux 和 Windows 版本都可以读取它,但文件内容不同。
Here is the version that is produced by the above script, here is the version produce by CloudCompare on Windows and here为原始数据
made_with_code.ply
和 made_with_windows.ply
之间的区别在于,在后者中,所有小数都四舍五入为 2 位小数,如您所见:
with open('windows.ply', 'rb') as f:
np.core.records.fromfile(f, formats='f4,f4,f4,f4')
使用 tail -c +274 made_with_windows.ply > windows.ply
提取数据部分后。
以下代码生成与 made_with_windows.ply
相同(在数据部分)的文件:
import pandas as pd
import numpy as np
pc = pd.read_csv('points.txt')
with open('made_with_code_new.ply', 'wb') as ply:
ply.write("ply\n")
ply.write('format binary_little_endian 1.0\n')
ply.write("comment Author: Phil Wilkes\n")
ply.write("obj_info generated with pcd2ply.py\n")
ply.write("element vertex {}\n".format(len(pc)))
ply.write("property float x\n")
ply.write("property float y\n")
ply.write("property float z\n")
ply.write("end_header\n")
pc[['x', 'y', 'z', 'n']] = pc[['x', 'y', 'z', 'n']].round(2).astype('f4')
ply.write(pc[['x', 'y', 'z', 'n']].to_records(index=False).tobytes())
原来我需要指定打开文件时使用的行结尾:
open(output_name, 'w', newline='\n')
为 Python 3 重写后,文件必须写入两次 - 一次用于 header,一次用于二进制组件,因此新函数如下所示:
import pandas as pd
import numpy as np
pc = pd.read_csv('points.txt')
with open(output_name, 'w', newline='\n') as ply:
ply.write("ply\n")
ply.write('format binary_little_endian 1.0\n')
ply.write("comment Author: Phil Wilkes\n")
ply.write("obj_info generated with pcd2ply.py\n")
ply.write("element vertex {}\n".format(len(pc)))
ply.write("property float x\n")
ply.write("property float y\n")
ply.write("property float z\n")
ply.write("end_header\n")
with open(output_name, 'ab') as ply:
pc[['x', 'y', 'z']] = pc[['x', 'y', 'z']].astype('f4')
ply.write(pc[cols].to_records(index=False).tobytes())
我正在尝试使用 python 将一些 xyz 点数据写入 .ply 文件。
我在这里使用 this 脚本,它基本上通过 recarry 和 numpy 方法将 pandas DataFrame 写入二进制格式 tobytes()
:
import pandas as pd
import numpy as np
pc = pd.read_csv('points.txt')
with open('some_file.ply', 'w') as ply:
ply.write("ply\n")
ply.write('format binary_little_endian 1.0\n')
ply.write("comment Author: Phil Wilkes\n")
ply.write("obj_info generated with pcd2ply.py\n")
ply.write("element vertex {}\n".format(len(pc)))
ply.write("property float x\n")
ply.write("property float y\n")
ply.write("property float z\n")
ply.write("end_header\n")
pc[['x', 'y', 'z']] = pc[['x', 'y', 'z']].astype('f4')
ply.write(pc[['x', 'y', 'z']].to_records(index=False).tobytes())
这个脚本在我的 Mac 上运行良好,像 CloudCompare 这样的软件可以读取它;然而,当我在 windows 机器上使用相同的脚本时,CloudCompare 可以读取 header 信息,但会乱码二进制内容。
当我将文本文件版本读入 CloudCompare 并输出为二进制文件时,Linux 和 Windows 版本都可以读取它,但文件内容不同。
Here is the version that is produced by the above script, here is the version produce by CloudCompare on Windows and here为原始数据
made_with_code.ply
和 made_with_windows.ply
之间的区别在于,在后者中,所有小数都四舍五入为 2 位小数,如您所见:
with open('windows.ply', 'rb') as f:
np.core.records.fromfile(f, formats='f4,f4,f4,f4')
使用 tail -c +274 made_with_windows.ply > windows.ply
提取数据部分后。
以下代码生成与 made_with_windows.ply
相同(在数据部分)的文件:
import pandas as pd
import numpy as np
pc = pd.read_csv('points.txt')
with open('made_with_code_new.ply', 'wb') as ply:
ply.write("ply\n")
ply.write('format binary_little_endian 1.0\n')
ply.write("comment Author: Phil Wilkes\n")
ply.write("obj_info generated with pcd2ply.py\n")
ply.write("element vertex {}\n".format(len(pc)))
ply.write("property float x\n")
ply.write("property float y\n")
ply.write("property float z\n")
ply.write("end_header\n")
pc[['x', 'y', 'z', 'n']] = pc[['x', 'y', 'z', 'n']].round(2).astype('f4')
ply.write(pc[['x', 'y', 'z', 'n']].to_records(index=False).tobytes())
原来我需要指定打开文件时使用的行结尾:
open(output_name, 'w', newline='\n')
为 Python 3 重写后,文件必须写入两次 - 一次用于 header,一次用于二进制组件,因此新函数如下所示:
import pandas as pd
import numpy as np
pc = pd.read_csv('points.txt')
with open(output_name, 'w', newline='\n') as ply:
ply.write("ply\n")
ply.write('format binary_little_endian 1.0\n')
ply.write("comment Author: Phil Wilkes\n")
ply.write("obj_info generated with pcd2ply.py\n")
ply.write("element vertex {}\n".format(len(pc)))
ply.write("property float x\n")
ply.write("property float y\n")
ply.write("property float z\n")
ply.write("end_header\n")
with open(output_name, 'ab') as ply:
pc[['x', 'y', 'z']] = pc[['x', 'y', 'z']].astype('f4')
ply.write(pc[cols].to_records(index=False).tobytes())