使用 python 跳过 ASCII 文件中的列
Skip columns in ASCII file using python
我有一个包含 15 列的 ASCII 文件,我可以通过跳过行来跳过 headers,但是,最后一列看起来与之前的最后一列“混合”,例如之前的示例:
86 0.0106 0.0186 0.0271 0.0457 0.0576 0.0691 0.0752 0.6623 1.6741 1.4979 1.7772 1.6318 1.0369-9.9999
95 0.0083 0.0167 0.0244 0.0420 0.0533 0.0655 0.0709 0.6574 1.7198 1.6143 1.8364 1.6730 1.1095-9.9999
101 0.0091 0.0173 0.0250 0.0433 0.0540 0.0659 0.0712 0.6583 1.6962 1.5280 1.8163 1.6663 1.0831-9.9999
131 0.0150 0.0250 0.0291 0.0456 0.0580 0.0703 0.0759 0.6569 1.2530 1.5717 1.6005 1.0979 1.0482-9.9999
5351 0.0062 0.0092 0.0171 0.0282 0.0356 0.0421 0.0441 0.6384 1.9588 1.4447 1.7263 2.0164 0.8406-9.9999
5381 0.0043 0.0085 0.0166 0.0277 0.0349 0.0400 0.0431 0.6415 2.0437 1.3245 1.7449 2.1263 0.8187-9.9999
5420 0.0040 0.0089 0.0167 0.0277 0.0354 0.0399 0.0428 0.6518 2.0006 1.3158 1.7609 2.0494 0.7287-9.9999
5427 0.0037 0.0080 0.0160 0.0262 0.0328 0.0372 0.0403 0.6540 2.0382 1.2688 1.6885 2.1329 0.7914-9.9999
5445 0.0031 0.0080 0.0159 0.0252 0.0323 0.0352 0.0386 0.6523 1.9921 1.1970 1.6521 2.0556 0.6766-9.9999
[......]
我尝试使用 numpy.loadtxt 读取文件,例如:
data_aot = np.loadtxt(file_path, delimiter=' ', skiprows=4 usecols=(1,2,3,4,5,6,7,8,9,10,11,12))
但我仍然遇到同样的错误:
ValueError:无法将字符串转换为浮点数:
有人知道我如何加载此文件而跳过最后一列值为 -9.9999
提前致谢
如果列有固定的宽度,你可以使用np.genfromtxt
加载它:
a = np.genfromtxt("your_file.txt", delimiter=[4, *[7] * 13]) # <-- use 14 if you want the last column too
print(a)
打印:
[[8.6000e+01 1.0600e-02 1.8600e-02 2.7100e-02 4.5700e-02 5.7600e-02
6.9100e-02 7.5200e-02 6.6230e-01 1.6741e+00 1.4979e+00 1.7772e+00
1.6318e+00 1.0369e+00]
[9.5000e+01 8.3000e-03 1.6700e-02 2.4400e-02 4.2000e-02 5.3300e-02
6.5500e-02 7.0900e-02 6.5740e-01 1.7198e+00 1.6143e+00 1.8364e+00
1.6730e+00 1.1095e+00]
[1.0100e+02 9.1000e-03 1.7300e-02 2.5000e-02 4.3300e-02 5.4000e-02
6.5900e-02 7.1200e-02 6.5830e-01 1.6962e+00 1.5280e+00 1.8163e+00
1.6663e+00 1.0831e+00]
[1.3100e+02 1.5000e-02 2.5000e-02 2.9100e-02 4.5600e-02 5.8000e-02
7.0300e-02 7.5900e-02 6.5690e-01 1.2530e+00 1.5717e+00 1.6005e+00
1.0979e+00 1.0482e+00]
[5.3510e+03 6.2000e-03 9.2000e-03 1.7100e-02 2.8200e-02 3.5600e-02
4.2100e-02 4.4100e-02 6.3840e-01 1.9588e+00 1.4447e+00 1.7263e+00
2.0164e+00 8.4060e-01]
[5.3810e+03 4.3000e-03 8.5000e-03 1.6600e-02 2.7700e-02 3.4900e-02
4.0000e-02 4.3100e-02 6.4150e-01 2.0437e+00 1.3245e+00 1.7449e+00
2.1263e+00 8.1870e-01]
[5.4200e+03 4.0000e-03 8.9000e-03 1.6700e-02 2.7700e-02 3.5400e-02
3.9900e-02 4.2800e-02 6.5180e-01 2.0006e+00 1.3158e+00 1.7609e+00
2.0494e+00 7.2870e-01]
[5.4270e+03 3.7000e-03 8.0000e-03 1.6000e-02 2.6200e-02 3.2800e-02
3.7200e-02 4.0300e-02 6.5400e-01 2.0382e+00 1.2688e+00 1.6885e+00
2.1329e+00 7.9140e-01]
[5.4450e+03 3.1000e-03 8.0000e-03 1.5900e-02 2.5200e-02 3.2300e-02
3.5200e-02 3.8600e-02 6.5230e-01 1.9921e+00 1.1970e+00 1.6521e+00
2.0556e+00 6.7660e-01]]
如果您的问题是最后一列,这就是解决方案,请将“-”字符替换为“”。
import numpy as np
from io import StringIO
file_path = 'path_to_your_file.txt'
s = open(file_path).read().replace('-',' ')
data_aot = np.loadtxt(StringIO(s), delimiter=' ', skiprows=4, usecols=(1,2,3,4,5,6,7,8,9,10,11,12,13))
我有一个包含 15 列的 ASCII 文件,我可以通过跳过行来跳过 headers,但是,最后一列看起来与之前的最后一列“混合”,例如之前的示例:
86 0.0106 0.0186 0.0271 0.0457 0.0576 0.0691 0.0752 0.6623 1.6741 1.4979 1.7772 1.6318 1.0369-9.9999
95 0.0083 0.0167 0.0244 0.0420 0.0533 0.0655 0.0709 0.6574 1.7198 1.6143 1.8364 1.6730 1.1095-9.9999
101 0.0091 0.0173 0.0250 0.0433 0.0540 0.0659 0.0712 0.6583 1.6962 1.5280 1.8163 1.6663 1.0831-9.9999
131 0.0150 0.0250 0.0291 0.0456 0.0580 0.0703 0.0759 0.6569 1.2530 1.5717 1.6005 1.0979 1.0482-9.9999
5351 0.0062 0.0092 0.0171 0.0282 0.0356 0.0421 0.0441 0.6384 1.9588 1.4447 1.7263 2.0164 0.8406-9.9999
5381 0.0043 0.0085 0.0166 0.0277 0.0349 0.0400 0.0431 0.6415 2.0437 1.3245 1.7449 2.1263 0.8187-9.9999
5420 0.0040 0.0089 0.0167 0.0277 0.0354 0.0399 0.0428 0.6518 2.0006 1.3158 1.7609 2.0494 0.7287-9.9999
5427 0.0037 0.0080 0.0160 0.0262 0.0328 0.0372 0.0403 0.6540 2.0382 1.2688 1.6885 2.1329 0.7914-9.9999
5445 0.0031 0.0080 0.0159 0.0252 0.0323 0.0352 0.0386 0.6523 1.9921 1.1970 1.6521 2.0556 0.6766-9.9999
[......]
我尝试使用 numpy.loadtxt 读取文件,例如:
data_aot = np.loadtxt(file_path, delimiter=' ', skiprows=4 usecols=(1,2,3,4,5,6,7,8,9,10,11,12))
但我仍然遇到同样的错误:
ValueError:无法将字符串转换为浮点数:
有人知道我如何加载此文件而跳过最后一列值为 -9.9999
提前致谢
如果列有固定的宽度,你可以使用np.genfromtxt
加载它:
a = np.genfromtxt("your_file.txt", delimiter=[4, *[7] * 13]) # <-- use 14 if you want the last column too
print(a)
打印:
[[8.6000e+01 1.0600e-02 1.8600e-02 2.7100e-02 4.5700e-02 5.7600e-02
6.9100e-02 7.5200e-02 6.6230e-01 1.6741e+00 1.4979e+00 1.7772e+00
1.6318e+00 1.0369e+00]
[9.5000e+01 8.3000e-03 1.6700e-02 2.4400e-02 4.2000e-02 5.3300e-02
6.5500e-02 7.0900e-02 6.5740e-01 1.7198e+00 1.6143e+00 1.8364e+00
1.6730e+00 1.1095e+00]
[1.0100e+02 9.1000e-03 1.7300e-02 2.5000e-02 4.3300e-02 5.4000e-02
6.5900e-02 7.1200e-02 6.5830e-01 1.6962e+00 1.5280e+00 1.8163e+00
1.6663e+00 1.0831e+00]
[1.3100e+02 1.5000e-02 2.5000e-02 2.9100e-02 4.5600e-02 5.8000e-02
7.0300e-02 7.5900e-02 6.5690e-01 1.2530e+00 1.5717e+00 1.6005e+00
1.0979e+00 1.0482e+00]
[5.3510e+03 6.2000e-03 9.2000e-03 1.7100e-02 2.8200e-02 3.5600e-02
4.2100e-02 4.4100e-02 6.3840e-01 1.9588e+00 1.4447e+00 1.7263e+00
2.0164e+00 8.4060e-01]
[5.3810e+03 4.3000e-03 8.5000e-03 1.6600e-02 2.7700e-02 3.4900e-02
4.0000e-02 4.3100e-02 6.4150e-01 2.0437e+00 1.3245e+00 1.7449e+00
2.1263e+00 8.1870e-01]
[5.4200e+03 4.0000e-03 8.9000e-03 1.6700e-02 2.7700e-02 3.5400e-02
3.9900e-02 4.2800e-02 6.5180e-01 2.0006e+00 1.3158e+00 1.7609e+00
2.0494e+00 7.2870e-01]
[5.4270e+03 3.7000e-03 8.0000e-03 1.6000e-02 2.6200e-02 3.2800e-02
3.7200e-02 4.0300e-02 6.5400e-01 2.0382e+00 1.2688e+00 1.6885e+00
2.1329e+00 7.9140e-01]
[5.4450e+03 3.1000e-03 8.0000e-03 1.5900e-02 2.5200e-02 3.2300e-02
3.5200e-02 3.8600e-02 6.5230e-01 1.9921e+00 1.1970e+00 1.6521e+00
2.0556e+00 6.7660e-01]]
如果您的问题是最后一列,这就是解决方案,请将“-”字符替换为“”。
import numpy as np
from io import StringIO
file_path = 'path_to_your_file.txt'
s = open(file_path).read().replace('-',' ')
data_aot = np.loadtxt(StringIO(s), delimiter=' ', skiprows=4, usecols=(1,2,3,4,5,6,7,8,9,10,11,12,13))