Python 文件解析器 - 以 10 为底的 int() 的无效文字:''
Python file parser - invalid literal for int() with base 10: ''
我正在尝试解析这种格式的文件
2010-11-04 00:03:50.209589 M003 ON Sleeping begin
2010-11-04 00:03:57.399391 M003 OFF
2010-11-04 00:15:08.984841 T002 21.5
2010-11-04 00:30:19.185547 T003 21
2010-11-04 00:30:19.385336 T004 21
我需要 select 第三列中的数字。在 select 该行之后,我 select 它的第三个字段。然后我拆分它,select 类型(M 或 T)和数字(第三个字段的其余部分)。这里的问题是:当我尝试 select 数字时,出现以下错误:Invalid literal for int() with base 10: ''.我已经尝试了很多方法(如删除 EOF 或任何类型的结束字符)到“num”,但我仍然遇到这个问题。
temp_sensors = 0 #total number of temperature sensors
f = open('data', 'r')#open the dataset
line = f.readline() #reading line
while line:
step = line.split()#dividing the line into different words
sensor_type = step[2][:1]
sensor_number = step[2][2:]
sensor_value = step[3]
#print(sensor_number)
#num = sensor_number[:2]
#print(type(num))
num = sensor_number.rstrip()
appoggio = int(num)
#print(type(num))
if sensor_type == "T":
if appoggio > temp_sensors:
temp_sensors = appoggio
line = f.readline()
print("NUMERO TEMP MAX: "+ str (temp_sensors))
要 运行 代码,您需要一个包含多个传感器事件的 txt 文件 data.txt,格式如前所示。我得到的错误是:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-b89fbf305c4a> in <module>
28 # print(type(num))
29 num = sensor_number.rstrip()
---> 30 appoggio = int(num)
31 #print(type(appoggio))
32 #print(type(num))
ValueError: invalid literal for int() with base 10: ''
我同意前面几位发言者的看法,一行似乎使用了不同的格式。因此,空字符串在转换为整数时似乎会引发错误。我的建议是提前测试格式。
import re
def readdata(fname):
ptrn = r'^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{6} [MT]\d{3} .*\s+$'
sensor_count = 0
with open(fname) as fp:
for i,line in enumerate(fp):
if not re.match(ptrn, line):
print(f'Illegal format in line {i}: {line!r}')
continue
dt,tm,sensor,value = line.rstrip().split(' ',3)
sensor_type,sensor_number = sensor[0],int(sensor[1:])
print(sensor_type, sensor_number)
if sensor_type == 'T':
sensor_count = max(sensor_count, sensor_number)
print(f'Number of temperature sensors: {sensor_count}')
我正在尝试解析这种格式的文件
2010-11-04 00:03:50.209589 M003 ON Sleeping begin
2010-11-04 00:03:57.399391 M003 OFF
2010-11-04 00:15:08.984841 T002 21.5
2010-11-04 00:30:19.185547 T003 21
2010-11-04 00:30:19.385336 T004 21
我需要 select 第三列中的数字。在 select 该行之后,我 select 它的第三个字段。然后我拆分它,select 类型(M 或 T)和数字(第三个字段的其余部分)。这里的问题是:当我尝试 select 数字时,出现以下错误:Invalid literal for int() with base 10: ''.我已经尝试了很多方法(如删除 EOF 或任何类型的结束字符)到“num”,但我仍然遇到这个问题。
temp_sensors = 0 #total number of temperature sensors
f = open('data', 'r')#open the dataset
line = f.readline() #reading line
while line:
step = line.split()#dividing the line into different words
sensor_type = step[2][:1]
sensor_number = step[2][2:]
sensor_value = step[3]
#print(sensor_number)
#num = sensor_number[:2]
#print(type(num))
num = sensor_number.rstrip()
appoggio = int(num)
#print(type(num))
if sensor_type == "T":
if appoggio > temp_sensors:
temp_sensors = appoggio
line = f.readline()
print("NUMERO TEMP MAX: "+ str (temp_sensors))
要 运行 代码,您需要一个包含多个传感器事件的 txt 文件 data.txt,格式如前所示。我得到的错误是:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-b89fbf305c4a> in <module>
28 # print(type(num))
29 num = sensor_number.rstrip()
---> 30 appoggio = int(num)
31 #print(type(appoggio))
32 #print(type(num))
ValueError: invalid literal for int() with base 10: ''
我同意前面几位发言者的看法,一行似乎使用了不同的格式。因此,空字符串在转换为整数时似乎会引发错误。我的建议是提前测试格式。
import re
def readdata(fname):
ptrn = r'^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{6} [MT]\d{3} .*\s+$'
sensor_count = 0
with open(fname) as fp:
for i,line in enumerate(fp):
if not re.match(ptrn, line):
print(f'Illegal format in line {i}: {line!r}')
continue
dt,tm,sensor,value = line.rstrip().split(' ',3)
sensor_type,sensor_number = sensor[0],int(sensor[1:])
print(sensor_type, sensor_number)
if sensor_type == 'T':
sensor_count = max(sensor_count, sensor_number)
print(f'Number of temperature sensors: {sensor_count}')