如何读取 Python 中由制表符分隔的文件值?
How to read a value of file separate by tabs in Python?
我有一个这种格式的文本文件
ConfigFile 1.1
;
; Version: 4.0.32.1
; Date="2021/04/08" Time="11:54:46" UTC="8"
;
Name
John Legend
Type
Student
Number
s1054520
我想获得 Name
或 Type
或 Number
的值
我如何得到它?
我试过这个方法,但是没有解决我的问题。
import re
f = open("Data.txt", "r")
file = f.read()
Name = re.findall("Name", file)
print(Name)
我的期望输出是John Legend
任何人都可以帮助我。我真的很感激。谢谢
这里我逐行遍历,当我遇到Name
我会把下一行(你也可以直接打印)添加到结果列表中:
import re
def print_hi(name):
result = []
regexp = re.compile(r'Name*')
gotname = False;
with open('test.txt') as f:
for line in f:
if gotname:
result.append(line.strip())
gotname = False
match = regexp.match(line)
if match:
gotname = True
print(result)
if __name__ == '__main__':
print_hi('test')
首先,re.findall
用于搜索与给定模式匹配的“所有”事件。所以在你的情况下。您正在文件中找到每个“名称”。因为这就是你要找的。
另一方面,计算机不会知道“John Legend”这个名字。它只会知道那是“名称”一词之后的那一行。
在你的情况下,我建议你可以检查这个 link.
- 找到“姓名”的行号
- 阅读下一行
- 得到没有白色的名字space
- 如果姓名多于1个。这也行得通
最终代码是这样的
def search_string_in_file(file_name, string_to_search):
"""Search for the given string in file and return lines containing that string,
along with line numbers"""
line_number = 0
list_of_results = []
# Open the file in read only mode
with open(file_name, 'r') as read_obj:
# Read all lines in the file one by one
for line in read_obj:
# For each line, check if line contains the string
line_number += 1
if string_to_search in line:
# If yes, then add the line number & line as a tuple in the list
list_of_results.append((line_number, line.rstrip()))
# Return list of tuples containing line numbers and lines where string is found
return list_of_results
file = open('Data.txt')
content = file.readlines()
matched_lines = search_string_in_file('Data.txt', 'Name')
print('Total Matched lines : ', len(matched_lines))
for i in matched_lines:
print(content[i[0]].strip())
我不会使用正则表达式,而是为文件类型创建一个解析器。规则可能是:
第一行可以忽略
任何以 ;
开头的行都可以忽略。
没有前导空格的每一行都是一个键
前导空格的每一行都是属于最后一行的值
键
我将从一个生成器开始,它可以 return 给你任何未忽略的行:
def read_data_lines(filename):
with open(filename, "r") as f:
# skip the first line
f.readline()
# read until no more lines
while line := f.readline():
# skip lines that start with ;
if not line.startswith(";"):
yield line
然后按照规则 3 和 4 填写字典:
def parse_data_file(filename):
data = {}
key = None
for line in read_data_lines(filename):
# No starting whitespace makes this a key
if not line.startswith(" "):
key = line.strip()
# Starting whitespace makes this a value for the last key
else:
data[key] = line.strip()
return data
此时您可以解析文件并打印您想要的任何密钥:
data = parse_data_file("Data.txt")
print(data["Name"])
假设这些标签行在文件中找到的序列中
可以简单地扫描它们:
labelList = ["Name","Type","Number"]
captures = dict()
with open("Data.txt","rt") as f:
for label in labelList:
while not f.readline().startswith(label):
pass
captures[label] = f.readline().strip()
for label in labelList:
print(f"{label} : {captures[label]}")
我有一个这种格式的文本文件
ConfigFile 1.1
;
; Version: 4.0.32.1
; Date="2021/04/08" Time="11:54:46" UTC="8"
;
Name
John Legend
Type
Student
Number
s1054520
我想获得 Name
或 Type
或 Number
的值
我如何得到它?
我试过这个方法,但是没有解决我的问题。
import re
f = open("Data.txt", "r")
file = f.read()
Name = re.findall("Name", file)
print(Name)
我的期望输出是John Legend
任何人都可以帮助我。我真的很感激。谢谢
这里我逐行遍历,当我遇到Name
我会把下一行(你也可以直接打印)添加到结果列表中:
import re
def print_hi(name):
result = []
regexp = re.compile(r'Name*')
gotname = False;
with open('test.txt') as f:
for line in f:
if gotname:
result.append(line.strip())
gotname = False
match = regexp.match(line)
if match:
gotname = True
print(result)
if __name__ == '__main__':
print_hi('test')
首先,re.findall
用于搜索与给定模式匹配的“所有”事件。所以在你的情况下。您正在文件中找到每个“名称”。因为这就是你要找的。
另一方面,计算机不会知道“John Legend”这个名字。它只会知道那是“名称”一词之后的那一行。
在你的情况下,我建议你可以检查这个 link.
- 找到“姓名”的行号
- 阅读下一行
- 得到没有白色的名字space
- 如果姓名多于1个。这也行得通
最终代码是这样的
def search_string_in_file(file_name, string_to_search):
"""Search for the given string in file and return lines containing that string,
along with line numbers"""
line_number = 0
list_of_results = []
# Open the file in read only mode
with open(file_name, 'r') as read_obj:
# Read all lines in the file one by one
for line in read_obj:
# For each line, check if line contains the string
line_number += 1
if string_to_search in line:
# If yes, then add the line number & line as a tuple in the list
list_of_results.append((line_number, line.rstrip()))
# Return list of tuples containing line numbers and lines where string is found
return list_of_results
file = open('Data.txt')
content = file.readlines()
matched_lines = search_string_in_file('Data.txt', 'Name')
print('Total Matched lines : ', len(matched_lines))
for i in matched_lines:
print(content[i[0]].strip())
我不会使用正则表达式,而是为文件类型创建一个解析器。规则可能是:
第一行可以忽略
任何以
;
开头的行都可以忽略。没有前导空格的每一行都是一个键
前导空格的每一行都是属于最后一行的值 键
我将从一个生成器开始,它可以 return 给你任何未忽略的行:
def read_data_lines(filename):
with open(filename, "r") as f:
# skip the first line
f.readline()
# read until no more lines
while line := f.readline():
# skip lines that start with ;
if not line.startswith(";"):
yield line
然后按照规则 3 和 4 填写字典:
def parse_data_file(filename):
data = {}
key = None
for line in read_data_lines(filename):
# No starting whitespace makes this a key
if not line.startswith(" "):
key = line.strip()
# Starting whitespace makes this a value for the last key
else:
data[key] = line.strip()
return data
此时您可以解析文件并打印您想要的任何密钥:
data = parse_data_file("Data.txt")
print(data["Name"])
假设这些标签行在文件中找到的序列中 可以简单地扫描它们:
labelList = ["Name","Type","Number"]
captures = dict()
with open("Data.txt","rt") as f:
for label in labelList:
while not f.readline().startswith(label):
pass
captures[label] = f.readline().strip()
for label in labelList:
print(f"{label} : {captures[label]}")