在外部文件中搜索特定单词并将下一个单词存储在 Python 中的变量中
Search external file for specific word and store the very next word in variable in Python
我有一个文件,其中包含与此类似的一行:
"string" "playbackOptions -min 1 -max 57 -ast 1 -aet 57
现在我想搜索文件并提取“-aet”(在本例中为 57)后的值并将其存储在变量中。
我正在使用
import mmap
with open('file.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('-aet') != -1:
print('true')
用于搜索。但不能超越这个。
我建议使用 regular expressions 来提取值:
import re
# Open the file for reading
with open("file.txt", "r") as f:
# Loop through all the lines:
for line in f:
# Find an exact match
# ".*" skips other options,
# (?P<aet_value>\d+) makes a search group named "aet_value"
# if you need other values from that line just add them here
line_match = re.search(r"\"string\" \"playbackOptions .* -aet (?P<aet_value>\d+)", line)
# No match, search next line
if not line_match:
continue
# We know it's a number so it's safe to convert to int
aet_value = int(line_match.group("aet_value"))
# Do whatever you need
print("Found aet_value: {}".format(aet_value)
这是另一种使用本机字符串和列表方法的方法,因为我通常会忘记正则表达式语法,因为我已经有一段时间没有接触它了:
tag = "-aet" # Define what tag we're looking for.
with open("file.txt", "r") as f: # Read file.
for line in f: # Loop through every line.
line_split = line.split() # Split line by whitespace.
if tag in line_split and line_split[-1] != tag: # Check if the tag exists and that it's not the last element.
try:
index = line_split.index(tag) + 1 # Get the tag's index and increase by one to get its value.
value = int(line_split[index]) # Convert string to int.
except ValueError:
continue # We use try/except in case the value cannot be cast to an int. This may be omitted if the data is reliable.
print value # Now you have the value.
进行基准测试会很有趣,但正则表达式通常较慢,因此这可能会执行得更快,尤其是在文件特别大的情况下。
我有一个文件,其中包含与此类似的一行:
"string" "playbackOptions -min 1 -max 57 -ast 1 -aet 57
现在我想搜索文件并提取“-aet”(在本例中为 57)后的值并将其存储在变量中。
我正在使用
import mmap
with open('file.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('-aet') != -1:
print('true')
用于搜索。但不能超越这个。
我建议使用 regular expressions 来提取值:
import re
# Open the file for reading
with open("file.txt", "r") as f:
# Loop through all the lines:
for line in f:
# Find an exact match
# ".*" skips other options,
# (?P<aet_value>\d+) makes a search group named "aet_value"
# if you need other values from that line just add them here
line_match = re.search(r"\"string\" \"playbackOptions .* -aet (?P<aet_value>\d+)", line)
# No match, search next line
if not line_match:
continue
# We know it's a number so it's safe to convert to int
aet_value = int(line_match.group("aet_value"))
# Do whatever you need
print("Found aet_value: {}".format(aet_value)
这是另一种使用本机字符串和列表方法的方法,因为我通常会忘记正则表达式语法,因为我已经有一段时间没有接触它了:
tag = "-aet" # Define what tag we're looking for.
with open("file.txt", "r") as f: # Read file.
for line in f: # Loop through every line.
line_split = line.split() # Split line by whitespace.
if tag in line_split and line_split[-1] != tag: # Check if the tag exists and that it's not the last element.
try:
index = line_split.index(tag) + 1 # Get the tag's index and increase by one to get its value.
value = int(line_split[index]) # Convert string to int.
except ValueError:
continue # We use try/except in case the value cannot be cast to an int. This may be omitted if the data is reliable.
print value # Now you have the value.
进行基准测试会很有趣,但正则表达式通常较慢,因此这可能会执行得更快,尤其是在文件特别大的情况下。