在文件中搜索多个字符串(来自文件)并打印行
Search Multiple Strings (from File) in a file and print the line
再次为菜鸟道歉:尝试使用以下代码搜索从关键字读取的多个字符串并在 f
中搜索并打印该行。
如果我只有一个关键字,它会起作用,但如果我有多个关键字,它就不会起作用。
keywords = input("Please Enter keywords path as c:/example/ \n :")
keys = open((keywords), "r").readline()
with open("c:/saad/saad.txt") as f:
for line in f:
if (keys) in line:
print(line)
查找关键字的挑战之一是定义关键字的含义以及应如何解析文件内容以找到完整的关键字集。如果 "aa" 是关键字,它应该匹配 "aaa" 还是匹配“"aa()"?关键字中可以包含数字吗?
一个简单的解决方案是说关键字只是字母,并且应该完全匹配连续的字母字符串,忽略大小写。此外,匹配应该逐行考虑,而不是逐句考虑。我们可以使用正则表达式来查找字母序列和集合来检查包容性,如下所示:
keys.txt
aa bb
test.txt
aa is good
AA is good
bb is good
cc is not good
aaa is not good
test.py
import re
keyfile = "keys.txt"
testfile = "test.txt"
keys = set(key.lower() for key in
re.findall(r'\w+', open(keyfile , "r").readline()))
with open(testfile) as f:
for line in f:
words = set(word.lower() for word in re.findall(r'\w+', line))
if keys & words:
print(line, end='')
结果:
aa is good
AA is good
bb is good
为匹配的含义添加更多规则,它会变得更加复杂。
编辑
假设您每行有一个关键字,您只需要一个子字符串匹配(即 "aa" 匹配 "aaa")而不是关键字搜索,您可以这样做
keyfile = "keys.txt"
testfile = "test.txt"
keys = [key for key in (line.strip() for line in open(keyfile)) if key]
with open(testfile) as f:
for line in f:
for key in keys:
if key in line:
print(line, end='')
break
但我只是在猜测你的标准是什么。
keywords = input("Please Enter keywords path as c:/example/ \n :")
keys = open((keywords), "r").readline()
keys = keys.split(',') # separates key strings
with open("c:/saad/saad.txt") as f:
for line in f:
for key in keys:
if key.strip() in line:
print(line)
您正在将一行作为一个字符串读入。您需要列出每个逗号分隔的字符串。然后测试每一行的每个键(删除键周围的空格)
这是假设您的关键字文件是这样的:aa 好,bb 好,垃圾邮件,鸡蛋
#The Easiest one...
def strsearch():
fopen = open('logfile.txt',mode='r+')
fread = fopen.readlines()
x = 'Product Name'
y = 'Problem Description'
z = 'Resolution Summary'
for line in fread:
#print(line)
if x in line:
print(line)
if y in line:
print(line)
if z in line:
print(line)
strsearch()
再次为菜鸟道歉:尝试使用以下代码搜索从关键字读取的多个字符串并在 f
中搜索并打印该行。
如果我只有一个关键字,它会起作用,但如果我有多个关键字,它就不会起作用。
keywords = input("Please Enter keywords path as c:/example/ \n :")
keys = open((keywords), "r").readline()
with open("c:/saad/saad.txt") as f:
for line in f:
if (keys) in line:
print(line)
查找关键字的挑战之一是定义关键字的含义以及应如何解析文件内容以找到完整的关键字集。如果 "aa" 是关键字,它应该匹配 "aaa" 还是匹配“"aa()"?关键字中可以包含数字吗?
一个简单的解决方案是说关键字只是字母,并且应该完全匹配连续的字母字符串,忽略大小写。此外,匹配应该逐行考虑,而不是逐句考虑。我们可以使用正则表达式来查找字母序列和集合来检查包容性,如下所示:
keys.txt
aa bb
test.txt
aa is good
AA is good
bb is good
cc is not good
aaa is not good
test.py
import re
keyfile = "keys.txt"
testfile = "test.txt"
keys = set(key.lower() for key in
re.findall(r'\w+', open(keyfile , "r").readline()))
with open(testfile) as f:
for line in f:
words = set(word.lower() for word in re.findall(r'\w+', line))
if keys & words:
print(line, end='')
结果:
aa is good
AA is good
bb is good
为匹配的含义添加更多规则,它会变得更加复杂。
编辑
假设您每行有一个关键字,您只需要一个子字符串匹配(即 "aa" 匹配 "aaa")而不是关键字搜索,您可以这样做
keyfile = "keys.txt"
testfile = "test.txt"
keys = [key for key in (line.strip() for line in open(keyfile)) if key]
with open(testfile) as f:
for line in f:
for key in keys:
if key in line:
print(line, end='')
break
但我只是在猜测你的标准是什么。
keywords = input("Please Enter keywords path as c:/example/ \n :")
keys = open((keywords), "r").readline()
keys = keys.split(',') # separates key strings
with open("c:/saad/saad.txt") as f:
for line in f:
for key in keys:
if key.strip() in line:
print(line)
您正在将一行作为一个字符串读入。您需要列出每个逗号分隔的字符串。然后测试每一行的每个键(删除键周围的空格)
这是假设您的关键字文件是这样的:aa 好,bb 好,垃圾邮件,鸡蛋
#The Easiest one...
def strsearch():
fopen = open('logfile.txt',mode='r+')
fread = fopen.readlines()
x = 'Product Name'
y = 'Problem Description'
z = 'Resolution Summary'
for line in fread:
#print(line)
if x in line:
print(line)
if y in line:
print(line)
if z in line:
print(line)
strsearch()