如何打印从 'startswith' 到 'endswith' 的字符串部分
How to print portion of the string from 'startswith' till 'endswith'
我喜欢将原始文本文件中可以在 'startswith' 和 'endswith' 字符串之间识别的部分保存到新的文本文件中。
示例:输入文本文件包含以下行:
...abc…
...starts with string...
...def...
...ends with string...
...ghi...
...jkl...
...starts with string...
...mno...
...ends with string...
...pqr...
我有兴趣将以下行提取到输出文本文件中:
starts with string...def...ends with string
starts with string...mno...ends with string
我的以下代码 returns 空列表 [ ]。请帮助更正我的代码。
with open('file_in.txt','r') as fi:
id = []
for ln in fi:
if ln.startswith("start with string"):
if ln.endswith("ends with string"):
id.append(ln[:])
with open(file_out.txt, 'a', encoding='utf-8') as fo:
fo.write (",".join(id))
print(id)
我希望 file.out.txt 包含所有以 "start with string" 开头并以 "ends with string" 结尾的字符串。
startswith
and endswith
return True or False rather than a position you can use to slice your string. Try find
or index
代替。例如:
start = 'starts with string'
end = 'ends with string'
s = '...abc… ...starts with string... ...def... ...ends with string... ...ghi...'
sub = s[s.find(start):s.find(end) + len(end)]
print(sub)
# starts with string... ...def... ...ends with string
您需要在循环中添加一些检查以查看开始和结束字符串是否存在,因为如果不匹配,find
将 return -1,这将导致一些意外的切片。
您可以使用单独的变量来指示当前行是否是感兴趣部分的一部分,并根据开始和结束标记切换此变量。那么你也可以把这个函数变成一个生成器:
def extract(fh, start, stop):
sub = False
for line in fh:
sub |= start in line
if sub:
yield line
sub ^= stop in line
with open('test.txt') as fh:
print(''.join(extract(fh, 'starts with string', 'ends with string')))
在 Python 3.8 中你可以使用 assignment expressions:
import itertools as it
def extract(fh, start, stop):
while any(start in (line := x) for x in fh):
yield line
yield from it.takewhile(lambda x: stop not in x, ((line := y) for y in fh))
yield line
with open('test.txt') as fh:
print(''.join(extract(fh, 'starts with string', 'ends with string')))
变化:不包括开始和停止标记
如果要从输出中排除开始和停止标记,我们可以再次使用 itertools.takewhile
:
import itertools as it
def extract(fh, start, stop):
while any(start in x for x in fh):
yield from it.takewhile(lambda x: stop not in x, fh)
with open('test.txt') as fh:
print(''.join(extract(fh, 'starts with string', 'ends with string')))
在每一行的末尾都有一个字符告诉计算机显示一个新行。我在这里假设 "start with string" 和 "ends with string" 在同一行。如果不是这种情况,请在第一个 if 语句的正下方添加 --"id.append(ln[:])"--。
尝试
ln.endswith("ends with string"+'\n' )
或
ln.endswith("ends with string"+'\n' +'\r')
with open('C:\Py\testing.txt','r') as fi:
id = []
x = 0
copy_line = False
for ln in fi:
if "starts with string" in ln:
copy_line = True
if copy_line:
id.append ( ln[:] )
if "ends with string" in ln :
copy_line = False
with open ('C:\Py\testing_out.txt', 'a', encoding='utf-8' ) as fo:
fo.write (",".join(id))
print(id)
我喜欢将原始文本文件中可以在 'startswith' 和 'endswith' 字符串之间识别的部分保存到新的文本文件中。
示例:输入文本文件包含以下行:
...abc…
...starts with string...
...def...
...ends with string...
...ghi...
...jkl...
...starts with string...
...mno...
...ends with string...
...pqr...
我有兴趣将以下行提取到输出文本文件中:
starts with string...def...ends with string
starts with string...mno...ends with string
我的以下代码 returns 空列表 [ ]。请帮助更正我的代码。
with open('file_in.txt','r') as fi:
id = []
for ln in fi:
if ln.startswith("start with string"):
if ln.endswith("ends with string"):
id.append(ln[:])
with open(file_out.txt, 'a', encoding='utf-8') as fo:
fo.write (",".join(id))
print(id)
我希望 file.out.txt 包含所有以 "start with string" 开头并以 "ends with string" 结尾的字符串。
startswith
and endswith
return True or False rather than a position you can use to slice your string. Try find
or index
代替。例如:
start = 'starts with string'
end = 'ends with string'
s = '...abc… ...starts with string... ...def... ...ends with string... ...ghi...'
sub = s[s.find(start):s.find(end) + len(end)]
print(sub)
# starts with string... ...def... ...ends with string
您需要在循环中添加一些检查以查看开始和结束字符串是否存在,因为如果不匹配,find
将 return -1,这将导致一些意外的切片。
您可以使用单独的变量来指示当前行是否是感兴趣部分的一部分,并根据开始和结束标记切换此变量。那么你也可以把这个函数变成一个生成器:
def extract(fh, start, stop):
sub = False
for line in fh:
sub |= start in line
if sub:
yield line
sub ^= stop in line
with open('test.txt') as fh:
print(''.join(extract(fh, 'starts with string', 'ends with string')))
在 Python 3.8 中你可以使用 assignment expressions:
import itertools as it
def extract(fh, start, stop):
while any(start in (line := x) for x in fh):
yield line
yield from it.takewhile(lambda x: stop not in x, ((line := y) for y in fh))
yield line
with open('test.txt') as fh:
print(''.join(extract(fh, 'starts with string', 'ends with string')))
变化:不包括开始和停止标记
如果要从输出中排除开始和停止标记,我们可以再次使用 itertools.takewhile
:
import itertools as it
def extract(fh, start, stop):
while any(start in x for x in fh):
yield from it.takewhile(lambda x: stop not in x, fh)
with open('test.txt') as fh:
print(''.join(extract(fh, 'starts with string', 'ends with string')))
在每一行的末尾都有一个字符告诉计算机显示一个新行。我在这里假设 "start with string" 和 "ends with string" 在同一行。如果不是这种情况,请在第一个 if 语句的正下方添加 --"id.append(ln[:])"--。
尝试
ln.endswith("ends with string"+'\n' )
或
ln.endswith("ends with string"+'\n' +'\r')
with open('C:\Py\testing.txt','r') as fi:
id = []
x = 0
copy_line = False
for ln in fi:
if "starts with string" in ln:
copy_line = True
if copy_line:
id.append ( ln[:] )
if "ends with string" in ln :
copy_line = False
with open ('C:\Py\testing_out.txt', 'a', encoding='utf-8' ) as fo:
fo.write (",".join(id))
print(id)