Python：如果行以 "ggggg" 开头，如何拆分字符串？

Question

这对我来说似乎很简单，但出于某种原因我无法 python 在以下内容上正确拆分。

f = open('text', 'r')
x = f.read()
f.close()
result = x.split('^ggggg', 1)[0]

文件 "text" 具有以下内容：

aaaaa1234
bbbbb1234
ccccc1234
ggggg1234
hhhhh1234

我认为 "result" 会包含 ggggg 行之前的所有内容，但它只包含整个文本。如何让 python 在行首以 "ggggg" 开头的位置拆分？

Answer 1

首先，str.split() 仅拆分文字文本，或者在使用 None（默认值）的情况下，任意空格。不支持正则表达式。您可以将文件内容拆分为 \nggggg:

x.split('\nggggg', 1)[0]

如果必须使用正则表达式，请使用 re.split() function。

为了效率起见，您可以循环遍历行，然后只测试该行是否以 ggggg 开头并在那里停止迭代：

result = []

with open('text', 'r') as f:
    for line in f:
        if line.startswith('ggggg'):
            break
        result.append(line)

这样您就不必阅读整个文件。你也可以使用 itertools.takewhile():

from itertools import takewhile
with open('text', 'r') as f:
    result = list(takewhile(lambda l: not l.startswith('ggggg'), f))

这两个选项都会生成一个字符串列表。

Answer 2

str.split() 不采用正则表达式。

但是，您可以使用字符串 '\nggggg'，如果 \n 不在文件的顶部，它将匹配。

另一种可能性是使用正则表达式函数，documented here。

Answer 3

不阅读所有文件会更好，但对于一般知识，这里是如何轻松处理您的问题，字符串明智...

result = x[0:x.find("ggggg")]

Answer 4

如果我没有正确理解你的问题，你想将 result 设置为 ggggg 行之前的所有内容吗？

您可以尝试以下操作：

result = ''
with open('text','r') as f: // Open file 'text' as 'r'eadonly,
    f.seek(0) // move the readcursor to the beginning of the document
    for line in f: // for each line...
        if not line.startswith('ggggg'): // If 'ggggg' isn't at the beginning of the line..
            result = "{0}\n{1}".format(result, line) // append the line to the result variable.
        else:
            break
f.close()

如果您想让它忽略 ggggg 行并获得其他所有内容，请尝试：

result = ''
with open('text','r') as f: // Open file 'text' as 'r'eadonly,
    f.seek(0) // move the readcursor to the beginning of the document
    for line in f: // for each line...
        if not line.startswith('ggggg'): // If 'ggggg' isn't at the beginning of the line..
            result = "{0}\n{1}".format(result, line) // append the line to the result variable.
        else:
            continue
f.close()

Answer 5

Python 拆分功能根本不需要。我用简单的字符串函数得到了相同的结果。如果您要求严格使用列表和拆分功能回答，我们深表歉意。

#!/usr/bin/python
fh=open('text', 'r')

for line in fh:
    if line.startswith(ggggg): break
    print line

print "DONE"
fh.close()

Python：如果行以 "ggggg" 开头，如何拆分字符串？

Python: How do you split a string if the line starts with "ggggg"?

python

split

startswith