split 的输出不是我所期望的

Question

我刚刚学习 python，但在读取我创建的 .txt 文件时遇到了一些问题。 我的objective: 我有一个包含字符串列表的 txt 文件。我正在尝试阅读、处理它并将每个字母保存到一个新列表中。

example2.txt 文件: [one, two, THREE, one, two, ten, eight,cat, dog, bird, fish] [Alonso, Alicia, Bob, Lynn] , [red, blue, green, pink, cyan]

我的输出 ['one, two, THREE, one, two, ten, eight, cat, dog, bird, fish]\n'] ['Alonso, Alicia, Bob, Lynn], [red, blue, green, pink, cyan']

我期待的是这样的： ['one','two','THREE','one','two','ten','eight','cat','dog','bird','fish','Alonso','Alicia','Bob','Lynn','red','blue','green','pink','cyan']

我的代码在python 这就是我尝试过的；你可以忽略评论

import re
# Creating a variable to store later the contents of the file
list_String = []
# Reading the file
file = open("D:\dir\example2", "r")

for line in file:
    print(re.split('^[\s].', line.strip(' ][')))
    #list_String.append(line.strip('[]').strip("\n").split(","))
    #list_String = re.split(r'[^\S\t.]', line)
    #print(line.split(r"\S"))
    #print(line)

#print(list_String)

file.close()

我也在看文档re，但不知道是不是我一个人，还是很难理解。

我尝试对我阅读的内容进行试验，但我仍然没有得到我想要的。

我什至试过这个：

print(line.strip('][').strip('\n').strip(']').split(","))

输出

['one', ' two', ' THREE', ' one', ' two', ' ten', ' eight', 'cat', ' dog', ' bird', ' fish']
['Alonso', ' Alicia', ' Bob', ' Lynn] ', ' [red', ' blue', ' green', ' pink', ' cyan']

如您所见，它有点管用。然而，在 Lynn 和 red 之间，大括号和逗号并没有消失。

感谢您的宝贵时间和帮助

Answer 1

您可能会发现在模式 \w+ 上执行 re.findall 在这里有效：

inp = "[one, two, THREE, one, two, ten, eight,cat, dog, bird, fish] [Alonso, Alicia, Bob, Lynn] , [red, blue, green, pink, cyan]"
words = re.findall(r'\w+', inp)
print(words)

这会打印：

['one', 'two', 'THREE', 'one', 'two', 'ten', 'eight', 'cat', 'dog', 'bird', 'fish',
 'Alonso', 'Alicia', 'Bob', 'Lynn', 'red', 'blue', 'green', 'pink', 'cyan']

split 的输出不是我所期望的

Output of split is not what I was expecting

python

file-io

split

list

python-re