如何清除列表中的一些字符串？

Question

当字符串以“@”、“#”、"http" 或 "rt" 开头或包含这些字符串时，我试图从列表中删除一些字符串。下面是示例列表。

text_words1 = ['@football', 'haberci', '#sorumlubenim', 'dedigin', 'tarafsiz', 'olurrt', '@football', 'saysaniz', 'olur', '#sorumlubenim', 'korkakligin', 'sonu']

根据上面的列表，我想删除“@football”和“#sorumlubenim”。我尝试了下面的代码。

 i = 0
 while i < len(text_words1):
     if text_words1[i].startswith('@'):
         del text_words1[i] 
     if text_words1[i].startswith('#'):
         del text_words1[i] 
     i = i+1
 print 'The updated list is: \n', text_words1

但是，上面的代码只删除了一些字符串，并没有删除所有以“@”或“#”符号开头的字符串。

然后，我将下面的代码添加到上面的代码中，因为并非所有感兴趣的字符串都以“@”、“#”或 "http" 开头，但包含这些符号。

 while i < len(text_words1):
     if text_words1[i].__contains__('@'):
         del text_words1[i] 
     if text_words1[i].__contains__('#'):
         del text_words1[i]
     if text_words1[i].__contains__('http'):
        del text_words1[i]
     i = i+1
 print 'The updated list: \n', text_words1

以上代码删除了一些包含“#:”或“@”但不是全部的项目。

有人可以告诉我如何删除所有以“@”、“#”、"http" 或 "rt" 开头或包含的项目吗？

Answer 1

正如评论所指出的那样。使用您的方法，您将失去对列表索引的引用，因此不会迭代整个列表。您可以使用列表理解来删除不需要的词

new_list  = [i for i in text_words1 if not i.startswith(('@','#'))]

Answer 2

这是我的解决方案：

import re
text_words1 = ['@football', 'haberci', '#sorumlubenim', 'dedigin', 'tarafsiz', 'olurrt', '@football', 'saysaniz', 'olur', '#sorumlubenim', 'korkakligin', 'sonu']
for i, word in reversed(list(enumerate(text_words1))):
    if re.search('(@|#|http|rt)', word):
        del text_words1[i]

有了列表理解：

text_words1 = [w for w in text_words1 if not re.search('(@|#|http|rt)', w)]

请注意，我使用的是 re.search，因为它会检查字符串中任何位置的匹配项，而 re.match 仅在字符串的开头检查匹配项。这很重要，因为您要删除以 and/or 开头且包含这些字符的单词。

您的代码段的问题是您在迭代时删除了项目。 len(text_words1) 因此不允许您检查每个列表项。将打印语句添加到 while 循环中，您就会明白我的意思。

如何清除列表中的一些字符串？

How to clean some some strings in the list?

python

string

contains

startswith