计算文本中的空格（将连续的空格视为一个）

Question

您如何计算文本中空格或换行符的数量，使连续的空格只计为一个？例如，这非常接近我想要的：

string = "This is an  example text.\n   But would be good if it worked."
counter = 0
for i in string:
    if i == ' ' or i == '\n':
        counter += 1
print(counter)

但是，返回的结果不是 15，而是 11。

Answer 1

只存储最后找到的字符。每次循环时将其设置为 i。然后在你的内部 if 中，如果找到的最后一个字符也是空白字符，则不要增加计数器。

Answer 2

您可以遍历数字以将它们用作索引。

for i in range(1, len(string)):
    if string[i] in ' \n' and string[i-1] not in ' \n':
        counter += 1
if string[0] in ' \n':
    counter += 1
print(counter)

注意第一个符号，因为这个构造是从第二个符号开始的，以防止IndexError。

Answer 3

你可以这样做：

string = "This is an  example text.\n   But would be good if it worked."
counter = 0
# A boolean flag indicating whether the previous character was a space
previous = False 
for i in string:
    if i == ' ' or i == '\n': 
        # The current character is a space
        previous = True # Setup for the next iteration
    else:
        # The current character is not a space, check if the previous one was
        if previous:
            counter += 1

        previous = False
print(counter)

Answer 4

re 到 rescue。

>>> import re
>>> string = "This is an  example text.\n   But would be good if it worked."
>>> spaces = sum(1 for match in re.finditer('\s+', string))
>>> spaces
11

这会消耗最少的内存，另一种构建临时列表的解决方案是

>>> len(re.findall('\s+', string))
11

如果您只想考虑 space 个字符和换行符（而不是制表符，例如），请使用正则表达式 '(\n| )+' 而不是 '\s+'。

Answer 5

假设您被允许使用 Python 正则表达式；

import re
print len(re.findall(ur"[ \n]+", string))

快速简单！

更新：另外，使用 [\s] 而不是 [ \n] 来匹配任何空白字符。

Answer 6

默认的str.split()函数将连续运行的空格视为一个。所以简单地拆分字符串，得到结果列表的大小，然后减去一个。

len(string.split())-1

Answer 7

您可以使用 enumerate，检查下一个字符是否也不是空格，因此连续的空格只会算作 1:

string = "This is an  example text.\n   But would be good if it worked."

print(sum(ch.isspace() and not string[i:i+1].isspace() for i, ch in enumerate(string, 1)))

您还可以将 iter 与生成器函数一起使用，跟踪最后一个字符并进行比较：

def con(s):
    it = iter(s)
    prev = next(it)
    for ele in it:
        yield prev.isspace() and not ele.isspace()
        prev = ele
    yield ele.isspace()

print(sum(con(string)))

一个 itertools 版本：

string = "This is an  example text.\n     But would be good if it worked.  "

from itertools import tee, izip_longest

a, b = tee(string)
next(b)
print(sum(a.isspace() and not b.isspace() for a,b in izip_longest(a,b, fillvalue="") ))

Answer 8

尝试：

def word_count(my_string):     
    word_count = 1
    for i in range(1, len(my_string)):
        if my_string[i] == " ":

            if not my_string[i - 1] == " ":    
                word_count += 1

         return word_count

Answer 9

您可以使用函数 groupby() 查找连续空格组：

from collections import Counter
from itertools import groupby

s = 'This is an  example text.\n   But would be good if it worked.'

c = Counter(k for k, _ in groupby(s, key=lambda x: ' ' if x == '\n' else x))
print(c[' '])
# 11

计算文本中的空格（将连续的空格视为一个）

Count spaces in text (treat consecutive spaces as one)

python

spaces

python-3.x