如何获取字符串中所有字符的 zip。 zip 遗漏了最后一个字符，itertools.zip_longest 添加了 none

Question

我正在将 itertools.zip_longest 的结果传递给 itertools.product，但是当它到达末尾并找到 None.

时出现错误

我得到的错误是：错误：(, TypeError('sequence item 0: expected str instance, NoneType found',), )

如果我使用 zip 而不是 itertools.zip_longest，那么我不会得到所有项目。

这是我用来生成 zip 的代码：

def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    print(args)
    #return zip(*args)
    return itertools.zip_longest(*args)

sCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~`!@#$%^&*()_-+={[}]|\"""':;?/>.<,"

for x in grouper(sCharacters, 4):
    print(x)

这是输出。第一个是 itertools.zip_longest，第二个只是 zip。您可以看到第一个带有 None 项，第二个缺少最后一项，即逗号：','

如何在字符串中获取所有个字符的 zip 而末尾没有 none。或者我怎样才能避免这个错误？

感谢您的宝贵时间。

Answer 1

sCharacters的长度是93（注意，92 % 4 ==0）。所以由于 zip 输出一个长度为最短输入序列的序列，它会错过最后一个元素

请注意，添加 itertools.zip_longest 的 None 是人为的值，可能不是每个人都希望的行为。这就是为什么 zip 忽略不必要的附加值

编辑：为了能够使用 zip 你可以在你的字符串中附加一些空格：

n=4
sCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~`!@#$%^&*()_-+={[}]|\"""':;?/>.<,"
if len(sCharacters) % n > 0:
    sCharacters = sCharacters + (" "*(n-len(sCharacters) % n))

编辑2：要在使用 zip 时获取丢失的尾巴，请使用如下代码：

tail = '' if len(sCharacters)%n == 0 else sCharacters[-(len(sCharacters)%n):]

Answer 2

我以前不得不在性能关键的情况下解决这个问题，所以这是我找到的最快的代码（无论 iterable 中的值如何都有效）：

from itertools import zip_longest

def grouper(n, iterable):
    fillvalue = object()  # Guaranteed unique sentinel, cannot exist in iterable
    for tup in zip_longest(*(iter(iterable),) * n, fillvalue=fillvalue):
        if tup[-1] is fillvalue:
            yield tuple(v for v in tup if v is not fillvalue)
        else:
            yield tup

据我所知，当输入足够长且块大小足够小时，以上是无与伦比的。对于块大小相当大的情况，它可能会输给这种更丑陋的情况，但通常不会太多：

from future_builtins import map  # Only on Py2, and required there
from itertools import islice, repeat, starmap, takewhile
from operator import truth  # Faster than bool when guaranteed non-empty call

def grouper(n, iterable):
    '''Returns a generator yielding n sized groups from iterable
    
    For iterables not evenly divisible by n, the final group will be undersized.
    '''
    # Can add tests to special case other types if you like, or just
    # use tuple unconditionally to match `zip`
    rettype = ''.join if type(iterable) is str else tuple

    # Keep islicing n items and converting to groups until we hit an empty slice
    return takewhile(truth, map(rettype, starmap(islice, repeat((iter(iterable), n)))))

如果没有足够的项目来完成组，这两种方法都会无缝地使最终元素不完整。它运行得非常快，因为在“设置”之后所有的工作都被推送到 CPython 中的 C 层，所以无论 iterable 有多长，Python 级别的工作是相同的，只是C级工作增加。也就是说，它做了很多的 C 工作，这就是为什么 zip_longest 解决方案（它做的 C 工作少得多，而且只有微不足道的 Python 级别工作除了最后一个块之外的所有块）通常会击败它。

选项 #2 的较慢但更易读的等效代码（但跳过动态 return 类型而仅支持 tuple）是：

 def grouper(n, iterable):
     iterable = iter(iterable)
     while True:
         x = tuple(islice(iterable, n))
         if not x:
             return
         yield x

或者更简洁地使用 Python 3.8+ 的海象运算符：

 def grouper(n, iterable):
     iterable = iter(iterable)
     while x := tuple(islice(iterable, n)):
         yield x

如何获取字符串中所有字符的 zip。 zip 遗漏了最后一个字符，itertools.zip_longest 添加了 none

How to get a zip of all characters in a string. zip misses out on final characters and itertools.zip_longest adds none

python

itertools

python-3.x

nonetype