如何在列表理解中添加额外的中间步骤?

How to add an extra middle step into a list comprehension?

假设我有一个 list[str] 对象,其中包含 "HH:mm" 格式的时间戳,例如

timestamps = ["22:58", "03:11", "12:21"]

我想将它转换为一个 list[int] 对象,每个时间戳都有“自午夜以来的分钟数”值:

converted = [22*60+58, 3*60+11, 12*60+21]

...但我想以风格来做,并使用单个列表推导式来做。我天真地构建的(语法上不正确的)实现类似于:

def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
    return [int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]

...但这不起作用,因为 for hh, mm = ts.split(":") 不是有效语法。

写同样东西的有效方法是什么?

澄清一下:我可以看到形式上令人满意的解决方案:

def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
    return [int(ts.split(":")[0]) * 60 + int(ts.split(":")[1]) for ts in timestamps]

...但这是非常低效的,我不想将字符串拆分两次。

如果您不想将字符串拆分两次,您可以使用 := 赋值运算符:

timestamps = [int((s := t.split(":"))[0]) * 60 + int(s[1]) for t in timestamps]
print(timestamps)

打印:

[1378, 191, 741]

选择:

print([int(h) * 60 + int(m) for h, m in (t.split(":") for t in timestamps)])

打印:

[1378, 191, 741]

Note: := is a feature of Python 3.8+ commonly referred to as the "walrus operator". Here's the PEP with the proposal.

您可以使用内部生成器表达式进行拆分:

[int(hh)*60 + int(mm) for hh, mm in (ts.split(':') for ts in timestamps)]

尽管就个人而言,我宁愿使用辅助函数来代替:

def timestamp_to_minutes(timestamp: str) -> int:
    hh, mm = timestamp.split(":")
    return int(hh)*60 + int(mm)

[timestamp_to_minutes(ts) for ts in timestamps]

# Alternative
list(map(timestamp_to_minutes, timestamps))

为了好玩,我们也可以使用 operator.methodcaller:

from operator import methodcaller
out = [int(h) * 60 + int(m) for h, m in map(methodcaller("split", ":"), timestamps)]

输出:

[1378, 191, 741]

聚会迟到了..但为什么不使用 datetime / timedelta 来转换你的时间呢?

对于“hh:mm”,这可能有点矫枉过正,但您可以轻松地将其调整为更复杂的时间字符串:

from datetime import datetime as dt
import typing

def timestamps_to_minutes(timestamps: typing.List[str]) -> typing.List[any]:
    """Uses datetime.strptime to parse a datetime string and return
    minutes spent in this day."""
    return [int(((p := dt.strptime(t,"%H:%M")) - dt(p.year,p.month, p.day)
                 ).total_seconds()//60) for t in timestamps]

timestamps = ["22:58", "03:11", "12:21"]

print(timestamps_to_minutes(timestamps))

输出:

[1378, 191, 741]

你的初始伪代码

[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]

非常接近您可以做的事情:

[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm in [ts.split(':')]]

In Python 3.9, expressions like this were optimized 这样在理解中创建一个 single-element 数组只是为了立即访问它的单个元素就和简单赋值一样快。

如果您为 middle-steps 使用生成器(而不是 list-comprehensions),整个列表仍将一次性转换:

timestamps = ["22:58", "03:11", "12:21"]

#NOTE: Use () for generators, not [].
hh_mms = (timestamp.split(':') for timestamp in timestamps)
converted = [int(hh) * 60 + int(mm) for (hh, mm) in hh_mms]

print(converted)
# [1378, 191, 741]

你可以在multiple-steps中拆分comprehension,写在多行上,你不需要定义任何函数。