如何在列表理解中添加额外的中间步骤?
How to add an extra middle step into a list comprehension?
假设我有一个 list[str]
对象,其中包含 "HH:mm"
格式的时间戳,例如
timestamps = ["22:58", "03:11", "12:21"]
我想将它转换为一个 list[int]
对象,每个时间戳都有“自午夜以来的分钟数”值:
converted = [22*60+58, 3*60+11, 12*60+21]
...但我想以风格来做,并使用单个列表推导式来做。我天真地构建的(语法上不正确的)实现类似于:
def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
return [int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]
...但这不起作用,因为 for hh, mm = ts.split(":")
不是有效语法。
写同样东西的有效方法是什么?
澄清一下:我可以看到形式上令人满意的解决方案:
def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
return [int(ts.split(":")[0]) * 60 + int(ts.split(":")[1]) for ts in timestamps]
...但这是非常低效的,我不想将字符串拆分两次。
如果您不想将字符串拆分两次,您可以使用 :=
赋值运算符:
timestamps = [int((s := t.split(":"))[0]) * 60 + int(s[1]) for t in timestamps]
print(timestamps)
打印:
[1378, 191, 741]
选择:
print([int(h) * 60 + int(m) for h, m in (t.split(":") for t in timestamps)])
打印:
[1378, 191, 741]
Note: :=
is a feature of Python 3.8+ commonly referred to as the "walrus operator". Here's the PEP with the
proposal.
您可以使用内部生成器表达式进行拆分:
[int(hh)*60 + int(mm) for hh, mm in (ts.split(':') for ts in timestamps)]
尽管就个人而言,我宁愿使用辅助函数来代替:
def timestamp_to_minutes(timestamp: str) -> int:
hh, mm = timestamp.split(":")
return int(hh)*60 + int(mm)
[timestamp_to_minutes(ts) for ts in timestamps]
# Alternative
list(map(timestamp_to_minutes, timestamps))
为了好玩,我们也可以使用 operator.methodcaller
:
from operator import methodcaller
out = [int(h) * 60 + int(m) for h, m in map(methodcaller("split", ":"), timestamps)]
输出:
[1378, 191, 741]
聚会迟到了..但为什么不使用 datetime / timedelta 来转换你的时间呢?
对于“hh:mm”,这可能有点矫枉过正,但您可以轻松地将其调整为更复杂的时间字符串:
from datetime import datetime as dt
import typing
def timestamps_to_minutes(timestamps: typing.List[str]) -> typing.List[any]:
"""Uses datetime.strptime to parse a datetime string and return
minutes spent in this day."""
return [int(((p := dt.strptime(t,"%H:%M")) - dt(p.year,p.month, p.day)
).total_seconds()//60) for t in timestamps]
timestamps = ["22:58", "03:11", "12:21"]
print(timestamps_to_minutes(timestamps))
输出:
[1378, 191, 741]
你的初始伪代码
[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]
非常接近您可以做的事情:
[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm in [ts.split(':')]]
In Python 3.9, expressions like this were optimized 这样在理解中创建一个 single-element 数组只是为了立即访问它的单个元素就和简单赋值一样快。
如果您为 middle-steps 使用生成器(而不是 list-comprehensions),整个列表仍将一次性转换:
timestamps = ["22:58", "03:11", "12:21"]
#NOTE: Use () for generators, not [].
hh_mms = (timestamp.split(':') for timestamp in timestamps)
converted = [int(hh) * 60 + int(mm) for (hh, mm) in hh_mms]
print(converted)
# [1378, 191, 741]
你可以在multiple-steps中拆分comprehension,写在多行上,你不需要定义任何函数。
假设我有一个 list[str]
对象,其中包含 "HH:mm"
格式的时间戳,例如
timestamps = ["22:58", "03:11", "12:21"]
我想将它转换为一个 list[int]
对象,每个时间戳都有“自午夜以来的分钟数”值:
converted = [22*60+58, 3*60+11, 12*60+21]
...但我想以风格来做,并使用单个列表推导式来做。我天真地构建的(语法上不正确的)实现类似于:
def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
return [int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]
...但这不起作用,因为 for hh, mm = ts.split(":")
不是有效语法。
写同样东西的有效方法是什么?
澄清一下:我可以看到形式上令人满意的解决方案:
def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
return [int(ts.split(":")[0]) * 60 + int(ts.split(":")[1]) for ts in timestamps]
...但这是非常低效的,我不想将字符串拆分两次。
如果您不想将字符串拆分两次,您可以使用 :=
赋值运算符:
timestamps = [int((s := t.split(":"))[0]) * 60 + int(s[1]) for t in timestamps]
print(timestamps)
打印:
[1378, 191, 741]
选择:
print([int(h) * 60 + int(m) for h, m in (t.split(":") for t in timestamps)])
打印:
[1378, 191, 741]
Note:
:=
is a feature of Python 3.8+ commonly referred to as the "walrus operator". Here's the PEP with the proposal.
您可以使用内部生成器表达式进行拆分:
[int(hh)*60 + int(mm) for hh, mm in (ts.split(':') for ts in timestamps)]
尽管就个人而言,我宁愿使用辅助函数来代替:
def timestamp_to_minutes(timestamp: str) -> int:
hh, mm = timestamp.split(":")
return int(hh)*60 + int(mm)
[timestamp_to_minutes(ts) for ts in timestamps]
# Alternative
list(map(timestamp_to_minutes, timestamps))
为了好玩,我们也可以使用 operator.methodcaller
:
from operator import methodcaller
out = [int(h) * 60 + int(m) for h, m in map(methodcaller("split", ":"), timestamps)]
输出:
[1378, 191, 741]
聚会迟到了..但为什么不使用 datetime / timedelta 来转换你的时间呢?
对于“hh:mm”,这可能有点矫枉过正,但您可以轻松地将其调整为更复杂的时间字符串:
from datetime import datetime as dt
import typing
def timestamps_to_minutes(timestamps: typing.List[str]) -> typing.List[any]:
"""Uses datetime.strptime to parse a datetime string and return
minutes spent in this day."""
return [int(((p := dt.strptime(t,"%H:%M")) - dt(p.year,p.month, p.day)
).total_seconds()//60) for t in timestamps]
timestamps = ["22:58", "03:11", "12:21"]
print(timestamps_to_minutes(timestamps))
输出:
[1378, 191, 741]
你的初始伪代码
[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]
非常接近您可以做的事情:
[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm in [ts.split(':')]]
In Python 3.9, expressions like this were optimized 这样在理解中创建一个 single-element 数组只是为了立即访问它的单个元素就和简单赋值一样快。
如果您为 middle-steps 使用生成器(而不是 list-comprehensions),整个列表仍将一次性转换:
timestamps = ["22:58", "03:11", "12:21"]
#NOTE: Use () for generators, not [].
hh_mms = (timestamp.split(':') for timestamp in timestamps)
converted = [int(hh) * 60 + int(mm) for (hh, mm) in hh_mms]
print(converted)
# [1378, 191, 741]
你可以在multiple-steps中拆分comprehension,写在多行上,你不需要定义任何函数。