match 语句中的条件 case python3.10（结构模式匹配）

Question

我目前正在开发一些东西，想知道 python 3.10 中的新匹配语句是否适合这种用例，我有条件语句。

作为输入，我有一个时间戳和一个包含日期和值的数据框。目标是遍历所有行，并根据日期将值添加到相应的 bin bases。在这里，将值放置在哪个 bin 中取决于与时间戳相关的日期。时间戳 1 个月内的日期放在 bin 1 中，2 个月内的日期放在 bin 2 中等等...

我现在的代码如下：

bins = [0] * 7

for date, value in zip(df.iloc[:,0],df.iloc[:,1]):
    match [date,value]:
        case [date,value] if date < timestamp + pd.Timedelta(1,'m'):
            bins[0] += value
        case [date,value] if date > timestamp + pd.Timedelta(1,'m') and date < timestamp + pd.Timedelta(2,'m'):
            bins[1] += value
        case [date,value] if date > timestamp + pd.Timedelta(2,'m') and date < timestamp + pd.Timedelta(3,'m'):
            bins[2] += value
        case [date,value] if date > timestamp + pd.Timedelta(3,'m') and date < timestamp + pd.Timedelta(4,'m'):
            bins[3] += value
        case [date,value] if date > timestamp + pd.Timedelta(4,'m') and date < timestamp + pd.Timedelta(5,'m'):
            bins[4] += value
        case [date,value] if date > timestamp + pd.Timedelta(5,'m') and date < timestamp + pd.Timedelta(6,'m'):
            bins[5] += value

更正：原来我说这段代码不起作用。事实证明它确实如此。但是，我仍然想知道这是否适合使用 match 语句。

Answer 1

我会说结构模式匹配不是很好的用途，因为没有实际的结构。您正在检查单个对象的 values，因此 if/elif chain 是一个更好、更具可读性和自然的选择。

我还有 2 个关于你写的方式的问题 -

您没有考虑垃圾桶边缘的值
您正在检查相同的条件两次，即使您在 match/case 中完成了一些检查，您可以保证之前的条件不匹配 - 所以如果之前的情况您不需要做 if date > timestamp + pd.Timedelta(1,'m') and...检查 if date < timestamp + pd.Timedelta(1,'m') 失败你已经知道它并不小。（存在平等的边缘情况，但无论如何都应该以某种方式处理）

总而言之，我认为这将是更清洁的解决方案：

for date, value in zip(df.iloc[:,0],df.iloc[:,1]):

    if date < timestamp + pd.Timedelta(1,'m'):
        bins[0] += value
    elif date < timestamp + pd.Timedelta(2,'m'):
        bins[1] += value
    elif date < timestamp + pd.Timedelta(3,'m'):
        bins[2] += value
    elif date < timestamp + pd.Timedelta(4,'m'):
        bins[3] += value
    elif date < timestamp + pd.Timedelta(5,'m'):
        bins[4] += value
    elif date < timestamp + pd.Timedelta(6,'m'):
        bins[5] += value
    else:
        pass

Answer 2

这真的应该直接用 Pandas 函数完成：

import pandas as pd
from datetime import datetime

timestamp = datetime.now()
bins = [pd.Timestamp(year=1970, month=1, day=1)]+[pd.Timestamp(timestamp)+pd.Timedelta(i, 'm') for i in range(6)]+[pd.Timestamp(year=2100, month=1, day=1)] # plus open bin on the right
n_samples = 1000

data = {
  'date': [pd.to_datetime(timestamp)+pd.Timedelta(i,'s') for i in range(n_samples)],
  'value': list(range(n_samples))
}

df = pd.DataFrame(data)

df['bin'] = pd.cut(df.date, bins, right=False)
df.groupby('bin').value.sum()

match 语句中的条件 case python3.10（结构模式匹配）

Conditional cases in match statement python3.10 (structural pattern matching)

python

case

match

structural-pattern-matching