如何使用一个月内的日期范围提取特征？

Question

我想从 day/date 的日期时间列中提取特征，例如在第 1 天到第 10 天之间，输出存储在名为

的列下

early_month as 1 or 0 otherwise.

我之前发布的以下问题为我提供了使用 indexer_between_time 以使用时间范围的解决方案。

我正在使用以下代码从日期中提取月份中的几天。

df["date_of_month"] = df["purchase_date"].dt.day

谢谢。

Answer 1

我相信您需要将布尔掩码转换为整数 - Trues 是类似于 1s:

的过程

rng = pd.date_range('2017-04-03', periods=10, freq='17D')
df = pd.DataFrame({'purchase_date': rng, 'a': range(10)})  

m2 = df["purchase_date"].dt.day <= 10

df['early_month'] = m2.astype(int)
print (df)
  purchase_date  a  early_month
0    2017-04-03  0            1
1    2017-04-20  1            0
2    2017-05-07  2            1
3    2017-05-24  3            0
4    2017-06-10  4            1
5    2017-06-27  5            0
6    2017-07-14  6            0
7    2017-07-31  7            0
8    2017-08-17  8            0
9    2017-09-03  9            1

详情：

print (df["purchase_date"].dt.day <= 10)
0     True
1    False
2     True
3    False
4     True
5    False
6    False
7    False
8    False
9     True
Name: purchase_date, dtype: bool

Answer 2

从你的问题中不清楚，但如果你试图创建一个包含 1 的列，如果日期在 1 到 10 之间，否则为 0，这很简单：

df['early_month'] = df['date_of_month'].apply(lambda x: 1 if x <= 10 else 0)

df['mid_month'] = df['date_of_month'].apply(lambda x: 1 if x >= 11 and x <= 20 else 0)

作为 python 初学者，如果您想避免使用 lambda 函数，您可以通过创建一个函数然后按如下方式应用它来获得相同的结果：

def create_date_features(day, min_day, max_day):
    if day >= min_day and day <= max_day:
        return 1
    else:
        return 0

df['early_month'] = df['date_of_month'].apply(create_date_features, min_day=1, max_day=10)
df['mid_month'] = df['date_of_month'].apply(create_date_features, min_day=11, max_day=20)

Answer 3

也许你需要这个：

import pandas as pd
from datetime import datetime
df = pd.DataFrame({'a':[1,2,3,4,5], 'time':['11.07.2018','12.07.2018','13.07.2018','14.07.2018','15.07.2018']})
df.time = pd.to_datetime(df.time, format='%d.%m.%Y')

df[df.time>datetime(2018,7,13)] #if you need filter for date
df[df.time>datetime(2018,7,13).day] #if you need filter for day

如何使用一个月内的日期范围提取特征？

How to extract features using date range within a month?

python

feature-extraction

python-3.x

pandas