格式化 pandas 数据框列中的日期格式并将其替换为月份
Formatting and replacing the date format in a pandas dataframe column with month
我正在尝试将 pandas 数据框中的日期格式替换为字符串格式的月份,但在此过程中出现错误
代码
def date_format(url_link):
match = re.compile(r'[\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4}')
mo = match.search(url_link)
mo = mo.group().replace('/','-')
try:
mo = datetime.strptime(mo, "%d-%m-%Y").strftime('%Y-%m-%d')
except:
pass
datetime_in = datetime.strptime(mo, "%Y-%m-%d")
datetime_out = datetime_in.strftime("%B")
return datetime_out
texts = [["The date is 11/12/1998"],["The date is 11-12-1998"],["/events/performances"],["/events/2019/02/22/promedica-masterworks/brah"],["/events/performances/641/2019-10-13/dudamel"]]
df = pd.DataFrame(texts, columns = ['event'])
df["date_format"] = df["event"].apply(lambda x: x.replace(r'[\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4}', date_format(x)))
预期输出是一个新的 pandas 数据框列,具有以下值
The date is December
The date is December
/events/performances
/events/February/promedica-masterworks/brah
/events/performances/641/October/dudamel
使用str.replace
:
# Because my locale is french
# import locale
# locale.setlocale(locale.LC_TIME, 'en_US.UTF-8')
# Add capture group -v-------------------------------------v
match = re.compile(r'([\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4})')
# Replace values
date_to_month = lambda x: pd.to_datetime(x.group(0)).strftime('%B')
df['date_format'] = df['event'].str.replace(match, date_to_month, regex=True)
输出:
>>> df
event date_format
0 The date is 11/12/1998 The date is November
1 The date is 11-12-1998 The date is November
2 /events/performances /events/performances
3 /events/2019/02/22/promedica-masterworks/brah /events/February/promedica-masterworks/brah
4 /events/performances/641/2019-10-13/dudamel /events/performances/641/October/dudamel
我正在尝试将 pandas 数据框中的日期格式替换为字符串格式的月份,但在此过程中出现错误
代码
def date_format(url_link):
match = re.compile(r'[\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4}')
mo = match.search(url_link)
mo = mo.group().replace('/','-')
try:
mo = datetime.strptime(mo, "%d-%m-%Y").strftime('%Y-%m-%d')
except:
pass
datetime_in = datetime.strptime(mo, "%Y-%m-%d")
datetime_out = datetime_in.strftime("%B")
return datetime_out
texts = [["The date is 11/12/1998"],["The date is 11-12-1998"],["/events/performances"],["/events/2019/02/22/promedica-masterworks/brah"],["/events/performances/641/2019-10-13/dudamel"]]
df = pd.DataFrame(texts, columns = ['event'])
df["date_format"] = df["event"].apply(lambda x: x.replace(r'[\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4}', date_format(x)))
预期输出是一个新的 pandas 数据框列,具有以下值
The date is December
The date is December
/events/performances
/events/February/promedica-masterworks/brah
/events/performances/641/October/dudamel
使用str.replace
:
# Because my locale is french
# import locale
# locale.setlocale(locale.LC_TIME, 'en_US.UTF-8')
# Add capture group -v-------------------------------------v
match = re.compile(r'([\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4})')
# Replace values
date_to_month = lambda x: pd.to_datetime(x.group(0)).strftime('%B')
df['date_format'] = df['event'].str.replace(match, date_to_month, regex=True)
输出:
>>> df
event date_format
0 The date is 11/12/1998 The date is November
1 The date is 11-12-1998 The date is November
2 /events/performances /events/performances
3 /events/2019/02/22/promedica-masterworks/brah /events/February/promedica-masterworks/brah
4 /events/performances/641/2019-10-13/dudamel /events/performances/641/October/dudamel