格式化 pandas 数据框列中的日期格式并将其替换为月份

Formatting and replacing the date format in a pandas dataframe column with month

我正在尝试将 pandas 数据框中的日期格式替换为字符串格式的月份,但在此过程中出现错误

代码

def date_format(url_link):
    match = re.compile(r'[\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4}')
    mo = match.search(url_link)
    mo = mo.group().replace('/','-')
    try:
        mo = datetime.strptime(mo, "%d-%m-%Y").strftime('%Y-%m-%d')
    except:
        pass
    datetime_in = datetime.strptime(mo, "%Y-%m-%d")
    datetime_out = datetime_in.strftime("%B")
    return datetime_out

texts = [["The date is 11/12/1998"],["The date is 11-12-1998"],["/events/performances"],["/events/2019/02/22/promedica-masterworks/brah"],["/events/performances/641/2019-10-13/dudamel"]]
df = pd.DataFrame(texts, columns = ['event'])
df["date_format"] = df["event"].apply(lambda x: x.replace(r'[\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4}', date_format(x)))

预期输出是一个新的 pandas 数据框列,具有以下值

The date is December
The date is December
/events/performances
/events/February/promedica-masterworks/brah
/events/performances/641/October/dudamel

使用str.replace:

# Because my locale is french
# import locale
# locale.setlocale(locale.LC_TIME, 'en_US.UTF-8')

# Add capture group -v-------------------------------------v
match = re.compile(r'([\d]{2,4}[/|-][\d]{1,2}[/|-][\d]{2,4})')

# Replace values
date_to_month = lambda x: pd.to_datetime(x.group(0)).strftime('%B')
df['date_format'] = df['event'].str.replace(match, date_to_month, regex=True)

输出:

>>> df
                                           event                                  date_format
0                         The date is 11/12/1998                         The date is November
1                         The date is 11-12-1998                         The date is November
2                           /events/performances                         /events/performances
3  /events/2019/02/22/promedica-masterworks/brah  /events/February/promedica-masterworks/brah
4    /events/performances/641/2019-10-13/dudamel     /events/performances/641/October/dudamel