如何从时区字符串列中获取时区字符串?

How to get timezone strings from timezone string columns?

1 我有一个 CSV,其中包含 GMT 时区字符串的列数据 ["timestamps1"],格式如下:

2020-02-28T12:53:47.167Z

2 我想在另一个 Europe/Berlin 时间中获得一个包含时区字符串的新列,格式并不重要,例如

2020-02-2813:53:47

我该怎么做?我已经尝试解析日期

import pandas as pd
from datetime import datetime
from pytz import timezone
path = "timestamps.csv"                           
data = pd.read_csv(path, sep=";", parse_dates= ["timestamps1"])

data['timestamps1_new'] = data['timestamps1'].dt.tz_localize('GMT').dt.tz_convert('Europe/Berlin')

然后我收到错误 “已经知道 tz,使用 tz_convert 进行转换。” 当我不解析日期时,我得到 "Can only use .dt accessor with datetimelike values"。即使当我以这种方式操作字符串时,也会发生错误:

data["timestamps1"] = data["timestamps1"].str[:-5]
data["timestamps1"] = data.timestamps1.replace("T"," ",regex=True)

以下是一些示例数据:

data = {'timestamps': ['2020-11-28T13:14:57.463Z','2020-11-28T13:14:57.603Z','2020-11-28T13:14:57.618Z']}
data = pd.DataFrame(data=data)

非常感谢!

pandas 解析 ISO 8601 格式 Z 直接表示 UTC 以了解日期时间(它“知道”它在 UTC 中......):

import pandas as pd

data = {'timestamps': ['2020-11-28T13:14:57.463Z','2020-11-28T13:14:57.603Z','2020-11-28T13:14:57.618Z']}
data = pd.DataFrame(data=data)

# you skip this step by setting 'parse_dates' in read_csv
data['timestamps'] = pd.to_datetime(data['timestamps'])

print(data['timestamps'])
0   2020-11-28 13:14:57.463000+00:00
1   2020-11-28 13:14:57.603000+00:00
2   2020-11-28 13:14:57.618000+00:00
Name: timestamps, dtype: datetime64[ns, UTC]

所以可以直接转换:

data['timeGermany'] = data['timestamps'].dt.tz_convert('Europe/Berlin')

print(data['timeGermany'])
0   2020-11-28 14:14:57.463000+01:00
1   2020-11-28 14:14:57.603000+01:00
2   2020-11-28 14:14:57.618000+01:00
Name: timeGermany, dtype: datetime64[ns, Europe/Berlin]