如何将数据框列转换为 UTC 日期时间格式?
How to convert dataframe column into UTC datetime format?
我想将数据框 data_copy
中的 Origin
列转换为 UTC 日期时间格式
import pandas as pd
>>>data_copy["Origin"]
0 1669-06-04 00:00:00
1 1669-06-22 00:00:00
2 1720-07-15 00:00:00
3 1803-09-01 00:00:00
4 1816-05-26 00:00:00
6395 2020-03-29 18:27:36
6396 2020-03-29 18:47:53
6397 2020-03-29 20:05:19
6398 2020-03-30 02:19:27
6399 2020-03-30 06:11:36
还有一些带有00:00:00
时间的数据条目(我也需要转换它)
我试过这个命令
data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)
但是我收到这样的错误
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2054, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data)
File "pandas\_libs\tslibs\conversion.pyx", line 350, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<ipython-input-93-aead2d23f264>", line 1, in <module>
data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 803, in to_datetime
values = convert_listlike(arg._values, format)
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 466, in _convert_listlike_datetimes
allow_object=True,
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2059, in objects_to_datetime64ns
raise e
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2050, in objects_to_datetime64ns
require_iso8601=require_iso8601,
File "pandas\_libs\tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 574, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 570, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 546, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslibs\np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1669-06-04 00:00:00
如何将列转换为 UTC 日期时间格式?
这是问题日期时间超出 pandas link:
In [92]: pd.Timestamp.min
Out[92]: Timestamp('1677-09-21 00:12:43.145225')
In [93]: pd.Timestamp.max
Out[93]: Timestamp('2262-04-11 23:47:16.854775807')
可能的解决方案是将 NaT
的值替换为 errors='coerce'
参数:
data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],
infer_datetime_format=True,
errors='coerce')
如果您仍然需要日期时间,可以使用 Python 的日期时间 class。但是,这将为您留下 dtype 对象的列,这意味着 pandas' 日期时间功能(dt 访问器)不可用。
例如:
from datetime import datetime, timezone
import pandas as pd
s = (pd.Series(["1669-06-04 00:00:00", "1816-05-26 00:00:00", "2020-03-29 18:27:36"])
.apply(lambda t: datetime.fromisoformat(t).replace(tzinfo=timezone.utc)))
# s
# 0 1669-06-04 00:00:00+00:00
# 1 1816-05-26 00:00:00+00:00
# 2 2020-03-29 18:27:36+00:00
# dtype: object
您仍然可以访问日期时间 class' 方法,但这需要迭代 (apply
)。
我想将数据框 data_copy
中的 Origin
列转换为 UTC 日期时间格式
import pandas as pd
>>>data_copy["Origin"]
0 1669-06-04 00:00:00
1 1669-06-22 00:00:00
2 1720-07-15 00:00:00
3 1803-09-01 00:00:00
4 1816-05-26 00:00:00
6395 2020-03-29 18:27:36
6396 2020-03-29 18:47:53
6397 2020-03-29 20:05:19
6398 2020-03-30 02:19:27
6399 2020-03-30 06:11:36
还有一些带有00:00:00
时间的数据条目(我也需要转换它)
我试过这个命令
data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)
但是我收到这样的错误
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2054, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data)
File "pandas\_libs\tslibs\conversion.pyx", line 350, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<ipython-input-93-aead2d23f264>", line 1, in <module>
data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],infer_datetime_format=True)
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 803, in to_datetime
values = convert_listlike(arg._values, format)
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\tools\datetimes.py", line 466, in _convert_listlike_datetimes
allow_object=True,
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2059, in objects_to_datetime64ns
raise e
File "C:\ProgramData\Anaconda3\envs\roses\lib\site-packages\pandas\core\arrays\datetimes.py", line 2050, in objects_to_datetime64ns
require_iso8601=require_iso8601,
File "pandas\_libs\tslib.pyx", line 352, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 574, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 570, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 546, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslibs\np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1669-06-04 00:00:00
如何将列转换为 UTC 日期时间格式?
这是问题日期时间超出 pandas link:
In [92]: pd.Timestamp.min
Out[92]: Timestamp('1677-09-21 00:12:43.145225')
In [93]: pd.Timestamp.max
Out[93]: Timestamp('2262-04-11 23:47:16.854775807')
可能的解决方案是将 NaT
的值替换为 errors='coerce'
参数:
data_copy["Origin"] = pd.to_datetime(data_copy["Origin"],
infer_datetime_format=True,
errors='coerce')
如果您仍然需要日期时间,可以使用 Python 的日期时间 class。但是,这将为您留下 dtype 对象的列,这意味着 pandas' 日期时间功能(dt 访问器)不可用。 例如:
from datetime import datetime, timezone
import pandas as pd
s = (pd.Series(["1669-06-04 00:00:00", "1816-05-26 00:00:00", "2020-03-29 18:27:36"])
.apply(lambda t: datetime.fromisoformat(t).replace(tzinfo=timezone.utc)))
# s
# 0 1669-06-04 00:00:00+00:00
# 1 1816-05-26 00:00:00+00:00
# 2 2020-03-29 18:27:36+00:00
# dtype: object
您仍然可以访问日期时间 class' 方法,但这需要迭代 (apply
)。