如何将 pyarrow timestamp dtype 转换为 time64 类型?
How to cast pyarrow timestamp dtype to time64 type?
我正在尝试转换 time64 类型的 pyarrow 时间戳类型。但是显示转换错误。
import pyarrow as pa
from datetime import datetime
dt = datetime.now()
table = pa.Table.from_pydict({'ts': pa.array([dt, dt])})
new_schema = table.schema.set(0, pa.field('ts', pa.time64('us')))
table.schema
# ts: timestamp[us]
new_schema
# ts: time64[us]
table.cast(new_schema)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/table.pxi", line 1329, in pyarrow.lib.Table.cast
File "pyarrow/table.pxi", line 277, in pyarrow.lib.ChunkedArray.cast
File "/home/inspiron/.virtualenvs/par/lib/python3.7/site-packages/pyarrow/compute.py", line 243, in cast
return call_function("cast", [arr], options)
File "pyarrow/_compute.pyx", line 446, in pyarrow._compute.call_function
File "pyarrow/_compute.pyx", line 275, in pyarrow._compute.Function.call
File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 105, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Unsupported cast from timestamp[us] to time64 using function cast_time64
有什么方法可以使这种转换成为可能吗?
time64[us]
是一天中的某个时间。它表示自午夜以来的微秒数。它与任何特定日期无关,不能转换为时间戳。
箭头文档有点稀疏,但 parquet docs 解释得更好:
TIME
TIME is used for a logical time type without a date with millisecond
or microsecond precision. The type has two type parameters: UTC
adjustment (true or false) and unit (MILLIS or MICROS, NANOS).
TIME with unit MILLIS is used for millisecond precision. It must
annotate an int32 that stores the number of milliseconds after
midnight.
TIME with unit MICROS is used for microsecond precision. It must
annotate an int64 that stores the number of microseconds after
midnight.
TIME with unit NANOS is used for nanosecond precision. It must
annotate an int64 that stores the number of nanoseconds after
midnight.
The sort order used for TIME is signed.
我正在尝试转换 time64 类型的 pyarrow 时间戳类型。但是显示转换错误。
import pyarrow as pa
from datetime import datetime
dt = datetime.now()
table = pa.Table.from_pydict({'ts': pa.array([dt, dt])})
new_schema = table.schema.set(0, pa.field('ts', pa.time64('us')))
table.schema
# ts: timestamp[us]
new_schema
# ts: time64[us]
table.cast(new_schema)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/table.pxi", line 1329, in pyarrow.lib.Table.cast
File "pyarrow/table.pxi", line 277, in pyarrow.lib.ChunkedArray.cast
File "/home/inspiron/.virtualenvs/par/lib/python3.7/site-packages/pyarrow/compute.py", line 243, in cast
return call_function("cast", [arr], options)
File "pyarrow/_compute.pyx", line 446, in pyarrow._compute.call_function
File "pyarrow/_compute.pyx", line 275, in pyarrow._compute.Function.call
File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 105, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Unsupported cast from timestamp[us] to time64 using function cast_time64
有什么方法可以使这种转换成为可能吗?
time64[us]
是一天中的某个时间。它表示自午夜以来的微秒数。它与任何特定日期无关,不能转换为时间戳。
箭头文档有点稀疏,但 parquet docs 解释得更好:
TIME
TIME is used for a logical time type without a date with millisecond or microsecond precision. The type has two type parameters: UTC adjustment (true or false) and unit (MILLIS or MICROS, NANOS).
TIME with unit MILLIS is used for millisecond precision. It must annotate an int32 that stores the number of milliseconds after midnight.
TIME with unit MICROS is used for microsecond precision. It must annotate an int64 that stores the number of microseconds after midnight.
TIME with unit NANOS is used for nanosecond precision. It must annotate an int64 that stores the number of nanoseconds after midnight.
The sort order used for TIME is signed.