使用 Python Polars 读取 JSON 文件时出错
Error while reading a JSON file with Python Polars
我正在尝试使用 Python Polars 读取 GeoJSON,如下所示:
import polars as pl
myfile = '{"type":"GeometryCollection","geometries":[{"type":"Linestring","coordinates":[[10,11.2],[10.5,11.9]]},{"type":"Point","coordinates":[10,20]}]}'
pl.read_json(myfile)
我得到的错误是:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...\local-packages\Python39\site-packages\polars\functions.py", line 631, in read_json return DataFrame.read_json(source) # type: ignore
File "...\local-packages\Python39\site-packages\polars\frame.py", line 346, in read_json
self._df = PyDataFrame.read_json(file)
RuntimeError: Other("Error(\"missing field `columns`\", line: 1, column: 143)")
我也试过将相同的内容放入文件中,但我也遇到了类似的错误。
按照建议 in GitHub,我尝试通过 Pandas 读取文件,如下所示:
import pandas as pd
initial_df = pl.from_pandas(pd.read_json(file_path))
我得到的错误是:
File "...\file_splitter.py", line 13, in split_file
initial_df = pl.from_pandas(pd.read_json(file_path))
File "...\local-packages\Python39\site-packages\polars\functions.py", line 566, in from_pandas
data[name] = _from_pandas_helper(s)
File "...\local-packages\Python39\site-packages\polars\functions.py", line 534, in _from_pandas_helper
return pa.array(a)
File "pyarrow\array.pxi", line 302, in pyarrow.lib.array
File "pyarrow\array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow\error.pxi", line 97, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: cannot mix list and non-list, non-null values
如何读取 GeoJSON 文件?
如果您使用 pandas 读取文件,您会得到 Object
类型的列,其中 Arrow
未知(可能是任何内容)。
如果我们将列转换为字符串类型,我们知道箭头和极坐标可以处理它。
myfile = '{"type":"GeometryCollection","geometries":[{"type":"Linestring","coordinates":[[10,11.2],[10.5,11.9]]},{"type":"Point","coordinates":[10,20]}]}'
print(pl.from_pandas(pd.read_json(myfile).astype(str)))
shape: (2, 2)
┌────────────────────┬─────────────────────────────────────┐
│ type ┆ geometries │
│ --- ┆ --- │
│ str ┆ str │
╞════════════════════╪═════════════════════════════════════╡
│ GeometryCollection ┆ {'type': 'Linestring', 'coordina... │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ GeometryCollection ┆ {'type': 'Point', 'coordinates':... │
└────────────────────┴─────────────────────────────────────┘
我正在尝试使用 Python Polars 读取 GeoJSON,如下所示:
import polars as pl
myfile = '{"type":"GeometryCollection","geometries":[{"type":"Linestring","coordinates":[[10,11.2],[10.5,11.9]]},{"type":"Point","coordinates":[10,20]}]}'
pl.read_json(myfile)
我得到的错误是:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...\local-packages\Python39\site-packages\polars\functions.py", line 631, in read_json return DataFrame.read_json(source) # type: ignore
File "...\local-packages\Python39\site-packages\polars\frame.py", line 346, in read_json
self._df = PyDataFrame.read_json(file)
RuntimeError: Other("Error(\"missing field `columns`\", line: 1, column: 143)")
我也试过将相同的内容放入文件中,但我也遇到了类似的错误。
按照建议 in GitHub,我尝试通过 Pandas 读取文件,如下所示:
import pandas as pd
initial_df = pl.from_pandas(pd.read_json(file_path))
我得到的错误是:
File "...\file_splitter.py", line 13, in split_file
initial_df = pl.from_pandas(pd.read_json(file_path))
File "...\local-packages\Python39\site-packages\polars\functions.py", line 566, in from_pandas
data[name] = _from_pandas_helper(s)
File "...\local-packages\Python39\site-packages\polars\functions.py", line 534, in _from_pandas_helper
return pa.array(a)
File "pyarrow\array.pxi", line 302, in pyarrow.lib.array
File "pyarrow\array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow\error.pxi", line 97, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: cannot mix list and non-list, non-null values
如何读取 GeoJSON 文件?
如果您使用 pandas 读取文件,您会得到 Object
类型的列,其中 Arrow
未知(可能是任何内容)。
如果我们将列转换为字符串类型,我们知道箭头和极坐标可以处理它。
myfile = '{"type":"GeometryCollection","geometries":[{"type":"Linestring","coordinates":[[10,11.2],[10.5,11.9]]},{"type":"Point","coordinates":[10,20]}]}'
print(pl.from_pandas(pd.read_json(myfile).astype(str)))
shape: (2, 2)
┌────────────────────┬─────────────────────────────────────┐
│ type ┆ geometries │
│ --- ┆ --- │
│ str ┆ str │
╞════════════════════╪═════════════════════════════════════╡
│ GeometryCollection ┆ {'type': 'Linestring', 'coordina... │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ GeometryCollection ┆ {'type': 'Point', 'coordinates':... │
└────────────────────┴─────────────────────────────────────┘