打开具有不同时间坐标长度的多个文件
Open multiple files with different time coordinate length
我有两个(更多)netCDF 文件,我想使用 xarray 中的 open_mfdataset
函数动态连接它们。如果我使用 open_dataset
分别打开它们,打印的结构如下:
Dimensions: (lat: 103, lon: 241, time: 365)
Coordinates:
* lon (lon) float64 5.75 5.771 5.792 5.812 5.833 5.854 5.875 5.896 ...
* lat (lat) float64 45.75 45.77 45.79 45.81 45.83 45.85 45.88 45.9 ...
* time (time) datetime64[ns] 2014-01-01 2014-01-02 2014-01-03 ...
Data variables:
TabsD (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ...
第一个文件,
Dimensions: (lat: 103, lon: 241, time: 31)
Coordinates:
* lon (lon) float64 5.75 5.771 5.792 5.812 5.833 5.854 5.875 5.896 ...
* lat (lat) float64 45.75 45.77 45.79 45.81 45.83 45.85 45.87 45.9 ...
* time (time) datetime64[ns] 2015-01-01 2015-01-02 2015-01-03 ...
Data variables:
TabsD (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ...
第二次。现在的问题是,当我将文件名放入 flist
并执行 data = xr.open_mfdataset(flist, concat_dim='time', cache=False)
:
时,我得到了这个错误回溯
Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\pydevd.py", line 1591, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\pydevd.py", line 1018, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/*/Documents/myfile.py", line 110, in <module>
data = xr.open_mfdataset(flist, concat_dim='time', cache=False)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\backends\api.py", line 514, in open_mfdataset
combined = auto_combine(datasets, concat_dim=concat_dim, compat=compat)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 396, in auto_combine
concatenated = [_auto_concat(ds, dim=dim) for ds in grouped]
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 396, in <listcomp>
concatenated = [_auto_concat(ds, dim=dim) for ds in grouped]
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 332, in _auto_concat
return concat(datasets, dim=dim)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 120, in concat
return f(objs, dim, data_vars, coords, compat, positions)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 273, in _dataset_concat
combined = concat_vars(vars, dim, positions)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\variable.py", line 1442, in concat
return Variable.concat(variables, dim, positions, shortcut)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\variable.py", line 998, in concat
data = duck_array_ops.concatenate(arrays, axis=axis)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\duck_array_ops.py", line 48, in f
return getattr(module, name)(*args, **kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\dask\array\core.py", line 1871, in concatenate
raise ValueError("Block shapes do not align")
ValueError: Block shapes do not align
我已经试过了:
- 设置
chunks={'time':10}
(30、40、100 等...)
- 设置
chunks={'lat':10, 'lon':10}
- 检查时间跨度之间是否存在差距:否
结果基本相同
这里有什么技巧?
您所描述的行为不应该发生。
我的猜测是您的某些文件实际上具有不同的维度大小,但您从 dask 收到了一条无意义的错误消息。
您所描述的错误消息不再出现在最新版本的 dask 中,因此该行为也有可能已得到修复。请更新到 xarray/dask 的最新版本,然后重试。
我有两个(更多)netCDF 文件,我想使用 xarray 中的 open_mfdataset
函数动态连接它们。如果我使用 open_dataset
分别打开它们,打印的结构如下:
Dimensions: (lat: 103, lon: 241, time: 365)
Coordinates:
* lon (lon) float64 5.75 5.771 5.792 5.812 5.833 5.854 5.875 5.896 ...
* lat (lat) float64 45.75 45.77 45.79 45.81 45.83 45.85 45.88 45.9 ...
* time (time) datetime64[ns] 2014-01-01 2014-01-02 2014-01-03 ...
Data variables:
TabsD (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ...
第一个文件,
Dimensions: (lat: 103, lon: 241, time: 31)
Coordinates:
* lon (lon) float64 5.75 5.771 5.792 5.812 5.833 5.854 5.875 5.896 ...
* lat (lat) float64 45.75 45.77 45.79 45.81 45.83 45.85 45.87 45.9 ...
* time (time) datetime64[ns] 2015-01-01 2015-01-02 2015-01-03 ...
Data variables:
TabsD (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ...
第二次。现在的问题是,当我将文件名放入 flist
并执行 data = xr.open_mfdataset(flist, concat_dim='time', cache=False)
:
Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\pydevd.py", line 1591, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\pydevd.py", line 1018, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.3.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/*/Documents/myfile.py", line 110, in <module>
data = xr.open_mfdataset(flist, concat_dim='time', cache=False)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\backends\api.py", line 514, in open_mfdataset
combined = auto_combine(datasets, concat_dim=concat_dim, compat=compat)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 396, in auto_combine
concatenated = [_auto_concat(ds, dim=dim) for ds in grouped]
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 396, in <listcomp>
concatenated = [_auto_concat(ds, dim=dim) for ds in grouped]
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 332, in _auto_concat
return concat(datasets, dim=dim)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 120, in concat
return f(objs, dim, data_vars, coords, compat, positions)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\combine.py", line 273, in _dataset_concat
combined = concat_vars(vars, dim, positions)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\variable.py", line 1442, in concat
return Variable.concat(variables, dim, positions, shortcut)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\variable.py", line 998, in concat
data = duck_array_ops.concatenate(arrays, axis=axis)
File "C:\Program Files\Anaconda3\lib\site-packages\xarray\core\duck_array_ops.py", line 48, in f
return getattr(module, name)(*args, **kwargs)
File "C:\Program Files\Anaconda3\lib\site-packages\dask\array\core.py", line 1871, in concatenate
raise ValueError("Block shapes do not align")
ValueError: Block shapes do not align
我已经试过了:
- 设置
chunks={'time':10}
(30、40、100 等...) - 设置
chunks={'lat':10, 'lon':10}
- 检查时间跨度之间是否存在差距:否
结果基本相同
这里有什么技巧?
您所描述的行为不应该发生。
我的猜测是您的某些文件实际上具有不同的维度大小,但您从 dask 收到了一条无意义的错误消息。
您所描述的错误消息不再出现在最新版本的 dask 中,因此该行为也有可能已得到修复。请更新到 xarray/dask 的最新版本,然后重试。