将 stats.linregress 应用于 netcdf 文件中的每个网格单元

Question

我有一个包含以下内容的 netcdf 文件：

<xarray.Dataset> Dimensions: (latitude: 65, longitude: 49, time: 7306) Coordinates: * latitude (latitude) float32 21.0 20.75 20.5 20.25 ... 5.75 5.5 5.25 5.0 * longitude (longitude) float32 116.0 116.25 116.5 ... 127.5 127.75 128.0 * time (time) datetime64[ns] 1985-12-31T23:00:00 ... 2005-12-31T11:00:00 Data variables: pr (time, latitude, longitude) float32 0.049636062 ... 0.6215298 time_bnds (time) datetime64[ns] 1985-12-31T23:00:00 ... 2005-12-31T11:00:00

我的目标是将 stats.linregress 应用于此数据集的每个网格单元格我采取了以下方法：

from scipy import stats
import nump as np
import xarray as xr

#load data
rain = xr.load_dataset('../precipitation.1986-2005.nc')

#group by (lon,lat) pairs by:
stacked = rain.stack(paired_points=['latitude','longitude'])
grouped = stacked.groupby('paired_points').apply(stats.linregress(stacked.time.astype(float),stacked['pr']))
unstack = grouped.unstack('paired_points')

在应用 stats.linregress 之后，我想创建一个绘图，其中每个网格单元格都根据线性回归计算的斜率值着色。

当我运行代码aValueError被引发：

ValueError: all the input array dimensions for the concatenation 
axis must match exactly, but along dimension 1, the array at index 0 has size 7306 and the 
array at index 1 has size 3185

似乎访问每个网格的降水时间序列并为每个网格单元成功应用 stats.linregress 似乎就是问题所在。

有人可以提出前进的方向吗？

您的帮助将得到重视。

Answer 1

已通过以下方式解决此问题：

rain = xr.load_dataset('../precipitation.1986-2005.nc')

def slope(x):
sl = stats.linregress(x.time.astype(float),x[dict(paired_points=0)]).slope   
return xr.DataArray(sl)

#group by (lon,lat) pairs by: stacked = rain.stack(paired_points=['latitude','longitude']) grouped=stacked.groupby('paired_points').apply(slope) unstack = grouped.unstack('paired_points')

上面的解决方案遵循 Ryan Abernathey 博士在

中采用的方法

https://gist.github.com/rabernat/bc4c6990eb20942246ce967e6c9c3dbe

将 stats.linregress 应用于 netcdf 文件中的每个网格单元

applying stats.linregress to every grid cell in a netcdf file

python

scipy

netcdf

python-xarray