WRF netcdf 文件 - 根据 python 中的坐标边界从数据集中提取较小的数组

WRF netcdf file - subset smaller array out of dataset based on coordinate boundaries in python

我有两个来自 WRF 运行的 netcdf 文件,一个包含每小时数据,另一个较小的文件包含坐标(XLAT 和 XLONG)。我正在尝试根据特定坐标检索数据的子集。

其中一个变量的示例是温度 'T2',其维度 (1,1015,1359) 分别为(时间,south_north,west_east)。

XLAT 和 XLONG 具有相同的尺寸 (1,1015,1359)。

有人问了一个相同的问题(请参阅 ),因为我的 lat/long 尺寸有点不同脚本对我不起作用而且我一直无法弄清楚为什么。我试图将坐标更改为一维数组,这样它就类似于上一个问题,但是脚本不起作用并且出现索引错误。

如果有人能帮助我,那就太棒了!提前致谢:)

import numpy as np
from netCDF4 import Dataset  
import matplotlib.pyplot as plt

lons = b.variables['XLONG'][:]
lats = b.variables['XLAT'][:]

lons2d =lons.reshape((1015,1359))
lons1d = lons2d.reshape((1379385))

lats2d =lats.reshape((1015,1359))
lats1d = lats2d.reshape((1379385))

lat_bnds, lon_bnds = [49,53], [-125,-115]
lat_inds = np.where((lats1d > lat_bnds[0]) & (lats1d < lat_bnds[1]))
lon_inds = np.where((lons1d > lon_bnds[0]) & (lons1d < lon_bnds[1]))

T_subset = a.variables['T2'][:,lat_inds,lon_inds]

但是我收到以下错误:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-0f8890d3b1c5> in <module>()
 25 lon_inds = np.where((lons1d > lon_bnds[0]) & (lons1d < lon_bnds[1]))
 26 
---> 27 T_subset = a.variables['T2'][:,lat_inds,lon_inds]
 28 
 29 

netCDF4/_netCDF4.pyx in      netCDF4._netCDF4.Variable.__getitem__(netCDF4/_netCDF4.c:35672)()

/Users/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/netCDF4/utils.pyc in _StartCountStride(elem, shape, dimensions, grp, datashape, put)
197         # Raise error if multidimensional indexing is used.
198         if ea.ndim > 1:
--> 199             raise IndexError("Index cannot be multidimensional")
200         # set unlim to True if dimension is unlimited and put==True
201         # (called from __setitem__)
IndexError: Index cannot be multidimensional

我发现 lat_inds 有一个明显的问题,因为它的最大形状为 1015*1359,但您尝试将其用作纬度的索引,其大小为 1015。所以 IMO 你应该首先找到 lat_indslon_inds 的相似值,同时满足 lon 和 lat 限制的点,然后将此数组用于展平数据。类似于:

uni_ind=numpy.intersect1d(lat_inds,lon_inds)
T_subset=np.ravel(a.variables['T2'])[uni_ind]

将数组转换回二维可能包含更多问题,因为我假设您的原始数据不是圆柱坐标,因此生成的子集可能不是矩形的。 此代码未经测试,如果您共享原始数据文件,我也可以这样做。

编辑: 为了正确绘图,使用遮罩更容易,这个例子应该足够有用了。

import numpy as np
from netCDF4 import Dataset
import matplotlib.pyplot as plt

b = Dataset('wrfout_conus_constants.nc')
a = Dataset('wrf2d_d01_2010-01-11_000000')

## Data coords
xlong = b.variables['XLONG'][0]
xlat = b.variables['XLAT'][0]
## Data var
temp = a.variables['T2'][0]
## Data bounds
longmax, longmin = -115, -125
latmax, latmin = 53, 49
## Mask coordinates according to bounds
latmask=np.ma.masked_where(xlat<latmin,xlat).mask+np.ma.masked_where(xlat>latmax,xlat).mask
lonmask=np.ma.masked_where(xlong<longmin,xlong).mask+np.ma.masked_where(xlong>longmax,xlat).mask
totmask = lonmask + latmask
## Show mask compared to full domain
plt.pcolormesh(totmask)
## Apply mask to data
temp_masked = np.ma.masked_where(totmask,temp)
## plot masked data
fig=plt.figure()
plt.contourf(temp_masked)
## plot full domain
fig=plt.figure()
plt.contourf(temp)
plt.show()

我不确定为什么它不起作用,但我认为这可以满足您的需求并且更干净:

import numpy as np
from netCDF4 import Dataset
import matplotlib.pyplot as plt

# By indexing at 0 along first dimension, we eliminate the time
# dimension, which only had size 0 anyway.
lons = b.variables['XLONG'][0]
lats = b.variables['XLAT'][0]
temp = a.variables['T2'][0]

lat_bnds, lon_bnds = [49,53], [-125,-115]

# Just AND together all of them and make a big mask
subset = ((lats > lat_bnds[0]) & (lats < lat_bnds[1]) & 
          (lons > lon_bnds[0]) & (lons < lon_bnds[1]))

# Apply mask--should apply to trailing dimensions...I think
T_subset = temp[subset]