使用 xarray interp 重新投影数据阵列?

Using xarray interp to reproject a dataarray?

我看了很多 the interp function 的 xarray 文档,但我无法真正理解它。我看到这是一个重投影,但它并不适合真实的案例。 例如,他们是否可以通过在 webmercator 数据上重新投影此数据集来理解它?

类似于示例:

import xarray as xr
from pyproj import Transformer

ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
lon, lat = np.meshgrid(ds.lon, ds.lat)
shp = lon.shape
# reproject the grid
gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
x, y = gcs_to_3857.transform(lon.ravel(), lat.ravel())
# future index for a regular raster
X= np.linspace(x.min(), x.max(), shp[1])
Y= np.linspace(y.min(), y.max(), shp[0])   
data["x"] = xr.DataArray(np.reshape(x, shp), dims=("lat", "lon"))
data["y"] = xr.DataArray(np.reshape(y, shp), dims=("lat", "lon"))

在这里,我卡住了

应该类似于 ds.interp(x=X,y=Y) 但该数组是在经纬度上建立索引的

这让我有点困惑...

我觉得大概是这样的:

In [1]: import xarray as xr
   ...: import numpy as np
   ...: from pyproj import Transformer
   ...:
   ...: ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)

设计目标网格转换后的space:

In [2]: # find the new bounds in mercator space
   ...: gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
   ...: x, y = gcs_to_3857.transform(
   ...:     [ds.lon.min(), ds.lon.max()],
   ...:     [ds.lat.min(), ds.lat.max()],
   ...: )

In [3]: # design a target grid for the re-projected data - can be any dims you want
   ...: X = np.linspace(x[0], x[1], 500)
   ...: Y = np.linspace(y[0], y[1], 600)
   ...: XX, YY = np.meshgrid(X, Y)

将此网格转换回 lat/lon

In [4]: # build a reverse transformer from Mercator back to lat/lons
   ...: merc_to_latlng = Transformer.from_crs(3857, 4326, always_xy=True)
   ...: new_lons, new_lats = merc_to_latlng.transform(XX.ravel(), YY.ravel())
   ...: new_lons = new_lons.reshape(XX.shape)
   ...: new_lats = new_lats.reshape(YY.shape)

创建新的 DataArrays 以索引与目标网格上的网格点对应的 lat/lon 值(在墨卡托 space 中由 x 和 y 索引):

In [5]: # create arrays indexed by (x, y); also convert lons back to (0, 360)
   ...: new_lons_da = xr.DataArray((new_lons % 360), dims=["y", "x"], coords=[Y, X])
   ...: new_lats_da = xr.DataArray(new_lats, dims=["y", "x"], coords=[Y, X])

使用 xarray 的高级索引将数据插值到新点 re-indexing 到新网格

In [6]: ds_mercator = ds.interp(lon=new_lons_da, lat=new_lats_da, method="linear")

现在数据由 x 和 y 索引,墨卡托 spaced 中的点相等 space:

In [7]: ds_mercator
Out[7]:
<xarray.Dataset>
Dimensions:  (y: 600, x: 500)
Coordinates:
    time     datetime64[ns] 2013-01-01
    lon      (y, x) float64 200.0 200.3 200.5 200.8 ... 329.2 329.5 329.7 330.0
    lat      (y, x) float64 15.0 15.0 15.0 15.0 15.0 ... 75.0 75.0 75.0 75.0
  * y        (y) float64 1.689e+06 1.708e+06 1.727e+06 ... 1.291e+07 1.293e+07
  * x        (x) float64 -1.781e+07 -1.778e+07 ... -3.369e+06 -3.34e+06
Data variables:
    air      (y, x) float64 nan nan nan nan nan ... 237.3 237.6 238.0 238.3 nan
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

与原始(左)数据集相比,可以在变换后(右)的轴(和失真)中看到新投影:

In [8]: fig, axes = plt.subplots(1, 2, figsize=(14, 5))
   ...: ds.air.plot(ax=axes[0])
   ...: ds_mercator.air.plot(ax=axes[1])
Out[8]: <matplotlib.collections.QuadMesh at 0x2b3b94be0

您也可以按照建议使用 rioxarray 中的重新投影。 这是代码:

import xarray as xr
import numpy as np
from rasterio.enums import Resampling
import matplotlib.pyplot as plt

ds = xr.tutorial.open_dataset('air_temperature').isel(time=0)
ds = ds.rio.write_crs('EPSG:4326')
dst = ds.rio.reproject('EPSG:3857', shape=(250, 250), resampling=Resampling.bilinear, nodata=np.nan)

fig, ax = plt.subplots(1, 2, figsize=(14, 5))
ds.air.plot(ax=ax[0])
dst.air.plot(ax=ax[1])