使用 xarray interp 重新投影数据阵列?
Using xarray interp to reproject a dataarray?
我看了很多 the interp function 的 xarray 文档,但我无法真正理解它。我看到这是一个重投影,但它并不适合真实的案例。
例如,他们是否可以通过在 webmercator 数据上重新投影此数据集来理解它?
类似于示例:
import xarray as xr
from pyproj import Transformer
ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
lon, lat = np.meshgrid(ds.lon, ds.lat)
shp = lon.shape
# reproject the grid
gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
x, y = gcs_to_3857.transform(lon.ravel(), lat.ravel())
# future index for a regular raster
X= np.linspace(x.min(), x.max(), shp[1])
Y= np.linspace(y.min(), y.max(), shp[0])
data["x"] = xr.DataArray(np.reshape(x, shp), dims=("lat", "lon"))
data["y"] = xr.DataArray(np.reshape(y, shp), dims=("lat", "lon"))
在这里,我卡住了
应该类似于 ds.interp(x=X,y=Y)
但该数组是在经纬度上建立索引的
这让我有点困惑...
我觉得大概是这样的:
In [1]: import xarray as xr
...: import numpy as np
...: from pyproj import Transformer
...:
...: ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
设计目标网格转换后的space:
In [2]: # find the new bounds in mercator space
...: gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
...: x, y = gcs_to_3857.transform(
...: [ds.lon.min(), ds.lon.max()],
...: [ds.lat.min(), ds.lat.max()],
...: )
In [3]: # design a target grid for the re-projected data - can be any dims you want
...: X = np.linspace(x[0], x[1], 500)
...: Y = np.linspace(y[0], y[1], 600)
...: XX, YY = np.meshgrid(X, Y)
将此网格转换回 lat/lon
In [4]: # build a reverse transformer from Mercator back to lat/lons
...: merc_to_latlng = Transformer.from_crs(3857, 4326, always_xy=True)
...: new_lons, new_lats = merc_to_latlng.transform(XX.ravel(), YY.ravel())
...: new_lons = new_lons.reshape(XX.shape)
...: new_lats = new_lats.reshape(YY.shape)
创建新的 DataArrays 以索引与目标网格上的网格点对应的 lat/lon 值(在墨卡托 space 中由 x 和 y 索引):
In [5]: # create arrays indexed by (x, y); also convert lons back to (0, 360)
...: new_lons_da = xr.DataArray((new_lons % 360), dims=["y", "x"], coords=[Y, X])
...: new_lats_da = xr.DataArray(new_lats, dims=["y", "x"], coords=[Y, X])
使用 xarray 的高级索引将数据插值到新点而 re-indexing 到新网格
In [6]: ds_mercator = ds.interp(lon=new_lons_da, lat=new_lats_da, method="linear")
现在数据由 x 和 y 索引,墨卡托 spaced 中的点相等 space:
In [7]: ds_mercator
Out[7]:
<xarray.Dataset>
Dimensions: (y: 600, x: 500)
Coordinates:
time datetime64[ns] 2013-01-01
lon (y, x) float64 200.0 200.3 200.5 200.8 ... 329.2 329.5 329.7 330.0
lat (y, x) float64 15.0 15.0 15.0 15.0 15.0 ... 75.0 75.0 75.0 75.0
* y (y) float64 1.689e+06 1.708e+06 1.727e+06 ... 1.291e+07 1.293e+07
* x (x) float64 -1.781e+07 -1.778e+07 ... -3.369e+06 -3.34e+06
Data variables:
air (y, x) float64 nan nan nan nan nan ... 237.3 237.6 238.0 238.3 nan
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
与原始(左)数据集相比,可以在变换后(右)的轴(和失真)中看到新投影:
In [8]: fig, axes = plt.subplots(1, 2, figsize=(14, 5))
...: ds.air.plot(ax=axes[0])
...: ds_mercator.air.plot(ax=axes[1])
Out[8]: <matplotlib.collections.QuadMesh at 0x2b3b94be0
您也可以按照建议使用 rioxarray
中的重新投影。
这是代码:
import xarray as xr
import numpy as np
from rasterio.enums import Resampling
import matplotlib.pyplot as plt
ds = xr.tutorial.open_dataset('air_temperature').isel(time=0)
ds = ds.rio.write_crs('EPSG:4326')
dst = ds.rio.reproject('EPSG:3857', shape=(250, 250), resampling=Resampling.bilinear, nodata=np.nan)
fig, ax = plt.subplots(1, 2, figsize=(14, 5))
ds.air.plot(ax=ax[0])
dst.air.plot(ax=ax[1])
我看了很多 the interp function 的 xarray 文档,但我无法真正理解它。我看到这是一个重投影,但它并不适合真实的案例。 例如,他们是否可以通过在 webmercator 数据上重新投影此数据集来理解它?
类似于示例:
import xarray as xr
from pyproj import Transformer
ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
lon, lat = np.meshgrid(ds.lon, ds.lat)
shp = lon.shape
# reproject the grid
gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
x, y = gcs_to_3857.transform(lon.ravel(), lat.ravel())
# future index for a regular raster
X= np.linspace(x.min(), x.max(), shp[1])
Y= np.linspace(y.min(), y.max(), shp[0])
data["x"] = xr.DataArray(np.reshape(x, shp), dims=("lat", "lon"))
data["y"] = xr.DataArray(np.reshape(y, shp), dims=("lat", "lon"))
在这里,我卡住了
应该类似于 ds.interp(x=X,y=Y)
但该数组是在经纬度上建立索引的
这让我有点困惑...
我觉得大概是这样的:
In [1]: import xarray as xr
...: import numpy as np
...: from pyproj import Transformer
...:
...: ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
设计目标网格转换后的space:
In [2]: # find the new bounds in mercator space
...: gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
...: x, y = gcs_to_3857.transform(
...: [ds.lon.min(), ds.lon.max()],
...: [ds.lat.min(), ds.lat.max()],
...: )
In [3]: # design a target grid for the re-projected data - can be any dims you want
...: X = np.linspace(x[0], x[1], 500)
...: Y = np.linspace(y[0], y[1], 600)
...: XX, YY = np.meshgrid(X, Y)
将此网格转换回 lat/lon
In [4]: # build a reverse transformer from Mercator back to lat/lons
...: merc_to_latlng = Transformer.from_crs(3857, 4326, always_xy=True)
...: new_lons, new_lats = merc_to_latlng.transform(XX.ravel(), YY.ravel())
...: new_lons = new_lons.reshape(XX.shape)
...: new_lats = new_lats.reshape(YY.shape)
创建新的 DataArrays 以索引与目标网格上的网格点对应的 lat/lon 值(在墨卡托 space 中由 x 和 y 索引):
In [5]: # create arrays indexed by (x, y); also convert lons back to (0, 360)
...: new_lons_da = xr.DataArray((new_lons % 360), dims=["y", "x"], coords=[Y, X])
...: new_lats_da = xr.DataArray(new_lats, dims=["y", "x"], coords=[Y, X])
使用 xarray 的高级索引将数据插值到新点而 re-indexing 到新网格
In [6]: ds_mercator = ds.interp(lon=new_lons_da, lat=new_lats_da, method="linear")
现在数据由 x 和 y 索引,墨卡托 spaced 中的点相等 space:
In [7]: ds_mercator
Out[7]:
<xarray.Dataset>
Dimensions: (y: 600, x: 500)
Coordinates:
time datetime64[ns] 2013-01-01
lon (y, x) float64 200.0 200.3 200.5 200.8 ... 329.2 329.5 329.7 330.0
lat (y, x) float64 15.0 15.0 15.0 15.0 15.0 ... 75.0 75.0 75.0 75.0
* y (y) float64 1.689e+06 1.708e+06 1.727e+06 ... 1.291e+07 1.293e+07
* x (x) float64 -1.781e+07 -1.778e+07 ... -3.369e+06 -3.34e+06
Data variables:
air (y, x) float64 nan nan nan nan nan ... 237.3 237.6 238.0 238.3 nan
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
与原始(左)数据集相比,可以在变换后(右)的轴(和失真)中看到新投影:
In [8]: fig, axes = plt.subplots(1, 2, figsize=(14, 5))
...: ds.air.plot(ax=axes[0])
...: ds_mercator.air.plot(ax=axes[1])
Out[8]: <matplotlib.collections.QuadMesh at 0x2b3b94be0
您也可以按照建议使用 rioxarray
中的重新投影。
这是代码:
import xarray as xr
import numpy as np
from rasterio.enums import Resampling
import matplotlib.pyplot as plt
ds = xr.tutorial.open_dataset('air_temperature').isel(time=0)
ds = ds.rio.write_crs('EPSG:4326')
dst = ds.rio.reproject('EPSG:3857', shape=(250, 250), resampling=Resampling.bilinear, nodata=np.nan)
fig, ax = plt.subplots(1, 2, figsize=(14, 5))
ds.air.plot(ax=ax[0])
dst.air.plot(ax=ax[1])