如何 select 来自 xarray 数据集的特定数据变量
How to select specific data variables from xarray dataset
背景
我正在尝试通过 xarray 和 OPeNDAP 下载 GFS 天气数据 netcdf4 文件。非常感谢 Vorticity0123 之前的 post,这让我能够整理 python 脚本的骨架(如下所示)。
问题
事实是,GFS 数据集有 195 个数据变量,但我不需要大多数,我只需要其中的十个。
- ugrd100m, vgrd100m, dswrfsfc, tcdcclm, tcdcblcll, tcdclcll, tcdcmcll,
tcdchcll, tmp2m, gustsfc
请求帮助
我浏览了 xarray readthedocs 页面和其他地方,但我想不出一种方法来将我的数据集缩小到只有十个数据变量。有谁知道如何缩小数据集中的变量列表?
PYTHON 脚本
import numpy as np
import xarray as xr
# File Details
dt = '20201124'
res = 25
step = '1hr'
run = '{:02}'.format(18)
# URL
URL = f'http://nomads.ncep.noaa.gov:80/dods/gfs_0p{res}_{step}/gfs{dt}/gfs_0p{res}_{step}_{run}z'
# Load data
dataset = xr.open_dataset(URL)
time = dataset.variables['time']
lat = dataset.variables['lat'][:]
lon = dataset.variables['lon'][:]
lev = dataset.variables['lev'][:]
# Narrow Down Selection
time_toplot = time
lat_toplot = np.arange(-43, -17, 0.5)
lon_toplot = np.arange(135, 152, 0.5)
lev_toplot = np.array([1000])
# Select required data via xarray
dataset = dataset.sel(time=time_toplot, lon=lon_toplot, lat=lat_toplot)
print(dataset)
您可以使用 xarray 的类似 dict 的语法。
variables = [
'ugrd100m',
'vgrd100m',
'dswrfsfc',
'tcdcclm',
'tcdcblcll',
'tcdclcll',
'tcdcmcll',
'tcdchcll',
'tmp2m',
'gustsfc'
]
dataset[variables]
给你:
<xarray.Dataset>
Dimensions: (lat: 721, lon: 1440, time: 121)
Coordinates:
* time (time) datetime64[ns] 2020-11-24T18:00:00 ... 2020-11-29T18:00:00
* lat (lat) float64 -90.0 -89.75 -89.5 -89.25 ... 89.25 89.5 89.75 90.0
* lon (lon) float64 0.0 0.25 0.5 0.75 1.0 ... 359.0 359.2 359.5 359.8
Data variables:
ugrd100m (time, lat, lon) float32 ...
vgrd100m (time, lat, lon) float32 ...
dswrfsfc (time, lat, lon) float32 ...
tcdcclm (time, lat, lon) float32 ...
tcdcblcll (time, lat, lon) float32 ...
tcdclcll (time, lat, lon) float32 ...
tcdcmcll (time, lat, lon) float32 ...
tcdchcll (time, lat, lon) float32 ...
tmp2m (time, lat, lon) float32 ...
gustsfc (time, lat, lon) float32 ...
Attributes:
title: GFS 0.25 deg starting from 18Z24nov2020, downloaded Nov 24 ...
Conventions: COARDS\nGrADS
dataType: Grid
history: Sat Nov 28 05:52:44 GMT 2020 : imported by GrADS Data Serve...
背景
我正在尝试通过 xarray 和 OPeNDAP 下载 GFS 天气数据 netcdf4 文件。非常感谢 Vorticity0123 之前的 post,这让我能够整理 python 脚本的骨架(如下所示)。
问题
事实是,GFS 数据集有 195 个数据变量,但我不需要大多数,我只需要其中的十个。
- ugrd100m, vgrd100m, dswrfsfc, tcdcclm, tcdcblcll, tcdclcll, tcdcmcll, tcdchcll, tmp2m, gustsfc
请求帮助
我浏览了 xarray readthedocs 页面和其他地方,但我想不出一种方法来将我的数据集缩小到只有十个数据变量。有谁知道如何缩小数据集中的变量列表?
PYTHON 脚本
import numpy as np
import xarray as xr
# File Details
dt = '20201124'
res = 25
step = '1hr'
run = '{:02}'.format(18)
# URL
URL = f'http://nomads.ncep.noaa.gov:80/dods/gfs_0p{res}_{step}/gfs{dt}/gfs_0p{res}_{step}_{run}z'
# Load data
dataset = xr.open_dataset(URL)
time = dataset.variables['time']
lat = dataset.variables['lat'][:]
lon = dataset.variables['lon'][:]
lev = dataset.variables['lev'][:]
# Narrow Down Selection
time_toplot = time
lat_toplot = np.arange(-43, -17, 0.5)
lon_toplot = np.arange(135, 152, 0.5)
lev_toplot = np.array([1000])
# Select required data via xarray
dataset = dataset.sel(time=time_toplot, lon=lon_toplot, lat=lat_toplot)
print(dataset)
您可以使用 xarray 的类似 dict 的语法。
variables = [
'ugrd100m',
'vgrd100m',
'dswrfsfc',
'tcdcclm',
'tcdcblcll',
'tcdclcll',
'tcdcmcll',
'tcdchcll',
'tmp2m',
'gustsfc'
]
dataset[variables]
给你:
<xarray.Dataset>
Dimensions: (lat: 721, lon: 1440, time: 121)
Coordinates:
* time (time) datetime64[ns] 2020-11-24T18:00:00 ... 2020-11-29T18:00:00
* lat (lat) float64 -90.0 -89.75 -89.5 -89.25 ... 89.25 89.5 89.75 90.0
* lon (lon) float64 0.0 0.25 0.5 0.75 1.0 ... 359.0 359.2 359.5 359.8
Data variables:
ugrd100m (time, lat, lon) float32 ...
vgrd100m (time, lat, lon) float32 ...
dswrfsfc (time, lat, lon) float32 ...
tcdcclm (time, lat, lon) float32 ...
tcdcblcll (time, lat, lon) float32 ...
tcdclcll (time, lat, lon) float32 ...
tcdcmcll (time, lat, lon) float32 ...
tcdchcll (time, lat, lon) float32 ...
tmp2m (time, lat, lon) float32 ...
gustsfc (time, lat, lon) float32 ...
Attributes:
title: GFS 0.25 deg starting from 18Z24nov2020, downloaded Nov 24 ...
Conventions: COARDS\nGrADS
dataType: Grid
history: Sat Nov 28 05:52:44 GMT 2020 : imported by GrADS Data Serve...