使用 xarray 获取 netcdf 文件的平均值

Question

我使用 xarray 在 python 中打开了一个 netcdf 文件，数据集摘要如下所示。

Dimensions:    (latitude: 721, longitude: 1440, time: 41)
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.25 359.5 359.75
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
    expver     int32 1
  * time       (time) datetime64[ns] 1979-01-01 1980-01-01 ... 2019-01-01
Data variables:
    z          (time, latitude, longitude) float32 50517.914 ... 49769.473
Attributes:
    Conventions:  CF-1.6
    history:      2020-03-02 12:47:40 GMT by grib_to_netcdf-2.16.0: /opt/ecmw...

我想获得 z 沿纬度和经度维度的平均值。

我试过使用这个代码：

df.mean(axis = 0)

但它删除了时间坐标，并返回给我这样的东西。

Dimensions:  (latitude: 721, longitude: 1440)
Coordinates:
    expver   int32 1
Dimensions without coordinates: latitude, longitude
Data variables:
    z        (latitude, longitude) float32 49742.03 49742.03 ... 50306.242

我是不是做错了什么。请帮我解决这个问题。

Answer 1

您需要按维度 (dim) 而非 axis 指定。

使用df.mean(dim='longitude')

Answer 2

警告！！！如果您沿着纬度应用它（您需要这样做才能完全回答问题），接受的答案会给您 wrong 结果，因为您需要对每个单元格进行加权，它们不是大小相同，并且随着您向规则经纬度网格中的两极移动而变小。

Xarray解决方案：

因此要制作加权均值，您需要按照以下代码构造权重：

import numpy as np
weights = np.cos(np.deg2rad(df.z))
weights.name = "weights"
z_weighted = df.z.weighted(weights)
weighted_mean = z_weighted.mean(("longitude", "latitude"))

See this discussion in the xarray documentation for further details and an example comparison.

误差的大小取决于你平均的区域，以及变量在纬度方向的梯度有多强 - 纬度范围和变量梯度的区域越大，它就越差...对于全球温度场，这是 xarray 文档中的示例错误，远远超过 5 摄氏度！未加权的答案更冷，因为即使那里的网格单元要小得多，但两极的数量也是一样的。

备选 CDO 解决方案

顺便说一句，您也可以像这样使用 cdo 从命令行执行此操作

cdo fldmean in.nc out.nc

cdo占格，不用担心权重问题。也可以使用 CDO 包直接从 python 中调用 cdo。

使用 xarray 获取 netcdf 文件的平均值

get mean of netcdf file using xarray

python

netcdf

geopandas

python-xarray