使用 x 数组计算年均值

Question

我有一个 python xarray 数据集，其维度为 time,x,y，变量为 value1。我正在尝试为每个 x,y 坐标对计算 value1 的年均值。

我在阅读文档时运行使用了这个函数：

ds.groupby('time.year').mean()

这似乎计算了所有 x,y 在每个给定时间片 value1 坐标对的年度平均值
而不是每个给定时间片的个人x,y坐标对的年度平均值。

虽然上面的代码片段产生了错误的输出，但我对其过于简化的形式非常感兴趣。我真的很想找出 "X-arrays trick" 来计算给定 x,y 坐标对的年度平均值，而不是自己将其组合在一起。

有人给我指出正确的方向吗？我应该暂时把它变成一个 pandas 对象吗？

Answer 1

要避免默认对所有维度进行平均，您只需明确提供要平均的维度： ds.groupby('time.year').mean('time')

Answer 2

请注意，如果您处理的是月度而非每日数据，调用 ds.groupby('time.year').mean('time') 将是不正确的。取平均值会给不同长度的月份赋予相同的权重，例如 2 月和 7 月，这是错误的。

改为使用 NCAR 中的以下内容：

def weighted_temporal_mean(ds, var):
  """
  weight by days in each month
  """
  # Determine the month length
  month_length = ds.time.dt.days_in_month

  # Calculate the weights
  wgts = month_length.groupby("time.year") / month_length.groupby("time.year").sum()

  # Make sure the weights in each year add up to 1
  np.testing.assert_allclose(wgts.groupby("time.year").sum(xr.ALL_DIMS), 1.0)

  # Subset our dataset for our variable
  obs = ds[var]

  # Setup our masking for nan values
  cond = obs.isnull()
  ones = xr.where(cond, 0.0, 1.0)

  # Calculate the numerator
  obs_sum = (obs * wgts).resample(time="AS").sum(dim="time")

  # Calculate the denominator
  ones_out = (ones * wgts).resample(time="AS").sum(dim="time")

  # Return the weighted average
  return obs_sum / ones_out

average_weighted_temp = weighted_temporal_mean(ds_first_five_years, 'TEMP')

使用 x 数组计算年均值

Compute annual mean using x-arrays

time-series

python-xarray