循环 netCDF 日期时间格式并根据月份计算平均值

Question

我有一个维度为 (504, 720, 500) 的数据集（netCDF4 input_file），其中第一个是日期时间值：

0     1979-01-15
1     1979-02-15
2     1979-03-15
3     1979-04-15
4     1979-05-15
         ...    
499   2020-08-15
500   2020-09-15
501   2020-10-15
502   2020-11-15
503   2020-12-15
Length: 504, dtype: datetime64[ns]

有一个变量的值我想每月取平均值。所以最终我想要 12 个值，其中变量的平均值基于第一个维度中的月份。

我试过像这样遍历它：

# empty dataframe
df = pd.DataFrame(columns = ['Month', 'Value'])

for i in range(size(df['time'])):
    month = input_file['time'][i].month # get the current month
    avg = np.average(input_file['values'][i, :, :]) # average for the month of that year

    # append to df
    df = df.append(pd.DataFrame({'Month' : month,
                                 'Value' : avg})

但直到这里我有点迷茫，这不起作用（语法无效）我仍然需要再次循环这些值以分别获得每个月的平均值。

Answer 1

假设第 2 维和第 3 维是纬度和经度，看来您要做的只是：

input_file.mean(dim = ['lat', 'lon'])

然后您可以使用 .to_dataframe()

转换为数据帧

Answer 2

我不确定这是不是你需要的

xr.open_dataset('file.nc')
xr.resample(time ='M').mean()

循环 netCDF 日期时间格式并根据月份计算平均值

Loop over netCDF datetime format and calculate mean based on month

python

datetime

netcdf