当变量 lat/lon 在 R 中存储为矩阵时如何提取一组 NetCDF 值

Question

我正在处理 3 维（x、y、时间）NetCDF 文件，其中包含一年中每小时的 PM10 浓度估计值。我的目标是提取几个坐标的每小时估计值 --- 因此将是 365days*24hrs=8760 estimates/year/coordinate --- 然后平均到每日 (365) 估计值。

我的脚本（见下文）在 2013 年运行良好，但在 2012 年输出有很多 NA。我注意到的区别是 2012 年的 lon/lat 文件以矩阵形式存储...

File E:/ENSa.2012.PM10.yearlyrea_.nc (NC_FORMAT_CLASSIC):

     3 variables (excluding dimension variables):
        float lon[x,y]   
            long_name: Longitude
            units: degrees_east
        float lat[x,y]   
            long_name: Latitude
            units: degrees_north
        float PM10[x,y,time]   
            units: ug/m3

     3 dimensions:
        x  Size:701
        y  Size:401
        time  Size:8784   *** is unlimited ***
            units: day as %Y%m%d.%f
            calendar: proleptic_gregorian

head(lon) 
      [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9]
[1,] -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0 -25.0
[2,] -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9 -24.9

对于 2013 年的文件，lon 是 'normal' 像这样

File E:/ENSa.2013.PM25.yearlyrea.nc (NC_FORMAT_NETCDF4):

     1 variables (excluding dimension variables):
        float PM25[lon,lat,time]   (Chunking: [701,401,1])  
            long_name: PM25
            units: ug
            _FillValue: -999

     3 dimensions:
        lon  Size:701
            standard_name: longitude
            long_name: longitude
            units: degrees_east
            axis: X
        lat  Size:401
            standard_name: latitude
            long_name: latitude
            units: degrees_north
            axis: Y
        time  Size:8760   *** is unlimited ***
            standard_name: time
            long_name: time at end of period
            units: day as %Y%m%d.%f
            calendar: proleptic_gregorian

head(lon) 
[1] -25.0 -24.9 -24.8 -24.7 -24.6 -24.5

我正在使用以下脚本：

# Command brick reads all layers (time slices) in the file
  pm102013 <- brick("ENSa.2013.PM10.yearlyrea.nc", varname = "PM10")

# Get date index from the file
  idx <- getZ(pm102013)

# Put coordinates and extract values for all time steps
  coords <- matrix(c( -2.094278,    -1.830583,  -2.584482,  -0.175269,  -3.17625,   0.54797,    -2.678731,  -1.433611,  -1.456944,  -3.182186,  
 57.15736,  52.511722,  51.462839,  51.54421,   51.48178,   51.374264,  51.638094,  53.230583,  53.231722,  55.945589),
 ncol = 2) # longitude and latitude
 vals <- extract(pm102013, coords, df=T)

# Merge dates and values and fix data frame names
 df.pm102013 <- data.frame(idx, t(vals)[-1,])
 rownames(df.pm102013) <- NULL
 names(df.pm102013) <- c('date','UKA00399', 'UKA00479', 'UKA00494', 'UKA00259', 'UKA00217', 'UKA00553', 'UKA00515', 'UKA00530', 'UKA00529', 'UKA00454')

#output
 options(max.print=100000000)
 sink("PM10_2013.txt")
 print(df.pm102013)
 sink()

有人知道 'fix' lon/lat 问题的解决方法吗？或者有另一种有效的方法来提取和平均数据？

Answer 1

您可以从 bash 中的命令行中提取离位置 lon/lat 最近的点并使用 CDO 制作日平均值：

lon=34.4
lat=22.1
cdo daymean -remapnn,lon=${lon}/lat=${lat} input.nc output_${lon}_${lat}.nc

remapnn 上的减号表示结果通过管道传输到 daymean 命令。对于每个所需的点，您可以将其放入 bash 的循环中。

当变量 lat/lon 在 R 中存储为矩阵时如何提取一组 NetCDF 值

how to extract a set of NetCDF values when variable lat/lon stored as matrix in R

r

raster

netcdf