pack/compress netcdf 数据("add offset" 和 "scale factor")与 CDO、NCO 或类似

pack/compress netcdf data ("add offset" and "scale factor") with CDO, NCO or similar

我有大量 64 位浮动精度的 netCDF 文件。我想使用 add_offsetscale_factor 参数的特定值进行打包(这样我就可以转换为短 I16 精度)。我找到了使用 CDO 操作符解包的信息,但没有找到打包信息。

有什么帮助吗?提前致谢!

编辑:

diego@LAcompu:~/new$ ncks -m in.nc
netcdf in {
  dimensions:
    bnds = 2 ;
    lat = 202 ;
    lon = 62 ;
    time = UNLIMITED ; // (15777 currently)

  variables:
    float lat(lat) ;
      lat:standard_name = "latitude" ;
      lat:long_name = "latitude" ;
      lat:units = "degrees_north" ;
      lat:axis = "Y" ;

    float lon(lon) ;
      lon:standard_name = "longitude" ;
      lon:long_name = "longitude" ;
      lon:units = "degrees_east" ;
      lon:axis = "X" ;

    double t2m(time,lat,lon) ;
      t2m:long_name = "2 metre temperature" ;
      t2m:units = "Celsius" ;
      t2m:_FillValue = -32767. ;
      t2m:missing_value = -32767. ;

    double time(time) ;
      time:standard_name = "time" ;
      time:long_name = "time" ;
      time:bounds = "time_bnds" ;
      time:units = "hours since 1900-01-01 00:00:00.0" ;
      time:calendar = "gregorian" ;
      time:axis = "T" ;

    double time_bnds(time,bnds) ;
} // group /
diego@LAcompu:~/new$ ncap2 -v -O -s 't2m=pack_short(t2m,0.00166667,0.0);' in.nc out.nc
ncap2: WARNING pack_short(): Function has been called with more than one argument
diego@LAcompu:~/new$ ncks -m out.nc
netcdf out {
  dimensions:
    lat = 202 ;
    lon = 62 ;
    time = UNLIMITED ; // (15777 currently)

  variables:
    float lat(lat) ;
      lat:standard_name = "latitude" ;
      lat:long_name = "latitude" ;
      lat:units = "degrees_north" ;
      lat:axis = "Y" ;

    float lon(lon) ;
      lon:standard_name = "longitude" ;
      lon:long_name = "longitude" ;
      lon:units = "degrees_east" ;
      lon:axis = "X" ;

    short t2m(time,lat,lon) ;
      t2m:scale_factor = -0.000784701646794361 ;
      t2m:add_offset = -1.01787074416207 ;
      t2m:_FillValue = -32767s ;
      t2m:long_name = "2 metre temperature" ;
      t2m:missing_value = -32767. ;
      t2m:units = "Celsius" ;

    double time(time) ;
      time:standard_name = "time" ;
      time:long_name = "time" ;
      time:bounds = "time_bnds" ;
      time:units = "hours since 1900-01-01 00:00:00.0" ;
      time:calendar = "gregorian" ;
      time:axis = "T" ;
} // group /

好问题!我会挖掘一下,看看我是否能找到一种方法,但与此同时,你知道 cdo 可以转换为 netcdf4 并使用 zip 技术压缩文件吗?这也可能有所帮助,另外您也可以尝试转向单精度浮点数?

cdo -b f32 -f nc4 -z zip_9 copy in.nc compressed.nc

9是最大压缩率,不过说实话我一般用zip_4或者zip_5,我发现4级或者5级以上你增益不大space,但是压缩和解压变得很慢

不是您问题的答案,但希望对您有所帮助?

NCO 将自动打包 scale_factoradd_offset 的最佳值,例如

ncpdq -P in.nc out.nc

您还可以使用

添加无损压缩
ncpdq -P -L 1 -7 in.nc out.nc

文档位于 http://nco.sf.net/nco.html#ncpdq

ncap2 接受 scale_factoradd_offset 的特定值用于 per-variable 打包 pack() 记录在 http://nco.sf.net/nco.html#ncap_mth

示范:

zender@spectral:~$ ncap2 -v -O -s 'rec_pck=pack(three_dmn_rec_var,-0.001,40.0);' ~/nco/data/in.nc ~/foo.nc
zender@spectral:~$ ncks -m ~/foo.nc
netcdf foo {
  dimensions:
    lat = 2 ;
    lon = 4 ;
    time = UNLIMITED ; // (10 currently)

  variables:
    float lat(lat) ;
      lat:long_name = "Latitude (typically midpoints)" ;
      lat:units = "degrees_north" ;
      lat:bounds = "lat_bnd" ;

    float lon(lon) ;
      lon:long_name = "Longitude (typically midpoints)" ;
      lon:units = "degrees_east" ;

    short rec_pck(time,lat,lon) ;
      rec_pck:scale_factor = -0.001f ;
      rec_pck:add_offset = 40.f ;
      rec_pck:_FillValue = -99s ;
      rec_pck:long_name = "three dimensional record variable" ;
      rec_pck:units = "watt meter-2" ;

    double time(time) ;
      time:long_name = "time" ;
      time:units = "days since 1964-03-12 12:09:00 -9:00" ;
      time:calendar = "gregorian" ;
      time:bounds = "time_bnds" ;
      time:climatology = "climatology_bounds" ;
} // group /