ComputeBandStats 是否考虑了无数据?

Does ComputeBandStats take nodata into account?

我正在尝试计算仅部分被数据覆盖的图像的统计信息。我想知道 ComputeBandStats 是否忽略了与文件 nodata 具有相同值的像素。

这是我的代码:

inIMG = gdal.Open(infile)

# getting stats for the first 3 bands
# Using ComputeBandStats insted of stats array has min, max, mean and sd values
print "Computing band statistics"
bandas = [inIMG.GetRasterBand(b+1) for b in range(3)]
minMax = [b.ComputeRasterMinMax() for b in bandas]
meanSD = [b.ComputeBandStats(1) for b in bandas]
print minMax
print meanSD

对于没有 nodata 属性的图像,输出为:

Computing band statistics
[(0.0, 26046.0), (0.0, 24439.0), (0.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]

对于 nodata = 0 的图像,输出为:

Computing band statistics
[(121.0, 26046.0), (202.0, 24439.0), (79.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]

最小值和最大值已更改,因此 0 不再是最小值,这是有道理的,因为在第二个版本中它是无数据,因此不被 ComputeRasterMinMax() 考虑。但是,均值和标准差没有改变。

这是否意味着 ComputeBandStats 不会忽略无数据值?
有什么方法可以强制 ComputeBandStats 忽略无数据值吗?

设置 NoData 值对数据本身没有影响。你可以这样试试:

# First image, all valid data
data = numpy.random.randint(1,10,(10,10))
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats1.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None

# Second image, values of "1" set to no data
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats2.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).SetNoDataValue(1)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None

请注意,ComputeBandStats 返回的统计数据未更改,但 ComputeStatistics 返回的统计数据为:

>>> (4.97, 2.451346568725035)
>>> [1.0, 9.0, 4.970000000000001, 2.4513465687250346]

>>> (4.97, 2.451346568725035)
>>> [2.0, 9.0, 5.411111111111111, 2.1750833672117]

您可以手动确认统计数据是否正确:

numpy.mean(data)
numpy.mean(data[data != 1])
numpy.std(data)
numpy.std(data[data != 1])

>>> 4.9699999999999998
>>> 5.4111111111111114
>>> 2.4513465687250346
>>> 2.1750833672117