ComputeBandStats 是否考虑了无数据?
Does ComputeBandStats take nodata into account?
我正在尝试计算仅部分被数据覆盖的图像的统计信息。我想知道 ComputeBandStats 是否忽略了与文件 nodata 具有相同值的像素。
这是我的代码:
inIMG = gdal.Open(infile)
# getting stats for the first 3 bands
# Using ComputeBandStats insted of stats array has min, max, mean and sd values
print "Computing band statistics"
bandas = [inIMG.GetRasterBand(b+1) for b in range(3)]
minMax = [b.ComputeRasterMinMax() for b in bandas]
meanSD = [b.ComputeBandStats(1) for b in bandas]
print minMax
print meanSD
对于没有 nodata 属性的图像,输出为:
Computing band statistics
[(0.0, 26046.0), (0.0, 24439.0), (0.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]
对于 nodata = 0 的图像,输出为:
Computing band statistics
[(121.0, 26046.0), (202.0, 24439.0), (79.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]
最小值和最大值已更改,因此 0 不再是最小值,这是有道理的,因为在第二个版本中它是无数据,因此不被 ComputeRasterMinMax() 考虑。但是,均值和标准差没有改变。
这是否意味着 ComputeBandStats 不会忽略无数据值?
有什么方法可以强制 ComputeBandStats 忽略无数据值吗?
设置 NoData 值对数据本身没有影响。你可以这样试试:
# First image, all valid data
data = numpy.random.randint(1,10,(10,10))
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats1.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None
# Second image, values of "1" set to no data
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats2.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).SetNoDataValue(1)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None
请注意,ComputeBandStats
返回的统计数据未更改,但 ComputeStatistics
返回的统计数据为:
>>> (4.97, 2.451346568725035)
>>> [1.0, 9.0, 4.970000000000001, 2.4513465687250346]
>>> (4.97, 2.451346568725035)
>>> [2.0, 9.0, 5.411111111111111, 2.1750833672117]
您可以手动确认统计数据是否正确:
numpy.mean(data)
numpy.mean(data[data != 1])
numpy.std(data)
numpy.std(data[data != 1])
>>> 4.9699999999999998
>>> 5.4111111111111114
>>> 2.4513465687250346
>>> 2.1750833672117
我正在尝试计算仅部分被数据覆盖的图像的统计信息。我想知道 ComputeBandStats 是否忽略了与文件 nodata 具有相同值的像素。
这是我的代码:
inIMG = gdal.Open(infile)
# getting stats for the first 3 bands
# Using ComputeBandStats insted of stats array has min, max, mean and sd values
print "Computing band statistics"
bandas = [inIMG.GetRasterBand(b+1) for b in range(3)]
minMax = [b.ComputeRasterMinMax() for b in bandas]
meanSD = [b.ComputeBandStats(1) for b in bandas]
print minMax
print meanSD
对于没有 nodata 属性的图像,输出为:
Computing band statistics
[(0.0, 26046.0), (0.0, 24439.0), (0.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]
对于 nodata = 0 的图像,输出为:
Computing band statistics
[(121.0, 26046.0), (202.0, 24439.0), (79.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]
最小值和最大值已更改,因此 0 不再是最小值,这是有道理的,因为在第二个版本中它是无数据,因此不被 ComputeRasterMinMax() 考虑。但是,均值和标准差没有改变。
这是否意味着 ComputeBandStats 不会忽略无数据值?
有什么方法可以强制 ComputeBandStats 忽略无数据值吗?
设置 NoData 值对数据本身没有影响。你可以这样试试:
# First image, all valid data
data = numpy.random.randint(1,10,(10,10))
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats1.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None
# Second image, values of "1" set to no data
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats2.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).SetNoDataValue(1)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None
请注意,ComputeBandStats
返回的统计数据未更改,但 ComputeStatistics
返回的统计数据为:
>>> (4.97, 2.451346568725035)
>>> [1.0, 9.0, 4.970000000000001, 2.4513465687250346]
>>> (4.97, 2.451346568725035)
>>> [2.0, 9.0, 5.411111111111111, 2.1750833672117]
您可以手动确认统计数据是否正确:
numpy.mean(data)
numpy.mean(data[data != 1])
numpy.std(data)
numpy.std(data[data != 1])
>>> 4.9699999999999998
>>> 5.4111111111111114
>>> 2.4513465687250346
>>> 2.1750833672117