从 numpy 数组创建栅格，从 csv 文件中获取值

Question

我有一个 geotiff。我想用 csv table.

中的相应值替换栅格中的值

栅格具有 class 值 0 到 n，并且 csv 具有栅格的每个 class n 的计算值（例如点密度）。我想根据 csv

中的相应值创建一个新栅格

我正在使用 GDAL 和 numpy。我尝试使用 pandas，但遇到了从 csv 中提取值到栅格 pandas 数据帧的问题。我将在相应的 csv tables.

的栅格列表上执行此操作

下面是我的数据示例（一个栅格）

#Example raster array
[5 2 2 3
 0 3 1 4
 2 0 1 3]

#Corresponding csv table
  Class   Count  Density
    0       2       6
    1       2       9
    2       2       4
    3       3       9
    4       1       7
    5       1       2


#Output Raster (to take the corresponding density values, 
#i.e. if class = 0, then output raster = 6, the corresponding density value)
    [2 4 4 9
     6 9 9 7
     4 6 9 9]

我有从光栅创建数组并从数组写回光栅的代码。我从各种 stackexchange 站点发现了它。我不知道如何构建循环以从新栅格中的 csv 获取值。我下面的 'for loop' 代码不完整。谁能帮忙

import numpy, sys
from osgeo import gdal
from osgeo.gdalconst import *

inRst = gdal.Open(r"c:/Raster1.tif")
band = inRst.GetRasterBand(1)
rows = inRst.RasterYSize
cols = inRst.RasterXSize
rstr_arry = band.ReadAsArray(0,0,cols,rows)

# create the output image
driver = inRst.GetDriver()
#print driver
outRst = driver.Create(r"c:/NewRstr.tif", cols, rows, 1, GDT_Int32)
outBand = outRst.GetRasterBand(1)
outData = numpy.zeros((rows,cols), numpy.int32)

for i in range(0, rows):
    for j in range(0, cols):
        if rstr_arry[i,j] =  :
            outData[i,j] = 
        elif rstr_arry[i,j] = :
            outData[i,j] = 
        else:
            outData[i,j] = 


# write the data
outRst= outBand.WriteArray(outData, 0, 0)
# flush data to disk, set the NoData value and calculate stats
outBand.FlushCache()
outBand.SetNoDataValue(-99)
# georeference the image and set the projection
outDs.SetGeoTransform(inDs.GetGeoTransform())
outDs.SetProjection(inDs.GetProjection())

Answer 1

如果我没有弄错你想要实现的目标，你首先必须读取你的 csv 文件并创建 Class 值到 Density 值的映射。可以这样做：

import csv

mapping = {}

with open('test.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        mapping[int(row['Class'])] = int(row['Density'])

您将获得这样的 dict :

{0: 6, 1: 9, 2: 4, 3: 9, 4: 7, 5: 2}

然后就可以用np.in1d to create a mask matrix of what need to be replaced, and np.searchsorted来替换元素了。在这样做之前，您需要展平光栅阵列，并在写回结果之前恢复其形状。（可以在这个问题的答案中找到替换 numpy 数组中元素的替代方法：Fast replacement of values in a numpy array）

# Save the shape of the raster array
s = rstr_arry.shape
# Flatten the raster array
rstr_arry = rstr_arry.reshape(-1)
# Create 2D replacement matrix:
replace = numpy.array([list(mapping.keys()), list(mapping.values())])
# Find elements that need replacement:
mask = numpy.in1d(rstr_arry, replace[0, :])
# Replace them:
rstr_arry[mask] = replace[1, numpy.searchsorted(replace[0, :], rstr_arry[mask])]
# Restore the shape of the raster array
rstr_arry = rstr_arry.reshape(s)

然后您可以几乎按照您的计划编写数据：

outBand.WriteArray(rstr_arry, 0, 0)
outBand.SetNoDataValue(-99)

outDs.SetGeoTransform(inRst.GetGeoTransform())
outDs.SetProjection(inRst.GetProjection())

outBand.FlushCache()

正在您的示例数据上测试它：

rstr_arry = np.asarray([
    [5, 2, 2, 3],
    [0, 3, 1, 4],
    [2, 0, 1, 3]])

mapping = {0: 6, 1: 9, 2: 4, 3: 9, 4: 7, 5: 2}

s = rstr_arry.shape
rstr_arry = rstr_arry.reshape(-1)
replace = numpy.array([list(mapping.keys()), list(mapping.values())])
mask = numpy.in1d(rstr_arry, replace[0, :])
rstr_arry[mask] = replace[1, numpy.searchsorted(replace[0, :], rstr_arry[mask])]
rstr_arry = rstr_arry.reshape(s)

print(rstr_arry)
# [[2 4 4 9]
#  [6 9 9 7]
#  [4 6 9 9]]

从 numpy 数组创建栅格，从 csv 文件中获取值

Raster creation from numpy array, taking values from a csv file

python

numpy

raster

gdal