为了 运行 它更快,我应该对此代码应用哪些更改?

What changes should I apply to this code in order to run it faster?

我写了一个 python 卫星图像处理脚本。基本上,代码所做的是查看图像中每个像素周围的每个 window,并将其与同一图像中的特定感兴趣区域进行比较。具有最相似信息的 window 被存储并转换为地理空间向量。

请允许我进一步解释:我有从 2013 年到 2020 年某个特定位置的月度卫星图像 运行,总共有 90 张图像(90 个月)。我还有一个包含 52 个特征的矢量文件 (.shp),我感兴趣的区域 (ROI)。对于每个月,即每张图像,我必须查看我的 ROI,收集 ROI 内所有像素的数字值,并计算其平均数字值。我对该图像中所有可能的 3x3 像素 (window) 执行相同的操作并比较它们的均值。最后,每个 ROI 都有对应的 window,它们的平均数字值之间的欧几里得距离较小。输出是90个vector文件(.shp,每个对应一个image/month),其中52个windows by vector file,每个kernel对应一个window,其信息最接近于投资回报率。

并非图像中的每个像素都允许成为内核的一部分,因此我还有 90 个蒙版可以让我测试 window 是否可选。

问题是每个月度向量需要大约 8 小时才能生成。总的来说,我花了大约 30 天的时间来生成所有向量,尽管我有一台相当不错的机器(w10、64 位、intelCorei7-4700HQ CPU @ 2.40GHz、gtx8520m、4GB ram ...)。我知道嵌套的 for 循环不是编码时的最佳实践,而且 python 也不能很好地支持并行计算(我怀疑这是最好的方法,因为 windows 是独立)。

我尝试了不同的方法,比如@jit,尝试使用 GPU 进行计算,还尝试使用所有 4 个内核进行多处理。前两种方法不起作用,据我了解,可能是因为我的函数不适合 t运行slated。最后一个运行没有问题,但花了双倍的时间才完成。

我的问题是:我应该对我的代码进行哪些更改才能使其更快,这不仅涉及“良好的写作”,而且还考虑到我想使用所有 4 个内核,可能还有我的 GPU 运行它。我不是计算机科学家,因此我真的很难找到解决方案。

代码如下:

import pandas as pd
import geopandas
from rasterio.windows import Window
import rasterio
import numpy as np
from rasterstats import zonal_stats
from shapely.geometry import box
import os
import ntpath
import time



def path_leaf(path):
    head, tail = ntpath.split(path)
    return tail or ntpath.basename(head)


def getListOfFiles(dirName):
    listOfFile = os.listdir(dirName)
    allFiles = list()

    for entry in listOfFile:
        # Create full path
        fullPath = os.path.join(dirName, entry)
        # If entry is a directory then get the list of files in this directory
        if os.path.isdir(fullPath):
            allFiles = allFiles + getListOfFiles(fullPath)
        else:
            allFiles.append(fullPath)

    return allFiles


#The address of the folder containing the ROI's
SHP_address = 'F:\nota_tecnica_2\vetores\amostras_interesse_final\ROI_wgs84_9_final_dissolvido.shp'

#The address of the satellite images
a = getListOfFiles('C:\dados_nota_tecnica\MODIS\MODIS_quality_mosaic_rec')

#The address of the masks (binary images where 0 is a electable pixel and 1 is a not electable pixel)
b = 'C:\dados_nota_tecnica\VIIRS\final\'

#The folder where I store my vectors at the end
c = 'F:\nota_tecnica_2\vetores\amostras_controle\'

#A variable defining which images from the "a" list should be analyzed 
_from = 0
_to = 90

#The function that will search for the windows 
def janela_movel(MODIS_list,local_viirs,saida):

    start = time.time()
    df_vazio = pd.DataFrame(
        {'mean_b1': [],
         'mean_b2': [],
         'mean_b4': [],
         'euc_dist': [],
         'left': [],
         'right': [],
         'bottom': [],
         'top': [],
         'id_alvo': []})

    for file in (range(_from,_to)): #len(MODIS_list)
        MODIS_address = MODIS_list[file]

        #Searches for a maks image that matches the name of the satellite image
        viirs_address = local_viirs+path_leaf(MODIS_address)

        #Open the matched mask image
        raster_viirs = rasterio.open(viirs_address)
        #Open the raster image
        raster_MODIS = rasterio.open(MODIS_address)



        #Caculates the average digital value in a ROI for 3 bands of the image (RED, GREEN and NEAR-INFRARED) (actually surface reflectance at NADIR);
        media_zonal_b1 = zonal_stats(SHP_address, MODIS_address,
                        stats="mean",
                        nodata=0,
                        all_touched=False,
                        categorical=False,
                        geojson_out=True,
                               band=1)
        media_zonal_b2 = zonal_stats(SHP_address, MODIS_address,
                        stats="mean",
                        nodata=0,
                        all_touched=False,
                        categorical=False,
                        geojson_out=True,
                               band=3)

        media_zonal_b4 = zonal_stats(SHP_address, MODIS_address,
                        stats="mean count",
                        nodata=0,
                        all_touched=False,
                        categorical=False,
                        geojson_out=True,
                               band=5)
        
        #Now the code will access the ROI. For each one it will extract not only the already computated mean value, but also the coordinates 
        #of the ROI and its ID (identificator) 
        for x in range(len(media_zonal_b1)):

            mean_band1_alvo = media_zonal_b1[x]['properties']['mean']
            mean_band2_alvo = media_zonal_b2[x]['properties']['mean']
            mean_band4_alvo = media_zonal_b4[x]['properties']['mean']
            id_alvo = x
            array_vazio = []
            #Here I set the size of the window/kernel
            
            i = 3
            j = 3
            #Now it will access the satellite image and move this 3x3 window through all pixels
            for i_r in range(raster_viirs.height):
                for j_r in range(raster_viirs.width):
                   row_start = i_r
                   row_stop = i_r + i
                   col_start = j_r
                   col_stop = j_r + j
                   Win = Window.from_slices(slice(row_start, row_stop), slice(col_start, col_stop))
                   croped = raster_viirs.read(window=Win)
                    
                    #This is some code to prevent NAN values and not electable pixels
                   if (-999.0 not in croped) and (1 not in croped) and (np.isnan(croped).any() != True):

                        bounds = raster_viirs.window_bounds(Win) 
                        croped2 = raster_MODIS.read(window=Win) #aplicando a janela extraída ao dado e reflectancia

                        
                        if ((np.isnan(croped2).any()) != True) and (croped2.size != 0):
                            mean_band1 = np.mean(croped2[0])
                            mean_band2 = np.mean(croped2[2])
                            mean_band4 = np.mean(croped2[4])
                            
                            if (mean_band1_alvo or mean_band2_alvo or mean_band4_alvo) is None:
                                mean_band1_alvo = -999
                                mean_band2_alvo = -999
                                mean_band4_alvo = -999
                                dist = -999
                            #Calculates the euclidian distance between the bands of the pixels within the window and the ROI
                            else:
                                dist = (((mean_band1 - mean_band1_alvo) ** 2) + ((mean_band2 - mean_band2_alvo) ** 2) + (
                                            (mean_band4 - mean_band4_alvo) ** 2)) ** 0.5

                                # Creates a dataframe with all the electable kernels
                                array_dados = [dist,mean_band1,mean_band2,mean_band4,bounds[0],bounds[1],bounds[2],bounds[3]]

                                #aggreagte the kernels in a single array
                                array_vazio = array_vazio+array_dados


            #Transforms in a 2d array and order it to find the most alike kernel (smallest euclidian distance)
            array_dados_2d = np.reshape(array_vazio, (-1,8))
            array_dados_2d = array_dados_2d[np.argsort(array_dados_2d[:, 0])]

            #Number of windows per ROI
            numero_de_amostras = 1
            array_dados_filtered = array_dados_2d[0:numero_de_amostras, ]
            #Accumulates the windows of all the ROI in a single vector (.shp file)
            df_dados = pd.DataFrame(
             {'euc_dist': array_dados_filtered[:, 0],
              'mean_b1': array_dados_filtered[:, 1],
              'mean_b2': array_dados_filtered[:, 2],
              'mean_b4': array_dados_filtered[:, 3],
              'left': array_dados_filtered[:, 4],
              'bottom': array_dados_filtered[:, 5],
              'right': array_dados_filtered[:, 6],
              'top': array_dados_filtered[:, 7],
              'id_alvo': id_alvo})
            df_vazio = df_vazio.append(df_dados, ignore_index = True)

        #Some geocoding for the output vector file. 
        bbox = df_vazio.apply(lambda row: box(row.left, row.bottom, row.right, row.top), axis=1)
        geo_dataframe = geopandas.GeoDataFrame(df_vazio, geometry=bbox)
        geo_dataframe.crs = {'init': 'epsg:4326'}
        local = saida
        geo_dataframe.to_file(driver = 'ESRI Shapefile', filename= (local+path_leaf(MODIS_address)+'.shp'))

        df_vazio = pd.DataFrame(
            {'mean_b1': [],
             'mean_b2': [],
             'mean_b4': [],
             'euc_dist': [],
             'left': [],
             'right': [],
             'bottom': [],
             'top': [],
             'id_alvo': []})
    end = time.time()
    print(end - start)

#finally appling the function
janela_movel(a,b,c)```

主要问题可能是对 raster_MODIS.read 的调用量非常大,这可能会执行 缓慢的 I/O 操作 。此外,由于 window 大小,每个单元格实际上被读取了 9 次(平均)。

要解决这个问题,您可以将整个光栅读入内存 的二维 numpy 数组中,然后直接对其进行操作。如果光栅太大,可以分割成线带甚至2D瓦片(虽然这有点复杂)。

然后,如果这还不够,您可以将 numpy 计算部分(2 个嵌套内部循环)放在一个函数中,并使用 numba @njit 加速它(请注意,如果 array_vazio 是直接分配的 2D numpy 数组,则它将是 better/faster,大小合适,甚至大小过高,因为 numba 不喜欢 Python 列表)。 12=]

最后,如果还是不够,可以尝试使用multiprocessing模块并行计算每个栅格(在一个单独的进程中,最后合并进程数据).

生成的代码应该快几个数量级