不知道如何用 Cython 优化代码

Question

我是尝试使用 cython 进行优化的新手，我需要一些帮助，因为我知道我可以优化更多，但我不知道如何优化。

首先我做的是在C中声明变量，但是很多都是对象，我不知道如何声明。

然后我有 class，在 HTML 我可以看到每次代码想要获取 class 的对象时它很慢，所以我想优化它也是，我不知道该怎么做:(

import math
import random

def class City:
  def __init__(self, city_name='NoName', posx=0, posy=0):
    self.name = city_name
    self.x = posx
    self.y = posy

def distance_between_cities( object C1, object C2 ):
    """Compute distance between two cities"""
    dx = C1.x - C2.x
    dy = C1.y - C2.y
    return math.sqrt( dx*dx + dy*dy )

def read_cities_from_file ( str filename ):
    """Read list of Cities from text file"""
    cdef list Cities= [] # List of Empty Cities
    cdef list R = []
    with open(filename) as file:
        for line in file:
            R= line.split()
            Cities.append ( City( R[0], float(R[1]), float(R[2]) ) )
    return Cities

def path_distance( object Cities ):
    """Compute total distance of the path traversing all cities in order"""
    cdef int i = 0
    cdef double D = 0
    for i in range( len(Cities) ):
        D = D + distance_between_cities ( Cities[i],
                                          Cities[i+1 if i+1<len(Cities) else 0])
    return D

cdef int closerCity( object City, object Cities ):
    """Compute position of the city C in the list of Cities that is closer to City"""
    cdef float minDist = 2*1000*1000
    cdef int minIndex= 0
    cdef int i = 0
    cdef double dx = 0
    cdef double dy = 0
    cdef double dist = 0
    for i in range(len(Cities)):
        c = Cities[i]
        dx = City.x - c.x
        dy = City.y - c.y
        dist = dx*dx + dy*dy
        if dist < minDist:
            minDist = dist
            minIndex= i
    return minIndex

def GoodPath( object Cities ):
    """Generate a path with small total distance using greedy algorithm"""
    cdef list NotVisited = []
    cdef int len_C = len(Cities)
    cdef int i = 0
    for i in range(len_C):
      NotVisited.append(Cities[i])
    Path= [ Cities[0] ] # Start path with first city
    del NotVisited[0]
    while len(NotVisited) > 0:
        pos = closerCity( Path[-1], NotVisited )
        Path.append( NotVisited[pos] )
        del NotVisited[pos]
    return Path

if __name__ == "__main__":
  import argparse as arg
  parser = arg.ArgumentParser(prog='ARGUMENTS', usage='%(prog)s [options]')
  parser.add_argument("input",  type=str, help="File containing list of Cities")
  args = parser.parse_args()
  
  ListOfCities= read_cities_from_file ( str(args.input) )
  GoodList    = GoodPath( ListOfCities )
  print ( "Initial Distance =", path_distance( ListOfCities ),
          "  Good Distance =",  path_distance( GoodList) )

这是我在 cythonize 时从 HTML 得到的： Part 1 of the HTML Part 2

任何帮助将不胜感激

谢谢！

Answer 1

您算法的复杂度为 O(n^2)（城市数量为 n）。事实上，您遍历了所有城市，并且对于每个城市，您再次遍历了大部分城市（O(n) 个城市）。另请注意，项目删除是在 O(n) 时间内执行的。结果可以用更高效的O(n log n)算法计算：

你可以在 O(n log n) 时间内建造一个 K-d tree structure, or more specifically a Quad-tree 结构，然后用这个结构在 O(log n) 时间内找到任何其他城市最近的城市。此外，最好将 NotVisited 的类型更改为 set 而不是 list 以避免缓慢的列表项删除。

除此之外，我建议您将 object 类型的变量替换为 lower-level 类型 （例如 [=39= 的数组或元组） ]) 以便 Cython 可以生成更快的 C 代码。

请注意，您的方法通常找不到总的最短路径，甚至可能找不到非常小的路径。有better algorithms，但是比较贵

不知道如何用 Cython 优化代码

Don't know how to optimize the code with Cython

python

optimization

cython

cythonize