不知道如何用 Cython 优化代码
Don't know how to optimize the code with Cython
我是尝试使用 cython 进行优化的新手,我需要一些帮助,因为我知道我可以优化更多,但我不知道如何优化。
首先我做的是在C中声明变量,但是很多都是对象,我不知道如何声明。
然后我有 class,在 HTML 我可以看到每次代码想要获取 class 的对象时它很慢,所以我想优化它也是,我不知道该怎么做:(
import math
import random
def class City:
def __init__(self, city_name='NoName', posx=0, posy=0):
self.name = city_name
self.x = posx
self.y = posy
def distance_between_cities( object C1, object C2 ):
"""Compute distance between two cities"""
dx = C1.x - C2.x
dy = C1.y - C2.y
return math.sqrt( dx*dx + dy*dy )
def read_cities_from_file ( str filename ):
"""Read list of Cities from text file"""
cdef list Cities= [] # List of Empty Cities
cdef list R = []
with open(filename) as file:
for line in file:
R= line.split()
Cities.append ( City( R[0], float(R[1]), float(R[2]) ) )
return Cities
def path_distance( object Cities ):
"""Compute total distance of the path traversing all cities in order"""
cdef int i = 0
cdef double D = 0
for i in range( len(Cities) ):
D = D + distance_between_cities ( Cities[i],
Cities[i+1 if i+1<len(Cities) else 0])
return D
cdef int closerCity( object City, object Cities ):
"""Compute position of the city C in the list of Cities that is closer to City"""
cdef float minDist = 2*1000*1000
cdef int minIndex= 0
cdef int i = 0
cdef double dx = 0
cdef double dy = 0
cdef double dist = 0
for i in range(len(Cities)):
c = Cities[i]
dx = City.x - c.x
dy = City.y - c.y
dist = dx*dx + dy*dy
if dist < minDist:
minDist = dist
minIndex= i
return minIndex
def GoodPath( object Cities ):
"""Generate a path with small total distance using greedy algorithm"""
cdef list NotVisited = []
cdef int len_C = len(Cities)
cdef int i = 0
for i in range(len_C):
NotVisited.append(Cities[i])
Path= [ Cities[0] ] # Start path with first city
del NotVisited[0]
while len(NotVisited) > 0:
pos = closerCity( Path[-1], NotVisited )
Path.append( NotVisited[pos] )
del NotVisited[pos]
return Path
if __name__ == "__main__":
import argparse as arg
parser = arg.ArgumentParser(prog='ARGUMENTS', usage='%(prog)s [options]')
parser.add_argument("input", type=str, help="File containing list of Cities")
args = parser.parse_args()
ListOfCities= read_cities_from_file ( str(args.input) )
GoodList = GoodPath( ListOfCities )
print ( "Initial Distance =", path_distance( ListOfCities ),
" Good Distance =", path_distance( GoodList) )
这是我在 cythonize 时从 HTML 得到的:
Part 1 of the HTML
Part 2
任何帮助将不胜感激
谢谢!
您算法的复杂度为 O(n^2)
(城市数量为 n
)。事实上,您遍历了所有城市,并且对于每个城市,您再次遍历了大部分城市(O(n)
个城市)。另请注意,项目删除是在 O(n)
时间内执行的。结果可以用更高效的O(n log n)
算法计算:
你可以在 O(n log n)
时间内建造一个 K-d tree structure, or more specifically a Quad-tree 结构,然后用这个结构在 O(log n)
时间内找到任何其他城市最近的城市。此外,最好将 NotVisited
的类型更改为 set
而不是 list
以避免缓慢的列表项删除。
除此之外,我建议您将 object
类型的变量替换为 lower-level 类型 (例如 [=39= 的数组或元组) ]) 以便 Cython 可以生成更快的 C 代码。
请注意,您的方法通常找不到总的最短路径,甚至可能找不到非常小的路径。有better algorithms,但是比较贵
我是尝试使用 cython 进行优化的新手,我需要一些帮助,因为我知道我可以优化更多,但我不知道如何优化。
首先我做的是在C中声明变量,但是很多都是对象,我不知道如何声明。
然后我有 class,在 HTML 我可以看到每次代码想要获取 class 的对象时它很慢,所以我想优化它也是,我不知道该怎么做:(
import math
import random
def class City:
def __init__(self, city_name='NoName', posx=0, posy=0):
self.name = city_name
self.x = posx
self.y = posy
def distance_between_cities( object C1, object C2 ):
"""Compute distance between two cities"""
dx = C1.x - C2.x
dy = C1.y - C2.y
return math.sqrt( dx*dx + dy*dy )
def read_cities_from_file ( str filename ):
"""Read list of Cities from text file"""
cdef list Cities= [] # List of Empty Cities
cdef list R = []
with open(filename) as file:
for line in file:
R= line.split()
Cities.append ( City( R[0], float(R[1]), float(R[2]) ) )
return Cities
def path_distance( object Cities ):
"""Compute total distance of the path traversing all cities in order"""
cdef int i = 0
cdef double D = 0
for i in range( len(Cities) ):
D = D + distance_between_cities ( Cities[i],
Cities[i+1 if i+1<len(Cities) else 0])
return D
cdef int closerCity( object City, object Cities ):
"""Compute position of the city C in the list of Cities that is closer to City"""
cdef float minDist = 2*1000*1000
cdef int minIndex= 0
cdef int i = 0
cdef double dx = 0
cdef double dy = 0
cdef double dist = 0
for i in range(len(Cities)):
c = Cities[i]
dx = City.x - c.x
dy = City.y - c.y
dist = dx*dx + dy*dy
if dist < minDist:
minDist = dist
minIndex= i
return minIndex
def GoodPath( object Cities ):
"""Generate a path with small total distance using greedy algorithm"""
cdef list NotVisited = []
cdef int len_C = len(Cities)
cdef int i = 0
for i in range(len_C):
NotVisited.append(Cities[i])
Path= [ Cities[0] ] # Start path with first city
del NotVisited[0]
while len(NotVisited) > 0:
pos = closerCity( Path[-1], NotVisited )
Path.append( NotVisited[pos] )
del NotVisited[pos]
return Path
if __name__ == "__main__":
import argparse as arg
parser = arg.ArgumentParser(prog='ARGUMENTS', usage='%(prog)s [options]')
parser.add_argument("input", type=str, help="File containing list of Cities")
args = parser.parse_args()
ListOfCities= read_cities_from_file ( str(args.input) )
GoodList = GoodPath( ListOfCities )
print ( "Initial Distance =", path_distance( ListOfCities ),
" Good Distance =", path_distance( GoodList) )
这是我在 cythonize 时从 HTML 得到的: Part 1 of the HTML Part 2
任何帮助将不胜感激
谢谢!
您算法的复杂度为 O(n^2)
(城市数量为 n
)。事实上,您遍历了所有城市,并且对于每个城市,您再次遍历了大部分城市(O(n)
个城市)。另请注意,项目删除是在 O(n)
时间内执行的。结果可以用更高效的O(n log n)
算法计算:
你可以在 O(n log n)
时间内建造一个 K-d tree structure, or more specifically a Quad-tree 结构,然后用这个结构在 O(log n)
时间内找到任何其他城市最近的城市。此外,最好将 NotVisited
的类型更改为 set
而不是 list
以避免缓慢的列表项删除。
除此之外,我建议您将 object
类型的变量替换为 lower-level 类型 (例如 [=39= 的数组或元组) ]) 以便 Cython 可以生成更快的 C 代码。
请注意,您的方法通常找不到总的最短路径,甚至可能找不到非常小的路径。有better algorithms,但是比较贵