cython 中的函数内部数组的大小是否有限制?
Is there a limit to how large an array can be inside of a function in cython?
这段代码可以正常编译和运行:
cdef enum:
SIZE = 1000000
cdef int ar[SIZE]
cdef int i
for i in range(SIZE):
ar[i] = i
print(ar[5])
但是这段代码:
cdef enum:
SIZE = 1000000
def test():
cdef int ar[SIZE]
cdef int i
for i in range(SIZE):
ar[i] = i
print(ar[5])
test()
使 python 内核崩溃(我 运行 使用 jupyter magic)。
函数内部的数组有多大限制?如果有,是否有办法取消该限制?
在第一种情况下,数组是 global-variable statically-defined 普通 C 数组 allocated on the heap. In the second case, the local-variable array is allocated on the stack. The thing is the stack has a fixed maximum size (generally something like 2 MiB but this can change a lot on different platforms). If you try to write something allocated beyond the limit of the stack you get a crash or a nice stack overflow error if you are lucky. Note that function calls and local variables temporary take some space in the stack. While the size of the stack can be resized, the best solution is to use dynamic allocation for relatively big arrays or the ones with a variable size. Hopefully, this is possible in Cython using malloc
and free
(see the documentation) 但你应该小心不要忘记释放(也不要释放数组两次)。另一种解决方案是创建一个 Numpy 数组,然后使用内存视图(这更昂贵但可以防止任何可能的泄漏)。
这段代码可以正常编译和运行:
cdef enum:
SIZE = 1000000
cdef int ar[SIZE]
cdef int i
for i in range(SIZE):
ar[i] = i
print(ar[5])
但是这段代码:
cdef enum:
SIZE = 1000000
def test():
cdef int ar[SIZE]
cdef int i
for i in range(SIZE):
ar[i] = i
print(ar[5])
test()
使 python 内核崩溃(我 运行 使用 jupyter magic)。
函数内部的数组有多大限制?如果有,是否有办法取消该限制?
在第一种情况下,数组是 global-variable statically-defined 普通 C 数组 allocated on the heap. In the second case, the local-variable array is allocated on the stack. The thing is the stack has a fixed maximum size (generally something like 2 MiB but this can change a lot on different platforms). If you try to write something allocated beyond the limit of the stack you get a crash or a nice stack overflow error if you are lucky. Note that function calls and local variables temporary take some space in the stack. While the size of the stack can be resized, the best solution is to use dynamic allocation for relatively big arrays or the ones with a variable size. Hopefully, this is possible in Cython using malloc
and free
(see the documentation) 但你应该小心不要忘记释放(也不要释放数组两次)。另一种解决方案是创建一个 Numpy 数组,然后使用内存视图(这更昂贵但可以防止任何可能的泄漏)。