如何在 cython 纯模式下循环列表

Question

为了加快速度 struct.pack()，我使用以下方法将 int 打包为字节：

import cython as c
from cython import nogil, compile, returns, locals, cfunc, pointer, address

int_bytes_buffer = c.declare(c.char[400], [0] * 400)


@locals(i = c.int, num = c.int)
@returns(c.int)
@cfunc
@nogil
@compile
def int_to_bytes(num):
    i = 0
    while num >0:
        int_bytes_buffer[i] = num%256
        num//=256
        i+=1

    return int_bytes_buffer[0]


int_to_bytes(259)

我正在尝试使用以下错误代码让它在整数列表上工作：

@locals(i = c.int, ints_p = pointer(c.int[100]), num = c.int)
@returns(c.int)
@cfunc
@nogil
@compile
def int_to_bytes(num):
    i = 0
    for num in ints_p:
        while num >0:
            int_bytes_buffer[i] = num%256
            num//=256
            i+=1

    return int_bytes_buffer[0]

ints = c.declare(c.int[100],  [259]*100)
int_to_bytes(address(ints))

这给了我：

    for num in ints_p:
              ^
----------------------------------------------------------

 Accessing Python global or builtin not allowed without gil

显然我不应该使用 in 或循环指针。

如何在函数内部遍历列表数组？

编辑:

我正在尝试将一个指向 int 数组的指针传递给该函数，并让它在没有 gil 的情况下工作，以便它可以并行化。

函数的参数应该是ints_p:

@locals(ints_p = pointer(c.int[100]), i = c.int, num = c.int)
@returns(c.int)
@cfunc
@nogil
@compile
def int_to_bytes(ints_p):
    i = 0
    for num in (*ints_p):
        while num >0:
            int_bytes_buffer[i] = num%256
            num//=256
            i+=1

    return int_bytes_buffer[0]

ints = c.declare(c.int[100],  [259]*100)
int_to_bytes(address(ints))

我想运行处理实际的整数并打包它们（没有 gil）

编辑 2:

我知道 struct.pack。我希望用 cython 和 nogil 制作一个可并行化的变体。

Answer 1

您的代码中存在一些错误。

在错误Accessing Python global or builtin not allowed without gil中，所以需要去掉@nogil的标签。删除后，它不会显示错误。在我的代码中测试。但是还有其他错误。
您的功能有一些问题。 def int_to_bytes(num): 您不应在函数中传递 num，因为值 num 将在 for 循环中赋值。我将其删除为 def int_to_bytes(): 并且该功能有效。但是还是报错

    @locals(i = c.int, ints_p = c.int(5), num = c.int)
    @returns(c.int)
    @cfunc
    @compile

    def int_to_bytes():
        ints_p = [1,2,3,4,5]
        i = 0
        for num in ints_p:
            while num >0:
                int_bytes_buffer[i] = num%256
                num//=256
                i+=1

        return int_bytes_buffer[1]

    a = int_to_bytes()
    print(a)

最后，我不明白为什么要将地址传递给函数，因为函数不应该接受任何东西。

代码对我有用：

import cython as c
from cython import nogil, compile, returns, locals, cfunc, pointer, address

int_bytes_buffer = c.declare(c.char[400], [0] * 400)

ints = c.declare(c.int[100],  [259]*100)
# for i in list(*address(ints)):
#   print(i)
@locals(i = c.int, num = c.int)
@returns(c.int)
@cfunc
@compile

def int_to_bytes(values):
    i = 0
    for num in list(*address(values)):
        while num >0:
            int_bytes_buffer[i] = num%256
            num//=256
            i+=1

    return int_bytes_buffer

a = int_to_bytes(ints)
print([i for i in a])

希望对您有所帮助。

Answer 2

这毫无意义：

A Python int 可以任意大。 "packing" 中的实际计算工作是确定它是否适合给定大小，然后将其复制到该大小的 space。但是，您使用的是 C int 数组。这些具有固定大小。将它们提取到字节数组中基本上没有任何工作要做。你所做的只是编写了一个非常低效的 memcpy 版本。它们实际上已经作为一组连续的字节存在于内存中——您所要做的就是这样查看它们：
```
# using Numpy (no Cython)
ints = np.array([1,2,3,4,5,6,7], dtype=np.int) # some numpy array already initialized
as_bytes = ints.view(dtype=np.byte) # no data is copied - wonderfully efficient
```
您可以将类似的方法用于另一个数组库或 C 数组：
```
# slightly pointless use of pure-Python mode since this won't
# be valid in Python.
@cython.cfunc
@cython.returns(cython.p_char)
@cython.locals(x = cython.p_int)
def cast_ptr(x):
    return cython.cast(cython.p_char,x)
```
你说你想要 nogil 以便它可以并行化。当需要完成实际的计算工作时，并行化效果很好。当任务受内存访问限制时，它不能很好地工作，因为线程往往最终会等待彼此访问内存。此任务不能很好地并行化。
内存管理有问题。您只能写入固定大小的缓冲区。要分配可变大小的数组，您有多种选择：您可以使用 numpy 或 Python array 模块（或类似模块）让 Python 处理内存-management 或者您可以使用 malloc 和 free 在 C 级别上分配数组。由于您声称需要 nogil ，因此您必须使用 C 方法。但是，您不能从 Cython 的纯 Python 模式执行此操作，因为所有内容也必须在 Python 中工作并且没有 Python 等同于 malloc 和 free .如果您坚持要尝试完成这项工作，那么您需要放弃 Cython 的纯 Python 模式并使用标准的 Cython 语法，因为您尝试做的事情无法与两者兼容。

请注意，目前 int_bytes_buffer 是一个全局数组。这意味着多个线程将共享它 - 对于您假定的并行化来说是一场灾难。

你需要想清楚你的输入是什么。如果它是 Python 整数的列表，那么你不能使用 nogil 来完成这项工作（因为你正在操纵 Python 对象，这需要 GIL）。如果它是一些 C 级数组（无论是 Numpy、array 模块，还是 Cython 声明的 C 数组），那么你的数据已经是你想要的格式，你只需要这样查看它。

编辑： 从评论来看，这显然是一个 X-Y 问题（你问的是修复这个 Cython 语法，因为你想打包一个整数列表）我已经添加了一种使用 Cython 打包 Python 整数列表的快速方法。这比 struct pack 快 7 倍，比将列表传递给 array.array 快 5 倍。它主要更快，因为它专门只做一件事。

我使用 bytearray 作为方便的可写数据存储和 Python memoryview class （与 Cython memoryview 语法不完全相同...）作为转换数据类型的方法.没有花费真正的努力来优化它，因此您可以改进它。请注意，最后复制到 bytes 不会改变可测量的时间，说明复制内存与整体速度无关紧要。

@cython.boundscheck(False)
@cython.wraparound(False)
def packlist(a):
    out = bytearray(4*len(a))
    cdef int[::1] outview = memoryview(out).cast('i')
    cdef int i
    for i in range(len(a)):
        outview[i] = a[i]
    return bytes(out)

如何在 cython 纯模式下循环列表

How to loop over a list in cython pure mode

python

cython

pack