python del 变量命令上的 numba 函数编译错误

python numba function compile error on del variable command

这是我遇到的问题的示例代码:

import numba, numpy as np

@numba.jit
def f_plain(x):
   return x * (x - 1)

@numba.jit
def integrate_f_numba(a, b, N):
   s = 0
   dx = (b - a) / N
   for i in range(N):
       s += f_plain(a + i * dx)
   return s * dx

@numba.jit
def apply_integrate_f_numba(col_a, col_b, col_N):
   n = len(col_N)
   result = np.empty(n, dtype='float64')
   assert len(col_a) == len(col_b) == n
   for i in range(n):
      result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
   new = result
   #return result
   del result

def compute_numba(df):
   result = apply_integrate_f_numba(df['a'].values, df['b'].values, df['N'].values)
   return Series(result, index=df.index, name='result')

并且 运行 它使用以下命令:

import pandas as pd

from pandas import DataFrame, Series

from numpy.random import randn, randint

import numpy as np

df = DataFrame({'a': randn(1000), 'b': randn(1000),'N': randint(100, 1000, (1000)), 'x': 'x'})
%timeit compute_numba(df)

但是当我在 'apply_integrate_f_numba' 函数中使用 'del result' 时出现此错误:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-31-6c46b74dae81> in <module>()
      8 
      9 df = DataFrame({'a': randn(1000), 'b': randn(1000),'N': randint(100, 1000, (1000)), 'x': 'x'})
---> 10 get_ipython().magic(u'timeit compute_numba(df)')

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\interactiveshell.pyc in magic(self, arg_s)
   2203         magic_name, _, magic_arg_s = arg_s.partition(' ')
   2204         magic_name = magic_name.lstrip(prefilter.ESC_MAGIC)
-> 2205         return self.run_line_magic(magic_name, magic_arg_s)
   2206 
   2207     #-------------------------------------------------------------------------

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\interactiveshell.pyc in run_line_magic(self, magic_name, line)
   2124                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2125             with self.builtin_trap:
-> 2126                 result = fn(*args,**kwargs)
   2127             return result
   2128 

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\magics\execution.pyc in timeit(self, line, cell)

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\magic.pyc in <lambda>(f, *a, **k)
    191     # but it's overkill for just that one bit of state.
    192     def magic_deco(arg):
--> 193         call = lambda f, *a, **k: f(*a, **k)
    194 
    195         if callable(arg):

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\magics\execution.pyc in timeit(self, line, cell)
   1011             number = 1
   1012             for _ in range(1, 10):
-> 1013                 if timer.timeit(number) >= 0.2:
   1014                     break
   1015                 number *= 10

C:\Program Files\Python\python-2.7.9.amd64\lib\timeit.pyc in timeit(self, number)
    193         gc.disable()
    194         try:
--> 195             timing = self.inner(it, self.timer)
    196         finally:
    197             if gcold:

<magic-timeit> in inner(_it, _timer)

<ipython-input-30-f288c9b5ebe9> in compute_numba(df)
     25 
     26 def compute_numba(df):
---> 27    result = apply_integrate_f_numba(df['a'].values, df['b'].values, df['N'].values)
     28    return Series(result, index=df.index, name='result')

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\dispatcher.pyc in _compile_for_args(self, *args, **kws)
    151         assert not kws
    152         sig = tuple([self.typeof_pyval(a) for a in args])
--> 153         return self.jit(sig)
    154 
    155     def inspect_types(self, file=None):

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\dispatcher.pyc in jit(self, sig, **kws)
    142         """Alias of compile(sig, **kws)
    143         """
--> 144         return self.compile(sig, **kws)
    145 
    146     def _compile_for_args(self, *args, **kws):

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\dispatcher.pyc in compile(self, sig, locals, **targetoptions)
    277                                           self.py_func,
    278                                           args=args, return_type=return_type,
--> 279                                           flags=flags, locals=locs)
    280 
    281             # Check typing error if object mode is used

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\compiler.pyc in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library)
    550     pipeline = Pipeline(typingctx, targetctx, library,
    551                         args, return_type, flags, locals)
--> 552     return pipeline.compile_extra(func)
    553 
    554 

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\compiler.pyc in compile_extra(self, func)
    265             return self.stage_compile_interp_mode()
    266         else:
--> 267             raise res.exception
    268 
    269     def compile_bytecode(self, bc, lifted=(),

NotImplementedError: offset=142 opcode=0x7e opname=DELETE_FAST

不知道现在该怎么办?我需要 del 命令,因为在我的原始代码中我需要释放一些内存,因为我正在处理巨大的数据集。

问题发生在 numba 即时编译器 (jit) 尝试编译此函数时 (N.B。根据我认为 OP 意图围绕 del)

@numba.jit
def apply_integrate_f_numba(col_a, col_b, col_N):
   n = len(col_N)
   result = np.empty(n, dtype='float64')
   assert len(col_a) == len(col_b) == n
   for i in range(n):
      result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
   new_res = result
   del result
   return new_res

具体来说 - 问题出在行

del result

JIT 实际上是在错误信息中告诉你问题出在哪里:

NotImplementedError: offset=142 opcode=0x7e opname=DELETE_FAST

即numba 编译器尚未实现编译 DELETE_FAST python 操作码的方法。如果您对代码感兴趣 - 它看起来像 thrown from here 并且该文件包含 numba 可以处理的字节码列表。

你会注意到,我敢肯定,如果你简单地从 apply_integrate_f_numba return result,一切正常,你的速度比使用它快约 2 倍不带 numba.jit 的函数(假设您将另一个函数保留为 @numba.jit 注释。

我认为您可能试图通过您的 del 语句实现不可能的目标。 This answer to a more general question 解释了 del 的作用 - 它删除了对对象的绑定(引用)。您似乎试图通过删除刚刚分配了另一个名称的对象(即 new)来释放内存。所以 - 您的代码实际上不会释放内存,因为仍然有对它的引用,您需要 return 那个。请注意,当您退出该函数时,无论如何都会删除对该对象的所有本地引用,并且 resultapply_integrate_f_numba 函数的本地引用 - 因此 del 实际上是多余的。

如果您有如此大的数据集以至于内存是个问题,您可以 del 在完全完成它之后 - 例如一旦你将它写到文件中,绘制它或者你想用它做的任何其他事情。简单地为其分配另一个名称并 deling 原始名称不会这样做 - 增加的负面影响是您将收到此 numba 错误。