当我将 theano.config.device 设置为 gpu 时我的程序出现内存错误

Question

我的显卡系统是GT 550M

当我在 gpu 上运行我的程序时出现以下错误，我不知道如何解决这个问题

MemoryError: error freeing device pointer 0x0000000500C60000 (the launch timed out and was terminated)
Apply node that caused the error: GpuReshape{4}(GpuConv{valid, (2, 2), None, (7, 7), True, (3, 224, 224), (7, 7)}.0, TensorConstant{[672   1 109 109]})
Inputs types: [CudaNdarrayType(float32, 4D), TensorType(int64, vector)]
Inputs shapes: [(7, 96, 109, 109), (4L,)]
Inputs strides: [(1140576, 11881, 109, 1), (8L,)]
Inputs scalar values: ['not scalar', 'not scalar']

Debugprint of the apply node: 
GpuReshape{4} [@A] <CudaNdarrayType(float32, (False, True, False, False))> ''   
 |GpuConv{valid, (2, 2), None, (7, 7), True, (3, 224, 224), (7, 7)} [@B] <CudaNdarrayType(float32, 4D)> ''   
 | |GpuDimShuffle{0,3,1,2} [@C] <CudaNdarrayType(float32, 4D)> ''   
 | | |GpuFromHost [@D] <CudaNdarrayType(float32, 4D)> ''   
 | |   |x [@E] <TensorType(float32, 4D)>
 | |<CudaNdarrayType(float32, 4D)> [@F] <CudaNdarrayType(float32, 4D)>
 |TensorConstant{[672   1 109 109]} [@G] <TensorType(int64, vector)>

HINT: Re-running with most Theano optimization disabled could give you a back-traces when this node was created. This can be done with by setting the Theano flags optimizer=fast_compile

Answer 1

cuda 的错误是异步返回的。因此部分错误消息可能无关。这是第一行：

MemoryError：释放设备指针 0x0000000500C60000 时出错（启动超时并终止）

答案在第二部分：启动超时被终止

您的 GPU 已连接到显示器。在这种情况下，每个 GPU 内核调用的时间限制为 5 秒。碰巧它被破坏了，驱动程序杀死了那个内核。这是为了防止屏幕没有响应。

可能的解决方案： 1）为显示器使用不同的GPU。 2）通过使用小输入数据（例如，较小的批大小）使内核更快 3) 买一个更快的 GPU，不确定它是否能工作，如果它能工作在你当前的尺寸上，问题会出现在更大的尺寸上。

当我将 theano.config.device 设置为 gpu 时我的程序出现内存错误

memory error in my program when I set theano.config.device to gpu

python

gpu

theano