Numpy 计算总是因 SVD 重建循环内的函数调用而异

Numpy calculations always differ per function call inside loop for SVD reconstruction

我的函数尝试使用具有 3 个循环的 SVD 重建图像,有时会产生不同的结果,并尝试计算具有错误值的重建图像。有时它工作得很好,但如果我反复尝试用相同的参数重建相同的图像,它最终会显示糟糕的结果。这是随机发生的,可能不是由于变量在每次函数调用后没有正确 overwritten/deleted。有时我也会收到此错误消息:

/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:21: RuntimeWarning: invalid 
value encountered in add
/usr/local/lib/python3.7/dist-packages/matplotlib/image.py:452: RuntimeWarning: overflow 
encountered in double_scalars
  dv = np.float64(self.norm.vmax) - np.float64(self.norm.vmin)
/usr/local/lib/python3.7/dist-packages/matplotlib/image.py:455: RuntimeWarning: invalid 
value encountered in double_scalars
  newmin = vmid - dv * fact
/usr/local/lib/python3.7/dist-packages/matplotlib/colors.py:1026: RuntimeWarning: 
overflow encountered in double_scalars
  resdat /= (vmax - vmin)

这是我重现此错误的代码:https://www.kaggle.com/krajnovic/numpy-variance

我调试了一段时间,我认为问题发生在“reco”对象的最后求和。

你没有把kaggle的相关代码贴在这里,所以我贴在这里给你。

您的代码:

import os
import imageio
import numpy as np
import matplotlib.pyplot as plt

im = imageio.imread("/kaggle/input/numpy-variance111/m1-1_slice125.png")
im = im -im.min() / im.max() - im.min()
u,s,vt = np.linalg.svd(im, full_matrices=False)
k = 20

def reconstruct_svd_for_loops3(u,s,vt,k):
    """SVD reconstruction for k components using 3 for-loops
    for + in svd for zeile for spalte
    Inputs:
    u: (m,n) numpy array
    s: (n) numpy array (diagonal matrix)
    vt: (n,n) numpy array
    k: number of reconstructed singular components
    
    Ouput:
    (m,n) numpy array U_mk * S_k * V^T_nk for k reconstructed components
    """
    ### BEGIN SOLUTION
    reco = np.empty(u.shape)
    for i in range(k): #for k components
        sum_ = np.empty(u.shape)
        for j in range(u[:,i].shape[0]): #for each element in ith column of u
            for k in range(vt[i,:].shape[0]): #for each element in ith row of vt
                sum_[j,k] = u[j,i] * vt[i,k]
        sum_ = s[i] * sum_
        reco += sum_
    ### END SOLUTION
    del sum_
    return reco

错误和更正

我在您的代码中发现了一些错误,但只有第一个列出的错误导致了随机性。

  1. 它有时有效有时无效的原因是因为您正在使用 np.empty,它会将您的 recosum_ 数组初始化为垃圾值,范围从 -infinf。因此,有时幸运的是,您可能碰巧使用不会崩溃的值进行初始化。您可以改用 np.zeros 来修复。

  2. 此外,您指定的形状不正确,并且您还错误地实现了 outer prdouct(两个内部循环)。虽然这不会导致随机性,但您的重建将是不正确的。 请参阅下面的代码 您的代码很好。有些东西第一次看错了

  3. 你的图像规范化是错误的,你少了一些括号'。 查看下面的代码

  4. 这不是错误,而是建议;您可以使用 NumPy broadcasting and matrix multiplication 轻松地编写所有这些而无需任何循环。 见下方代码:

实施修复和建议

import numpy as np
from PIL import Image
from urllib import request
from matplotlib import pyplot as plt

url = "https://pbs.twimg.com/profile_images/656159226805399552/v6ffWuIc_400x400.jpg"
img = np.array(Image.open(request.urlopen(url))) # Image of owl

def normalize_image(X: np.ndarray):
    xmin = X.min()
    xmax = X.max()
    return (X - xmin) / (xmax - xmin)

X = img[..., 0] # Pick a single channel
X = normalize_image(X)

def reconstruct_loops(u, s, vt, k):
    reco = np.zeros(u.shape)
    for i in range(k): #for k components
        sum_ = np.zeros(u.shape)
        for j in range(u[:,i].shape[0]): #for each element in ith column of u
            for k in range(vt[i,:].shape[0]): #for each element in ith row of vt
                sum_[j,k] = u[j,i] * vt[i,k]
        sum_ = s[i] * sum_
        reco += sum_
    return reco

def reconstruct_vectorized(u, s, vt, k):
    return (u[:,:k]*s[:k])@vt[:k,:]

k = 10
u, s, vt = np.linalg.svd(X, full_matrices=False)
reco_loop = reconstruct_loops(u, s, vt, k)
reco_vector = reconstruct_vectorized(u, s, vt, k)

fig, axes = plt.subplots(1,3)
axes[0].set_title("Original")
axes[0].set_axis_off()
axes[0].imshow(X, cmap='gray')
axes[1].set_title("Using\nloops")
axes[1].set_axis_off()
axes[1].imshow(normalize_image(reco_loop), cmap='gray')
axes[2].set_title("Using numpy broadcasting\nand matrix\nmultiplication")
axes[2].set_axis_off()
axes[2].imshow(normalize_image(reco_vector), cmap='gray')
plt.show()

结果: