如何将 matplotlib 频谱图图像转换为火炬张量

How to convert a matplotlib spectrogram image into a torch tensor

import numpy as np
from numpy import asarray
from matplotlib import pyplot as plt
import torch

# generate a signal
fs = 50 # sampling freq
ts = np.arange(0, 10, 1/fs) # times at which signal is sampled
s1 = np.sin(2 * np.pi * 2 * ts) # 2 hz
s2 = np.sin(2 * np.pi * 3 * ts) # 3 hz
s3 = np.sin(2 * np.pi * 6 * ts) # 6 hz
s = s1 + s2 + s3 # aggregate signal

# generate specgram
spectrum, freqs, t, im = plt.specgram(s, Fs=fs, xextent=((0, len(s)/fs)))

# convert matplotlib image to torch tensor
# bypassing the numpy part would be even better!
torch_tensor = torch.from_numpy(asarray(im, np.float32))

print(torch_tensor)

>>> TypeError: float() argument must be a string or a number, not 'AxesImage'

我应该补充一点,'spectrum' 变量正是我要找的东西,只是我对它有点困惑,因为它只有两列时间,而且我认为 specgram 图像有多于两个时间步。如果有一种方法可以使用光谱变量将整个图像表示为火炬张量,那对我也有用。

plt.specgram returns spectrum 变量中的频谱图。这意味着您需要将该变量传递给 torch.from_numpy 函数。此外,根据 thisspecgram 显示 10*log10(spectrum),这意味着您可能想要执行该操作,或者将 specgram 显示的结果与您的张量图进行比较。请参阅下面的代码:

import numpy as np
from numpy import asarray
import numpy as np
from matplotlib import pyplot as plt
import torch

# generate a signal
fs = 50 # sampling freq
ts = np.arange(0, 10, 1/fs) # times at which signal is sampled
s1 = np.sin(2 * np.pi * 2 * ts) # 2 hz
s2 = np.sin(2 * np.pi * 3 * ts) # 3 hz
s3 = np.sin(2 * np.pi * 6 * ts) # 6 hz
s = s1 + s2 + s3 # aggregate signal

# generate specgram
ax1=plt.subplot(121)
ax1.set_title('Specgram image')
spectrum, freqs, t, im = ax1.specgram(s, Fs=fs, xextent=((0, len(s)/fs)))
ax1.axis('tight')

torch_tensor = torch.from_numpy(spectrum)

#Plot torch tensor variable
ax2=plt.subplot(122)
ax2.set_title('Torch tensor image')
ax2.imshow(10*np.log10(torch_tensor),origin='lower left',extent=[0,10,0,25])
ax2.axis('tight')

plt.show()

并且输出给出: