由于内存限制,将向量循环转换为 for 循环
converting a vector loop to a for loop because of memory limitations
我有一段 Octave / Matlab 代码,我从 Andy 和团队那里得到了很多帮助。我现在 运行 遇到的问题是没有足够的内存来发出持续时间更长的信号。
我的解决方案是:
1) 将向量循环转换为for循环。 (这里有问题)
2) 让 for 循环将循环的每一段导出为 wav 文件,而不是执行矢量代码所做的附加它。 (这里有问题)
3) 使用 sox 连接每个 wave 文件段。
大多数在线示例都是从 for 循环到矢量化循环,而不是相反,有什么想法吗?我也愿意接受其他建议来解决我的记忆问题。注意:我在 rasberry pi 2 上使用 1 gig ram,它工作正常,速度非常快,我只是想获得持续时间更长的信号,并导出每个信号段应该允许这样做。
我正在使用与 Matlab 兼容的 Octave。
请参阅下面的工作矢量化代码:
它基于在这里找到的 Paul Nasca 拉伸算法
http://www.paulnasca.com/algorithms-created-by-me#TOC-PaulStretch-extreme-sound-stretching-algorithm
urlwrite('http://www.onewithall.net/rttmpfiles/3sec8000.wav','3sec8000.wav');
inputfn='3sec8000.wav' %change this to test another file
[d, fs, bps]=wavread(inputfn);
inputlen=rows (d)/fs;
printf ("Original duration of file in seconds = %.2f s\n", rows (d)/fs);
dur=60; %duration / length you want the file to be in seconds
stretch = dur/rows(d)*fs; %how much I need to stretch the file to get it to be the duration I want
windowsize = round (0.25 * fs);
step = round ((windowsize/2)/stretch);
## original window
fwin = @(x) (1-x.^2).^1.25;
win = fwin (linspace (-1, 1, windowsize));
#win = hanning (windowsize)';
## build index
ind = (bsxfun (@plus, 1:windowsize, (0:step:(rows(d)-windowsize))'))';
cols_ind = columns(ind);
## Only use left channel
left_seg = d(:,1)(ind);
clear d ind;
## Apply window
left_seg = bsxfun (@times, left_seg, win');
## FFT
fft_left_seg = fft (left_seg);
clear left_seg
#keyboard
## overwrite phases with random phases
fft_rand_phase_left = fft_left_seg.*exp(i*2*pi*rand(size(fft_left_seg)));
clear fft_left_seg;
ifft_left = ifft (fft_rand_phase_left);
clear fft_rand_phase_left;
## window again
ifft_left = bsxfun (@times, real(ifft_left), win');
## restore the windowed segments with half windowsize shift
restore_step = floor(windowsize/2);
ind2 = (bsxfun (@plus, 1:windowsize, (0:restore_step:(restore_step*(cols_ind-1)))'))';
left_stretched = sparse (ind2(:), repmat(1:columns (ind2), rows(ind2), 1)(:), real(ifft_left(:)), ind2(end, end), cols_ind);
clear ind2 ifft_left win;
left_stretched = full (sum (left_stretched, 2));
## normalize
left_stretched = 0.8 * left_stretched./max(left_stretched);
printf ("converted %s =%.2f(s) file to stretched.wav = %.2f(s)\n", inputfn, inputlen, rows (left_stretched)/fs);
wavwrite (left_stretched, fs, bps, "streched.wav");
我尝试通过在关键点放置 display('line') 来追踪问题。它看起来像行
left_stretched = sparse (ind2(:), repmat(1:columns (ind2), rows(ind2), 1)(:), real(ifft_left(:)), ind2(end, end), cols_ind);
只有当我 运行 out ram 时,上面的行似乎才有问题。 它说错误下标索引必须是正整数或逻辑数。请注意,只有当我 运行 试图通过设置 dur=60 来使用长持续时间时内存不足时才会发生这种情况* 1800。如果我设置 dur=60*10 一切正常。
你还记得我吗?我是您发布的初始代码的作者。下面的代码作为 for 循环。我已经用 800s 的输出 len 测试了它。
## based on http://hypermammut.sourceforge.net/paulstretch/
## https://github.com/paulnasca/paulstretch_python/blob/master/paulstretch_steps.png
more off
inputfn = "original.wav"
[d, fs, bps] = wavread (inputfn);
inputlen=rows (d)/fs;
printf ("Original duration of file in seconds = %.2f s\n", rows (d)/fs);
target_duration = 60; # in seconds
stretch = target_duration/inputlen;
# 1/4 s window len
windowsize = round (0.25 * fs);
# stepwidth between windows
step = round ((windowsize/2)/stretch);
numsteps = floor((rows(d)-windowsize)/step);
## restore the windowed segments with half windowsize shift
restore_step = floor (windowsize / 2);
## stetched duration
stretched_len = (numsteps*restore_step+windowsize)/fs;
printf ("Stretched duration of file in seconds = %.2f s\n", stretched_len);
stretched = zeros (numsteps*restore_step+windowsize, 1);
if (!exist ("out", "dir"))
mkdir ("out");
endif
## Matrix which holds the freq of the maximum amplitude and the max. amplitude
chunks_stats = zeros (numsteps, 2);
## original window
fwin = @(x) (1-x.^2).^1.25;
win = fwin (linspace (-1, 1, windowsize));
## loop over all windows
for k = 1:numsteps
if (! mod(k, 50))
printf ("Calculating chunk %i of %i...\n", k, numsteps);
fflush (stdout);
endif
## Only use left channel
s_ind = (k - 1) * step + 1;
e_ind = s_ind + windowsize - 1;
tmp = win' .* d(s_ind:e_ind, 1);
## FFT, overwrite phases with random phases and IFFT
tmp = fft(tmp);
[m, ind] = max (abs(tmp(1:numel(tmp)/2)));
# Frequency in Hz
chunks_stats(k, 1) = (ind-1)/windowsize*fs;
# max Amplitude
chunks_stats(k, 2) = m;
printf ("Freq = %.2f Hz, max = %.2f\n", chunks_stats(k, :));
tmp = ifft (tmp .* exp(i*2*pi*rand(size(tmp))));
## window again
tmp = win' .* real (tmp);
fn = sprintf ("out/out_%04i.wav", k);
wavwrite (tmp, fs, bps, fn);
s_ind = (k - 1) * restore_step + 1;
e_ind = s_ind + windowsize - 1;
stretched (s_ind:e_ind) += tmp;
endfor
## normalize
stretched = 0.8 * stretched./max(stretched);
wavwrite (stretched, fs, bps, "stretched.wav");
如果你想编写多个 wav 以稍后将它们连接起来,这会有点困难,因为重叠 windows。但我认为这段代码 运行 在 BeagleBoneBlack 上没问题。
编辑:添加了保存块以分隔文件,并将每个块的此信号的最大幅度和频率收集到 chunk_stats。
我有一段 Octave / Matlab 代码,我从 Andy 和团队那里得到了很多帮助。我现在 运行 遇到的问题是没有足够的内存来发出持续时间更长的信号。
我的解决方案是:
1) 将向量循环转换为for循环。 (这里有问题)
2) 让 for 循环将循环的每一段导出为 wav 文件,而不是执行矢量代码所做的附加它。 (这里有问题)
3) 使用 sox 连接每个 wave 文件段。
大多数在线示例都是从 for 循环到矢量化循环,而不是相反,有什么想法吗?我也愿意接受其他建议来解决我的记忆问题。注意:我在 rasberry pi 2 上使用 1 gig ram,它工作正常,速度非常快,我只是想获得持续时间更长的信号,并导出每个信号段应该允许这样做。
我正在使用与 Matlab 兼容的 Octave。
请参阅下面的工作矢量化代码: 它基于在这里找到的 Paul Nasca 拉伸算法 http://www.paulnasca.com/algorithms-created-by-me#TOC-PaulStretch-extreme-sound-stretching-algorithm
urlwrite('http://www.onewithall.net/rttmpfiles/3sec8000.wav','3sec8000.wav');
inputfn='3sec8000.wav' %change this to test another file
[d, fs, bps]=wavread(inputfn);
inputlen=rows (d)/fs;
printf ("Original duration of file in seconds = %.2f s\n", rows (d)/fs);
dur=60; %duration / length you want the file to be in seconds
stretch = dur/rows(d)*fs; %how much I need to stretch the file to get it to be the duration I want
windowsize = round (0.25 * fs);
step = round ((windowsize/2)/stretch);
## original window
fwin = @(x) (1-x.^2).^1.25;
win = fwin (linspace (-1, 1, windowsize));
#win = hanning (windowsize)';
## build index
ind = (bsxfun (@plus, 1:windowsize, (0:step:(rows(d)-windowsize))'))';
cols_ind = columns(ind);
## Only use left channel
left_seg = d(:,1)(ind);
clear d ind;
## Apply window
left_seg = bsxfun (@times, left_seg, win');
## FFT
fft_left_seg = fft (left_seg);
clear left_seg
#keyboard
## overwrite phases with random phases
fft_rand_phase_left = fft_left_seg.*exp(i*2*pi*rand(size(fft_left_seg)));
clear fft_left_seg;
ifft_left = ifft (fft_rand_phase_left);
clear fft_rand_phase_left;
## window again
ifft_left = bsxfun (@times, real(ifft_left), win');
## restore the windowed segments with half windowsize shift
restore_step = floor(windowsize/2);
ind2 = (bsxfun (@plus, 1:windowsize, (0:restore_step:(restore_step*(cols_ind-1)))'))';
left_stretched = sparse (ind2(:), repmat(1:columns (ind2), rows(ind2), 1)(:), real(ifft_left(:)), ind2(end, end), cols_ind);
clear ind2 ifft_left win;
left_stretched = full (sum (left_stretched, 2));
## normalize
left_stretched = 0.8 * left_stretched./max(left_stretched);
printf ("converted %s =%.2f(s) file to stretched.wav = %.2f(s)\n", inputfn, inputlen, rows (left_stretched)/fs);
wavwrite (left_stretched, fs, bps, "streched.wav");
我尝试通过在关键点放置 display('line') 来追踪问题。它看起来像行
left_stretched = sparse (ind2(:), repmat(1:columns (ind2), rows(ind2), 1)(:), real(ifft_left(:)), ind2(end, end), cols_ind);
只有当我 运行 out ram 时,上面的行似乎才有问题。 它说错误下标索引必须是正整数或逻辑数。请注意,只有当我 运行 试图通过设置 dur=60 来使用长持续时间时内存不足时才会发生这种情况* 1800。如果我设置 dur=60*10 一切正常。
你还记得我吗?我是您发布的初始代码的作者。下面的代码作为 for 循环。我已经用 800s 的输出 len 测试了它。
## based on http://hypermammut.sourceforge.net/paulstretch/
## https://github.com/paulnasca/paulstretch_python/blob/master/paulstretch_steps.png
more off
inputfn = "original.wav"
[d, fs, bps] = wavread (inputfn);
inputlen=rows (d)/fs;
printf ("Original duration of file in seconds = %.2f s\n", rows (d)/fs);
target_duration = 60; # in seconds
stretch = target_duration/inputlen;
# 1/4 s window len
windowsize = round (0.25 * fs);
# stepwidth between windows
step = round ((windowsize/2)/stretch);
numsteps = floor((rows(d)-windowsize)/step);
## restore the windowed segments with half windowsize shift
restore_step = floor (windowsize / 2);
## stetched duration
stretched_len = (numsteps*restore_step+windowsize)/fs;
printf ("Stretched duration of file in seconds = %.2f s\n", stretched_len);
stretched = zeros (numsteps*restore_step+windowsize, 1);
if (!exist ("out", "dir"))
mkdir ("out");
endif
## Matrix which holds the freq of the maximum amplitude and the max. amplitude
chunks_stats = zeros (numsteps, 2);
## original window
fwin = @(x) (1-x.^2).^1.25;
win = fwin (linspace (-1, 1, windowsize));
## loop over all windows
for k = 1:numsteps
if (! mod(k, 50))
printf ("Calculating chunk %i of %i...\n", k, numsteps);
fflush (stdout);
endif
## Only use left channel
s_ind = (k - 1) * step + 1;
e_ind = s_ind + windowsize - 1;
tmp = win' .* d(s_ind:e_ind, 1);
## FFT, overwrite phases with random phases and IFFT
tmp = fft(tmp);
[m, ind] = max (abs(tmp(1:numel(tmp)/2)));
# Frequency in Hz
chunks_stats(k, 1) = (ind-1)/windowsize*fs;
# max Amplitude
chunks_stats(k, 2) = m;
printf ("Freq = %.2f Hz, max = %.2f\n", chunks_stats(k, :));
tmp = ifft (tmp .* exp(i*2*pi*rand(size(tmp))));
## window again
tmp = win' .* real (tmp);
fn = sprintf ("out/out_%04i.wav", k);
wavwrite (tmp, fs, bps, fn);
s_ind = (k - 1) * restore_step + 1;
e_ind = s_ind + windowsize - 1;
stretched (s_ind:e_ind) += tmp;
endfor
## normalize
stretched = 0.8 * stretched./max(stretched);
wavwrite (stretched, fs, bps, "stretched.wav");
如果你想编写多个 wav 以稍后将它们连接起来,这会有点困难,因为重叠 windows。但我认为这段代码 运行 在 BeagleBoneBlack 上没问题。
编辑:添加了保存块以分隔文件,并将每个块的此信号的最大幅度和频率收集到 chunk_stats。