MATLAB 编程/相位声码器和编码器
外观
| 一位华夏公益教科书用户质疑本页面的事实准确性,因为它不符合 Flanagan 1966 年在高频下的相位声码器方程。 请查看相关讨论。 |
以下相位声码器及其相应的频谱编码例程是为 MATLAB 克隆编译器编写的,它可能在GNU Octave中运行。有关背景信息,请查阅
- 相位声码器 在英文维基百科上;
- Laroche, J. 和 Dolson, M. (1999) "改进的相位声码器音频时标修改" IEEE 语音和音频处理学报 7(3):323-32;
- Miller Puckette (1995) "锁相声码器" [PostScript 文件] IEEE ASSP 应用信号处理到音频和声学会议论文集 (莫洪克,纽约:1995) - 通过在重新合成期间锁相相邻频谱箱和帧来开创相位相干传播;
- Axel Röbel (2003) "相位声码器中瞬态处理的新方法" 第 6 届国际数字音频效果大会论文集 (伦敦:2003 年 9 月 8-11 日);以及
- MATLAB 编程 华夏公益教科书,其中此代码作为示例给出。
以下代码、注释和相关评论由作者发布到公共领域,或者如果这不可行,则允许所有人将此材料用于任何目的。
% fs.m - synthesis from frequency spectra.
% [output: waveform] = fs(freqresp: series of spectra)
%
% James Salsman, 1999-2000
%
% Note: sometimes this produces data outside the (1,-1) range, even when
% run with input from that smaller domain given to af.m. It is safe to
% clamp outlying values to the extrema of the corresponding input domain.
% If the resulting signal sounds clipped, it is NOT because of clamping
% domain outliers of the range (e.g., 1.2, -1.3) to the domain edges (1,-1).
% To improve the quality, try increasing the length of the fft; if that
% doesn't work, then try increasing the sampling rate. To improve the speed,
% rewrite it in C; fftPrep is the only complex vector, and the call to 'sort'
% is only for the ordering of the dot-product magnitudes, not the sorting.
function [output] = fs(freqresp)
samplingRate = 16000; % changed from 8000
frameRate = 100; % should be even, maybe try 160
windowStep = samplingRate/frameRate; % 16000/100 = 160, shall be even
% windowSize = samplingRate/40; % must be even; use 256 for 11k/s rate
windowSize = windowStep*2; % this is better [2002]
[rows, cols] = size(freqresp); % input dimensions
origFFtSize = rows*2; % should be power of two
% Allocate (preextend) all space needed for output array and helpers.
%
output = zeros(cols*windowStep + (windowSize - windowStep), 1);
FFtPrep = zeros(1, origFFtSize) + 0*i; % complex ifft input
iFFtReal = zeros(1, origFFtSize); % real ifft output
phases = zeros(1, origFFtSize/2); % resynthesis phase array
newPhases = zeros(1, origFFtSize/2); % uses "-1" for unassigned phases
transInd = zeros(1, origFFtSize/2); % index into bins by magnitude
ignore = zeros(1, origFFtSize/2); % sort values unused but can't hurt
for n = 1:origFFtSize/2, % Initialize synthesis phases.
phases(n) = pi*mod(n, 2); % Each starting phase 180 deg's apart
end; % from the next one.
% pre-calculate phase transition from one frame to the next for base 2nd bin
%
phaseFactor = 2*pi*samplingRate/(frameRate*origFFtSize);
% resynthesis COLA window should be Portnoff (sinc) because iFFT will
% shape like the window it was taken from
%
wind = sinc((-windowSize/2:windowSize/2-1)/(windowSize/2)); % pure sinc
for c = 1:cols, % iterate over columns
first = (c-1)*windowStep + 1; % output index
last = first + windowSize - 1; % boundaries
for b = 2:origFFtSize/2, % iterate over bins
% prepare resynthesis
%
[re, im] = pol2cart(phases(b), freqresp(b, c)); % polar to cartesian
FFtPrep(b) = re + i*im; % rectangular to complex
FFtPrep(origFFtSize - b+2) = conj(FFtPrep(b)); % the conjugates count
end;
iFFtReal = real(ifft(FFtPrep)); % resynthesize
output(first:last) = output(first:last) ... % note: column vector (')
+ (iFFtReal( origFFtSize/2 - windowSize/2 : ...
origFFtSize/2 + windowSize/2 - 1 ) .* wind )'; % COLA
if c < cols, % figure out the next set of phases
newPhases = ones(1, origFFtSize/2)*-1; % "-1" for unassigned phase
% set transInd to index the least-to-greatest transitioning magnitudes
%
[ignore, transInd] = sort(freqresp(:,c)' .* freqresp(:,c+1)');
% iterate downwards such that indexed bin will be greatest mags first
%
for idx = origFFtSize/2:-1:1, % step by -1 from greatest to least
b = transInd(idx); % get bin from index
if b-1 > 0, % adjacent lower neighboring bin
lower = newPhases(b-1);
else
lower = -1; % nonexistent = unassigned
end;
if b+1 < origFFtSize/2, % adjacent higher neighboring bin
higher = newPhases(b+1);
else
higher = -1; % nonexistent = unassigned
end;
if (lower == -1) & (higher == -1), % both unassigned: propagate
newPhases(b) = mod(phases(b) + (b-1)*phaseFactor, 2*pi);
elseif lower == -1, % make 180 degs from higher
newPhases(b) = mod(higher + pi, 2*pi);
elseif higher == -1, % make 180 degs from lower
newPhases(b) = mod(lower + pi, 2*pi);
else % both neighbors assigned
avgAng = (lower + higher)/2; % Mean will be 180 deg off
if abs(lower - higher) > pi, % when the angles are that
avgAng = avgAng + pi; % far apart; if so, adjust
end; % 180 degs from mean angle.
newPhases(b) = mod(avgAng + pi, 2*pi);
end;
end;
phases = newPhases; % replace array for next iteration
end; % if -- skips last iteration
end;
以下是相应的编码器
% af.m - frequency spectrum analysis.
% [freqresp: sequence of spectra] = af(input: waveform)
%
% James Salsman, '99-2000; adapted from Malcolm Slaney's mfcc.m [from his
% "Auditory Toolbox" at www.interval.com.]
%
% Also works when samplingRate is 8000 or fftSize is 256, but I haven't tried
% both together.
function [freqresp] = af(input)
samplingRate = 16000; % samples/sec
frameRate = 100; % started at 100/sec, maybe use 160 later
windowStep = samplingRate/frameRate; % 16000/100 = 160; shall be even
windowSize = 400; % large, but from ETSI standard ES 201108
windowSize = samplingRate/40; % must be even; use 256 for 11k/s rate
fftSize = 512; % larger better but slower; usually 2^int.
[rows, cols] = size(input); % sanity-check input dimensions
if (rows > cols) % if there are more rows than columns
input=input'; % then transpose
end
% hamming window shapes input signal
%
hamWindow = 0.54 - 0.46*cos(2*pi*(0:windowSize-1)/windowSize);
% how many columns of data we will end up with
%
cols = ceil((length(input) - windowSize)/windowStep);
% Allocate (preextend) all the space we need for the output arrays.
%
freqresp = zeros(fftSize/2, cols);
fftData = zeros(1, fftSize);
fftMag = zeros(1, fftSize);
for c = 1:cols, % for each column of data
first = (c-1)*windowStep + 1; % determine
last = first + windowSize - 1; % endpoints
% shape the data with a hamming window
%
fftData(1:windowSize) = input(first:last).*hamWindow;
% find the magnitude of the fft
%
fftMag = abs(fft(fftData)); % keep only real magnitudes
freqresp(:, c) = fftMag(1:fftSize/2)'; % emit
end;