Site Map
home
artwork
car alarm
circuit boards
crickets
faraday synth
giant meter
giant spectrum
head music
live furniture
noctiluca scintillans
publicity
pendulum pianos
relay rhythms
ringtones
code
arduino xcode
color themes
dissertation
audio
graphs
scripts
display
maxmsp
pd objects
biquadratic~
crossover~
phasorloop~
pink~
thirtyone~
pd spectral toolkit
source
display
pd xcode
xcode 3.2.2 pd tutorial
pink noise
raspberry pi
scrabble
wavecheck
website
dossier
biography
contact
curriculum vitae
resume
music
aural video is visual audio
confluent forms
dotscilloscope
entropy
finnegan's dream
improv no. 8
max
noctiluca scintillans
north pole
pongulation
relay rhythms
spectral teletype
other
garmin usb
ultra slomo
photos
big sur
bodie
bonneville dam
calarts e104
catalina island
costa rica
death valley
east kauai
holography
hot air balloons
japanese garden
joshua tree
lake chealan
maui
miscellaneous
newport aquarium
north kauai
oak flat lookout
pylons
r.t.o. balloons
salton sea
sierras and cascades
south kauai
u of mn
search
temp
av
canvas
scripps pier
spindle
your computer
/home/code/dissertation
Title
Dissertation
Description
Objective Analysis of Time Stretching Algorithms
Keywords
cooper baker, cooper, baker, cooperbaker, dissertation, ucsd, phd, doctor, time, stretch, stretching, algorithm, dsp, phase, vocoder, pvoc, grain, granular, spectrum, spectral, compare, comparison, evaluate, evaluation, matlab, math, code, oscillator bank, phase, lock, peak, track, locked, tracking, locking, tracked, synchronous, overlap, add, sola, ola, fast, fourier, transform, discrete, fft, dft, analysis, analyze, music, musical, musician, digital, signal, processing, amazing
Content
Objective Analysis
of
Time Stretching Algorithms
This research describes and implements new techniques for objective analysis and comparison of common time stretching algorithms. A set of test signals and samples is used in conjunction with spectral analysis to generate objective error measurements and subsequent comparisions of the algorithms. The time stretching techniques are treated as black-box processes, and are regarded as members within a general family of analyze-modify-resynthesize algorithms. This black-box approach of output signal analysis provides a basis for objective evaluation of the more general class of algorithms. Six common time stretching algorithms are analyzed: the classic phase vocoder, oscillator bank phase vocoder, peak tracking phase vocoder, phase-locked vocoder, synchronous overlap add technique, and overlap add technique. The six algorithms are then compared within a uniform framework to provide greater insight regarding the behavior of their output signals. Subjective listening evaluations are also performed, then all results contribute to the development of criteria that evaluate and rank the algorithms’ suitability in various signal processing situations.
•
Audio
– browse audio files
•
– download audio files
•
Graphs
– browse graphs
•
– download graphs
•
Scripts
– browse scripts
•
– download scripts
•
– download all resources
M.L.A.
—
Baker, Cooper Everett. Objective Analysis of Time Stretching Algorithms.
Diss. University of California, San Diego, 2015.
A.P.A.
—
Baker, C. E. (2015). Objective Analysis of Time Stretching Algorithms
(Doctoral dissertation, University of California, San Diego).
Chicago
—
Baker, Cooper Everett. Objective Analysis of Time Stretching Algorithms.
Diss. University of California, San Diego, 2015.
ProQuest
—
In total, this research uses six test signals in conjunction with three analysis techniques to evaluate six time stretching algorithms. The techniques and algorithms are described in detail, and 128 different graphs representing the resultant data are interpreted and comparatively discussed. This particular example was selected to briefly illustrate a facet of the research and provide a general idea of what the work is about.
1 — Test Signal
Analysis of the Phase-Locked Vocoder in this example is performed using one of the special test signals shown below. This test signal is designed to be difficult to stretch, and consists of a steady 1000 Hertz sinusoid mixed with a logarithmically swept sinusoid that moves from 50 to 21,000 Hertz.
Test Signal
The input signal is synthesized with a duration of two seconds and an another ideal output signal is synthesized with a duration of four seconds. This allows comparison of the ideal output signal with the stretched output signal when the Phase-Locked Vocoder is configured to stretch its input by two times. The input, ideal, and stretched signals used in this example may be heard below.
Input Signal
Ideal Output Signal
Stretched Output Signal
2 — Phase-Locked Vocoder
Time stretching of the input signal is performed by the Phase-Locked Vocoder developed by Miller Puckette. This algorithm is similar to the Classic Phase Vocoder, however, it produces higher fidilety results by mitigating phase errors with a technique that locks phases of adjacent bins together before resynthesis. The following formulas show its operation, starting with a Discrete Fourier Transform (DFT) that generates spectral data where
X
contains the spectra,
w
is the window function,
x
is the signal,
H
is the hop size in samples,
S
is the stretch factor,
N
is the window size in samples, and
F
is the number of frames:
Phase accumulation occurs next, where
P
contains accumulated phase,
over-bar
signifies the complex conjugate, and
Y[f-1,k]
represents the previous output spectrum:
Following phase accumulation, phases are locked together into
L
, with the notation
k-1
and
k+1
signifying bin offsets, so that phases of three adjacent bins are combined:
After adjacent phases are combined, the output spectra are calculated as
Y
:
Finally, the output signal is windowed and overlap-added into
y
:
3 — Moving Spectral Analysis
Moving Spectral Analysis was developed for this research to provide information about the behavior of a stretched signal over time. This technique affords insight about time domain features like transients or persistent amplitude modulations by comparing them to their counterparts in the ideal signal. In order to create meaningful comparisons, the signals are power-normalized according to the following equations, where
x
is the audio file,
w
is the windowing function,
A
is the start of the signal region,
B
is the end of the signal region, and
R
is the Root Mean Square (RMS) power value:
R
is then used to perform power-normalization of the stretched file so that it matches the ideal file as shown by the following equation, where
x
s
is the stretched file,
R
s
is the stretched RMS power value,
R
i
is the ideal RMS power value,
x
sr
is the power-normalized stretched file, and
N
is the length of the file in samples:
Following normalization, Fourier analysis is used to compare the signals’ spectra, frame by frame, to create a measure of difference between the ideal and stretched versions. The first step is implemented with a DFT where
X
represents the output spectrum,
N
is the number of samples per analysis frame,
w
is the window function,
x
is the signal,
H
is the hop size in samples, and
F
is the total number of frames:
The next step is the calculation of a set of moving average magnitude values over time, or magnitude curve, according to the following equation, where
M
contains the magnitude curve:
The windowing function is then normalized out of the magnitude curve, and the magnitude curve is converted to sones as represented by
S
, shown here by the following equation:
After creating the sones curves in
S
, the following equation uses these curves to calculate an error curve that shows how the two signals differ.
E
represents the error curve,
X
i
is the ideal spectrum, and
X
s
is the stretched spectrum:
Finally, the average and maximum error values are determined according to the following equations, where
E
avg
is the average value of the error spectrum, and
E
max
is the maximum value of the error spectrum:
4 — Analysis Graphs
The following graphs show the ideal swept sinusoid signal in blue, the stretched version in green, and the error, or absolute difference, in red. Error is scaled in sones on the Y axis to more closely match our perception of sound, and the X axis displays the ideal and stretched signals over time, while also providing a more analytical representation of their differences than our ears can perceive. The upper graph is an overview of the entire signals, and the lower graphs show more detailed views of interesting areas within the signals.
Moving Spectral Analysis Graphs
1 : 1
Files
All.zip
Objective Analysis of Time Stretching Algorithms.pdf
example
audio
ideal.wav
lock.wav
original.wav
images
mov_lock.jpg
mov_lock_big.jpg
sweep.jpg
sweep_big.jpg
svg
mov
avg.svg
error.svg
fft.svg
mag_avg.svg
max.svg
sones_avg.svg
norm
norm.svg
rms.svg
pvoc
fft.svg
ifft.svg
out_spect.svg
phase_accum.svg
phase_lock.svg