Real-Time Speech Pitch Shifting on an FPGAProject Report
(Spring 2006) Frequency ShiftingThe simplest way to shift the frequency content of an audio signal is by
single-sideband (SSB) modulation. This process entails eliminating the
negative frequency content of a signal, modulating the positive frequencies
by multiplication with a complex sinusoid, and finally reconstructing the
real signal. Elimination of the lower sideband prevents frequency content
from switching sidebands during modulation. Multiplication with a complex
sinusoid (as opposed to a real-valued sinusoid) is used to avoid imaging
issues introduced by standard modulation.
For the case of a real input signal, the transform outputs only the upper sideband, producing a signal subsequently referred to as the analytic signal [1]. Appendix A includes a derivation of how we can achieve SSB modulation using the Hilbert transform yielding the following result:
where xp[n] is the analytic
signal, ωc is the amount of frequency
shift in radians per second, x[n] is the input signal, and
Ts is the
sampling period.
To evaluate the effectiveness of SSB modulation as a pitch shifting solution, we have simulated the algorithm in MATLAB for qualitative and quantitative analysis. As an initial test, classical music is modulated up in frequency by 100 Hz. Qualitatively, it is immediately evident that the algorithm dramatically changes the sound of the input. The original warm sound of the music becomes metallic and dissonant. While the pitch is audibly higher, the shift introduces significant harmonic distortion. Figure 2 shows the signal frequency spectrum before and after the modulation.
The output spectrum confirms that the input is linearly shifted along the frequency axis. That is, each frequency component is increased or decreased by an additive constant, 100 Hz in this example. Linear frequency shifts have many applications; however, human perception of sound relies on the harmonic relationship between frequency components. Modulation does not preserve this harmonic relationship, which results in the perceived degradation of sound quality. Speech input demonstrates that this shifting technique is less problematic for non-musical audio signals. Nonetheless, SSB modulation is certainly not an ideal pitch shifting solution. Copyright 2006 Habib Estephan, Scott Sawyer, and Daniel Wanninger. All rights reserved. |