0:00

Welcome back to the course on audio signal processing for music applications.

In the first part of this lecture we presented some properties of the discreet

Fourier transform.

We now continue with some more properties

that will be very much useful when using the DFT.

In particular, we will talk about energy Conservation and

decibels, phase unwrapping, zero padding, the Fast Fourier Transform.

The Fast Fourier Transform together with what we call zero-phase windowing.

And finally, we will put it together with the concept of analysis and

synthesis of a sound.

0:42

The property of energy conservation

relates with the idea that energy of a signal.

Both in the time domain or in the frequency domain can be measured in

the same way and it's basically the same.

So we can either compute energy in one domain or the other.

So the energy is defined as

1:10

the sum of the square root of the absolute values of a signal.

And in the frequency domain,

if we take also the absolute value squared and sum it.

And if we just add this normalization factor by dividing over n,

we get the same value.

Here we see an example, we have a time domain signal, we do

this energy conclusion and we get this value 11.8.

And if we do the same thing in the frequency domain in

the square of the absolute value.

We sum and then divide by n, we get exactly the same value.

1:53

Okay, a concept related to energy is amplitude, which is what

we normally use, either in the time domain or in the frequency domain.

When we obtain the polar representation of the spectrum as a signal in DFT,

the amplitude is obtained by computing the absolute value.

Which is a linear measure.

However, for the case of sound, a more intuitive representation of the amplitude

can be obtained by converting it to decibels, into a log value.

So the decibels are defined as we see here,

as 20 times log 10 of the absolute value of the signal.

So from the original time domain,

here we can see the absolute value of the spectrum in a linear representation.

And what we are now saying that is a more intuitive way

to visualize the amplitude in the frequency domain, so using decibel scale.

3:01

Okay, so the spectrum of a signal includes the amplitude,

now computing decibels and also the phase.

And phase unwrapping is a way to represent the phase spectrum

of the DFT in a way that is easier to visualize and understand.

So here we see an original signal,

the magnitude spectrum in decibels and the phase spectrum.

Computed as the angle of the complex value of the spectrum and

here we see clearly that it's a very messy type of visualization.

So the unwrapping, what it does, it basically smooths

that out by adding two pi whenever there is a discontinuity.

So since this is bounded between zero and two pi,

whenever it reaches beyond two pi, it wraps back and it goes to zero.

So what we're going to do is to unwrap that and

let it grow as it behaves in a natural way.

So we get these smoother functions that become much easier to read and interpret.

4:17

Zero-padding means to add zeroes at the end of a signal.

In the context of the DFT, if we zero-pad in one domain,

it produces an interpolated signal in the other domain.

[COUGH] So here, we see an example of that,

we start from a signal x of size eight,okay?

And this is the signal from which we compute the DFT,

so its DFT will also be of size eight.

And here we see the absolute value while in this case

computed into DB, into decibels of these eight samples.

Now, instead of computing the DFT of these eight samples,

we can compute the DFT of these eight samples plus eight samples of zero.

Therefore having the size of the DFT to be 16, this is the second plot.

So by computing the DFT of size 16 but of only this eight samples,

what we're seeing is that it's a much smoother visualization.

The samples of N = 8 of the of the magnitude spectrum for

N = 8 are exactly here.

But apart from those, there are interpolated values in between so

that to make the spectrum smoother.

And we can even do more if we zero-pad even more up to N = 32,

we will get more interpolated values in between.

Therefore resulting into a smoother spectrum.

5:58

Okay, so now let's talk about the Fast Fourier Transform.

The DFT can be a quite demanding operation.

The implementation can be quite slow if

we don't pay attention about some efficient implementations.

So the Fast Fourier Transform is that is an efficient implementation

of the DFT equation.

And it does that by taking advantage of symmetries.

So what it does is that it restricts the input signal

to be of size, a power of two and because of that.

And thanks to that, then there is a whole bunch of symmetries that appear.

And so in this example, for example of having eight samples of a signal.

We can combine them so that we can group them and

take advantage of these symmetries.

And then perform computation at this pair wise type of

signals and therefore having a much more efficient computation.

7:39

And I compute of different DFT sizes and I computed the time that it took.

So it was as the size of the DFT was increasing,

the computation time increased exponentially.

I'm here at the last one I tried, the 16,000,

it was close to two minutes of compute time.

8:01

If I use the FFT implementation that comes with Python, of course,

it's an efficient implementation also because it's implemented in C.

But clearly we see the huge amounts of difference between that.

So the N 50 size of 16,000 samples is

much less than a millisecond that to compute time.

And the growth of this compute time is growing not exponentially,

but is growing a little bit flatter.

In fact, it's growing at a growth of n log n, which

is lower than the exponential growth of the DFT implementation.

8:51

Okay, so in order to use the FFT,

we need to have the input signal to have a power of two length.

But we want to compute the spectrum of any length signal.

So this is the way we propose to compute the spectrum of a signal.

We would first do zero-padding and

then we will be using what we call zero-phase windowing.

Okay, so let's go through this example.

We will start from a fragment of a sound, x,

that has a given length, let's say 401 samples.

9:25

Now we want to use the FFT, so we'll need to use power of two,

so the next power of two will be 512.

So we'll add zeros, so this next representation has zeros but

it doesn't add them at the end.

It does it by kind of splitting the signal through the middle

in a way that this is what called the zero-phase windowing.

That the zero sample which is the center sample is at the left side of the buffer.

That's where the zero sample is.

Then we have the positive samples up until the middle with the zero-padding included.

And then from the right side, we have the negative samples,

the samples that are negative time.

10:15

So this is the way we will pack the signal in

what we call the FFT buffer before calling the FFT.

And if we compute the FFT of that and

then compute the spectrum in dB and the phase with unwrap.

Unwrapping the phase, we see this visualization in which we see

the symmetry of the magnitude spectrum and we see it quite nicely.

We see it quite smoothly.

And the phase, we see the odd symmetry of the phase and

we see a very smooth phase visualization because of two reasons.

Because we did the zero-phase windowing and because we did the unwrapping.

So because of the zero-phase windowing,

basically we are getting rid of the shifting distortion.

That would occur if we had not centered all the samples around zero.

And of course, the unwrapping allows us to see this very smooth visualization.

Okay, so this is the last part of what I wanted to talk about.

So we have seen the DFT, we have seen the different properties,

so now we can put it together.

Doing the analysis and synthesis of the DFT in

what we call the analysis/synthesis type of operation.

So we can start from a signal,

compute the FFT represented correctly in the magnitude and phase.

And since there are symmetries, there is need to only show half of it,

the positive side.

So this is the positive side of the magnitude spectrum and

the positive side of the phase spectrum.

So the spectrum was twice as long.

And then we can do the inverse Fourier transform from these and

reconstruct the original signal and it should be exactly the same.

So if we do things right, there are input signal,

the output signal should have exactly the same values.

12:27

So we have seen this slide before.

This is just a slide for giving some references and

credits, information on the DFT.

It's available in many places about the Fast Fourier transform too.

The sounds from free sounds.

Again, the reference for Julius DFT

information in his website and the standard credits.

12:52

So with this lecture,

we complete the presentation of the relevant Fourier Transform properties.

That they're of relevance to our audio processing work.

In the next lecture, we will take this further and

start working with more complex sounds.

So I hope to see you next class.

Bye-bye.