0:00

Hello, welcome back to the course on audio signal processing for music applications.

This week, we're talking about the short-time Fourier transform.

And one of the fundamental elements of this transform is the analysis window.

So in this programming lecture I want to talk about windows but

from a programming perspective.

0:42

But since I have to do all this zero phase windowing and

center things around zero and zero padding.

At the end, the code is a little bit longer than what it would be desirable.

So let's go through the code and talk about the main aspects of it.

Well first, we have the packages that we need,

import, numpy, scipy and matplotlib.

Then we have the lines that relate to the window, so

we defined the window length in this case,

63 and the very row that we keep that is M, capital M.

Then we called a function GetWindow,

which it has two parameters.

The name of the window on the size, so the window I chose is the hanning window.

Which is the raised cosine function.

And then I need the variables that keep the middle of the window information.

And it has to handle this even are not situations so

the middle can be different depending on even an outside.

So the two sides might have different number of samples.

Okay, then we need to prepare the window for computing FFT.

So to place the window at the zeroth location, so

what we're going to do is we create a buffer of the size of the 51

that we want to compute, in this case 512 samples.

And then we place the window around the zeroth sample.

So we take the second half of the window and place it at the beginning and

the first half of the window and place it at the end.

2:45

Then we can compute the spectrum of this buffer, okay.

So we will compute the buffer using the FFT algorithm and

then we comvert the complex spectrum into absolute value and phase.

The absolute value in order to convert it to decibels,

which is the way that we like to see the information of the magnitude,

we have to make sure that there are no zeroes so

that the lock of zero is not there.

So we will have to check if the absolute value is

below this epsilon value which is the minimum value we can have in Python.

So that's why we have these three lines of code.

3:34

And then we compute the angle of the complex values.

And then in order to show it better, to display the spectrum,

the magnitude and phase spectra and

see it centered in the middle, let's say of the array.

We undo this zero phase windowing thing so

that we place back the data in the middle of the array,

it's easier to visualize.

Okay let's run it, so

we are in a iPython with Pylab in it, so

we have the matplotlib package already

within it and we can run the script.

Okay, that's it and now we can start plotting things that we have computed.

For example, we can plot the window, okay.

So, this is the hanning window, which is this raised cosine going from

zero to one and the 63 samples that we have computed.

4:45

Then we can plot how this window has been placed in the FFT buffer.

So we can plot the FFT buffer.

Okay, so here the window is centered at zero.

So we have the second half from zero to half of the window, and the first half

5:22

Okay so this is the magnitude spectrum again centered around zero so

here we have the positive frequency values.

And in the second part we have the negative values.

So a better way to display that is to move things around and

locate the main lobe in the center so that's how we did and

mx1 has the information in the center.

5:52

And if we plot the phase, we can also show the location

of the phases and here what is important of course are the phases in the middle.

And here we have these two pi discontinuities but basically everything

is zero, modulus two pi and it's a clear.

It's good because that means we centered the window around zero and

therefore we have what we call a zero phase window.

Okay, in order to visualize the X axis better.

Here, I typed some commands to plot it

in a way that the access are better shown.

So here what I did was to plot against an X axis array.

That has been normalized so

that the zero parting does not affect, so we divide by N and multiple by M.

So we actually see the samples, let's say of, with respect to the window and

they also have normalized the magnitude so that the maximum is zero decibels.

And I'm only plotting the values that go in the x axis from minus 22 to 20 and

the dB values, I'm plotting from minus 80 to zero.

Now, if we run this script with these lines added,

well we are seeing the magnitude spectrum of the hanning window.

But with an x axis that shows the center around zero.

And decibels starting from zero.

So now we can check the values we talked about that describe the window.

We mentioned that the main lobe.

The width of the main lobe is an important characteristic of the window.

In here we can see, if we look from the bottom

right side where our x value is of the cursor.

We can see that this main lobe sort of deep is at minus two.

And this other one is at two so clearly this main lobe at this location,

the weight is around four samples or to be called four bins.

And this is what we mentioned about the hanning window.

And then also we talked about the side lobe level, the highest side lobe level.

And here again we can put the cursor at highest side lobe level and

it tells me kind of the value that this has.

And it's around minus 31 decibels which is the kind of thing that we mention.

8:49

Okay so now we can keep changing the window and

for example we can put instead of the hamming, instead of the hanning we can put

the hamming and compute the same script.

And okay that's a different window and

if we can measure the main lobe width we see that it also has four samples,

but now the side lobe level is much lower.

So the highest side lobe is around minus 42 decibels,

which is the kind of things we mentioned in the theory class, okay?

And if we go to another window, for example we go to the Blackman window,

9:38

Blackman window and we save it and run it.

Okay well, yes that's what we saw in the theory class and

now the main lobe is wider.

It's around six bins and

the side lobe level is much lower around 58 decibels.

And then finally the last window that we also talked about,

this was the Blackman-Harris window.

Blackman-Harris window that we can compute.

And if we run it, okay.

So now, in fact we are not seeing the side lobes and this is because

the range that we are specifying doesn't even reach where the side lobe is.

So let's change the display.

And let's change the axis, instead of having minus 80.

Let's put at least minus 100 so

that when we show it, now we are seeing the side lobes.

In fact the highest side lobe is around minus 92 decibels.

And now the main lobe is quite wider, is around eight bins.

Okay so this is a good way to visualize and try to understand the windows.

And so please feel free to play around with that.

11:14

And this is another script.

This is much shorter, because instead of calling the FFT and

worrying about all this centering around zero.

And worrying about the zero parting etc., I call directly the dftAnal

function that we have in the SMS tools directory.

So now what I do is I specify the directory of

the models where I have all the models of the SMS tools and

I import the dftModel file, okay?

So that I can call this dftAnal function that receives as input signal,

the window, and the FFT size and automatically does all

the operations that we just showed in the previous script.

Okay, so in this script basically what I do is I get

sine wave with a given frequency.

In this case, a frequency is 5,000 Hertz.

Of course a sampling rate of 44,100 and

I compute 101 samples of a sinusoid which are basically

going to be the same length of the window size I am using.

For example, in this case I am using the hanning window, but

maybe let's change it and let's use, for

example the hamming window.

12:55

Okay, and now we compute the analysis.

And here, I also display in a way the output so

that I centered everything on top about the hertz.

So we're going to display only the positive part,

because the dftAnal only computes at the positive part of the spectrum.

Then I will be displaying it, showing it in hertz.

So let's execute this file, test1.

13:30

Okay, and this is the sinusoid of what I have analyzed.

And this is the shape of the window, in fact.

So this proves what we talked about in theory.

That when we analyze a sine wave, and we multiplied by a given window.

In fact, what we've seen in the spectrum, is the transfer of the window.

Centered at the frequency of the sinusoid, okay?

5,000 hertz and with the amplitude of the sinusoid.

The amplitude in this case was normalized and is not here an important thing.

But we would also see the amplitude of the sinusoid.

And of course we see the standard characteristics of the window and

here the x axis is in hertz.

So we need to understand the relationship between

frequency in hertz and all these other values, the beans that we talked about.

And of course, if we change the window and

we compute the like the one we mentioned,

the Blackman-Harris, okay so

now we do this computation, okay and now we are seeing.

Again the Blackman-Harris window, the magnitude spectrum but

centered at the frequency of the sinusoid.

Okay, so that's basically all I wanted to talk about in this lecture.

So we have been using some packages from Python and also of course,

we have been using the sms tools in order to get DFT analysis code from that.

15:28

And that's all, so we have been focusing on one aspect of the short-time

Fourier transform which is analysis window which is a fundamental element.

And unless we understand how windows work it's going to be very difficult to

understand how spectrograms and how the short-time Fourier transform and

other spectral analysis systems work.

So please make sure that you understand the concept of windowing and

how it affects when either by itself when we see

the spectrum or when we window a particular signal.

And therefore, so how we see the spectrum of

particular signal having been windowed in a certain way.