0:18

many sounds that are not quite well modeled with sinusoids or

with harmonic analysis approach.

That there are signals that are very noisy,

that are what we call stochastic that require a different modeling approach.

0:36

This modeling approach is based more on statistical signal processing on.

Describing the sound from a probabilistic point of view.

In this programming class, I want to go over that from a programming perspective,

so I'll present the models that we have been implementing in the SMS tools,

and that implement this idea of stochastic modeling.

1:02

So we'll basically be talking about this plot diagram in which we

start from a signal and then after windowing it and computing FFT,

we just keep the magnitude spectrum in stochastic signals,

the phase spectrum is not relevant.

And so the approximation is based on the magnitude spectrum

that then it's approximated with a smooth curve, and

this smooth curve is what we call the Stochastic model.

And from that we can reconstruct, or we can regenerate a signal,

1:39

by sort of reconstructing a magnitude spectrum and

then generating phases randomly so that we can create another complex spectrum

that captures some of the essence of the original sound.

And then by overlap adding, we can generate a signal that

hopefully if the original signal had this stochastic properties will be

quite similar to the original sound.

So let's go to text editor and here I wrote a very short

script that implements this idea of

approximating a fragment of a sound with this stochastic model.

Okay. So we start by importing a few

packages that we need.

And then the core of the code is basically reading a sound file,

this is the ocean sound file that we already have been using,

so it's clearly a stochastic type of signal.

And then in order to analyze the spectrum of it we have to define

the FFT size and the window size in this type of sounds and

this type of situation, there is no need to zero pad.

There is no need to have odd-sized windows.

We have M, the window size, and N, the FFT size,

of the same size, which is 256.

Then the key perimeter is this what we call the stochastic factor which

is basically the smoothing factor of the down sampling factor.

So point 2 in this case that we defined,

it means that we are basically reducing the information by a factor of

0.2 and getting only that kind of the surface information.

3:43

Then we get the window.

In this case, we get a humming window, but

it can be any type of window, it doesn't have to be very complex.

In fact I would say that maybe it would be easier to just use the hunting,

3:57

because it is even simpler, and it works.

Okay, then we need to select the portion of the sound we are going to analyze.

In this case we chose a section in the sample 10,000 and

then plus the window size and we multiplied by the window.

4:31

And we do it by using this Python function called resample.

Resample is a function that basically uses an FFT approach,

so it uses a spectral analysis approach to downsample a signal,

or to resample a signal.

Roles we can down sample or up sample a signal.

So we give a function, in this case mx, and

we just make sure there is nothing below -200 decibels.

5:04

It wouldn't make any sense.

And we specify the new length that we want to create.

So the regional length was of size n over 2,

because we're only using half of the spectrum.

Okay, so from n over 2 samples, we are going to generate n over

2 multiplied by this stochastic factor approximation.

So basically, if this is 0.2,

you're going to have one-fifth of the samples that we started with.

Okay, let's run this little script.

So we will run test.

And we can plot the signal that we start with,

the windowed fragment that we start with.

So this is the fragment of the ocean sound already windowed by a humming window.

6:02

Okay, and then we are computing the spectrum of that.

So the spectrum is this mx array.

Okay, so this is magnitude spectrum of that.

And now we can plot on top of that the smooth approximation or

the down sample approximation of that, so which is mXenv.

But in order to make sure that it's shown on top of this array,

which is a longer size, we have to plot it differently.

So we have to say

mp.arrange of mxm.

The size of that.

And then if we dividie by stock factor,

this should stretch these points

to the size that we had of the MX.

7:04

Okay, so yes.

So here we show in green the approximation, the smooth approximation.

Of course it has less samples, so mxm has the size of it is 25 samples.

So from the 128 samples that the positive

side of this magnitude spectrum has,

we have reduced it to 25.

Okay?

Of course, if we change the stochastic factor,

we will have more or less samples to be approximated.

But clearly this green line is an approximation to the blue line.

7:51

Okay, so this is kind of a stochastic approximation that

we're going to be using thus.

And now let's go to this other script.

And what we will be doing is to actually synthesizing a sound from this.

So we have the same first part up until here, okay, so we generate the envelope.

And now what we're going to do is to resample again

8:37

And then to create the whole complex spectrum, we will have to

deal with the symmetry so that we will recreate the magnitude and phase for

the negative frequencies and of course convert it to complex numbers.

So the positive spectrum, the positive complex spectrum would be 10 to the power

of the magnitude multiplied by the exponential of the phase.

9:06

Okay, well this is of course the conversion from DV to linear scale.

And then for the negative part, we do the same thing but

we reverse the order and the phases that of that.

And then, we just take the inverse FFT of the complex spectrum.

So let's run this, this is Test1 okay?

And in here we have the magnitude spectrum that we started, MX.

Okay so this is the original spectrum.

And now we can plant on top of that the approximated and

that up sample to have the same size.

In fact and my has the same size as this.

So now this green line started from the samples that we showed before.

So it was a more jagged line and had fewer points.

And now it has converted to a smooth version

that has the same size than the regional spectrum.

And of course the phase spectrum, if we plot the p-y,

this is the phase spectrum that we generated,

which is just random numbers from 0 to pi, so 2 to pi.

And by taking the inverse of 50, we can just obtain the output signal.

So this is the synthesized signal from that approximated spectrum.

Of course this is different from the xw that we have started from.

10:50

So that now the green is the original signal we started from.

Clearly very different, but that's not the point here.

Because, being a stochastic signal,

the shape in the time domain is not that relevant.

What is important is this general distribution in the magnetic spectrum.

11:09

Okay, so now let's look how the actual analysis and

synthesis work for longer sounds.

So in here, we have the stochastic model.py file,

in which we have different functions that implement

the analysis of a complete stochastic signal and

the synthesis from the analyze stochastic representation.

And basically does, they do what I have done for

a single frame in which the stochastic model are null, it start from the signal,

the size, size, and the stochastic factor.

And it iterates over the whole sound.

And it does what we just did, it performs FFT.

It finds the DB magnitude spectrum then it uses the resample function to

down sample it to a fewer samples depending on the stochastic factor.

And it keeps attending these to a sequence of envelopes.

Because we're going to have a time-varying envelope for

the stochastic representation.

And of course, the synthesis does again, what I showed before,

it has to up-sample the magnitude spectrum to the complete size,

it has to generate the random phases, and it has to convert to the complex spectrum.

And to overlap that to recover the whole signal.

So let's call this for a particular sound.

So in here I wrote a little script that takes that ocean sound and

it calls the stochastic model now, so it analyzes the whole sound.

Okay, and it is a hop size of 128.

The FFT size, it used to be twice as that instead of specifying an FFT size,

the default is normally to just take twice the hub size.

And then the restore in the stochastic factor which is 0.2.

So now if we ran this test 2.

13:23

Okay, now we have created stac m that if we looked at the shape of that,

that's a metrics that has 2,217 frames.

And each frame has 25 points which are the points of the approximation.

And this, we can look at the individual frames,

but basically what we normally do is we plot

it with pcolormesh, okay the pcolormesh.

And we'll display this array in like a color 3D type of plot.

Of course, this is shown in the wrong direction,

we are accustomed to see time in the horizontal axis.

So what we normally do is show

it as with np.transpose.

Okay, so this will reverse the plot and

it will show the x-axis as time and

the vertical axis as frequency.

Okay, so this is the plot of the spectral

representation of these coefficients of the stochastic representation.

14:59

function that basically does this stochastic analysis and

synthesis for a complete sound and at the same time it plots it nicely.

This is the code that is used from the interface and

it plots a nice spectrum realm of the stochastic representation.

Same here.

We have this main function that has some parameters by default.

It reads a sound.

It performs the analysis we just did.

And it does the synthesis from the whole analysis.

15:37

Then it outputs a sound file and then it plots the different components,

the input sound, the stochastic representation and output sound.

So let's run these,

run the stochastic model function.py, okay.

And these plots, input sound, the stochastic approximation, and

the output sound.

And of course we can, now it has saved the file,

called ocean stochastic model.wav, and

I can just play it by typing play, and ocean, okay.

[NOISE].

Okay, so this has played the complete sound.

And as we showed before, well it's quite similar even though it's not identical,

it definitely sounds like an ocean sound.

And even though it has lost some of the low frequencies,

so we might have to change a little bit the stochastic approximation in order to

recover the complete sound a little bit better.

17:02

So we talked about this stochastic model implementation that we have

in the sms-tools package.

And of course, we use quite a bit of the Python functions for

those from numpy, scipy and also the plotting of matplotlib.

17:21

So we have seen how to implement

the stochastic model that we talked about in class, in the theory class.

And this will be one important component to then, now in the next classes

to incorporate it into the harmonic or sinusoidal plus stochastic representation.

So therefore, we will be using this model to analyze

the residuals that we obtain by subtracting the sinusoids or the harmonic.

So I hope to see you next lecture.

Bye, bye.