Data repositories in which cases are related to subcases are identified as hierarchical. This course covers the representation schemes of hierarchies and algorithms that enable analysis of hierarchical data, as well as provides opportunities to apply several methods of analysis.

Associate Professor at Arizona State University in the School of Computing, Informatics & Decision Systems Engineering and Director of the Center for Accelerating Operational Efficiency School of Computing, Informatics & Decision Systems Engineering

K. Selcuk Candan

Professor of Computer Science and Engineering Director of ASU’s Center for Assured and Scalable Data Engineering (CASCADE)

Okay. So, let's basically focus on the first question.

The first question is that how can we quantify

a similarity or difference between time series?

So, the first alternative we have is use Euclidean distance.

If you remember from earlier lectures,

we had discussed Euclidean distance in the context of vectors.

We had said that basically, if we are given two points in a vector space,

Euclidean distance can be measured using a formula that look like this, right?

Essentially, what we are saying here is that we can use

the same formula to measure also the distance or difference between time series.

So, what I can do is I can basically look at each and every point,

each and every day or each and every year,

each and every month along my timeline.

For each and every point,

I can measure the difference between two given time series.

So, I can do that for each and every point I have on my timeline.

For each and every observation,

I can measure the difference between the given two time series.

Then after that, to compute the Euclidean distance,

essentially what I have to do is I need to square these differences,

I need to add them up and I need to take the square root of it, right?

So, this essentially is a very commonly used measure

to sort of compare two different time series.

If I do that, I will see that basically once again before 2011,

if I do it up to 2011,

I will see that Euclidean distance between big data and deep learning is very small.

But after 2011, if I do the same thing,

I will say that the difference basically between big data and deep learning,

the Euclidean distance between them is going to be very

large because I am basically finding the differences.

Differences are very large and I'm adding them up.

Of course, if I add up very large value,

the difference is going to be very large.

So, this is basically most commonly used.

One of the most commonly used measures to compare time series, Euclidean distance.