>> So you've seen the UC San Diego learners discuss these timings that we
were seeing and think really about whether an actual machine
running insertion sort on bigger and bigger data could produce these timings.
So, let's tie back the timings that we got from running the code
to the analysis that we did earlier in the course trying to think about
the asymptotic behavior of various sorting algorithms and procedures.
And, in particular we're thinking about insertion sort.
And when we analyze insertion sort using big O notation,
we saw that on average, and in the worst case, big O behaves like, I'm sorry,
insertion sort has a quadratic behavior.
It's got big O of n squared.
But when we look at this graph, that doesn't seem to jive, right?
The graph looks very linear, it's going up at a constant rate.
It appears linear and not quadratic.
It doesn't look like a parabola.
And so we're trying to understand how is this possible.
And one clue comes from looking at that other cell in the table and
seeing that there is a case when insertion sort can behave like a linear algorithm.
And so let's think back to when we might be in that best case performance,
that best case scenario.
And to think about that,
we have to go back to the code of how we go ahead and implement insertionSort.
And we realize that the variations in performance,
the variations in timing for the insertion sort algorithm come from this while loop.
And this while loop executes a different number of times
based on how organized or unorganized the data is to begin with.
So, if we start with really almost sorted or fully sorted data,
then this while loop just checks whether our current element that
we care about is in it's correct location relative to the other elements and
if it is, we don't need to do anything.
And, so, if all of the tests that we did to create that timing chart
that we saw before.
If all of those tests were on really, really big, but sorted data,
then we could very well have gotten that linear behavior that we saw, because that
insertion sort algorithm is super fast on sorted or nearly sorted data.
So indeed, that timing chart that we saw could be the result of running just basic
old insertion sort on really big data as long as we know something about that data.
And so the moral of this story is that when we look at timing data we have to be
aware of drawing conclusions from that timing data.
And, we wanna be cautious about trying to deduce the big O complexity and
classes of an algorithm just based on the timing.
That's not enough.
Usually we need to understand what the data that we were testing on looks like
and whether there's other factors at play as well.