So you can ask yourselves the same question again.

Is that enough?

Is that going to always give us one unique consensus or

one unique ranking of web pages?

And the answer now is yes, we do have enough and that will always

guarantee that we will have a unique solution with that page rank algorithm.

And taking those two things into consideration does give us

the page rank scheme.

And so as the process we presented before, which accounts for dangling nodes and

disconnected graphs that forms the basis for our page ranking.

As with any company, you can assume that they are using a slightly different

approach and they optimize it a lot over time.

But that's the basic idea of how they go about ranking the web pages.

Notice how difficult their problem is though because there's billions and

billions of those web pages.

So they have to be able to do the computation very quickly.

And they have to, basically, deal with lots of large data sets and

large data objects like matrices.

And they have to be able to do computations on them very quickly, and

update those computations quick.

So if we take this graph again, just for a second.

And just to illustrate exactly how that calculation would go with page rank.

The first thing we notice again, is that V is a dangling node down here.

So what we would do is we would assume there's a link coming up here.

That there's a link here.

There's a link to this guy, and there's a link to x.

And that there's also a soft node to V.

And initially then, this node disconnected components here but

if there was one disconnecting component, maybe would could assume

that there's nodes over here like A and B, for instance.

And if they're connected to each other, going back and

forth, then that other randomization aspect would basically connect them and

have a pseudo connection to each of the other guys, by that randomization.

Say that there is a way to get there.

But it's just you have to go through your browser or

the random server has to go through this browser to get there in our analogy.

And we could do this for every leg.

So then every node has, in a sense, a connection to every other node,

as we said.

So as long as we account for dangling nodes and disconnected graphs,

then we can be guaranteed that that procedure will have a unique solution.

So now the question is, how much randomization should we have?

So in that random serving philosophy, how many times should we be just following

the web graph itself versus how many times should we be jumping to random pages?

Including that we'll have an affect on the algorithm itself and

how quickly it can approach its solution.

PageRank itself assumes a 15% randomization.

And so that's basically what that's saying is that 15% of the time we're going to

rely on the randomness,

or just random URL's that we searched through when we used page track process.

And the other 85% of the time,

we're going to rely on the connectivity in the web graph.

And there's a trade off there