0:18

This analysis willshow three claims.

The first that, uh, the, uh, gossip protocol is lightweight,

even when the groups are very large,

meaning they contain a large number of nodes,

uh, it spreads a multicast quickly, even in large groups,

and it is highly fault tolerant in spite of, uh, node failures

and in spite of packets being dropped.

0:36

So, the analysis here is, uh,

derived from an old branch of mathematics called epidemiology,

and this is taken from a, uh, well-known textbook

on epidemiology by Bailey, published in 1975.

Um, essentially, epidemiology was a old branch of mathematics

that studied the spread of epidemic diseases

among prison populations

and among human populations in society.

Uh, the-the traditional, uh, classical version

of epidemi-epidemiology analysis

has a population of n+1 individuals

mixing homogenously.

Essentially, each of these individuals

is one of the nodes in our system

and consider these individuals or these molecules

being inside a jar and you're constantly shaking the jar

and these molecules are moving around

and hitting each other all the time.

The contact rate per time unit

between any pair of individuals is beta.

So, beta's typically a number between 0 and 1.

Uh, this is the probability

that two individuals will come in to contact with each other

during a given time unit.

At any time, each individual is either uninfected

and there are x such uninfected individuals,

or once it turns infected,

it stays infected and the number of infected is y.

Initially, only one infected, uh, individual is present.

This is the multicast center, so y_0=1 and the remaining, of,

uh, the remaining n individuals are all uninfected at time zero.

Of course, at all times,

because a-an individual is either uninfected or infected,

the sum of x+y at any given point of time should be n+1,

the number of individuals in the group.

2:05

When an uninfected individual comes into contact

with any infected individual,

uh, the uninfected individual turns into infected

and it stays infected thereafter.

So, this sort of models, uh, the, uh, gossip-based protocol

that we have seen so far.

So, how do you analyze this?

Well, this is a continues time process

and you can essentially write this as a differential equation

where you write the rate of change of, uh, x,

which is the, uh, uh, uh, uninfected, uh, number

in the-in the system, um, as -beta*x*y.

First of all, it's negative because the number of uninfected

is obvious going down over time.

Uh, now, why is it, uh, -beta*x*y?

Well, x*y is the, uh,

total number of potential infected/uninfected contacts

per time unit,

and among those, um, uh, all possible x*y contacts,

only a fraction, beta, happen

because beta is, after all, the contact rate,

and for each of those contacts,

one uninfected turns into infected.

3:29

One of the things for you to notice here is that, uh,

x over here, the equation for x,

uh, has, um, a t in it, which is the time,

which is the number of rounds

since the gossip-based protocol has started.

Remember again that here we are using a synchronous, uh, model

where all the processes or nodes proceed in lock step, uh,

from one round to another.

Uh, this is the only for the, uh, ease of analysis.

The analysis holds

even if, the, uh, processes are all unsynchronized.

3:56

So, you notice that as t goes to infinity,

as t becomes very large,

the denominator is going to become extremely large

and x is gonna go down to zero.

Similarly, here, when t becomes very large,

this second, uh, uh, factor in the denominator, uh,

is, uh, going to go to zero, and so y is gonna go to n+1.

So, eventually the number of uninfected

becomes very close to zero

and the number of infected becomes very close to n+1,

which is the total number of nodes in the group.

So, eventually gossip converges

and everyone receives the gossip.

That's not surprising.

Well, what we want to show is

that gossip actually converges fairly fast.

4:37

So, in the gossip based and epidemic multicast protocol,

the value beta is actually b/n.

Well, why is this?

Consider a, uh, an-an infected, uh, node

and consider one particular uninfected node.

The probability that this infected node

picks this particular uninfected node as a gossip target

during the round

is essentially the probability that it is one of the b targets

picked by the infected node during that round.

So, since there are n possible, um, uh, uh, targets, uh,

per round and the probability that this uninfected node

is picked as a gossip target, simply b/n.

So, you substitute this value beta

into the equation that we derived

and you substitute the value of t=clog(n)

this essentially says that you have, uh, n- uh, log(n) rounds

that have happened so far

and actually all the log(n) rounds

that have happened so far.

And you substitute in the previous equations,

you get the number, uh, of infected nodes

in the system is n+1,

the total number of nodes, minus this quantity 1/n^(cb-2)

This c comes from the clog(n) here,

and the b comes from, uh, the fan out of the gossip.

Now, if I set c to be some small number like 2

and b to be a small fan out like 2,

this, uh, term becomes 1/n^(4-2), or 1/n^2.

That's a very small number that's very close to zero.

Essentially what that's- what that is saying

is that after 2log(n) rounds,

as long as I'm using a gossip fan out of 2,

the, ah, number of infected nodes in the system

will be n+1 minus a very small number, 1/n^2,

which is very close to 0, and in fact, as n increases,

this number goes even closer to 0.

Essentially this is saying that the gossip converges

within a logarithmic number of rounds

and it gets very close to n+1 infected being in the system.

6:16

So, as l- as long as you set c and b to be small numbers

that are independent of n,

they are constants, within clog(n) rounds.

This means low latency, all but a very small fraction of

number of nodes receive, uh, the multicast.

This means that the multicast is highly reliable

and it gets to almost everyone with high probability

within a logarithmic number of rounds.

And since only clog(n) rounds have happened, um,

nodes in the worst case are sent out, uh,

each node is sent out c*b*log(n) copies of the gossip message.

So, the order, the, ah, overhead

on each node is also logarithmic.

7:00

Now, I hope that some of you are thinking,

"Well, it's log(n), it's not constant," right?

Um, the latency is O(log(n)), it's not constant,

um, the overhead is O(log(n)), it's not constant.

But, why is O(log(n)) so sacrosanct?

Well, log(n) is really not constant in theory,

but when it comes, uh, to practice, uh,

it is a very slowly growing number.

A log base 2(1000) = 10, log base 2(1M) = 20,

log base 2 (1B)~30,

and if you consider the IP before address space, uh,

log base 2(all IPv4 address)=32.

So, these are fairly small, uh, numbers,

and as far as practitioners are concerned,

many practitioners consider log(n)

to be a fairly small, uh, number

and a constant for practical purposes.

7:55

Suppose 50% of the packets get dropped, uh,

from the network.

Again, we can analyze the protocol

by simply replacing b by b/2 because, after all, uh,

the gossip targets are selectived at random,

so even if you have, um, uh, uh, uh, packet losses

all over the network, uh,

these packet losses will be distributed uniformly

at random among the gossip messages.

So, you simply analyze b by, uh, nah, this,

do the same analysis as before by replacing b by b/2

because half the gossip messages get dropped,

and it turns out,

and you can check this for yourself, uh,

that to achieve the same reliability as 0% packet loss,

you only need to wait for twice as many rounds.

So, instead of using clog(n) rounds,

you wait for 2clog(n) rounds

and you get the same reliability as, uh, with a 50-

as with the 0% packet loss rate.

With node failure, 50% of the nodes fail, uh,

you replace n by n/2

because there are only half the nodes

that are alive in the system,

you care only about the reliability at them,

and you replace b by b/2, ah, again,

again, half the gossips get dropped

because they are sent to one of the failed nodes.

And once again, ah, you get a similar, um, a result

as, uh, before by increasing the number of log- uh,

the-the rounds by only a constant factor- um, uh,

multiplicative, constant multiplicative factor,

you get the same reliability

as if you had no node failures in the system.

9:14

So, with failures,

one of the things that could happen with the gossip

is that it could die out very quickly

and this, uh, if it does happen,

typically happens very early in the gossip.

Very early in the gossip, if the sender, uh,

before it sends out a copy of the gossip dies, then obviously

you're not gonna get the gossip spread out anywhere.

The sender might send out one copy of the gossip, uh,

to a few nodes in the system, just one round,

and then all these, uh, first round recipients

and the sender byte might die.

Okay, the probability of this happening is very low

because these are selected at random,

so even if you have gossip running in a data center

and an entire rack goes out,

as long as one, um, er, process outside the rack got the gossip,

you'd still have the gossip spread and it is very resilient.

9:55

So, once the gossip has infected a small amount of nodes

in the system, just a few rounds,

after that it's very hard to kill the gossip.

And this is kinda, should be familiar to you

because this is uh, paralleling

uh, the way in which disease is spread

in, uh, human populations or rumors spread in society.

Once a rumor of the disease gets out, uh, a little bit farther,

it's very hard to contain.

Okay, so, all the analysis that we've seen

in the previous slides is with high probability,

and this has been shown in another book, uh,

by Galey and Dani, uh, which also analyzed, uh, gossip.

10:40

So, what about the pull-gossip protocol?

So far we have looked at the push-gossip protocol,

what about pull?

Uh, in all forms of gossip, whether it's push or pull,

it takes O(log(n)) rounds before the, uh, before, uh,

about half the nodes in the system get the gossip.

Why is this?

Well, uh, the best you can do

is essentially build a spanning tree

among the nodes in the system,

and if you have a constant fan out,

constant number of children

per node in f-in the spanning tree,

it takes O(log(n)) rounds for about half the nodes to get, uh,

this, uh, uh, gossip message in the best case.

However, thereafter,

after about half the nodes have received the gossip,

pull is faster than push.

Yeah, how-to see this, let's look at the following:

After the ith round,

let p_i be the fraction

of noninfected or uninfected processes in the system,

then in the i+1 round, uh,

the value of p_(i+1) is essentially (p_i)/(k+1)

where k is the, um, number of, um,

k is the number of, uh, gossip targets.

K=b in this particular case.

Why is this?

Well, the probability that you stay infected- uh, noninfected

at the end of the i+1 round

is the probability that you were uninfected

at the beginning of the i+1 round,

so that's p_i by itself,

and also the probability

that each of the k or b, um, uh, targets

that you contacted at random

were all also uninfected by the gossip,

and that's, uh, times (p_i)^k,

and that's how you get (p_i)^(k+1).

So, this goes around very quickly,

and in fact this goes around, uh, super exponentially,

faster than exponential,

and it can show that this is O(log(log(n)),

not just O(log(n)), but it's O(log(log(n)),

so it's much faster than logarithmic.

So, overall the pull gossip protocol is still O(log(n)),

but, ah, uh, it's much faster in the second half.

12:23

So, this should give you an idea of getting,

b-of designing, perhaps, a push-pull hybrid protocol

where you use push in the beginning portion

to get the gossip out very quickly

and then use pull, uh, in the later rounds, uh,

to, uh, um, uh, to-to get it out

to everyone else in-in the system very quickly.

12:42

Finally, uh, the gossip protocols

that we've discussed so far are not topology aware,

so, uh, when it comes to a hierarchical topology,

whether it's, uh, on the internet using subnets

or whether it's on-in a data center using, uh, racks,

uh, the core switches and routers

might get overloaded quite a bit.

So, for instance, consider a scenario

where I have two subnets, perhaps two racks;

uh, the top subnet and the bottom subnet

each with n/2 nodes, uh, from the group.

Since each node selects a gossip target uniformly at random,

about half the gossips are gonna go across from one subnet

to the other, okay, and also from the bottom to the top.

This means that the router sees, uh, n/2,

actually b*(n/2) gossips go out per, uh, gossip period

and so the load on the router

is going to be O(n), which is very, very high.

So, the fix to this is to do the following.

You have gossip that, uh, prefers nodes in your own subnet

with a high probability

and nodes outside your subnet with a lower probability.

If your subnet contains ni or n_i, uh, nodes, you, uh,

gossip, uh, um, uh, outside your subnet with probablity 1/(n_i).

With probability 1-(1/(n_i)),

you gossip within your own subnet.

So, this means that within your own subnet,

because the probability of gossiping

is still very close to one,

ah, it's gonna spread in O(log(n)) time,

and this is true for any one of these subnets.

And then after a subnet has been completely infected,

uh, since everyone gossips outside with probability 1/(n_i)

and there is n_i such nodes gossiping outside,

it takes O(1), uh, rounds on expectation

for the gossip to go across the router to the other, uh, subnet

and for someone in the other subnet to get infected.

After that, again, it's gonna take O(log(n)) rounds

to spread within that subnet itself.

So, overall it's still O(log(n)) +, uh, O(1) + O(log(n)), uh,

for it to get across

and so it's still O(log(n)) rounds dissemination time,

and instead what we have done

is that we have re-reduced the load on the router to be O(1).

Why is it O(1)?

Well, uh, because essentially you have n_i nodes in the-

in one subnet and since each of them

sends a gossip out with probability 1/(n_i),

the, uh, expected number of gossips going through the router

is (n_i*1)/(n_i), which is O(1).