0:00

So the key idea is that instead of scaling up the number of ports per

Â router, we're going to scale out the topology through what's called a multi

Â stage switch network. This is a network with many switches.

Â For example, a simple two by two switch, two input, two output.

Â Can be either configured like this, or it can be configured.

Â Like that. So there are only two possible

Â configurations for this small two by two switch.

Â And if we connect many of these kind of switches which could be two by two or

Â larger, two by three, three by two, four by four then we form a network of

Â connectivity determined by the configuration inside each switch.

Â If this runs through multiple stages, we call that a multi-stage switch network.

Â It turns out that you can view large network using only small switches but

Â many of them. And arranged in a smart way across

Â multiple stages. And that's what we mean by skating out a

Â topology, to enable the skating of a network.

Â Even though, we do not scale up the number of ports for each network

Â [INAUDIBLE]. So here's a simple example to illustrate

Â that. We got on the left, the exact kind of

Â tree that we just drew. You've got eight servers as leaf nodes in

Â circles, and you've got switches in rectangles starting from small switches,

Â two by two, going to four by four, going to eight by eight.

Â Okay? So suppose, let's say that we don't want

Â switches larger than two by two. So this layer of switches do okay in the

Â tree, but this and this layer is not. So, one possibility is to replace each of

Â these four by four switches with two, two by two switches.

Â So, one four by four somehow is equivalent now to two, two by two.

Â For example, this is a two by two this is another two by two.

Â I put the two together, and now the lower layer switch will actual go talk to each

Â of these switches. And that collectively function as this

Â four by four switch. By symmetries identical on the right-hand

Â side. And then, the uppermost layer for this

Â one example is an eight by eight switch, and we show that it can be done

Â equivalently as four two by two switches. Okay?

Â Each of these is a two by two switch, and then each of them is connected one link

Â to the left half and the other to the right half.

Â Collectively, these four act as an eight by eight switch.

Â And by the way, each one is, indeed, two by two.

Â Even though you only see, two links here. Because two links need to come in.

Â And then they also need to go out, right? So this is, just an artifact, a visual

Â artifact created by folding these switches.

Â We'll look example of this again. But these, each of these is a two by two

Â switch. So now, if you look at this,

Â instead of having, what, all together, seven switches but some are big switches.

Â And we now have a collection of twelve switches.

Â So almost doubled in this small example, the number of switches.

Â But each switch if only two by two now. Instead of scaling up the port count per

Â switch we scale out the topology of connectivity pattern.

Â We will look at a special case called the Cloche network of such connectivity

Â pattern. Invented by Cloche in 1953, but in

Â general there are so many choices in their study in the field of

Â interconnection network which has been studied in different application context.

Â Starting with 1950's and 60's in actually circuit switch networks, even in

Â telephone network PSTN, people realized by then that hey you cannot build a large

Â enough mechanical or electronic switch. So you need to scale out apology with

Â multistage network. Then we'll study in parallel computation

Â in the 60's, 70's to 80's. Okay with many mainframe computers

Â connected together into a supercomputer then the scale got shrunken to a much

Â smaller one. we're looking at systems on a chip and

Â multi-core processor in the 90's and the last decade and then in recent years

Â revived again in the context of data center networking.

Â Connecting servers within a large data center.

Â All these are different cases of interconnection network, and the common

Â theme is that connectivity pattern is a resource that you have to carefully

Â design. And there are many different metrics to

Â compare different kinds of inter connection

Â networks. We'll mention two.

Â One is called by section bandwidth, and the other is called blocking, or non

Â blocking property. Now, the idea of a bisection bandwidth

Â has to go back to a simple graph oretic operation called a cut.

Â So suppose you are given a graph G, which is really a set of nodes V, a set of

Â lengths E. A cut is to say that I'm going to divide

Â this node set into two parts V1 and V2 which equals to the original set of V the

Â set of nodes in V1. Okay?

Â 5:40

And a bisection is basically a, a binary cut where the two subsets are of equal

Â number of nodes. For example, here would be a network with

Â a bisection, this way. Okay, so I divide the total set of nodes

Â into V1 and V2. Now what about the links?

Â Some links are entirely inside your V1 or V2 connecting only nodes in all in V1,

Â all in V2, then those links are just fine.

Â What about the links that actually connects across V1 and V2 like these

Â three links? Well, we can count them, if it's an

Â unweighted graph. Or if it is weighted, we can add up their

Â weights like two, three, five. All right, then we add them up and this

Â is called the capacity of that cut, okay? The capacity of a cut, in this case.

Â Ten, in general, it is some real number representing the sum of the links

Â connecting acoss to, to size. In the case of a bi-section, two equal

Â parts, then we can say it's the capacity of that

Â bisection. Now there many ways to cut the given

Â graph into two parts or two equal parts. For example, this will be another cut.

Â Okay? Now these two nodes belongs to one side

Â and these belong to another set, side. So you can look at all the possible ways

Â to cut or bisect and then find out the minimum one.

Â Okay? The one with the smallest capacity,

Â that's called the minimum cut. And the resulting capacity, in the case

Â of a bisection to equal parts, is called bisection bandwidth.

Â It's the worst case connectivity, between two equal parts of a network.

Â 7:30

For example this cut, let's say these two links cut ways one

Â and four, then this weight of cut gives you a capacity of eight,

Â this weight of cut gives a capacity of ten.

Â So, we just saw eight is smaller than ten.

Â So, the minimum cut is this minimum bisection in this case.

Â And a bisection bandwith is eight units. Now, there's also a famous, theorem

Â connecting the mean cut with the amount of flow you can squeeze from one node to

Â another. Let's say you want to squeeze a lot of

Â flows from the source to this destination through this graph.

Â Okay. For example I want to squeeze say 30,

Â well 30 is impossible because there are only 16 plus, 13 29 units that's

Â available to come out of the source. So let's say alright I want to squeeze 29

Â then, alright. And split 16 out of 16 go out this way,

Â 13 out of 13 go out this way. So both links capacity at 40 utilized

Â Alright? Then what? The 13 arrives here, it can go out.

Â Okay, either part of this along this link with capacity 4 or along this length of

Â capacity 14 can all go along this length, for example.

Â Now the problem is 16 units arrive here and there's only one way to go out, that

Â way is the link with capacity only 12 which is smaller than 16 and so you

Â cannot squeeze all 16 units out of this node.

Â So 29 is not a flow that you can squeeze through from the source to the

Â destination. So now the questions is, what is the

Â maximum amount of flow that you can squeeze from source to destination for

Â this given topology and weights or capacity on the links.

Â Well, that is the famous maximum flow problem and there are several very famous

Â efficient algorithm to solve that. Now this is not a graph theory course of

Â whom will go through all of them. But I want to highlight one very elegant

Â theorem, assess the following. Well if you want to find out the maximum

Â flow through the network between two points,

Â it's the same as finding the minimum cut where the source lies on one side of the

Â cut, and destination on the other. So, source belongs to B1, the subside of

Â nodes, and destination belongs to B2, another subside of nodes.

Â It turns out that this shaded, configuration is the cut with the

Â smallest capacity. That is the min cut, okay?

Â You can convince this yourself, by cutting in different other ways.

Â In this min cut, we see that going from the V1 side to the V2 side, there are

Â three links of capacity 12, 7 and 1. So you add up together that is 23 units.

Â So the mean cut capacity for this graph, were source and destination lies on two

Â different sides of the cut is 23 units. So is it indeed possible to actually push

Â 23 units from source to destination? Yes and here's one way to do it.

Â You put 11 on this link, 12 on this link. How do we get to this split into 11 and

Â 12 this way. well that can be discovered by those

Â graph oretic algorithms that we're skipping.

Â But now, once you get to 11 here that's fine.

Â 11 can all go on this way. 12 comes here.

Â 11 will go here and one will go this way. So you all together got one plus 11, 12

Â units that fully utilizing those link. Okay.

Â This link's not used and the 11 arrives here.

Â You can split into 4 and 7. Now, you see why we took one unit off

Â this way. Because if all 12 unit gets to here, then

Â you won't be able to put them all to these two links.

Â So, it's not a surprise. In fact, a small example, you can eyeball

Â that indeed, you should put one out there.

Â And then the one together with 11 will saturate this link and the 11 will be

Â able to saturate these two links. And then once the seminar arrives here

Â with 12 as of to 19, but it can be supported out of this link unit of 20

Â capacity. So now finally, this destination gets two

Â things coming in one with 19, one with 4's all 23 units are received as they

Â were transmitted out of the source. So indeed as you can verify yourself,

Â there is no way to increase the max, the flow from source to destination.

Â So the maximum flow is 23 units, which is the same as the minimum cut size.

Â So that's a long detour of a fun and important classic result.

Â But in data center design when you scale out the topology, you want a large

Â bisection bandwidth okay. A large potential bottleneck to the

Â performance of communication within the data center.

Â The second criteria we use is called non blocking.

Â It originated from the circuit switching word for the telephone network.

Â There's something called non-blocking and something called rearrangably

Â non-blocking. Non-blocking says that, for whatever

Â traffic pattern arrives in the multi-stage network, which I'm going to

Â draw momentarily, as long a this input port and that output ports themselves

Â available, you should be able to find a way to pass

Â it through this multi-stage switched network.

Â So the connectivity pattern should be able to support that.

Â And rearrangeably non-blocking is a weaker condition, that says,

Â well, you may not be able to do that. But if I allow you to rearrange some of

Â the existing connectivity pattern. For example, instead of going this way,

Â say well, you should go this way. And this opens up a way for this input

Â code to this output. As long as you can rearrange existing

Â configuration and then be able to achieve non-blocking then that's also good okay.

Â So this is a weaker condition. Now what kind of switch network will have

Â blocking or rearrange for the non-blocking property?

Â Well only those that have rich enough connectivity patterns in them.

Â And indeed, in 1953, Clark invented a particular connectivity pattern and we're

Â going to see the kind of non-blocking conditions on the parameters for such

Â networks.

Â