0:41
And Skype is best know for making IP to IP phone calls, which are free.
Now making phone calls is intrinsically P2P, okay?
Up here calls up another pair, okay? You do not call up a server, for example.
So it's intrinsically P2P for voice call themselves but also the control
signalling, that's what really makes Skype tick, to say that in order to discover the
other IP phones, what you need to do is to use the pierce as an infrastructure for
discovery, The connectivity.
1:25
So the peers collectively act like a bridge, okay?
And they can bridge to IP phones. Let say you are one of these nodes on
Skype network. Now you see some of these nodes are
bigger, because we'll later call these super nodes, SN, okay?
As opposed to just a node. Now, when you want to make a Skype call,
first of all you have to log at a server. So the server will authenticate you, make
sure you are indeed a Skype user and then you will tell the server that hey, I would
like to call up the following Skype user. And then the server will direct you to a
particular super node. And then your communication will be
carried out by a super node. Now these super nodes are basically just
like a normal node. And some of your desktop machines might be
used as a super node. And you may not even know that,
Okay? So basically any nodes that has a public.
Ip address. You may recall from two lectures ago that
IP address could be public or it could be hidden behind a firewall or network
address translators. Okay?
This one is public and it should have abundant resources, including upload
bandwidth CPU power, memory and so on. So, a ordinary node will connect to a
super node, because, it knows the public IP address.
Let's say, if you want to make a phone call to your friend, who is a Skype node
over here. And let's say this one sits behind some
kind of firewall, Okay?
Or network address translation. So you cannot actually directly locate
this IP node. Now in a telephone network, the old
telephone network, it's been there for 100+ years, you would need maybe a
switchboard with human being there or not, mechanical or electrical, in order to
connect you to that phone. But in the IP world, what can you do if
this is sitting behind some kind of firewall or net?
Well the good news is that. He talks to his own super node and these
super nodes have public ip addresses. So this super node will form a
communication path with this super, super node.
Okay and then this super node would then further connect to this, this node.
4:54
Peers, especially these powerful peers with IP address called super nodes.
Now, BitTorrent in contrast uses P to P not just for connectivity discovery, but
for actual content sharing. So a typical BitTorrent graph may look
like this. This black dot is a BitTorrent machine,
let's say, your machine. And it would like to engage in a torrent.
And what it does is to, first, communicate again with the server managed by
BitTorrent. Authenticate yourself and obtain the IP
address and port number of another hardware called a tracker.
Okay, this tracker got a database that keeps track of.
And then, when you talk to the tracker and say, you know what?
I would like to participate in the torrent of watching a particular movie, okay?
Now, the movie may or may not be, properly, digital copyright protected.
But, let's say, we ignore that part. Okay?
The tracker then, will then say, alright good.
I can tell you that in your proximity, okay, you've got the following, whatever
number eight, neighbors. Okay?
And you can form connection with them, okay.
And notice that they may be forming connection among themselves as well.
Now, how do you decide what neighbors to pick as peers to actually do transmission
of packets? That's something we'll talk about very
soon. But as you can see, the main idea behind
bit torrent, that often you have to do multicast.
So far, our attention has been unicast, meaning that one source to one
destination. Multicast is one source to n destinations.
Now, you may have heard of doing multicast in the IP layer, the network layer, but,
except for certain controlled channel. Except for IP TV, that we'll talk about in
lecture eighteen, and except for local area networks LANs, there are three main
kinds of exceptions you do not see IP multicast really widely adopted, part of
the reason is management overhead. And, instead, an architectural alternative
Architectural alternative to doing multi cast, one to end delivery of packets in
the network layer is to do it in the application layer.
Also called, application layer multi cast. Meaning that, I do not have to figure out
how can I actually establish multicast capability from one source to say, three
in host, three destinations. I'm going to build an overlay network,
okay. Between among these modes and then be with
some kind of tree in the application layer.
So, the network layer doesn't needs, does not need to be bothered.
8:05
Now some say the IP multicast would have been a more solution, but it does carry
the price of managing the overhead. So BitTorrent is an application layer
multicast. And the key idea is to say, I'm going to
divide the file, we're going to share, whether it's a text or a movie, into many
smaller granularity units called chunks. And then the tracker will see which peers
possess the chance that you need and then give you a list of what we call neighbors.
These are the potential peers that can share content with you, potential peers,
and then you have to pick a subset of these neighbors as the actual peers.
8:56
And we're coming to describing very smart ideas behind BitTorrent in doing that.
So each chunk in this multicast process follows some kind of tree, okay?
I got say six peers, they go from A to F and this blue square or rectangle is a
particular chunk and if you look at the lifetime of this chunk it basically
follows some kind type of tree pattern. It doesn't form, loop back because that'll
be very inefficient. However, one single session of multicast
in BitTorrent consist of many, many chunks.
9:43
And, these different trunks can traverse different kinds of trees.
For example, in this graph, we can see that these six nodes came form all kinds
of connectivity patterns. In advanced material part of the lecture,
we'll look at optimizing the connectivity pattern.
But right now, let's say the dot solid line are the actual pier ring relationship
form. And the dotted lines are those pier and
relationship that could have been formed, but are not at this particular instant.
And then next, time instant, we may see. Different kind of connectivity pattern.
Okay, so, if you're looking at a, a lifetime of a particular chunk, it
traverses a tree. But if you look like, the overall it's
more like a mesh, rather than a tree. Now, both a bit torrent and Skype, as we
just saw in the very basic description, are overlay networks.
Now we've been using this term a few times already.
What exactly is an overlay network? Now think about the following topology.
You've got a graph. Which is a set of nodes and a set of
links, Okay?
11:03
The white circles are the nodes in set V and then the solid lines are the links in
set, of edges E. Okay?
That's given to you. And overlay says, you know what?
I would like to establish a connection between this node and this node.
So I'm going to draw this on top of it in dark just to differentiate.
These two dark nodes are logical entities. A source and destination.
I want to find a path. Lets say, this dotted line, to connect
them. Similarly I can say, I want to take this
node and this node and from a connectivity between them.
So think of, its hot in, job is on 2D, but think of 3D where, aa, these, aa, black
dots, follow them, and the doted lines, sort of, a lies, right above, this plain.
Okay, now you may say what about this node?
Well this node is not part of the overlay network.
It is used basically as a bridge, a relay in the underlie, underlay network.
So we can define the overlay network represented by another graph.
G2d. It also consists of a set of nodes and a
set of links. V2d and E2D, respectively.
But the set of nodes here is a subset of the original one.
In other words, V2D belongs to the original set V.
For example, these four black nodes are subset of the original one.
And there are some nodes that were in the underlayed G, but not in the overlayed
G2D. They are hidden, from the u point of
overlay network and further more, each link, each element in the set, e to d, in
the overlay network, is really a path. By concatenating different links in the
underlay graphs. Okay?
So , overlays are linked in overlay . Is really a path in the underlay.
13:12
Now, what does the overly graph look like in this case?
It's a very strange looking graph. Nonetheless, we basically got these four
nodes, and there are two links. One link is the path.
This one in the underlay, the other is this path in the underlay.
So, you actually see just two parallel lines.
That is our new graph G2D, which is disconnected graph.
And just by looking at this G2D you would have no idea that actually there is
another node hidden there. And furthermore, actually, both links
actually use this underlying physical node.
13:53
Okay. Now, actually, that was shared fate.
If this node goes down, both connection would go down.
That information is also hidden from the graph representing G2D.
So this is the idea of building overlay. And you can say, you know what?
If I cannot change the routing protocol down there, I'm going to, just establish a
tree or mesh or some kind of graph among the source and destinations.
Let's say these are the BitTorrent machines alive at a given time.
And then build my own machine G2D. That is the powerful idea of overlay.