And in directed graphs, you have twice the number of pairs because for

any pair s, t, you could have a path from s to t, but

also a potentially different path from t to s.

So you would divide the betweenness centrality of node v by (N-1)(N-2).

And in network x, you can use the function betweenness

centrality to find the centrality of every node in the network.

And you have the various options that we've discussed here.

So for example, you can choose to normalize or not.

And you can also choose the question of the endpoints,

whether you use the node that you're computing the centrality of as one of

the endpoints in the computation of its centrality.

So you can choose to do this in any way you want here.

So for example, if we look at the karate club and look at betweenness centrality,

compute the betweenness centrality of all the nodes and then find the five largest,

the five nodes with the largest betweenness centrality,

we find that these are the nodes 1, 34, 33, 3, and 32.

Now one of the issues with betweenness centrality is that it can be

very computationally expensive.

Depending on the specific algorithm you're using,

this computation can take up to order number of nodes cubed time.

And just for us to get some idea about this,

look at the network of friendship, marital tie, and

family tie among these 2,200 people.

This is a relatively small network, yet when you look at the number pairs of nodes

that it can have, you have a new order of 4.8 million pairs of nodes.

And so even small networks have lots and lots and lots of pairs of nodes.

And so computing the betweenness centrality of these networks becomes

pretty expensive.

So one of the things that you can do is rather than the computing betweenness

centrality based on all the nodes s and t, all the possible nodes s,

t in the network, you can approximate it by just looking at a sample of nodes,

instead of looking at all the nodes.

And in network x, you can do this by using the parameter k

that says how many nodes you should use to compute the betweenness centrality.

And so here, I'm computing the betweenness centrality of the nodes in

the karate club network using only 10 nodes rather than 34 nodes.

And so this gives you an approximation for

what the betweenness centrality of the nodes actually is.

And if I look at the five nodes with the largest

approximated betweenness centrality,

we find that these are nodes 1, 34, 32, 3, and 2.

So we get almost exactly the same list as we did when we didn't approximate,

when we find the actual betweenness centrality, except that we now get

2 as one of the top five and we lose 33 as one of the top five.

So it gives us something that is close to the actual.

But of course, there can be some differences since now

you're only using 10 rather than 34 nodes to compute the centrality