Hi folks, so now we are going to talk about models of networks that allow us to capture interdependencies. And these are often known as exponential random graph models. So, we went through these different forms of models, last we talked about stochastic block models. And now we are going to talk about the popular set of models which are known as exponential random graph models. And we are also going to talk about some variations on these kinds of models, some new types of models. Which are easier to estimate than exponential random graph models. And, so I'll talk to that, after we get to, to the basics. Okay. So first of all, a quote from Jacob Jevy Moreno and Helen Hall Jennings from 1938. And this sort of, you know, gives us the idea that when we're looking at situations with, we're looking at social interactions and people are interrelated. We're not going to be able to look at just binary relationships diads. We're going to have to be looking at bigger configurations. And, so that is something that is driven home by their original, quote here. That a pertinent form of statistical treatment is one which deals with social configurations as wholes, and not single series of facts, more or less artificially separated from the total picture. And a very interesting thing here, I pulled a, a picture from Moreno in 1932. So, actually the Moreno is also known as the father of sociometry. He's a social psychologist, and here, he was actually mapping out ties between individuals in different houses, in New York. He was at, at Columbia University in 1930s. And this was one of the first sociograms or situations where we've actually mapped out a network in a graphic form of a social interactions between individuals and one of the first ones that I know of. Okay, so, so lets talk about this class of models. So, the previous models we have talked about are not great at fitting data with lots of clustering and other kinds of dependencies. And in particular, in testing many social and economic kinds of theories, which will give us reasons for which people might be, interacting with each other in certain particular, forms. And, so the idea here is that you know, the link between two individuals i and j could depend on the presence of a friend in common. Does, j know k and i know k as well? That's something we want to capture. The difficulty with this is, once we allow one link to depend on another link, then we open a Pandora's box, where now all the links could be inter-dependent. So if, if i and j dependent whether they have a friend in common and the, that friend depends on whether they have other friends in common and also depending on whether i and j are there. Everything gets put back together and so we have to specify all of the interdependencies. And Frank and Strauss in 1986 and they started working with a class of models which became known as p* models and, and were most of them imported into the social networks, literature and, and force by Wasserman and, and Pattison in the 1990s. And have, have since become known as exponential random graph models. And let's start by just going through an example of this. And, and this is sort of the simplest possible complication that we could have to given, model where instead we had links before. Now what we're going to do is allow the probability of the network to depend not only on the number of links present, but also depend on the number of triangles present. Okay, so we could say that links with triangles are more likely to appear than links networks without triangles, okay? So we're just going to complicate things on one very simple dimension. But one that actually turns out be quite powerful in, in terms of enriching the picture of, that we have. And in particular, this idea that we could have that, a link depends on whether you have friends in common. That's going to, to give possible precedence to more triangles than would appear in something like a Erdos-Renyi random graph. And so, what's the idea behind an exponential random graph model. Now we have the, the probability depends on the number of links we have, the number of triangles times some parameters. So, if we set this parameter to zero, then then that would zero this out and, and all,a ll that would matter was links. But if that's not zero, then we have a situation, where now we have both links and triangles mattering. In order to make things into probabilities. What we're going to do is then, make sure that this thing turns out to be something between what has to, have to be non-negative and lie between zero and one. And so the first thing we can do is instead of just having a probability be proportional directly to this. We'll exponentiate that thing, so it's always going to be non-negative. This is a standard trick in, in statistics, work with the exponential family. So now we have something which is non non-negative or generally going to be positive, so the probability of a graph is proportional to some exponential function of something about the number of links and the number of triangles. And one very powerful theorem by Hammersly and Clifford show that pretty much any network model could be expressed in the exponential family, with some counts of graph statistics. So it might not be links and triangles, it might be links, triangles and numbers of three prong stars you have, it could be that it depends on the number of cliques of size six. It could be of quite complicated list of statistics. But the theorem that they had said that pretty much any network model that you can think of could be expressed in this form. And just as a check on this, let's go back to our Erdos-Renyi random network model and see how this works. So, let's let p be the probability of a link. L of g is the number of links in a given graph. What's the probability of g in that world? Well, under the Erdos-Renyi the chance that you had exactly this network full of links, is the probability p that all of the links that are present formed and 1 minus p that all of the links that did not form didn't form. Okay. So, that would be the probability of getting exactly a particular network with L links. Then let's rewrite this, so we'll pull out the L. So, I'm just repackaging this. So that we can write this as p over 1 minus p raised to the L, of g times 1 minus p to the n times n minus 1 over 2. And, let's write this now in an exponential form. So another way to write this is, this is the exponent of the log of p over 1 minus p times L of g minus some extra term which doesn't involve the number of links at all. So, what does this look like? This looks like an exponential function of a statistic of the graph, where in particular, this statistic is the number of links in the graph. So you can write the probability of an Erdos-Renyi graph as in this form, where now this looks like log of p over 1 minus p. Okay. So we can see that, I mean, this is not a proof of the Hammersly-Clifford theorem by any means. But what it does give us an idea that you can convert a lot of other kinds of models into the exponential random graph family. And that's going to be quite useful. Now one other thing to, to make this a probability of course the, these have to sum to one. So, in particular that means that we have to normalize by what the probability of all the graphs are, to make sure that the probability of one particular graph when we sum across all the graphs, this is going to sum up to one. So the denominator of this probability is going to have to be the sum across all possible graphs of what the probability of those, relative probability of those graphs would be. So now, if we sum this thing across all graphs, the sum of now the P of g's across all g primes, this thing will now equal to 1 because we get the sum to be the same on the top and the bottom. Okay. And also you can write this we know we can bring this denominator which doesn't depend on g anymore. Into the numerator as minus some constant it's just going to be the corresponding constant which captures its denominator. So, what we've got is a, an exponential random graph family, which allows us to capture different types of statistics. And, that's going to be quite powerful in allowing us to fit a lot of things and incorporate a lot of things. The main challenge is going to be in estimating these kinds of models. So now next, in the next video, I'll talk a little bit about how to work with these models to estimate things and to begin to fit these.