Return t the model where we have a Bernoulli likelihood with N observations, or a binomial likelihood, and using a beta prior, we get a posterior which is beta alpha plus sum of the y sub is. Beta plus n minus sum of the y sub is. Here it is clear how both the prior and the data contribute to the posterior. For a prior, which is a beta, alpha beta this is equivalent to having alpha plus beta, additional observations. We can say the effected sample size of the prior. And we'll see how this is so in the next calculation. Recall that the expected value or mean of a beta It it's first parameter over the sum of the parameters. So in the case of our posterior here, it's posterior mean is alpha plus sum of the. Divided by alpha plus sum of the y sub i's, plus beta plus n minus sum of the y sub i's. We can then write this, simplifying as alpha plus sum of the y sub i's over alpha plus beta, plus n. This can further be decomposed, as, Alpha plus beta, over alpha plus beta plus n, times alpha over alpha plus beta. Plus N over Alpha plus beta plus n times sum of the y sub is over n. What we see here is that we get the posterior mean is equal to a prior weight Times the prior mean plus a data weight times the data mean. So the mean of the data is the sum of the y subwise over n. The prior mean is alpha over alpha plus beta. The weight for the prior, and the weight for the data add up to one, so they are weights. And so the posterior mean is the weighted average of the prior mean and the data mean. We can also see here, here's the N for our sample size, and so the corresponding term over here, tells us the effective sample size of the prior. Thus the affected sample size of the prior for beta prior on Bernoulli or a binomial likelihood is alpha plus beta. This effective sample size also gives you an idea of how much data you would need to make sure that you're prior doesn't have much influence on your posterior. If alpha plus beta is small compared to N. And the posterior will largely just be driven by the data. If alpha + beta is large relative to n, then your posterior will be largely driven by the prior. Thinking about intervals, recall that a frequentist 95% confidence interval, For theta Is theta hat plus or minus 1.96 times the square root of theta hat, 1 minus theta hat over n. Now as a Bayesian, we can make a 95% credible interval using our posterior distribution for theta. We would use a computer package, such as R. To numerically find the values for the interval, we can find an interval that actually has 95% probability of containing theta. And this is a true probability statement. For a confidence interval, we can't make a probability statement that theta is in the interval. Sequential analysis is also straight forward In Bayesian statistics. Suppose we start with the prior f(theta) and we observe n data points. We observe y1 through yn. We then update our prior. We get a Posterior theta given y1 through yn. So far so good. Then suppose you come in the next day and you observe more data. You observe m more data points. y, And +1 through y n + m. Now that we have more data, can we update again? And indeed we can. We treat yesterday's postere as today's prior. So we now use theta given y one through y n as our prior, update it with data And get a new posterior theta given y1 through yn+m, all of the data. And so we can just chain this together doing a sequential update everytime we get new data. We can get a new posterior, and we just use our previous posterior as a prior to do another update using Baye's theorem. In this way, we actually get the same answer mathematically as if we started with the prior. And then all the data came at once and we did a single update with Baye's theorem, and we will get the exact same posterior. So in the Basian paradigm, this consistent whether we're doing sequential updates or we do a single batch update. This is very different from the frequent scenario, where you can't do sequential updates in this way. And you get different answers depending upon when you observe your data and how you combine them. One of the earlier areas for basing statistics accepted under government regulation was in the medical device testing area. And this is because of it's ability to do sequential updates. Medical devices, you often have very small sample sizes. But you're only making minor updates to the devices and you're doing new trials. The ability of Bayesian statistics to do easy sequential updates made it very practical and appealing For the medical device testing industry.