A second important case is the gamma-Poisson conjugate families. In this case the data come from a Poisson distribution, and the prior and posterior are both gamma distributions. The Poisson random variable can take any non-negative integer value all the way up to infinity. It is used in describing count data, where one counts the number of independent events that occur in a fixed amount of time, a fixed area, or a fixed volume. The Poisson distribution has been used to describe the number of phone calls one receives in an hour. Or, the number of pediatric cancer cases in the city, for example, to see if pollution has elevated the cancer rate above that of in previous years or for similar cities. It is also used in medical screening for diseases, such as HIV, where one can count the number of T-cells in the tissue sample. The Poisson distribution has a single parameter, lambda, which is both the mean and the variance of the Poisson random variable. The probability mass function for the Poisson is lambda raised to the power of K over K factorial, times E raised to the negative lambda power. This gives the probability of observing a random variable equal to K. Here, K factorial is K times K minus one times K minus two, all the way down to one. And, obviously, lambda must be greater than zero. It is both the variance, and also the average of the count. Famously, von Bortkiewicz used the Poisson distribution to study the number of Prussian cavalrymen who were kicked to death by a horse each year. This is count data over the course of a year, and the events are probably independent, so the Poisson model makes sense. He had data on 15 cavalry units for the 20 years between 1875 and 1894, inclusive. The total number of cavalrymen who died by horse kick was 200. One can imagine that a Prussian general might want to estimate lambda. The average number per year, per unit. Perhaps in order to see whether some educational campaign about best practices for equine safety would make a difference. And, suppose the general is a Bayesian. Introspective elicitation leads him to think that the lambda is about 0.75, and his uncertainty in this belief is expressed as a standard deviation would be one. Since this time period and to date the era of modern computing, the general will need to express his prior as a member of a family conjugate to the Poisson. It turns out that this family consists of the gamma distributions. Gamma distributions describe continuous non-negative random variables. As we know, the value of lambda in the Poisson can take any non-negative value so this fits. And, the gamma family is pretty flexible, one can see a wide range of gamma shapes. The probability density function for the gamma is indexed by two parameters, K and theta. But, be warned that some books parameterize it in a slightly different way than we shall use here. And, you should always check which parameterization is being used. For our parameterization, the mean of the gamma is K times theta, and the standard deviation is theta times the square root of K. So, the general's prior is that this gamma such that K times theta equals 0.75 and theta times the square root of K equals one. Solving these simultaneous equations shows that the general's K is nine 16ths and the generals theta is four thirds. For the gamma Poisson conjugate family, suppose when observed data that are Poisson distributed with values X sub one, X sub two, dot dot dot, X sub N. Then, in the same way that we recognize the kernel of the beta distribution In the integral form for Bayes' rule for the beta-binomial family we would recognize the kernel of the gamma when using the gamma-Poisson family. It turns out that the posterior gamma has updated parameters K star and theta star. Where K star is K plus the sum of the observed values, and theta star is theta divided by N times theta plus one. For this dataset, there are N equals 15 times 20, or 300, observations, and the total number of cavalrymen who died was 200. Therefore, the general now thinks that the average number of Prussian cavalry officers who die at the hoofs of their horses follows a gamma distribution with parameter K star equal to 200.5625 and theta star equal to 0.0033. How has the general changed his mind after seeing the data? Before he saw the data, he believed that lambda was about 0.75. Now, he believes it's about 0.67. And, before he saw the data, his uncertainty about lambda, expressed as a standard deviation, was one. After seeing the data his uncertainty has shrunk to 0.047. What have we learned? We learned a new pair of conjugate families, the gamma-Poisson. To do that we had to learn about the Poisson distribution, which is a useful model for account data. And, about the gamma distribution, a flexible family of distributions for continuous, non-negative random variables. We learned the updating formula, and applied it to a classical, and I hope memorable, data set.