In the last section, we studied parallel Gaussian channels. In this section, we study correlated Gaussian channels, which is a generalization of parallel Gaussian channels. This is the system of parallel Gaussian channels that we have studied, which can be represented more compactly by the figure on the right-hand side, where bold X is the random vector consisting of the components X_1, X_2, up to X_k, and bold Y is the random vector consisting of the components Y_1, Y_2 up to Y_k. And bold Z is the noise vector consisting of the components Z_1, Z_2 up to Z_k. The noise vector Z is jointly Gaussian with mean 0 and covariance matrix N, where the covariance matrix N is a diagonal matrix with the diagonal elements equal to N_1, N_2 up to N_k. This means that the noise variables Z_1, Z_2 up to Z_k are uncorrelated, and because these random variables are jointly Gaussian, it implies that they are mutually independent. With this representation of parallel Gaussian channels, we can generalize it to correlated Gaussian channels, where the covariance matrix of the noise vector Z, K_Z, is not necessarily a diagonal matrix. For the Gaussian channels, we impose a constraint on the input power. Likewise, for correlated Gaussian channels, we also impose a constraint on the input power. Our task is to determine the capacity of a system of correlated Gaussian channels. The main idea of the analysis of the capacity of correlated Gaussian channels is the correlation of the noise vector. Consider a system of correlated Gaussian channels, where the covariance matrix of the noise vector K_Z can be diagonalized as Q lambda Q transpose. We now convert the original system by installing a linear transformation Q at the input and installing a linear transformation Q transpose at the output. For this new system, the input is the vector X prime and the output is the vector, Y prime. From this figure, we see that Y prime is equal to Q transpose times Y. And X prime is equal to Q transpose times X, because from the above, we see that X is equal to Q times X prime. Likewise, we let Z prime be Q transpose times Z. Because the vector Z is a Gaussian vector, Z prime is also a Gaussian vector. We now derive the input output relation of the new system. First of all, Y prime is equal to Q transpose times Y, where Y is equal to X plus Z. So we have Q transpose times X plus Q transpose times Z, where Q transpose times X is equal to X prime and Q transpose times Z is equal to Z prime. Therefore, Y prime is equal to X prime plus Z prime. In other words, in the new system, Z prime is the equivalent noise vector. [BLANK_AUDIO] This equivalent noise vector Z prime is uncorrelated. To see this, consider the covariance matrix of Z prime, which is equal to Q transpose time the covariance matrix of Z times Q because Z prime is equal to Q transpose times Z. Now the covariance matrix of Z can be diagonalized as Q times lambda times Q transpose. So this cancels with this and this cancels with this and we are left with lambda, which is a diagonal matrix. That is, the equivalent noise variable for the i-th channel, Z_i prime, is a Gaussian random variable with mean zero and variance lambda_i, where lambda_i is the i-th diagonal element of the matrix lambda. Because lambda is a diagonal matrix, the equivalent noise variables Z_i prime are uncorrelated, and hence mutually independent. Thus, for the new system, we have the output Y prime equal to the input X prime plus the equivalent noise vector, Z prime, where the components of Z prime are mutually independent. Therefore, the new system, which we call the equivalent system, is a system of parallel Gaussian channels. Therefore, we can represent the equivalent system like this. In the rest of this section, we will show that the equivalent system and the original system have the same capacity. In the original system, we have a constraint P on the input vector X, we now relate this constraint on X to the constraint on X prime, the input of the equivalent system. Since X prime is equal to Q transpose times X and Q transpose is an orthogonal matrix, by proposition 10.9, the energy is preserved. That is, the expectation of summation i, X_i prime square is equal to the expectation of summation i, X_i square. Therefore, the input power constraint, expectation summation i X_i square less than or equal to P of the original system translates to the input power constraint expectation summation i, X_i prime square less than or equal to P of the equivalent system. We now prove the equivalence of capacity of the equivalent system and the original system. This is done by proving the following proposition, which asserts that the mutual information between X prime and Y prime is equal to the mutual information between X and Y. The mutual information between X prime and Y prime is equal to the differential entropy of Y prime minus the differential entropy of Y prime conditioning on X prime. Now the differential entropy of Y prime conditioning on X prime is equal to the differential entropy of Z prime conditioning on X prime. This is by means of a straightforward vector generalization of lemma 11.22. Now the differential entropy of Z prime conditioning on X prime is equal to the differential entropy of Z prime because Z prime and X prime are independent. To see this, consider Z independent of X and so Q transpose Z is independent of Q transpose X, that is Z prime is independent of X prime. Now Y prime is equal to Q transpose times Y and Z prime is equal to Q transpose times Z. The differential entropy of Q transpose times Y is equal to the differential entropy of Y plus log of the absolute value of the determinants of Q transpose. And likewise, the differential entropy of Q transpose times Z is equal to the differential entropy of Z plus log of the absolute value of the determinant of Q transpose. And so, this logarithm cancels with this logarithm and we have differential entropy of Y minus differential entropy of Z. In the remaining steps, what we are going to do is just to reverse what we have done at the beginning of the proof. Namely, differential entropy of Z, is equal to the differential entropy of Z given X, which in turn is equal to the differential entropy of Y given X. And finally, the differential entropy of Y minus the differential entropy of Y given X, is equal to the mutual information between X and Y. And this proves the proposition. Therefore, the equivalent system and the original system have the same capacity. We see that the equivalent system is actually a system of parallel Gaussian channels. Therefore, applying the capacity of a system of parallel Gaussian channels, we see that the capacity of a system of correlated Gaussian channels is given by one half summation i equals one up to k log of 1 plus a_i star divided by lambda_i, where a_i star is the optimal power allocated to the i-th channel in the equivalent system. And lambda_i is the i-th diagonal element of the matrix lambda. That is, the noise energy of the equivalent noise variable for the i-th channel. And the values of a_i star's can be obtained by water filling. This completes the characterization of the capacity of a system of correlated Gaussian channels.