Now we know what grayscale digital images is. The next question. How can we obtain and represent color images. What is color? Color is a psychological property of our visual experience when we look at objects and lights, and it is not a physical property of those objects and lights. Color is the result of interaction between physical light in the environment and our human visual system. Visible light is an electromagnetic radiation in the range from 380 nanometers to 780 nanometers. Our eyes are sensitive to electromagnetic radiation in this range and are insensitive to look for magnetic radiation outside this range. The reason why it has happened so probably lies in the fact that almost half of the solar energy that reaches Earth is transferred by light in this range. Almost the same amount falls to the infrared light, and the rest falls to ultraviolet radiation. To describe the source of light, we need to build that spectrum. To do it, we need at each wavelength in the range of visible light to measure the amount of metered energy per time unit. The amount of metered energy can be measured for example in number of photons emitted per millisecond. When light reaches the eye, photoreceptor cells in the retina reacts to it. Each rod and cone acts as a filter on the spectrum. To get the output of this filter we multiply its response curve by the spectrum and integrate over all wavelengths. Each photoreceptor cell yields one number. There are three types of cones with different responses. They can be described by the modes of their responses as long, medium, and short cones. So in our eyes they represent spectrum with three numbers from three types of cones. How can you represent a full spectrum with only three numbers? Of course we can't, and most information in the signal is lost. So two very different spectra may appear indistinguishable to a human observer. On the slide, I give two examples of such spectrums, which look like a purple color. You can see from the curves that they are vastly different. One curve is very smooth and one has a lot of peaks. But to the human observer, these two very different light sources produce the same color sensation. Such spectra that produce the same color spectra are called metamers, though these two examples are two metamers of purple color. Let's select three primary colors P1, P2, P3. We can try to represent all other colors with linear combination of these primary colors. Such primary colors are called primaries. And the weights that they used to represent other colors can be color coordinates in this color model on color coordinate system. There is a theory called trichromatic color theory that says that we can go get all visible colors by combining just those three primaries. To verify this theory, you can make color matching experiments. In these color matching experiments, the human observer is presented these different lights. On one part of the screen, you see a light spot from the test light. On the other part of the screen, you see a spot that is combined by three primary light sources. Human observer can change their relative power of these power sources. The goal of the observer is, by changing relative powers, to match the colors of test light to the color of the combination of primary lights. We can do the same experiments again and again with different observers and compare the powers that are used by different subjects to match the same test light, and then we can average the results of them. Based on these color matching experiments a set of heuristic laws, which are called the Grassman laws, are derived. The basic idea behind the Grassman laws is that color matching appears to be linear. The following properties of color matching are violet. If two test lights can be matched with the same set of weights, then these two test lights can match each other. The second property. If we mix two test lights, then mixing the matches will result in matching the result of mixing two test lights. And if we scale the test lights, then the matches gets scaled by the same amount. Based on these laws, we can obtain different linear color models. It's enough to select three primary sources, and then we can obtain all other colors by mixing these primary sources. For example we can select cone perception modes as primaries. These modes correspond to red, green and blue colors. So such model will be called RGB model. In 1931, International Commission of illumination proposed an CIE RGB 1931 color model. In this model, primaries are three monochromatic lights. If you want to use this model in computer display, they will correspond to the three types of phosphors in the computer display. We can visualize all colors that can be represented by RGB model with a 3D cube aligned to this coordinate system. The origin 0, 0, 0 corresponds to black color. The primaries are located on the three corners of the cube along the coordinate axis. The corner which is opposite to black with coordinates 1, 1, 1 corresponds to white color. Three other corners, correspond to a combination of just two primary colors, and they look like blue, magenta, and yellow colors. The next question is, can these RGB model represent all colors that are visible to a human observer. To verify this, we can continue color matching experiments to match RGB primaries to single-wavelength light sources in the range of visible light. The weights from these experiments will provide us matching functions. For each wavelength, we can get three weights that are used to match the single-wavelength light. We average the result of the matching functions of many observers, and during this experiment we will find that sometimes it's impossible to match test color with available primaries. To solve this problem, the primary color needed to be added to the test source, rather, to the basic primaries. To match color in this example, we need to add some amount of P2 primary to the test color side. This correspond to the negative weight. If you need to add some primary color to the test source, then we'll write this down as negative coordinate of this primary in color matching functions. So, the result of this experiment is that RGB model can't represent all visible colors. Substractive matching is required for some wavelengths. The other problem is RGB model that it is not very intuitive. Intuitively, we can describe visible light usually using intensity of light or brightness as a measure of strength of the light, and color or chromaticity. Sometimes, chromaticity can we break down to the hue and saturation factor. And in RGB model, we have no specific chromaticity information, we have only weights of the primary colors, and there is no brightness information. To solve this problem, in 1931, together with standard RGB model, another model, which is called XYZ was proposed. The goal was to get a linear additive model XYZ where component Y correspond to perceived brightness. This model should cover all visible colors, and all matching function should be everywhere positive. So, there is no such thing for XYZ model as substractive matching. X and Z components in this model capture chromaticity component of light. And points (zero, zero, one), (zero, one, zero), and (one, zero, zero) will be imaginary primaries. All values of X, Y, and Z are from zero to infinity. Dimension theorems for XYZ are shown in this slide. To visualize all possible colors using XYZ model, we will use normalized values X and Y, where X is ratio of X to the sum of XYZ, and Y is ratio of capital Y to the sum of XYZ. By varying X and Y in the range from zero to one, we can build color diagram or called color gumut. From this color gumut, we can make several observations. First observation is that single wavelength light sources are placed on the bounding curve of this gumut. Bottom line segment of purple colors correspond to colors that are unachievable by single wavelength light sources. So there are some colors that human can only see you if they're produced by a light source, not a laser but some range wavelengths. And from the shape of the gumut, we can see that no three real primaries can cover all visible colors. So, all linear models will either have imaginary primaries or they will not cover all possibly visible colors. On the slide, I display several variants of RGB model. The main difference between the models is the selection of primaries. You can see the difference in color presentation between different models. Depending on the selected primaries, the model cover different areas in color gumut. There is also one additional important property of light perception. It is non-uniformity of intensity perception. On the slide, there are two samples. First, is physically uniform brightness samples, and the second is perceptually uniform brightness samples. For the physically uniform brightness samples, you can see that the difference between black and dark gray is very significant for the human observer. It's much more significant than the difference between the bright samples on the right. From this, we can make the following observation that it is ineffective to store a brightness linearly because our human eyes distinguish differences in dark areas much better than difference in bright areas. So we can use gamma transformation and store gamma-corrected values instead of linear RGB values. To store gamma-corrected values is the main idea behind the sRGB model. sRGB model is a standard RGB model used in high definition television, digital cameras, display, et cetera. To obtain values of this sRGB model from XYZ model, we need to do two steps. First step is linear transformation from XYZ coordinates to linear RGB coordinates. Then we apply per channel on non-linear transformation. If we want to estimate light intensity or brightness from sRGB model, we need to apply this two transformation in reverse. First, we apply non-linear transformation to obtain linear RGB values, and then reapply linear transformation from linear RGB values to XYZ values and the Y value will give us the brightness of this color. There are other type of color models. For example, the YIQ model. This model was first introduced for analog colored TV. This is also linear model with separated the brightness component. Y corresponds to brightness component, and I and Q correspond to chromaticity information. Because brightness is separated, we can transmit it on a grayscale channel and color and information can be transmitted on additional bandwidth. This allows us to simultaneously translate signal, which can be received by grayscale TV and colored TV. YIQ model is used in NTSC standard. In PAL standard, there is a bit different model. Now, we can understand how we can obtain colored images. To obtain colored images, we need to measure R, G, and B value in each point of optical image. This can be done for example by taking three pictures with red, green, and blue filter of the same scene. In Russia, the history of color photography is started by Sergei Prokudin-Gorsky. He build special projector to display color photographs on three separate images in red, green, and blue channels. He also received the grant from the last Russian emperor Nicholas II to make a color gallery of the Russian empire. You can see images reconstructed from photos of Sergei Prokudin-Gorsky on the link displayed on the slide. In digital photography, instead of taking three images with different filters, we can use another technique, which is called Bayer pattern. In this case, each pixel in a sensor are if it its own filter. So, in each pixel, we obtain value for only one color channel, and the values for other color channel are missing. So, they need to be reconstructed. We can interpolate values of missing channels using information from neighboring pixels. This procedure is called demosaicing. Demosaicing can lead to image artifacts, as demonstrated in the slide. For example, fine black and white details can be interpreted as changes in color, not a small details during demosaicing. And we will receive raw in colors in the image. To solve this problem, specific techniques of demosaicing should be used and implemented actually in most of the modern cameras. So even this simple procedure of taking color images is not very simple. Now, we can understand what the digital color image is. Digital color image is three dimensional array where each pixel is coordinate. X, Y is represented with color vector of lens C. C is the number of channels. Usually, the number of channels is three, for example for RGB model or YIQ model, or some other model. But the number of channels can be different, for example, we can try to store full description of spectrum so the vector will be very large. But usually, you we have only three channels. Usually, each component is discretized to the interval from zero to 255 and stored in one bit word. This is usually enough for the recognition purposes. What if you want to have more precision in each channel? You can use 10 bit or 16 bit words, and obtain better accuracy in color representation. But for the recognition, its rarely required so we will think that most of images that we're working these are three channels by eight-bit words.