In this video, you'll learn about unpaired image to image translation. First I'll compare it appeared image to image translation, which you saw with pixel pics, to the unpaired variant. Then you'll see how unpaired image to image translation works, where it's a mapping between two piles of image styles. It's really about finding the common content of these two different piles as well as their differences. You've seen a paired image to image translation before, and that's when you have that clear input, output pair between, let's say edges and realistic handbag. Because you can use edge detector and go from a realistic image to those edges. Then you can get that paired data set pretty easily. But that's great when you have some algorithm to get that paired training data, or you already have that paired training data for some reason. An unpaired image to image translation, it's not so straightforward. Let's say you want to change a horse into a zebra, or have a painting by Monet here. This generation is much harder because you probably did not start off with training data that was paired where you had, for every single photograph something that looked like it was painted by Monet or vice versa. There probably weren't thousands of training pairs there. The paradigm between these two image to image translation tasks, is that for one of them, you have these paired images. You pair up xi and yi so here i is probably 0, 1, 2. You have all these different pairs that match onto each other, but you don't necessarily have this correspondence all the time, and so an unpaired image to image translation, you actually just have two piles of two different image styles, x and y. One pile which is maybe realistic photos, the other maybe it's Monet or Cezanne or someone else. Or you can have a pile of winter looking images versus summer scenes or a pile of horses in one or zebras in another, where you don't have a one-to-one correspondence at all. Using these two piles, pile from x and y, you want your model to learn general stylistic elements from one pile to the other and transform images in one to another and sometimes also vice versa. Here photograph to Monet and maybe you can also have photograph to Van Gogh and vice versa to Cezanne and Ukiyo-e. But what's key here is that you're making this photo of a poppy field look like it was drawn by say Monet, but it's still a field of poppies in this Monet. There's still some type of content that is preserved. It's just the stylistic elements that are changed. That's pretty key in thinking about this translation task because there are commonalities and stylistic differences, unique things about each pile where you want to be able to tease out what's common, and keep those common elements, and only transfer those unique elements that are to each pile. Concretely, with zebras or horses here, you have a pile of zebras, and the horse you want to generate from this zebra image should still be in the same orientation. You just want those stripes to go away. The models goals to learn that mapping between these two piles of horses and zebras into figure out those common elements and unique elements. The common elements are often known as content. The content of the image, which is common to both of these piles and then styles often referred to what is different between them. Again, content here is the general shape of the horse or zebra. Maybe even the background here in the style is obviously stripes and zebras, as opposed to a single color or less of a pattern on horses. You'll learn about how this is done in the following videos this week. In summary, unpaired image to image translation uses piles of different styled images instead of paired images. The model learns that mapping between those two piles by keeping the contents that are present in both, while changing the style which is different or unique to each of those piles.