Okay, so we've seen that the Cyclopeptide Tyrocidine B1 is not produced based off of the genome. So we need to move beyond genome analysis to try and figure out what its sequence is. It's not going to be hidden in the genome. The tool that we're going to use is the workhorse tool of this approach, and it's called a mass spectrometer. Here's a photo of one that UCSD uses and we're going to think of it as an expensive molecular scale. Now, this asks an important question which is, how is it that we actually measure the mass of a molecule? So we're going to define a Dalton as the mass of a proton or a neutron, approximately. That's not an exact measurement, but it's approximately what a Dalton is. And the mass of a molecule then is going to be equal to the sum of the protons and neutrons and all of the atoms of the molecule. So let's look at an example. Let's compute the mass of Glycine. So Glycine is an amino acid and it has chemical formula, C2H3ON, all right? So it has 2 carbon atoms and carbon has 12 protons and neutrons. So we do 12 times 2, then we'd have 3 hydrogen atoms, each of mass 1. Then we have a single oxygen atom of mass 16, and a nitrogen atom of mass 14. So when we add these together, we get an approximate value of 57 Daltons. The actual mass is 57.02 Daltons and that requires a more technical precise definition of what a Dalton is. But we're going to use this rounded off approximation to an integer and we're going to refer to this as the integer mass of Glycine. So we can compute the integer mass of each amino acid that we have and this gives us what we call the integer mass table, all right? So you'll notice we have the 20 amino acid masses beneath the one letter abbreviation of each amino acid. So the question then, is, well, what's the mass of something like Tyrocidine B1? We know the sequence of Tyrocidine B1, so we just need to go through and compute its mass as the sum of each individual amino acid. So V has mass 99, K has mass 128, L has mass 113, and so on. So we can go through, we simply add the constituent masses, and we get 1322. So that's the integer mass of this Tyrocidine B1 peptide. I'll note, before we continue, that two amino acid pairs, I and L, as well as K and Q, have the same mass. So when we weigh the molecules, we're not actually going to be able to distinguish between I and L or distinguish between K and Q. And so we're going to move essentially from an alphabet of 20 amino acids to an alphabet of just 18 integer masses. We'll work mainly with integers. Now, in this talk, I'll show the actual amino acids in terms of their letters, but if you're working with them on a computer, you probably want to just assume I and L are the same and assume K and Q are the same. So let me show you then how this mass spectrometer works. You're going to take multiple copies, a sample of this peptide, and you're going to plug them into the machine. And then the machine is going to chop them up into a lot of different pieces, it's going to blast them into fragments. And then, essentially, it's going to weigh these fragments. So this is an ideal scenario where it gave us the mass of every possible subpeptide or fragment of this NQEL peptide, circular peptide. And I'll highlight the masses. Because it doesn't tell us what the subpeptides are, it tells us what the masses are. Notice that here, two of the masses, 242, are equivalent. So we're going to call this the theoretical spectrum. The theoretical spectrum of a cyclic peptide is chop it up into all possible fragments, weigh those fragments, and then add 0, which is just an empty string, and 484, which is the mass of the entire peptide. So if you know the sequence of a peptide, computing its spectrum is just a trivial exercise. Chop it up into all pieces, use the integer mass table and figure out what the mass of each fragment is, that's not bad. What about reversing it? What about going from the spectrum to the peptide? That's really the computational problem that we have. The mass spectrometer generates a spectrum, an experimental spectrum, and then we want to determine a peptide that came from it. That's going to be a harder problem. And we call this the Cyclopeptide Sequencing Problem, so to reconstruct a cyclic peptide from its theoretical spectrum.