OK, so now we have an idea for a different algorithm, and it's going to be based off of an approach called “branch and bound”. So let's learn a little bit about this and I'm going to give a really hazy, vague overview of what a branch and bound algorithm's idea is. So, you're going to start with a simple candidate solution and you're going to “expand” it or “branch” that out into different solutions, and then you might do this again. Then, after you've branched out into a number of possible solutions, or small seeds of larger possible solutions, then you're going to say, OK, can we exclude any of those? Can we eliminate any of these candidate solutions that we have? And that's called the “bound step”. So from here to here we've eliminated five solutions, and we're left with, these are not necessarily solutions, but they're at least candidates for solutions, at the leaves of this tree. Then we say, okay let's branch again, and then we'll need to bound. Based off of some condition, we'll exclude some of these edges when we move from branch to bound. And now we have, maybe some candidate solutions, and we can go ahead and test these. So we've found a way of eliminating certain solutions, before we construct them. So we want to apply the same idea to eliminating some candidate peptides, before they ever get generated. Let's apply it to cyclopeptide sequencing, then. Here's a possible spectrum. And our first question is, which amino acids have masses in the spectrum? So if we bring back this integer mass table, then we highlight, P has mass 97, and it occurs in the Spectrum, V has mass 99, it occurs in the Spectrum. We also get T and C. So we're going to start with four 1-mer peptides. We're going to start with just these single amino acids, and what we're going to do is, we're going to extend these four into all possible 2-mer peptides, all possible amino acid strings of length 2. So we get PA, VA, TA, and CA. We get PC, VC, TC, and CC, and so on. So, we started with four, and we're going to now multiply that by 18, because we're adding each possible amino acid onto each of these four and we're left with 72. So, we want to know how we can trim this list. So we've done our branch step, and now we need to do our bound step and bring it back. To do this trimming, we'll notice that PV is what's called “consistent” with Spectrum. Here's one possible 2-mer, and you'll notice that the mass of P is equal to 97, it’s present. The mass of V, the other letter in PV, is present as 99, and the mass of this entire small linear peptide, PV, is 196, and its also detected. So that's called “consistent”, if when you chop up this candidate solution, and weigh its fragments, that all those fragments are found in the spectrum. CD is one that we actually created, in the expand step, but it's inconsistent with this spectrum. So although 103, the mass of C, is present, is detected, 115 and 218, the mass of D and the mass of the entire 2-mer peptide CD, they're not found. You can't find 115 or 218 in this spectrum. So we want to exclude CD. After we do this step, and we figure out what it is that we need to trim we're left with ten amino acid strings of length two. From our 72, we've cut it down to ten. So let's do this exact same thing. Let's expand each of these in all 18 possible directions. We'll get now, 180 different 3-mer peptides, and then let's check whether or not each one is consistent, and we'll trim it down. After we do this, we go from 180 to 15, and we say, OK, this is going well, we're preventing these peptides from growing crazily. So we expand them, and then we'll trim. We go from 15 and do 15 times 18 to get 270. And then we find which of those 270 4-mers are consistent with the spectrum, and we're down to ten. And we do it one more time, expand and then trim, and we're left with ten consistent 5-mers. And you can verify that each of these corresponds to the exact same cyclic peptide, and the spectrum of this cyclic peptide coincides exactly with our experimental spectrum. So now we have a method that's going to work and it looks like it's going to be pretty efficient. I'll just summarize what it is. We find all the amino acids whose masses occur in the spectrum, and we add them to a list. Then we extend every peptide that's currently in the list in every possible direction. So we add each of 18 different amino acid masses, onto the end of it, and then we have this trim step, where we say, let's take which of these are inconsistent and let's just get rid of them. Then we'll return any peptides whose theoretical spectra match the experimental spectrum that we're given. And we'll do steps 2 through 4 over and over again, this branching and bounding and then testing, until we run out of peptides to consider, until we have all inconsistent peptides. We've seen it works really well on an example, but we ask ourselves whether or not this algorithm is efficient in practice? The brute force algorithm that we saw before, it was exponential, and by “exponential” I mean, if you look at its input size, the number of steps that it has to take or, in this case, the number of peptides that it has to consider, grows exponentially with respect to the input size. It may be true that the branch and bound algorithm that we've come up with is exponential on some dataset. Like, somewhere way out in the galaxy you can form some dataset that takes a really, really long time to run and is prohibitive. But in practice, it's actually really fast. So, it looks like we're done and, we can have a party and go home. That's the question though: can we really? And I'm going to have to pop that balloon and say “no”.