0:00

Okay, now after, after going back and forth with the biologist we

Â have gone through the first two phases of algorithmic thinking,

Â which is understanding the problem and formulating it.

Â So we now understand what the problem is, what the biologist has a sequence,

Â a primary sequence of, which a string over a, a UCG.

Â The biologist would want, the, the secondary structure,

Â which is in the form of a set of pairs of indices of that,

Â of that sequence, such that they satisfy certain, certain constraints,

Â four of them, and it has the maximum number of such pairs.

Â And we define the problem cleanly now.

Â Right? So it is a problem we can actually, for

Â the time being, we can completely ignore the biology and

Â say it is a problem that is defined over strings.

Â Right?

Â Now the question is how do we solve this problem.

Â How do I find a set of pairs, of indices, over a, over a given RNA string.

Â That max, that, that is of maximum size, the set is of maximum size and

Â satisfies these properties.

Â Let's reason carefully about this.

Â OK? So I'm going to assume that I have a,

Â an RNA sequence, that has "N" letters in it.

Â Again, each one of these are, b i is in.

Â A, U, C, G, okay.

Â 1:45

And I'm going to, introduce, I'm going to call that optimal solution,

Â remember that the optimal solution is the, is a set

Â of the maximum size in terms of the, in terms of the number of pairs then, okay?

Â So [COUGH] I'm going to introduce this nosh, notation of OPT.

Â Of i, j.

Â OPT of i, j is going to be the maximum

Â number of pairs of indices, the maximum numbers of pairs of indices that I

Â can put in the set that will satisfy the four requirements of, of infeasibility.

Â When I'm just looking at the, at the pa, the part of the sequence between i and j.

Â Okay?

Â Between positions i and j.

Â In particular, I am really interested in OPT of 1,n.

Â Okay? So if we focus on this OPT of 1, n is

Â a notation that's going to, that is going to be the number, the number of pairs in

Â the set with the maximum of pairs between that satisfies the four conditions.

Â When I look at the entire sequence from between 1 and n.

Â Again, let me repeat, OPT 1 of n in the maximum number of pairs,

Â of indices that I can match together to satisfy the four

Â feasible requirements in the feasibility of the problem.

Â When I am looking at the entire sequence.

Â Or, or in other words, this is the size of the,

Â the set that the biologist is interested in.

Â Okay?

Â Now I want to get, what is the maximum number of pairs that I can,

Â I can match when I am given this sequence?

Â Okay?

Â Now we can look at, let's look just at the very last letter.

Â Okay, that the, the letter at the very last position.

Â 3:31

And let's ask the question of is this position,

Â is this letter, going to be matched with something, or

Â it's not going to be matched with anything, in the, in the solution.

Â There are only two possibilities for every letter here.

Â It's either part of the solution, or it's outside the solution.

Â Right?

Â So I can either match this with some element here, and

Â it is part of the solution.

Â Or I say the solution that has the maximum number of

Â pairs is not going to involve this one.

Â Right? It's not going to involve this one.

Â So I have two options here.

Â Either the position n, bn is part of the solution.

Â Or it's not.

Â 4:28

Okay? This is the first case [SOUND] if

Â position n is not part of the solution.

Â Then I know that OPT of 1,

Â n then the, the maximum number of pairs when I am looking at the entire string is

Â going to be really the maximum number of pairs when I am looking at just 1 to n

Â minus 1 because this is not going to effect the maximum of pairs.

Â So whatever maximum I can get between 1 and n minus 1 is going to be

Â the same to the max between 1 and n, because this is not going to be involved.

Â Right?

Â So in this case, OPT of on n is really of OPT of 1.

Â 6:04

If this is the case, if, if, if OPT(1,n) is going to have.

Â It's going to have a pair that has or the,

Â the optimal solution is going to have the pair t, n in it.

Â Then I can now continue to solve the problem by looking at

Â this part and looking at this part here.

Â 6:29

And see what solutions, how do I continue to complete the solution by look,

Â by, by pairing things here and by pairing things here?

Â Why did I not consider pairing anything here with something here?

Â 6:43

Because one of the feasibility conditions is that, is that condition of no crossing.

Â Right? If I have this,

Â I cannot have a pair crossing.

Â So that condition allows me that once I decide that the pair (t,n) is going to

Â be in the solution, it allows me now to break the problem into two subproblems.

Â This one, now looking at the positions t plus 1 to n minus 1 and

Â t minus 1 all the way to 1.

Â 7:23

Whatever I can get by pairing things between 1 and t minus 1, and

Â whatever I can get in terms of pairing things between t plus 1 and n minus 1.

Â Okay?

Â So this really corresponds to the solution in this sub-space.

Â This corresponds to the solution in this subspace and

Â this one here is because of this sphere.

Â Okay.

Â 7:55

Now there is nothing else.

Â Either.

Â Either position n is not part of the solution, in which case I say okay,

Â I cannot do anything about it, let me now try to solve the problem once again but

Â now on positions from 1 to n minus 1, or it is part of the solution,

Â it matches with some of position t, and if it matches with position t,

Â then the solution should have one, at least one peer.

Â Plus, I repeat the exercise, now by matching things between 1 and

Â t minus 1, and between t plus 1 and n minus 1, and nothing else.

Â I know that I shouldn't be matching anything before with

Â something after t now.

Â 8:37

This position n could have matched something here,

Â could have matched something here, some match, something, and so on.

Â So t, t really ranges, this one here.

Â This should, this is t ranges between 1 and n minus 1,

Â and it has to satisfy the feasibility constraints, right?

Â So t has to be.

Â 8:58

T is between 1 and smaller than n minus 4.

Â Right?

Â Because I cannot have t too close to 1 because of that and

Â it can be all the way to 1 here.

Â Of course, when I look at b, t and.

Â 9:16

And, b, n, they have to satisfy the complementarity

Â that is if b, t is either A or, or, b,

Â t and b, n have to be either a, u and c, g and so on.

Â 9:38

And I take the one that will maximize the, the, this number here.

Â Right?

Â So if we look at this solution now, what we are getting is a sa,

Â is an interesting formula that tells us that OPT of 1,

Â n is either OPT of 1, n minus 1, or 1 plus this.

Â Now which one of them it is?

Â It is the max.

Â If this gives me the maximum size for this, I will choose this.

Â Or in other words, if excluding position n from the solution will give me

Â a maximum number of pairs, I will exclude it.

Â If it doesn't, I will find that position t and match b, n with b, t.

Â Such that this is maximum.

Â Okay?

Â Now if you look at this carefully, here.

Â 10:25

We are, we started from 1, n, but now I'm looking at,

Â in terms of 1, n minus 1, I'm looking at one t minus 1, and so on.

Â So this is why OPT of y, j.

Â Can be defined generally.

Â It doesn't have to be defined for 1 to n.

Â We can define it opt of i, j, which is, again, the maximum number of pairs we

Â can obtain by solving the problem on the string that's between positions i and j.

Â Okay?

Â And, again, in particular, when we solve it on the pairs 1 to n,

Â we are solving that entire problem.

Â But it's good to define this because we can make use of it here.

Â All right, so if I have this quantity and this quantity,

Â then this quantity can be obtained as 1 plus this plus this, okay?

Â So this all together, this all combined can basically

Â gives me the fact that if I replace the 1 and n by any i n g.

Â That opt of i, n,

Â j is the max of OPT i, j minus 1.

Â This is when position j is not part of the solution, and.

Â 12:20

This will be the actual formula that we have, that opt of i, j, if the max of

Â two quantities, the first one is, well, position j is not part of the solution.

Â Second one, j is part of the solution and it matches with position t.

Â And what, what is shown here again is,

Â again from position i to position j, j, if j goes with t.

Â If j goes with t, then we have, all the way from i to t minus 1.

Â And then we have the sub solution of t plus 1 to j minus 1.

Â Okay?

Â T plus 1 to j minus 1 is solving this sub problem.

Â Opt of i, t minus 1 is solving this sub problem.

Â And the 1 is for this pair.

Â And as we said.

Â T has to have some constraints, so when I a range t goes between 1 and

Â j minus 1 with some prop between i and

Â j minus 1 with some properties that again it has to satisfy the no sharp turns,

Â it has to satisfy the condition that I am matching it with u and c with j and so on.

Â 13:25

Now this is the mean,

Â the mean formula that will allow us now to solve the problem because if you

Â look at this, you are immediately seeing that if OPT is the name of a function.

Â This is nothing but

Â a recursive formula that can be turned into a recurves of algorithm.

Â So you want to, you want to write, you want to compute OPT of 1, n.

Â Just write the function that will call the function itself three times, and

Â take the max of these values.

Â And this will be a recursive algorithm for this problem.

Â Of course, it's giving us the number of pairs.

Â But with simple, bookkeeping here, or

Â simple traversal of the solution, we can get the actual pairs themselves.

Â