0:00

Okay, now after, after going back and forth with the biologist we

have gone through the first two phases of algorithmic thinking,

which is understanding the problem and formulating it.

So we now understand what the problem is, what the biologist has a sequence,

a primary sequence of, which a string over a, a UCG.

The biologist would want, the, the secondary structure,

which is in the form of a set of pairs of indices of that,

of that sequence, such that they satisfy certain, certain constraints,

four of them, and it has the maximum number of such pairs.

And we define the problem cleanly now.

Right? So it is a problem we can actually, for

the time being, we can completely ignore the biology and

say it is a problem that is defined over strings.

Right?

Now the question is how do we solve this problem.

How do I find a set of pairs, of indices, over a, over a given RNA string.

That max, that, that is of maximum size, the set is of maximum size and

satisfies these properties.

Let's reason carefully about this.

OK? So I'm going to assume that I have a,

an RNA sequence, that has "N" letters in it.

Again, each one of these are, b i is in.

A, U, C, G, okay.

1:45

And I'm going to, introduce, I'm going to call that optimal solution,

remember that the optimal solution is the, is a set

of the maximum size in terms of the, in terms of the number of pairs then, okay?

So [COUGH] I'm going to introduce this nosh, notation of OPT.

Of i, j.

OPT of i, j is going to be the maximum

number of pairs of indices, the maximum numbers of pairs of indices that I

can put in the set that will satisfy the four requirements of, of infeasibility.

When I'm just looking at the, at the pa, the part of the sequence between i and j.

Okay?

Between positions i and j.

In particular, I am really interested in OPT of 1,n.

Okay? So if we focus on this OPT of 1, n is

a notation that's going to, that is going to be the number, the number of pairs in

the set with the maximum of pairs between that satisfies the four conditions.

When I look at the entire sequence from between 1 and n.

Again, let me repeat, OPT 1 of n in the maximum number of pairs,

of indices that I can match together to satisfy the four

feasible requirements in the feasibility of the problem.

When I am looking at the entire sequence.

Or, or in other words, this is the size of the,

the set that the biologist is interested in.

Okay?

Now I want to get, what is the maximum number of pairs that I can,

I can match when I am given this sequence?

Okay?

Now we can look at, let's look just at the very last letter.

Okay, that the, the letter at the very last position.

3:31

And let's ask the question of is this position,

is this letter, going to be matched with something, or

it's not going to be matched with anything, in the, in the solution.

There are only two possibilities for every letter here.

It's either part of the solution, or it's outside the solution.

Right?

So I can either match this with some element here, and

it is part of the solution.

Or I say the solution that has the maximum number of

pairs is not going to involve this one.

Right? It's not going to involve this one.

So I have two options here.

Either the position n, bn is part of the solution.

Or it's not.

4:28

Okay? This is the first case [SOUND] if

position n is not part of the solution.

Then I know that OPT of 1,

n then the, the maximum number of pairs when I am looking at the entire string is

going to be really the maximum number of pairs when I am looking at just 1 to n

minus 1 because this is not going to effect the maximum of pairs.

So whatever maximum I can get between 1 and n minus 1 is going to be

the same to the max between 1 and n, because this is not going to be involved.

Right?

So in this case, OPT of on n is really of OPT of 1.

6:04

If this is the case, if, if, if OPT(1,n) is going to have.

It's going to have a pair that has or the,

the optimal solution is going to have the pair t, n in it.

Then I can now continue to solve the problem by looking at

this part and looking at this part here.

6:29

And see what solutions, how do I continue to complete the solution by look,

by, by pairing things here and by pairing things here?

Why did I not consider pairing anything here with something here?

6:43

Because one of the feasibility conditions is that, is that condition of no crossing.

Right? If I have this,

I cannot have a pair crossing.

So that condition allows me that once I decide that the pair (t,n) is going to

be in the solution, it allows me now to break the problem into two subproblems.

This one, now looking at the positions t plus 1 to n minus 1 and

t minus 1 all the way to 1.

7:23

Whatever I can get by pairing things between 1 and t minus 1, and

whatever I can get in terms of pairing things between t plus 1 and n minus 1.

Okay?

So this really corresponds to the solution in this sub-space.

This corresponds to the solution in this subspace and

this one here is because of this sphere.

Okay.

7:55

Now there is nothing else.

Either.

Either position n is not part of the solution, in which case I say okay,

I cannot do anything about it, let me now try to solve the problem once again but

now on positions from 1 to n minus 1, or it is part of the solution,

it matches with some of position t, and if it matches with position t,

then the solution should have one, at least one peer.

Plus, I repeat the exercise, now by matching things between 1 and

t minus 1, and between t plus 1 and n minus 1, and nothing else.

I know that I shouldn't be matching anything before with

something after t now.

8:37

This position n could have matched something here,

could have matched something here, some match, something, and so on.

So t, t really ranges, this one here.

This should, this is t ranges between 1 and n minus 1,

and it has to satisfy the feasibility constraints, right?

So t has to be.

8:58

T is between 1 and smaller than n minus 4.

Right?

Because I cannot have t too close to 1 because of that and

it can be all the way to 1 here.

Of course, when I look at b, t and.

9:16

And, b, n, they have to satisfy the complementarity

that is if b, t is either A or, or, b,

t and b, n have to be either a, u and c, g and so on.

9:38

And I take the one that will maximize the, the, this number here.

Right?

So if we look at this solution now, what we are getting is a sa,

is an interesting formula that tells us that OPT of 1,

n is either OPT of 1, n minus 1, or 1 plus this.

Now which one of them it is?

It is the max.

If this gives me the maximum size for this, I will choose this.

Or in other words, if excluding position n from the solution will give me

a maximum number of pairs, I will exclude it.

If it doesn't, I will find that position t and match b, n with b, t.

Such that this is maximum.

Okay?

Now if you look at this carefully, here.

10:25

We are, we started from 1, n, but now I'm looking at,

in terms of 1, n minus 1, I'm looking at one t minus 1, and so on.

So this is why OPT of y, j.

Can be defined generally.

It doesn't have to be defined for 1 to n.

We can define it opt of i, j, which is, again, the maximum number of pairs we

can obtain by solving the problem on the string that's between positions i and j.

Okay?

And, again, in particular, when we solve it on the pairs 1 to n,

we are solving that entire problem.

But it's good to define this because we can make use of it here.

All right, so if I have this quantity and this quantity,

then this quantity can be obtained as 1 plus this plus this, okay?

So this all together, this all combined can basically

gives me the fact that if I replace the 1 and n by any i n g.

That opt of i, n,

j is the max of OPT i, j minus 1.

This is when position j is not part of the solution, and.

12:20

This will be the actual formula that we have, that opt of i, j, if the max of

two quantities, the first one is, well, position j is not part of the solution.

Second one, j is part of the solution and it matches with position t.

And what, what is shown here again is,

again from position i to position j, j, if j goes with t.

If j goes with t, then we have, all the way from i to t minus 1.

And then we have the sub solution of t plus 1 to j minus 1.

Okay?

T plus 1 to j minus 1 is solving this sub problem.

Opt of i, t minus 1 is solving this sub problem.

And the 1 is for this pair.

And as we said.

T has to have some constraints, so when I a range t goes between 1 and

j minus 1 with some prop between i and

j minus 1 with some properties that again it has to satisfy the no sharp turns,

it has to satisfy the condition that I am matching it with u and c with j and so on.

13:25

Now this is the mean,

the mean formula that will allow us now to solve the problem because if you

look at this, you are immediately seeing that if OPT is the name of a function.

This is nothing but

a recursive formula that can be turned into a recurves of algorithm.

So you want to, you want to write, you want to compute OPT of 1, n.

Just write the function that will call the function itself three times, and

take the max of these values.

And this will be a recursive algorithm for this problem.

Of course, it's giving us the number of pairs.

But with simple, bookkeeping here, or

simple traversal of the solution, we can get the actual pairs themselves.