In this session we shall discuss a quite important and very useful GRanges class, from the packets, called genomic ranges. Let's load. A GRange is a GRange constructor and essentially it's very similar to an IRanges with some additional having to do with chromosome and strength. Gromosones in GRanges are called seqnames. Say yes, we have a strand "+", "-", "+" And then the ranges. We'll say I ranges start. So it prints and we can see we have three ranges there. On two different strands and on a single chromosome. Strands in G ranges can have three different values. They can be plus or minus which are the forward and the reverse strand. And then they can be star which indicates either that the strand is unknown or that it's there. An entity that is present on both strands. A lot of the things that we've learned about, I ranges carried directly to T ranges we just have the additional booking keeping of the chromosome and the strand. But there are a couple of new operations that has to do with the fact that things now have a direction because of the strain. We can do things, such as get the flanking sequence on the g range, and you can see here, that. The flanking sequence is relative to the direction of transcription. Something happens in the positive strand where we go to the left. It was a negative strand to go to the right. There are other things such as Permodas and a couple of other straight forward things there. Permoda is a default permoda thing. 2,200 bases interval where your 200 bases are downstream and 2,000 bases upstream of the transaction start site. And g-ranges since it has chromosomes operate within a universe that is given by something called seqinfo or sequence info. Right now we haven't given it a lot of information. It knows that there's a chromosome 1, it doesn't know how long it is, it doesn't know whether it's circular and what genome it is. So can we for example give it a length, which is quite useful. So we're going to say chromosome 1 has a length of 10. And let's see it there. There. We can see the different chromosome names, given by sequence levels. And now that we have an end of the chromosome, we can talk two operations that are relative to the entire chromosome. An example is gaps. Gaps is a function that basically gives us all the stuff on the chromosome that is not covered by a range and the g ranges. So here we get one surprising thing. The first thing is you can see that the star strand is treated differently from the past in the minor strand. There were no ranges originally on the star strand so therefore when we check the gap, we get the entire star strand back. Furthermore, you can see that all of the ranges end at 10, because that's the end of the chromosome so we know, yeah, that's where it stops. Let's try to expand this with a new chromosome. It seems straightforward that you should be able to do something like seqnames of gr, to be equal to chromosome one, chromosome two, chromosome one. But, we get an error, and the reason we get an error is that it has recorded that there's a single chromosome in this organism, or on this hypothetical organism. So we need to start off by either reconstructing the object, or we need to tell it that, hey, there's actually two different levels. And now when it knows that the seqlevels can take two different values. We can assign a new vector of C values. Now we have two different chromosomes here. Often we sought Granges and asserting order seems to be sensible here, but it's actually relative to the order of a seqlevel. So here we have seqlevels that are chromosome one and chromosome two, let's reverse that, so we're essentially saying that chromosome 2 comes before chromosome 1 and then we're going to solve two main [INAUDIBLE] and now the seq chromosome 2 is the first range that comes out. Things with chromosome one and chromosome ten and sort of stand out computer sorting chromosome ten would happen before chromosome two. We could also assign a genome to these different things so let's try to do that. Let's say that the genome of this g range is some string, well let's say hq19. And it will print it. We can see it all in 19, and actually turns out if you look at the seek info slot that every single chromosome can have its own genome. That seems very esoteric, but that happens because say we seek from saying organisms. Or say you're spiking in external DNA into our sequencing experiment. This external DNA sometimes comes as our quality control. Or spiking controls. These external pieces of DNA may come from different organisms. So it's a little less charitable as possible that each sequence comes from its own genome. The whole thing is they will all come from the same genome. So why is this nice? Well, this is very nice to label your GRanges with genome because it gives it a certain kind of protection. We are all used to dealing with data from different genome builds and that's a, I feel probably a frequent source of errors in computational. But here there's some building stuff. So let's take, let's make a copy of the T ranges and say that this new copy is actually on HG18. Now let's do something like a find overlaps, that we haven't really discussed fully but we'll do that in a different session. But it works in the same way as the Iranges. And if we do that we get basically an error out saying that, hey we're trying to do a final overlap between two GRanges and they are incompatible G notes. That's a really, really nice safety feature that I would encourage people to make use of.