Okay. So this is the last of the lecture slide sections for this week. And I will be talking more about cytometry data analysis. There's many softwares available to do cytometry data analysis. Prior to a few years ago, many of the decent ones you had to pay for and they were actually quite expensive, but there's now one available for free called Cytobank which is actually quite good. It's a web-based program accessible by the website here. We're gonna show you some examples of using Cytobank as some of the lab based video. And it's what we've transitioned over to using in my lab. Essentially, it allows you to analyze both Mass Cytometry and Flow Cytometry data. And it's Cloud-based, so you can upload your data, and then share it with many people. It's accessible. People can see exactly how you analyzed your data, so you can share those sorts of things. Also it's really a nice piece of software that's really available, that's worthwhile getting to know and learning how to use. So with that being said, really the most critical aspect to analyzing any flow cytometry data set is gating. And I briefly alluded to that in the first lecture, when I was talking about fluorescence barcoding, and defining some of these rows of different fluorescence intensity in your different fluorophore channels as a gate. And just defining kind of a subpopulation of events or cells, based on which then you can do further analysis. The most common gate to start with in a flow cytometry experiment is by looking at light scattering properties, okay? So forward scatter versus side scatter. And as you saw this picture before when I was talking about how you can identify different cell types based on their light scattering properties. But even if you just have a homogeneous cell population from a cell culture. All of the events that go by the laser are not going to be cells. There's junk that just gets into your sample. Maybe bacteria, maybe yeast or other fungi, maybe just dirt, other small particles, etc. And you don't want to analyze that. You only want to analyze what are highly likely to be cell events. And the way that you do that, is by looking for events with essentially a certain combination of forward and side scatter. So debris typically show up in the bottom corner here, in forward and versus side scatter. So they're typically small and they don't have a lot of granularity inside of them. So they're showing up down here. So if you only want to analyze cells, then you can draw a region around the rest of your events, kinda like this. Different people do it in kind of slightly different ways, but as long as you're getting the events that have reasonably large forward scatter and side scatter, you can create a gate which is now separating the debris, at least what's very likely to be debris, from those things that are very likely to be cells. Once you do that you can do a whole range of other analyses. And before I talk about one of those other analyses I'd just like to illustrate why a slower flow rate is usually better. And this has to do with a phenomena called doublets, which we can also gate out. Once we identify what we think are cells we want to figure out what are actually highly likely to have been single cells passing by the laser beam, as opposed to those that are perhaps two cells that are stuck together, a so called duplet, or maybe a clump of cells or at higher flow rates, maybe it's two cells that happen to be passing through the laser beam at the same time. So this, this kind of illustrates why lower flow rates are better, because you're more likely to be getting single cells passing by the laser beam because the hydrogenemic focusing is much better behaved at these lower flow rates than they are at the higher flow rates here. Because the stream is larger and you can actually have two or more cells going by the laser beam at the same time. So how do you gate for something like that? Well, there's actually many ways in which one can do this. A very common one is understood by, if you remember, what the pulse of an event looks like as its seen on the detector. That you really have these different characteristics of the event which correspond to the area, to the height and to the width, of the event. So if you look, then, at different properties of this pulse, you would expect different characteristics whether it was a single cell or a doublet going through. A very common one, probably the most common one, is to look at area versus width. So if you plot area on the x axis versus width on the y axis, what you expect is the relationship between area and width to follow somewhat of a tight relationship here for singlets. And for doublets or higher multiplets that you have kind of a much bigger width for similar areas. So you might create a gate like this that you would define as so-called singlets. You can also look at properties such as height versus area. And some people look at this in a fluorescence channel, when you have a fluorophore that's present in essentially all of the things that you expect to be cells. If you don't have that, you can also look at it in the side scatter channel. It's not quite as good, but it can be sufficient sometimes to eliminate doublets or higher order cell clumps. But an example of what happens when you clean up your data this way is, in this case, somebody was doing cell cycle analysis and looking at the DNA staining histogram here. That if you just consider things that are gated to be cells, but not gated to be singlets, this is what the data look like. So you have your kinda traditional G0, G1, G2M, your S phase between. But you also have kind of these higher order bits up here. And one of the really important things in cell cycle analysis is that, if you have an event here that's in the so-called g2m peak, unless you gate for singlets, you can't really be sure whether that event was one cell that really had G2M DNA content, or whether it was two cells that had G0 G1 DNA content. So when you clean up the data in that kind of a way to gate out singlets, then you eliminate that problem, and you can see a lot of this higher order behavior here is gone. And actually the proportions between the G2M and the G0 G1 peaks are different, and essentially much better, much cleaner here. So those are two typical ways that are used to gate almost every flow cytometry experiment. So first you gate for cell versus debris, using light scattering properties, and then you gate, usually to eliminate doublets from singlets, in a variety of ways. But when you have lots and lots of parameters on a single cell level, that kind of gating, that kind of bivariant dating can exponentially explode when you have say 30 samples. So in a relatively recent publication a very nice algorithm to do gating in a much more automated way called SPADE was developed, where instead of kinda manually defining these gates, it does it algorithmically in a very high dimension. So you'll see a little bit of the data analysis from the mass cytometry in the lab-based video, where you can use SPADE and these 10s to 20 to 30 dimensional measurements in order to really define populations which then you might have some biological relevance, like for instance, certain types of t-cells in a blood cell sample versus erythrocytes versus neutrophils, etc. But these are found in an automated way so it really takes a lot of the subjectivity out of defining these gates. I mentioned compensation a few times in the first lecture. And if you remember it's a technique that's used when you're using multiple fluorophores that may overlap in their emission intensities. So the problem with this is illustrated here on this slide. So lets say, for example, if we're using two fluorophores one called FITC, which I explained before, a very common green fluorophore. And a fluorophore called PE, which is more red shifted. But if you look at the emission spectrum of both of these fluorophores, there's some overlap here. So let's say we're exciting with a 488 laser, so we're exciting with the same laser. And we look at this channel here, defined by this band pass filter here. We're gonna get a lot of fluorescence from the PE fluorophore, but we're also gonna be getting fluorescence from the FITC fluorophore. So for example, if we have a sample that's stained only for FITC and not stained for PE. But if we plot the PE intensity, as defined by this channel here, we're actually gonna get some signal when there shouldn't be any signal at all. So the idea then is to apply this so called compensation to understand what fraction of the fluorescence in this channel is attributable to what I'm seeing in the FITC channel. And let's subtract that out of this channel. So it's essentially just defining a percentage and you can define this percentage in a so called compensation matrix, which defines the crossovers of your fluorophores versus your channels. So when things are properly compensated then, your data go from looking like this with only FITC stain and no PE stain, to looking like this where now if your FITC positive, you really don't have any intensity in the so called PE channel when you don't have PE staining. With all that being said, I'd like to just make a few notes about actually obtaining quantitative data from these types of experiments, and although fluorescence and or these mass intensities are inherently quantitative, there's still some important considerations to take into account if you want to get good quantitative data out of these types of experiments. So, as you saw in the first few slides of this lecture, gating is extremely important. Especially like in the case of cell cycle analysis. Unless you really get a good gating for singlets versus multiplets, your ability to do quantitative analysis is very limited. So you have to have a very rigorous and good gating protocol for your experiments. And this can change depending on the types of samples that you have, the cell types, even between cell lines of similar cell types, etc. Compensation. So, if it's possible, it's best to choose fluorophores for which you do not need any compensation. And we'll be showing you an example of an experiment that, a protocol that we've developed in our lab, where we do four color imaging, but we don't really need any compensation because the fluorophores that we choose are all essentially spectrally distinct. But sometimes you have no choice, and if that's the case then you need to do compensation. So you've got to plan for those controls, do these kind of singly labeled controls versus doubly labeled controls and figure out what is the percentage of each fluorophore that's showing up in the other channels. Okay, so if you're looking at DNA, of course as I've mentioned in the first lecture, you've go to look at that on a linear scale, though, not on a log scale. Or for any parameter where you expect kind of subtle variations instead of kind of log scale variations, you need to look at that on a linear quantitative scale instead of a log scale. For antibody based quantitation, it's very important to do antibody tritations. So if you just incubate your cells with a certain concentration of antibody, you'll get a signal, and that signal will probably vary across the different cells in your population. But it's very hard to tell whether the signal that you get is being limited by the target that's in the cell, or by the antibody itself. It's very important for quantitative analyses to make sure that the cellular target is the limiting factor and not the antibody. Because you don't want variations in antibody concentration from sample to sample to be affecting the magnitude of the measurement that you make. And, of course, as opposed to something like a Western blot where you're separating everything in a cell by molecular weight, in this case the cell is in its native context, so you have lots of opportunity for nonspecific binding of these antibodies. So there's quite stringent requirements for the quality of the antibody in terms of its specificity, and its nonspecific binding to other things in the cell, which need to be evaluated before you're sure that the quantitative data you get from the antibody is really usable. Sometimes a measurement, a quantitative measurement, is just the frequency of cells in different gates. For example, in a cell cycle analysis, you're not really concerned about the amount of DNA in the cell, you're rather concerned with the fraction of cells that have certain amounts of DNA. And of course this requires acquisition of a sufficient number of cells, to be able to make those determinations and a proper gating protocol, to make sure that it's robust in getting us an adequate number of cells in there. Sometimes your interested, not in the variability across cells in a certain gate, or of a certain population, but you're interested in what is the average magnitude of staining within that gate. And often, you get outliers, severe outliers in flow cytometry, which may be real, but sometimes may be artifactual. And these outliers can sometimes very severely distort the average, the arithmetic mean, of the events in this cell population. So therefore oftentimes in flow cytometry, the median is used as a measure of such a magnitude, rather than the mean. Because these outliers, even though they're pretty rare, when you're collecting thousands and thousands of events, they do show up in pretty much every experiment. So you need to be aware that these can severely distort the quantitative analysis of what's going on in such a case. So, that's all I have for flow cytometry and now there is several lab based videos that will follow this to kind of show you how does a flow cytometry experiment loo, and how does a mass cytometry experiment look. And what are some of the analyses that you can do to the data you get off those machines.