Hello. So let's finish our discussion of Hamming Encoding for SEC-DED. As we finished up last time, we talked about that there are four basic rules. Either there's no error, there's a single-bit error that we can correct. There's a parody error for the entire encoded distributed parody and data bits. We can just recompute that. Or there's been a double bit error and we're going to fail safe and allow redundancy in our high availability, high reliability design to recover us essentially. This is the only unknown, if we have triple, quadruple or even higher number of errors than just doubled bid, which we showed before is highly unlikely because it requires three SEU hits in the same word. So on a very small area. It's like hitting the three holes in one, on a per five golf hole in a row. Just not likely to happen. Even with a fantastic golfer. I don't know, I just always what cause me to think of golf analogies. Or hitting the golf balls at the chain-link fence as I described before. We just assume that based on these rules, it's going to be one of these four rules. If it's not, then we've probably already had a totally non-recoverable triple fall. Our systems design criteria, is such that we want to recover from single faults, fail-safe on double faults. Then triple faults, we just assume that there's no way to recover from that. Let's look at a specific case. This case there are no errors. We write data in through the ECC, into the ECC memory. Here I've simplified it to just be eight bits instead of 32 bits. This expands out, it scales out any size word you want, generally. But this just makes it easier to show you exactly what's happening. Then I'll show you at the end how it scales out to a 32-bit word. There's an excel model that goes with this. If you click on the link in the slides, you can get the excel model. I'll put it in the resources as well. This is read after write. We first write the data and then we're going to read it. We're going to see that there are no errors and that's going to fulfill this first rule that the check bits are zero. The check bits is also formerly called the syndrome. I labeled it SYN here. But I just call each one of these bits, a check bit. They would be in a register not stored in memory because they're just computed as part of the scratch pad logic here. If there's zero and the parody pW2 agrees with the original parody pW, which is in this column. This column agrees with this column. Then we know there are no errors. It could also be some really strange triple, quadruple or higher number of bidders that just happened to look like no error. It is probably possible that it could be some really really unlikely false-negative, but we're not going to worry about that as I said. It turns out for eight bits, of course with 13 total bits, there are 8,192 different scenarios, as far as bit flips. Because it's basically we could have no error like we're looking at here. That's one case. The next case is we could have 13 SPEs. Any one of these bits can flip zero to 12. Then it turns out we could have 78 DBEs. How do I know that? It's just 13 choose two. It's a standard probability computation. That leaves 8,100, which is this remaining MBEs. I guess if we really wanted to, we could test all those MBEs cases and plot out false positives and false negatives that we might get. But like I said, it's a triple fault is extremely unlikely. This is a no error case. Let's see how we compute this. Here's our data bits. This data we actually write from one of our registers. The distributed parody that's sitting there in memory. We don't care what it is to start with. We're going to compute it in anyway. We take these data bits and we look across a row in our scratch pad space. This is our digital logic in here. We take fields which are basically one bit wide, every other bit, starting with the first data bit. We compute a cross P01, from the parody of the single bit wide field that is every other bit. If we look across all the bits out to the last data bit, we see that we have zero here. We have an odd parity protocol. We only set it if it's odd and we leave it at zero or set it to a zero since we don't know what it was there if it's even. We took care of that row there. On the next row for p02, we compute it based on every two bits, starting basically two bits over from pW and skip two bits and then do two bits, skip two bits and then do two bits. You're probably already catching on and that I see that the fields get twice as big and skip twice as much, and they start that distance over from pW. This is a positional encoding strategy. It's still to me pretty magic that Hamming came up with this. I always like to imagine how people came up with something like this when there's such a clever approach taken. But I honestly don't know. I haven't read the original paper or anything like that. But the next row would be four bits over every bit field that's four long, skip four, and you just keep going. If you run off the end, you just fill in what the portion that fits in your word and you don't have to worry about the rest of it.. We look across each row and here on this one, there's even so zero. Here, it's odd, so it's one. Then we do every eight, eight over, so 1, 2, 3, 4, 5, 6, 7, 8, and then we have eight over and again we ran off, but we don't care. We just look at what's there and it's a one, so we fill in the one. These then, as the arrows indicate, just come straight down in each case into the encoded data. This is what we're going to actually store. It's going to take us 13 bits to store eight data bits because we're going to include this distributed parity. Now, what's interesting is then we compute another parity that's a parity for the encoded data. Not including the parity bit across. Here we got 1, 2, 3, 4, 5. Ones that's odd parity, so we set this bit to indicate that and that gets stored as well. If you remember back to here, this is parity out here around the whole thing. Now that's what you wrote in there. It gets stored and we can leave it there for a really long time, until we read it again. Maybe this location gets red because the code requires it. Or maybe it's not going to get read at all, for a really long time. But the scrubber gets here. A point later in time, someone finally reads it, either the scrubber or code that needs the data. When it reads it, we're going to go back through the internal logic. On the read, we're going to check. But there's something special here, instead of parities being put into the working register we have here. What we do is we now have bits in the distributed parity as well as bits in the data that are sitting at rest in memory. We take the same field logic every other, and we now recompute this parity value here. This is across this row. We see that that's even, so this is a zero. This one is even. Every two looking at two, it's zero. This one is four every four and we look a across, so it's a one. Now, this one is essentially eight and it's a one, so it's a one here as well. Now these check bits are only going to get set if the parity is different than what it was supposed to be. This is the same. That's as expected, so that's a zero. This is as expected, so that's a zero. This is as expected even though it's a one, so the check bit is a zero. This is as expected, so even though it's a one, so it becomes a zero. These ones we're not using but I just put up there in case we wanted to go out to a bigger word size. We got zero check bits. When we compute parity on this whole thing, again, now basically what we get is we compute parity over p01. Remember p01 out to d08. Not including the tW and we get 1, 2, 3, 4, 5. That's a one and that's a one and they match. That means there are no errors. That's the first case. Now, let's make it more interesting. Let's see an error. Here's a read-after-write where there's a data SBE. We can have an SBE and parity bit or data. That's a requirement if we can't detect errors in the parity bits, that would also be potentially a problem. We can get a single-bit error anywhere in the encoded data. The whole length is encoded data bits, distributed parity bits are the parity for the entire word. We needed to handle all those cases. Otherwise it's not really a good sector and code. The syndrome is now going to encode the position of the flip. What I did is I said, and this is an active spreadsheet that you can play with by the way, it actually updates and you can click on it and pull down menus to change the bit. I said I flipped it from a one, it was originally a one up here and I flipped it to a zero, and when I flipped that, you see these fields immediately update in the interactive spreadsheet, and that's because now if we look across on this single bit field, every other bit, we now go across, and we have a one here instead of a zero. It was supposed to be a zero and it's a one. We set that bit to say, "oops those fields have an error". Over here, we check it, and it was even parity, and so it's okay. We checked this one, and it's even parity, but it was not originally even parity, it was odd parity, so that's an error, so I'm marking it with red. We check this last one, and it was odd parity and it was originally odd parity, so It's okay. Now what's magic about this is this is two to the zero, which is a one to the two, two to the two which is a four. Four plus one is five, and guess what? It was bit position five that we flipped. It gives the position of the bit that we flipped on a single-bit error, and we know that the check bits are not zero, and that the two parities, parity original and pW2 do not agree. We can check that. Well, we know that's going to be true because we had an odd number of bit flips anyway. We have one, two, three, four, and that's going to be even. But originally, we had basically an odd parity because this was a one. We had one, two, three, four, five bits, and that gave us this odd parity. Now one of them flip to a zero, so it became even so it became a zero, but it was originally a one, so there's your disagreement. Those are single-bit error with a data bit that flipped. Study through this and just use the spreadsheet interactively, and I think you'll be able to convince yourself that it works. Double bit error, That's the next case. This is each time it's a different rule, and the double bit error, we can pick any two bits we want, I picked two data bits. The syndrome is not equal to zero, but it's ambiguous. Unfortunately it doesn't tell us which bits to flip back. It looks like it does, but we don't really know, and we'll talk about that. I do the same encoding that I always do, and then I compute the parity on the whole thing here, and the same way with the bit fields, so I won't trace through that. Now I come in here and I decided to flip bit position nine and bit position five index from zero, and that's going to cause errors going across because this changed in this bit field here, so that would obviously have an error, and this changed this bit field here, so that would obviously have an error. But note, it didn't change anything in this field over here because it's one and one, and before, we had two flips, so because both flips show in this field, there was no change. That's interesting. This value comes out to 12, and that's interesting because I guess it would indicate that maybe it's this bit out here that's in there. That might be one interpretation, but that's definitely not the one in error. It could indicate something different like maybe four plus eight or two plus ten or something like that, but that's certainly not the error either. It's totally ambiguous or just wrong, might be the other way of thinking of it. But we know that because the check bits are not zero, they won't be zero, but they can't be trusted because pW equals pW2. We know we had an even number of bit flips, and with the parities equal, even means no flips or two flips, so either has to be a double bit error or nothing changed, and when the check bits aren't zero, something definitely changed. But we can't trust the position indicator, and it's really that simple. Let's go on to the next case. So what if a parity single-bit error occurred? One of the parity bits rather than the data bit. In this case, it's the same really. We're going to get check bit's not equal zero, and the parities disagree. We do the same encoding, but the syndrome is going to locate the error just as easily for a distributed parity bit as it can for a data bit. It turns out it's no different. Eight comes up with eight, and that is the position, as you can clearly see, so there's really no problem there. We computed everything exactly the same way. It's really no different from a data bit error, so I won't belabor. I'll let you play around with that. There's one final case, simple parity error. What if this pW that we compute that's based on everything else here is the thing that flips. Well, what would happen is we would calculate pW2 and it would come out to a one because this flip to a zero and pW2 would not agree, but the check-bits would be zero. In that case that means it's a pW error and we can just flip it or recompute it and we're done. We covered all the cases. We could have d error or data error. We could have a distributed parity error, so p sub n error. Or we could have a pW error and we could have a no error from that case, of course only one case that we could have SBE and we can have a DBE. Then the MBE is interesting. We really don't know what's going to happen there because we could have a false positive, false negative. We can't do anything about it anyways, this one we can correct, as we've said many times, with the position and this one, we can detect, but we can't correct. We can detect and correct here of course. I've given an example of every possible case for the 1 plus 13 plus 78 equals 92 cases where there are definite, where we know what happened, it matches one of the four rules and then there are 8100 cases for highly improbable triple bit, quadruple bit, etc. number of errors, which we just don't expect to happen. There is nothing we can really do about that without a much more advanced code. But why work on that if it's super unlikely to ever happen? It's like if a single bit error as a one-in-a-million than the double bit errors, 1 over a million times a million and then the triple bit error is 1 over a million times a million times a million. We're talking the double of bit errors maybe one in a trillion and the triple bit error is 1 in a million times a trillion, just not going to happen. There's probably other things we should be worrying about before we worry about that. How does this scale out to nice 32-bit value? Well, just the same way. I didn't fill out all the bits here, but you can see the pattern. It starts out over one and then goes everyone, every other one all the way across, just keeps going all the way across and certainly this is something that's very amenable through digital logic. I have a C code emulator for it that you can play with that I wrote. Here it just goes every two, starting every other two. If you think about it, what's really going on here is we're using the field location and size and the intersection of the fields. Now, that Wikipedia diagram starts to make more sense. The intersection of these fields because these ones intersect right here and then there's three-way intersection here so there is these different intersections of the fields and that causes the syndrome, interestingly enough, to encode the bit position no matter how big we make this, because of this pattern of sub-fields that are basically distributed and sized according to the basically in a growing pattern that are sized in order to segment the entire field, including the distribute parity and the data bits into sets that have intersections that encode the location of the bit flip. I've tried to indicate that here so we have p 0, 1, 2, 3, 4, 5, 6 across so you start out with a high density of parity bits at the beginning and then they of course start to spread out as you go along further and further out. That's the good news. That's how you get the information rate of slightly greater than 80 percent. You can make this even with 8-bit byte lanes or just keep it in an 8-bit parity and you can keep your, I don't know what you do with that extra bit maybe use it to encode some status or something like that. But you now have a sector head for a standard 32-bit word. This certainly can be expanded out for 64-bit words. Hamming seems a little bit magic, but once you understand it, really it's just a sub bit field encoding scheme to compute the syndrome as it's called or the check bits to tell you where the bit in error is as well as detecting that it's an error. You can think of these as parities of growing subsections really of the data word as you go across and that's it. Thank you very much.