Cyclic redundancy check
Table of Contents
- Overview
- The Basic Idea Behind CRC Algorithms
- Binary Arithmetic with No Carries
- Choosing A Poly
- A Straightforward CRC Implementation
- A Table-Driven Implementation
- A Slightly Mangled Table-Driven Implementation
- "Reflected" Table-Driven Implementations
- Initial and Final Values
- Example
- CRC Algorithms
- More
Overview
Cyclic Redundancy Check循环冗余检验,是基于数据计算一组效验码,用于核对数据传输过程中是否被更改或传输错误。
The Basic Idea Behind CRC Algorithms
The basic idea of CRC algorithms is simply to treat the message as an enormous binary number, to divide it by another fixed binary number, and to make the remainder from this division the checksum. Upon receipt of the message, the receiver can perform the same division and compare the remainder with the "checksum" (transmitted remainder).
Binary Arithmetic with No Carries
Adding two numbers in CRC arithmetic is the same as adding numbers in ordinary binary arithmetic except there is no carry. This means that each pair of corresponding bits determine the corresponding output bit without reference to any other bit positions. For example:
10011011 +11001010 -------- 01010001 --------
There are only four cases for each bit position:
0+0=0 0+1=1 1+0=1 1+1=0 (no carry) # Subtraction is identical: 10011011 -11001010 -------- 01010001 --------
In fact, both addition and subtraction in CRC arithmetic is equivalent to the XOR operation, and the XOR operation is its own inverse.
Here's a fully worked division:
1100001010
_______________
10011 ) 11010110110000
10011,,.,,....
-----,,.,,....
10011,.,,....
10011,.,,....
-----,.,,....
00001.,,....
00000.,,....
-----.,,....
00010,,....
00000,,....
-----,,....
00101,....
00000,....
-----,....
01011....
00000....
-----....
10110...
10011...
-----...
01010..
00000..
-----..
10100.
10011.
-----.
01110
00000
-----
1110 = Remainder
Thus we see that CRC arithmetic is primarily about XORing particular values at various shifting offsets.
Choosing A Poly
Choosing a poly is somewhat of a black art and the reader is referred to1 (p.130-132) which has a very clear discussion of this issue.
Some popular polys are:
name | Polys | Hex |
---|---|---|
CRC12 | x12 + x11 + x3 + x2 + x + 1 | 0x80F |
CRC16 | x16 + x15 + x2 + 1 | 0x8005 |
CRC16-CCITT | x16 + x12 + x5 + 1 | 0x1021 |
CRC32 | x32 + x26 + x23 + x22 + x16 + x12 + | 0x04C11DB7 |
x11+ x10 + x8 + x7 + x5 + x4 + x2 + x + 1 |
A Straightforward CRC Implementation
3 2 1 0 Bits
+---+---+---+---+
Pop! <-- | | | | | <----- Augmented message
+---+---+---+---+
1 0 1 1 1 = The Poly
To perform the division perform the following:
Load the register with zero bits. Augment the message by appending W zero bits to the end of it. While (more message bits) Begin Shift the register left by one bit, reading the next bit of the augmented message into register bit position 0. If (a 1 bit popped out of the register during step 3) Register = Register XOR Poly. End The register now contains the remainder.
A Table-Driven Implementation
The straightforward method operates at the bit level, it is rather awkward to code (even in C), and inefficient to execute (it has to loop once for each bit). To speed it up, we need to find a way to enable the algorithm to process the message in units larger than one bit.
For the purposes of discussion, let us switch from a 4-bit poly to a 32-bit one. Our register looks much the same, except the boxes represent bytes instead of bits, and the Poly is 33 bits (one implicit 1 bit at the top and 32 "active" bits) (W=32).
3 2 1 0 Bytes +----+----+----+----+ Pop! <-- | | | | | <----- Augmented message +----+----+----+----+ 1<------32 bits------>
Consider for a moment that we use the top 8 bits of the register to calculate the value of the top bit of the register during the next 8 iterations. Suppose that we drive the next 8 iterations using the calculated values (which we could perhaps store in a single byte register and shift out to pick off each bit). Then we note three things:
- The top byte of the register now doesn't matter. No matter how many times and at what offset the poly is XORed to the top 8 bits, they will all be shifted out the right hand side during the next 8 iterations anyway.
- The remaining bits will be shifted left one position and the rightmost byte of the register will be shifted in the next byte
- While all this is going on, the register will be subjected to a series of XOR's in accordance with the bits of the pre-calculated control byte.
Perhaps you can see the solution now. Putting all the pieces together we have an algorithm that goes like this:
While (augmented message is not exhausted) Begin Examine the top byte of the register Calculate the control byte from the top byte of the register Sum all the Polys at various offsets that are to be XORed into the register in accordance with the control byte Shift the register left by one byte, reading a new message byte into the rightmost byte of the register XOR the summed polys to the register End
As it stands this is not much better than the SIMPLE algorithm. However, it turns out that most of the calculation can be precomputed and assembled into a table. As a result, the above algorithm can be reduced to:
While (augmented message is not exhaused) Begin Top = top_byte(Register); Register = (Register << 24) | next_augmessage_byte; Register = Register XOR precomputed_table[Top]; End
The above is a very efficient algorithm requiring just a shift, and OR, an XOR, and a table lookup per byte.
In C, the algorithm main loop looks like this:
r=0; while (len--) { byte t = (r >> 24) & 0xFF; r = (r << 8) | *p++; r^=table[t]; }
where len is the length of the augmented message in bytes, p points to the augmented message, r is the register, t is a temporary, and table is the computed table. This code can be made even more unreadable as follows:
r=0;
while (len--)
r = ((r << 8) | *p++) ^ t[(r >> 24) & 0xFF];
A Slightly Mangled Table-Driven Implementation
Despite the terse beauty of the above lines, those optimizing hackers couldn't leave it alone. The trouble, you see, is that this loop operates upon the AUGMENTED message and in order to use this code, you have to append W/8 zero bytes to the end of the message before pointing p at it. Depending on the run-time environment, this may or may not be a problem; if the block of data was handed to us by some other code, it could be a BIG problem. One alternative is simply to append the following line after the above loop, once for each zero byte:
最后还需要传入W/4次的0
for (i=0; i<W/4; i++)
r = (r << 8) ^ t[(r >> 24) & 0xFF];
However, at the further expense of clarity (which, you must admit, is already a pretty scare commodity in this code) we can reorganize this small loop further so as to avoid the need to either augment the message with zero bytes, or to explicitly process zero bytes at the end as above.
3 2 1 0 Bytes +----+----+----+----+ +-----<| | | | | <----- Augmented message | +----+----+----+----+ | ^ | | | XOR | | | 0+----+----+----+----+ v +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ +----->+----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ 255+----+----+----+----+
Algorithm
- Shift the register left by one byte, reading in a new message byte.
- Use the top byte just rotated out of the register to index the table of 256 32-bit values.
- XOR the table value into the register.
- Goto 1 iff more augmented message bytes.
Now, note the following facts:
- TAIL 处理最后补入的0 The W/4 augmented zero bytes that appear at the end of the message will be pushed into the register from the right as all the other bytes are, but their values (0) will have no effect whatsoever on the register because 1) XORing with zero does not change the target byte, and 2) the four bytes are never propagated out the left side of the register where their zeroness might have some sort of influence. Thus, the sole function of the W/4 augmented zero bytes is to drive the calculation for another W/4 byte cycles so that the end of the REAL data passes all the way through the register.
- HEAD 若register初始是0,开始的4次循环仅仅是把0移出 If the initial value of the register is zero, the first four iterations of the loop will have the sole effect of shifting in the first four bytes of the message from the right. This is because the first 32 control bits are all zero and so nothing is XORed into the register. Even if the initial value is not zero, the first 4 byte iterations of the algorithm will have the sole effect of shifting the first 4 bytes of the message into the register and then XORing them with some constant value (that is a function of the initial value of the register).
These facts, combined with the XOR property
(A xor B) xor C = A xor (B xor C)
mean that message bytes need not actually travel through the W/4 bytes of the register. Instead, they can be XORed into the top byte just before it is used to index the lookup table. This leads to the following modified version of the algorithm.
+-----<Message (non augmented) | v 3 2 1 0 Bytes | +----+----+----+----+ XOR----<| | | | | | +----+----+----+----+ | ^ | | | XOR | | | 0+----+----+----+----+ v +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ +----->+----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ 255+----+----+----+----+
Algorithm
- Shift the register left by one byte, reading in a new message byte.
- XOR the top byte just rotated out of the register with the next message byte to yield an index into the table ([0,255]).
- XOR the table value into the register.
- Goto 1 iff more augmented message bytes.
This is an IDENTICAL algorithm and will yield IDENTICAL results. The C code looks something like this:
r=0;
while (len--)
r = (r<<8) ^ t[(r >> 24) ^ *++];
"Reflected" Table-Driven Implementations
DEFINITION: A value/register is reflected if it's bits are swapped around its centre. For example: 0101 is the 4-bit reflection of 1010.
Turns out that UARTs (those handy little chips that perform serial IO) are in the habit of transmitting each byte with the least significant bit (bit 0) first and the most significant bit (bit 7) last (i.e. reflected).
The bytes are processed in the same order, but the bits in each byte are swapped; bit 0 is now bit 7, bit 1 is now bit 6, and so on.
不是对信息bytes的镜像,而是改变算法 In this situation, a normal sane software engineer would simply reflect each byte before processing it. However, it would seem that normal sane software engineers were thin on the ground when this early ground was being broken, because instead of reflecting the bytes, whoever was responsible held down the byte and reflected the world, leading to the following "reflected" algorithm which is identical to the previous one except that everything is reflected except the input bytes.
Message (non augmented) >-----+ | Bytes 0 1 2 3 v +----+----+----+----+ | | | | | |>----XOR +----+----+----+----+ | ^ | | | XOR | | | +----+----+----+----+0 | +----+----+----+----+ v +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+ | +----+----+----+----+<-----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+255
Notes:
- The table is identical to the one in the previous algorithm except that each entry has been reflected.
- The initial value of the register is the same as in the previous algorithm except that it has been reflected.
- The bytes of the message are processed in the same order as before (i.e. the message itself is not reflected).
- The message bytes themselves don't need to be explicitly reflected, because everything else has been!
Initial and Final Values
In addition to the complexity already seen, CRC algorithms differ from each other in two other regards:
- The initial value of the register.
- The value to be XORed with the final register value.
For example, the "CRC32" algorithm initializes its register to FFFFFFFF and XORs the final register value with FFFFFFFF.
Example
#include <stdio.h> #include <stdint.h> #include <string.h> // typedef unsigned long crc; typedef uint32_t crc; #define CRC_NAME "CRC-32" #define POLYNOMIAL 0x04C11DB7 #define INITIAL_REMAINDER 0xFFFFFFFF #define FINAL_XOR_VALUE 0xFFFFFFFF #define REFLECT_DATA TRUE #define REFLECT_REMAINDER TRUE #define CHECK_VALUE 0xCBF43926 #define WIDTH (8 * sizeof(crc)) #define TOPBIT (1 << (WIDTH - 1)) #if (REFLECT_DATA == TRUE) #undef REFLECT_DATA #define REFLECT_DATA(X) ((unsigned char) reflect((X), 8)) #else #undef REFLECT_DATA #define REFLECT_DATA(X) (X) #endif #if (REFLECT_REMAINDER == TRUE) #undef REFLECT_REMAINDER #define REFLECT_REMAINDER(X) ((crc) reflect((X), WIDTH)) #else #undef REFLECT_REMAINDER #define REFLECT_REMAINDER(X) (X) #endif unsigned long reflect(unsigned long data, unsigned char n_bits) { unsigned long reflection = 0x00000000; unsigned char bit; for (bit = 0; bit < n_bits; ++bit) { if (data & 0x1) { reflection |= (1 <<((n_bits - 1) - bit)); } data >>= 1; } return (reflection); } crc crc_table[256]; void CrcInit() { crc remainder; int dividend; unsigned char bit; // Compute the remainder of each possible dividend. for (dividend = 0; dividend < 256; ++dividend) { remainder = dividend << (WIDTH - 8); for (bit = 8; bit > 0; --bit) { if (remainder & TOPBIT) { remainder = (remainder << 1) ^ POLYNOMIAL; } else { remainder <<= 1; } } crc_table[dividend] = remainder; } } crc CrcFast(unsigned char const message[], int n_bytes) { crc remainder = INITIAL_REMAINDER; unsigned char data; int byte; for (byte = 0; byte < n_bytes; ++byte) { data = REFLECT_DATA(message[byte]) ^ (remainder >> (WIDTH - 8)); remainder = crc_table[data] ^ (remainder << 8); } return (REFLECT_REMAINDER(remainder) ^ FINAL_XOR_VALUE); } int main() { printf("wid=%ld, top=0x%x\n", WIDTH, TOPBIT); unsigned char test[] = "123456789"; CrcInit(); printf("The crcFast() of \"123456789\" is 0x%X\n", CrcFast(test, strlen(test))); return 0; }
CRC Algorithms
- A "CRC16" (CRC-16-CCITT) implementation on AutomationWiki.
- Implementing The CCITT Cyclical Redundancy Check on Dr Dobbs.
- Fast CRC32 Compare
- Best CRC Polynomials
- A C++ Class that encapsulates the official CRC32 algorithm
- CRC32 C or C++ implementation on the stackoverflow
- A CRC algorithm in C: crc.zip
A Parameterized Model For CRC Algorithms
The algorithm is from A Parameterized Model For CRC Algorithms.
- REFIN This is a boolean parameter. If it is FALSE, input bytes are processed with bit 7 being treated as the most significant bit (MSB) and bit 0 being treated as the least significant bit. If this parameter is FALSE, each byte is reflected before being processed.
- REFOUT This is a boolean parameter. If it is set to FALSE, the final value in the register is fed into the XOROUT stage directly, otherwise, if this parameter is TRUE, the final register value is reflected first.
The crc algorithm and genarating a lookup table are in the crcmodel.tar.gz.
Footnotes:
Tanenbaum, A.S., "Computer Networks", Prentice Hall, 1981, ISBN: 0-13-164699-0. Comment: Section 3.5.3 on pages 128 to 132 provides a very clear description of CRC codes. However, it does not describe table-driven implementation techniques.