When human beings use numbers - whether keying them into computers, dialing them on telephones, or reading them and telling them to others --- they tend to make certain kinds of mistakes more often than others. According to Richard Hamming (Coding and Information Theory, 2e, Prentice-Hall, 1986, p. 27), the two most common human errors are:
We can eliminate (or easily detect) the problem of omitting or adding digits by restricting the input field to a given number of digits if we are dealing with numbers which are fixed in format, such as credit card numbers, Social Insurance Numbers, local phone numbers, and student ID numbers.
Other errors are detected by calculating whether the check equation for a particular check digit scheme is true. The check digit is included in the equation so that it is protected against errors as well. If the equation is not true, an error is present; if it is true, there may or may not be an error.
The International Standard Book Number (ISBN) uses a weighted code: Each digit is weighted according to its position in the number and the check digit is chosen so the weighted sum is evenly divisible by a prime number. The check digit is the rightmost digit in a 10-digit number. The digit positions are numbered 1..10 from right to left. The weighted sum is divided by 11. Since the remainder resulting from division by 11 can be any number between 0 and 10, an 'X' is used to represent a check digit of 10 if necessary.
This scheme detects any single error and the transposition of any two digits at any distance (assuming the overall number is 10 or fewer digits long).
For example, given the number
0 - 1 3 1 5 - 2 4 4 7 - X
the check equation is
10x0+9x1+8x3+7x1+6x5+5x2+4x4+3x4+2x7+1x10 mod 11 = 132 mod 11 = 0
The "IBM check", which is used by MasterCard, VISA, and most other credit card companies (including the new Hudson's Bay Company cards, but not the older ones), is an even/odd weighted code. The digits in the even positions (numbering from the right) are multiplied by 2, then reduced to a single digit (if > 9) by "casting out nines" (subtracting 9, which is equivalent to adding the digits). All digits are then summed and a check digit added to make the result evenly divisible by 10.
For example, given the number
6 1 8 2 0 9 2 3 1 5 5 3
the leading 6 is doubled, giving 12, which is then reduced to 3 by adding the digits of 12 together; similarly, the 8 becomes 16 and then 7; the 0 is impervious to doubling; the 2 becomes 4; the 1 becomes 2; and the 5 in the second-last position becomes 10 and thus 1. Thus the check equation is
6#2 + 1 + 8#2 + 2 + 0#2 + 9 + 2#2 + 3 + 1#2 + 5 + 5#2 + 3 mod 10 = 0
where '#' represents multiplication with casting out nines, giving
3 + 1 + 7 + 2 + 0 + 9 + 4+ 3 + 2 + 5 + 1 + 3 mod 10 = 40 mod 10 = 0
This scheme catches all single errors and most adjacent transpositions, but not jump transpositions (such as 553 becoming 355) or 09 becoming 90.
The check digit scheme used on routing numbers for Electronic Funds Transfer (EFT) between banks uses a 9-digit number with position weightings of 3, 7, and 1. The check equation for a number a1a2a3a4a5a6a7a8a9 is
3a1 + 7a2 + a3 + 3a4 + 7a5 + a6 + 3a7 + 7 a8 + a9 mod 10 = 0
This scheme is based on the fact that multiplication modulo 10 yields a permutation of all 10 decimal digits if the multiplication factor is one of the digits 1, 3, 7, or 9, but only a subset of the decimal digits if the factor is 5 or an even digit, as illustrated in the following table:
This scheme cannot detect adjacent transpositions of digits that differ by 5.
The Universal Product Code (UPC) is similar to the IBM check, but uses a weighting factor of 3 (instead of 2) for the digits in the even positions (counting from the right, including the check digit).
It shares the weakness of the previous scheme: overlooking adjacent transpositions of digits that differ by 5.
Verhoeff proposed a scheme which avoids the weakness of the preceding three schemes in failing to detect some adjacent transpositions due to using addition modulo 10. His solution is based on multiplication in the dihedral group D5, which is not commutative (i.e., a*b is not always equal to b*a). The following table shows the result of multiplying i by j in D5:
Verhoeff's check equation is of the form
where multiplication is over D5 and f1, f2, . . . , fn are permutations of the ten decimal digits. Verhoeff found that the special case where fi is the ith iteration of a fixed permutation f yielded an excellent check:
Note that the function F cycles with period 8 (i.e., F[8, *] = F[0. *]).
The full table for function F is thus:
If the check digit is appended at position 0 in the number (i.e., as the rightmost digit if the positions are numbered from right to left, beginning with 0), then the check digit is calculated as the inverse (in D5) of the result of the successive multiplications
Verhoeff's check equation catches all single errors, all adjacent transpositions, over 95% of twin errors, over 94% of jump transpositions and jump twin errors, and most phonetic errors.
The following form allows experimentation with Verhoeff's check digit scheme. To calculate a Verhoeff check digit, enter a decimal number in the first box below, then click the Compute button. The correct check digit will then be appended at the right end of the number entered.
To test that the scheme catches the sorts of errors claimed, try changing selected digits, then clicking the Check button. The result of the check will be displayed in the lower text box.
Note that the scheme does not catch most jump twin errors involving digits with a difference of 5, such as 050 vs. 505, 161 vs. 616, 272 vs. 727, and 494 vs. 949, but it does catch 383 vs. 838.
Copyright © 1999 Jonathan Mohr