checksum computes a 16-bit cyclic redundancy checksum (CRC) for a file, as well as counts of the words, lines and characters.
With no filenames on the command line, stdin and stdout are assumed. With one filename, input is from that file, with output to stdout. With two filenames, input is from the first, and output to the second.
If the checksum has previously been installed in the input file, and the input file has not been corrupted since then, the output file will be identical to the input file.
With the -c option, only an input file is expected, and it may come from the command line, or from stdin. checksum will then compute a new checksum, and then output on stdout only the new checksum line. This may be convenient for programmable editors to update a file checksum.
With the -v option, only an input file is expected, and it may come from the command line, or from stdin. checksum will then verify whether the checksum embedded in the file is correct or not. A zero status code is returned for a correct checksum, and a non-zero one otherwise; in UNIX, the status code may be conveniently tested in shell scripts. In either case, an informative message is printed on stdout.
For many text files, it is possible to hide the ``critical line'' in a comment near the beginning of the file.
It is difficult to arrange that a file contains its own checksum. Instead, the field xxxxx contains the checksum, written in decimal in a five-digit field (with possible leading 0's) of the file obtained from the output file by replacing the field containing the checksum by the string ZZZZZ.
If the critical line already contains after the word ``checksum'' precisely two quotation marks, and the first is the last character of the four-character string `` = "'' (i.e. <blank><equals><blank><quotation mark>) then the material between the two quotation marks will be deleted and replaced by a checksum and three counts as described above.
While the counts of words, characters, and lines could be obtained by the UNIX wc(1) utility, that information is still not sufficient to detect character substitutions, or transpositions of characters, lines, and words. The CRC-16 checksum remedies that, since the resulting checksum depends on the order and value of every single byte in the file.
checksum is intended to support the reliable exchange of text files between different computers, even ones with different operating systems. Thus, the newline character sequence that terminates each line is treated as if it were an ASCII newline (linefeed) character, even though it may be a carriage return, a carriage return and a line feed, or simply an end-of-record condition in the file, depending on the operating system and file type. The file checksum is therefore independent of the particular representation of end-of-line.
Although UNIX systems have a file checksum utility, sum(1), the result it produces differs between UNIX variants, and in any event, it is neither publicly available for porting to other systems, nor independent of the end-of-line representation. checksum is freely available.
Robert M. Solovay Department of Mathematics University of California Berkeley, CA, USA Tel: +1 415 642 2252 Email: solovay@math.berkeley.edu
Amiga support and many typographical formatting improvements:
Andreas Scherer Abt Wolf Strasse 17 96215 Lichtenfels Germany Tel: (0 95 71) 2013
General maintenance for the TeX Users Group:
Nelson H. F. Beebe Center for Scientific Computing Department of Mathematics University of Utah Salt Lake City, UT 84112 USA Tel: +1 801 581 5254 FAX: +1 801 581 4148 Email: beebe@math.utah.edu (Internet)