The library catalog server can be specified by a command-line option, with the default server being the world's largest library catalog, the US Library of Congress.
ANSI/NISO Standard Z39.50-1995 and ISO Standard 23950:1998 ``Information and documentation --- Information retrieval (Z39.50) --- Application service definition and protocol specification'' define a library catalog protocol that allows client programs to communicate with library catalog servers around the world, and retrieve data in a small number of different formats, notably USMARC (United States MAchine-Readable Cataloging) and SUTRS (Simple Unstructured Text Record Syntax).
\s-2http://www.indexdata.dk/targettest/\s+2
In this list, vertical bars separate alternatives, and asterisk matches any word with that prefix:
\s-2 alberta | ab University of Alberta amherst | umass University of Massachusetts, Amherst boulder | co University of Colorado, Boulder british | br British Library calgary University of Calgary congress | lc US Library of Congress copac | uk COPAC (union of 24 research-university catalogs in the UK and Ireland) denmark | dk Royal Library of Denmark dsb | dsl Danish State Library duke Duke University florida | fl Florida Center for Library Automation marriott | ut University of Utah Marriott Library melbourne University of Melbourne melvyl | ca University of California MELVYL catalog minn* | mn University of Minnesota newyork | ny New York University nla | au National Library of Australia nlnz | nz National Library of New Zealand norway | no National Library of Norway nsw | unsw University of New South Wales odense | sdu University of Southern Denmark oxford | ox* Oxford University sweden | se National Library of Sweden texas | tx University of Texas at Austin toronto University of Toronto usc University of Southern California\s+2
\s-2% cattobib 1-57586-011-2 %% Searching [z3950.loc.gov:7090/Voyager] for [1575860112]: flags = [@attr 1=7] @Book{Knuth:1999:DT, author = "Donald Ervin Knuth", title = "Digital typography", volume = "78", publisher = "CSLI Publications", address = "Stanford, Calif.", pages = "xv + 685", year = "1999", ISBN = "1-57586-011-2 (cloth), 1-57586-010-4 (paperback)", LCCN = "Z249.3 .K59 1999", bibdate = "Fri Nov 19 07:23:50 MST 2004", bibsource = "z3950.loc.gov:7090/Voyager", series = "CSLI lecture notes", URL = "ftp://uiarchive.cso.uiuc.edu/pub/etext/gutenberg/; http://www.loc.gov/catdir/description/cam029/98027331.html; http://www.loc.gov/catdir/toc/cam022/98027331.html", acknowledgement = ack-nhfb, subject = "Printing; Data processing; Computerized typesetting; Computer fonts; TeX (Computer file); METAFONT", }\s+2
Remark: The ISBN is a unique identifier assigned to books published throughout the world since about 1972. It consists of ten decimal digits, the last of which may also be the letter X, divided into four hyphen- (or rarely, space-) separated parts: country or language, publisher, book number within the publisher, and a final check digit that can be used to detect invalid ISBNs.
Country/language groups 0 and 1 are English, 2 is French, 3 is German, 4 is Japanese, 5 is Russian, and so on. The Republic of Srpska is 99938.
Large publishers have small numbers (e.g., Collins is 00, McGraw-Hill is 07, and Prentice-Hall is 13), and small publishers have big numbers (e.g., Peachpit Press is 938151 and Personal TeX is 9631044).
When a publisher exhausts its range of book numbers, it gets a new publisher number: O'Reilly and Associates is assigned numbers 937175, 56592, and 596.
Search the default Z39.50 server for a book by its title:
\s-2% cattobib 'Digital Typography Sourcebook' %% Searching [z3950.loc.gov:7090/Voyager] for [Digital Typography Sourcebook]: flags = [] @Book{Bryan:1996:DTS, author = "Marvin Bryan", title = "The digital typography sourcebook", publisher = "Wiley", address = "New York", pages = "xxiv + 384, 3", year = "1996", ISBN = "0-471-14811-3 (paper/CD-ROM)", LCCN = "Z250.7 .B79 1996", bibdate = "Fri Nov 19 07:25:19 MST 2004", bibsource = "z3950.loc.gov:7090/Voyager", URL = "ftp://uiarchive.cso.uiuc.edu/pub/etext/gutenberg/; http://www.loc.gov/catdir/bios/wiley047/96013161.html; http://www.loc.gov/catdir/description/wiley033/96013161.html; http://www.loc.gov/catdir/toc/onix04/96013161.html", acknowledgement = ack-nhfb, subject = "Computer fonts", }\s+2
Search the British Library for the same book:
\s-2% cattobib --server br 'Digital Typography Sourcebook' %% Searching [z3950cat.bl.uk:9909/BLAC] for [Digital Typography Sourcebook]: flags = [] %% IGNORED: Number of hits: 1, setno 1 ... @Book{Bryan:1997:DTS, author = "Marvin Bryan", title = "The digital typography sourcebook", publisher = "Wiley", address = "New York ; Chichester", pages = "xxiv + 384", year = "1997", ISBN = "0-471-14811-3 (paperback)", bibdate = "Fri Nov 19 07:26:13 MST 2004", acknowledgement = ack-nhfb, subject = "Computer fonts", }\s+2
Search the National Library of Australia for two books by ISBN:
\s-2% cattobib -q --server au 0-06-621285-5 0-19-860702-4 @Book{Winchester:2003:KDW, author = "Simon Winchester", title = "Krakatoa: the day the world exploded, 27 August 1883", publisher = "HarperCollins Publishers", address = "New York", pages = "xvi + 416", year = "2003", ISBN = "0-06-621285-5", bibdate = "Fri Nov 19 07:37:12 MST 2004", bibsource = "catalogue.nla.gov.au:7090/Voyager", acknowledgement = ack-nhfb, remark = "Includes bibliographical references and index.", subject = "Natural disasters; Indonesia; Krakatoa; Social aspects; Volcanoes; Indonesia; Krakatoa; Krakatoa (Indonesia); Eruption, 1883", usmarc-019 = "019 1 $a 24669279", usmarc-043 = "043 $a a-io---", usmarc-250 = "250 $a 1st U.S. ed.", usmarc-984 = "984 $a ANL $c YY 551.2109598 W759", } @Book{Winchester:2003:MES, author = "Simon Winchester", title = "The meaning of everything: the story of the Oxford English Dictionary", publisher = "Oxford University Press", address = "Oxford", pages = "xxv + 260", year = "2003", ISBN = "0-19-860702-4 (hbk.), 0-19-860702-4 (hbk.)", bibdate = "Fri Nov 19 07:37:23 MST 2004", bibsource = "catalogue.nla.gov.au:7090/Voyager", price = "No price", acknowledgement = ack-nhfb, remark = "Includes ndex.", subject = "Oxford English dictionary; Lexicology; History", usmarc-019 = "019 1 $a 25073662", }\s+2
- completely wrong author lists;
- duplicated records, sometimes with minor variations;
- faulty title capitalization;
- incomplete, inaccurate, or missing page numbers;
- incorrect author order;
- mangled and missing accents;
- off-by-one copyright years;
- truncated author lists and titles;
- ...
The best advice to the user is to search three or more catalogs for the same data, and then merge the results, using a majority vote to resolve discrepancies.
When multiple catalogs provide the same data, it may indicate that the data are likely to be reliable. However, the user is warned that libraries around the world share cataloging data, so there may not be as much data independence as might appear from geographically-distant catalogs.
While the conversion of USMARC and SUTRS markup to BibTeX works reasonably well, there are many catalog record types that are not converted. When they are known not to be useful in BibTeX entries, they are silently discarded. Otherwise, cattobib preserves them as additional key/value pairs, such as the usmarc-nnn keys in the BibTeX output in the EXAMPLES section, or else complains about them in diagnostic messages.
cattobib produces only BibTeX @Book{...} entries, even for conference proceedings, for which a @Proceedings{...} entry is required. Library catalog information often does not distinguish between these document types, so the user must convert such entries.
A certain amount of manual cleanup of the BibTeX output is almost always necessary.
Nelson H. F. Beebe University of Utah Department of Mathematics, 110 LCB 155 S 1400 E RM 233 Salt Lake City, UT 84112-0090 Tel: +1 801 581 5254 FAX: +1 801 581 4148 Email: beebe@math.utah.edu, beebe@acm.org, beebe@computer.org WWW URL: http://www.math.utah.edu/~beebe