.\" ====================================================================
.\"  @Troff-man-file{
.\"     author          = "Nelson H. F. Beebe",
.\"     version         = "0.02",
.\"     date            = "16 October 1996",
.\"     time            = "12:12:13 MDT",
.\"     filename        = "citesub.man",
.\"     address         = "Center for Scientific Computing
.\"                        Department of Mathematics
.\"                        University of Utah
.\"                        Salt Lake City, UT 84112
.\"                        USA",
.\"     telephone       = "+1 801 581 5254",
.\"     FAX             = "+1 801 581 4148",
.\"     checksum        = "20947 262 1052 8058",
.\"     email           = "beebe@math.utah.edu (Internet)",
.\"     codetable       = "ISO/ASCII",
.\"     keywords        = "bibliography, BibTeX, citation label, LaTeX,
.\"                        TeX",
.\"     supported       = "yes",
.\"     docstring       = "This file contains the UNIX manual pages
.\"                        for the citesub utility, a program for
.\"                        substituting new BibTeX citation labels
.\"                        standardized to the BibNet Project form,
.\"                        Lastname:year:abbrev.  The companion
.\"                        program, biblabel, can be used to generate
.\"                        the substitution file needed by this
.\"                        program.
.\"
.\"                        The checksum field above contains a CRC-16
.\"                        checksum as the first value, followed by the
.\"                        equivalent of the standard UNIX wc (word
.\"                        count) utility output of lines, words, and
.\"                        characters.  This is produced by Robert
.\"                        Solovay's checksum utility.",
.\"  }
.\" ====================================================================
.if t .ds Bi B\s-2IB\s+2T\\h'-0.1667m'\\v'0.20v'E\\v'-0.20v'\\h'-0.125m'X
.if n .ds Bi BibTeX
.if t .ds La L\\h'-0.24m'\\v'-0.15v'\\s-2A\\s+2\\h'-0.15m'\\v'0.15v'T\\h'-0.1667m'\\v'0.20v'E\\v'-0.20v'\\h'-0.125m'X
.if n .ds La LaTeX
.if t .ds Te T\\h'-0.1667m'\\v'0.20v'E\\v'-0.20v'\\h'-0.125m'X
.if n .ds Te TeX
.TH CITESUB 1 "16 October 1996" "Version 0.02"
.\"======================================================================
.SH NAME
citesub \- substitute standardized BibTeX citation labels
.\"======================================================================
.SH SYNOPSIS
.B citesub
[
.BI \-f " substitution-file"
]
[
.B \-v
]
[ file(s) ]  >outfile
.\"======================================================================
.SH DESCRIPTION
.B citesub
applies citation label substitutions to input
\*(Bi\&, \*(La\&, and \*(Te\& files.  The
substitutions will usually be generated
automatically by the companion
.B biblabel
program (see below), which can be used to
standardize the form of \*(Bi\& citation labels to
the conventions adopted for the BibNet Project.
.PP
Most existing \*(Bi\& bibliography files have been
found to have rather haphazardly-chosen, and
unsystematic, citation labels that are very likely
to conflict with labels in other bibliography
files;
.B biblabel
and
.B citesub
provide an automatic way to rectify this.
.PP
To avoid confusion between labels with common
prefixes, such as
.I Smith80
and
.IR Smith80a ,
.B citesub
checks for leading context of a left brace, quote,
comma, whitespace, or beginning of line and
trailing context of a right brace, comma, quote,
percent, whitespace, or end of line so as to match
these styles:
.PP
.RS
.nf
@Book{Smith:1980:ABC,

crossref = "Smith:1980:ABC",

crossref = {Smith:1980:ABC},

\ecite{Smith:1980:ABC}

\ecite{Smith:1980:ABC,Jones:1994:DEF}

\ecite{%
       Smith:1980:ABC,%
       Jones:1994:DEF%
}
.fi
.RE
.PP
Although one might expect that simple application
of standard software tools like the UNIX
.BR awk (1)
and
.BR sed (1)
utilities could do the string substitution job,
this is not the case.  For one thing, the required
context sensitivity complicates the
regular-expression patterns that are needed.  For
another, most UNIX
.BR sed (1)
implementations have a built-in limit of about 100
substitutions, which is far too few for typical
bibliographies.  Finally, simple application of
.BR awk (1)
and
.BR awk (1)
involves matching every input line with every
substitution pattern, which results in quadratic
run-time behavior that proves impossibly slow for
large bibliographies.
.PP
.B citesub
provides an efficient solution whose run time is
essentially proportional to the size of the input
files, and
.I independent
of the number of substitutions to be carried out.
This is achieved by tokenizing the input lines,
and then looking up each token in a
constant-access time (hash) table of
substitutions.  An initial prototype programmed in
the
.I awk
language led to a final version in C that ran
about 50 times faster, processing about 4000 input
lines/sec on an entry-level Sun SPARCstation LX
workstation.
.\"======================================================================
.SH OPTIONS
Except for the option described below, all
command-line words are assumed to be input files.
Should such a filename begin with a hyphen, it
must be disguised by a leading absolute or
relative directory path, e.g.
.I /tmp/-foo.bib
or
.IR ./-foo.bib .
.TP \w'\-f-labels-in-use-file'u+2n
.BI \-f " substitution-file"
This option specifies the name of a file
containing pairs of old and new citation labels,
one pair per line, surrounded by arbitrary amounts
of whitespace.  This file is most easily generated
by the companion program
.BR biblabel (1).
.IP
If this option is omitted, then the substitution
filename will be derived from that of the first
input file by replacing its extension by
.IR ".sub" .
Thus, the commands
.nf
.I "citesub -f foo.sub foo.bib >foo.bib-new"
and
.I "citesub foo.bib >foo.bib-new"
.fi
are equivalent.
.IP
If the substitution file is named "-", then
.B citesub
follows the common UNIX convention and interprets
it to mean standard input, allowing the
substitutions to be provided from a pipeline, such
as
.nf
.I "biblabel foo.bib | citesub -f - >foo.new"
.fi
.TP
.B \-v
Display the program version number, and possibly
installer, location, and compile-time information,
on
.IR stderr .
.\"======================================================================
.SH "WARNING AND ERROR MESSAGES"
.B citesub
will issue warning messages in the following cases:
.TP \w'\(bu'u+1n
\(bu
.I "Incorrect number of fields in substitution file."
The line is ignored.
.TP
\(bu
.I "Invalid citation label character."
.B citesub
expects that citation labels follow the
requirements of the \*(Bi\& grammar described in
the paper
.IP
Nelson H. F. Beebe,
.IR "Bibliography prettyprinting and syntax checking" ,
TUGboat 14(3), 222, October (1993) and TUGboat
14(4), 395--419, December (1993).
.IP
Citation labels must contain only these
characters:
.nf
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789
:-+/.'_
.fi
.B citesub
will continue processing, but since only input
tokens containing the above set of characters are
candidates for substitution, such erroneous labels
will not be substituted.
.TP
\(bu
.IR "No label substitutions found in file."
This is not necessarily an error, but might be.
.B citesub
will then simply copy its input to its output.
.\"======================================================================
.SH "SEE ALSO"
.BR awk (1),
.BR bibcheck (1),
.BR bibclean (1),
.BR bibextract (1),
.BR bibjoin (1),
.BR biblabel (1),
.BR biblex (1),
.BR biborder (1),
.BR bibparse (1),
.BR bibsort (1),
.BR bibtex (1),
.BR bibunlex (1),
.BR sed (1).
.\"======================================================================
.SH AUTHOR
.nf
Nelson H. F. Beebe, Ph.D.
Center for Scientific Computing
Department of Mathematics
University of Utah
Salt Lake City, UT 84112
Tel: +1 801 581 5254
FAX: +1 801 581 4148
Email: <beebe@math.utah.edu>
.fi
.\"==============================[The End]==============================
