


CITESUB(1)               USER COMMANDS                 CITESUB(1)



NAME
     citesub - substitute standardized BibTeX citation labels

SYNOPSIS
     citesub [ -f substitution-file ] [ -v ] [ file(s)  ]   >out-
     file

DESCRIPTION
     citesub applies citation label substitutions to  input  Bib-
     TeX,  LaTeX,  and TeX files.  The substitutions will usually
     be generated automatically by the companion biblabel program
     (see  below),  which  can be used to standardize the form of
     BibTeX citation labels to the conventions  adopted  for  the
     BibNet Project.

     Most existing BibTeX bibliography files have been  found  to
     have  rather  haphazardly-chosen, and unsystematic, citation
     labels that are very likely to conflict with labels in other
     bibliography   files;   biblabel   and  citesub  provide  an
     automatic way to rectify this.

     To avoid confusion between labels with common prefixes, such
     as  Smith80 and Smith80a, citesub checks for leading context
     of a left brace, quote, comma, whitespace, or  beginning  of
     line  and  trailing  context of a right brace, comma, quote,
     percent, whitespace, or end of line so  as  to  match  these
     styles:

          @Book{Smith:1980:ABC,

          crossref = "Smith:1980:ABC",

          crossref = {Smith:1980:ABC},

          \cite{Smith:1980:ABC}

          \cite{Smith:1980:ABC,Jones:1994:DEF}

          \cite{%
                 Smith:1980:ABC,%
                 Jones:1994:DEF%
          }

     Although one might expect that simple application  of  stan-
     dard  software  tools like the UNIX awk(1) and sed(1) utili-
     ties could do the string substitution job, this is  not  the
     case.   For one thing, the required context sensitivity com-
     plicates the regular-expression patterns  that  are  needed.
     For  another, most UNIX sed(1) implementations have a built-
     in limit of about 100 substitutions, which is  far  too  few
     for  typical bibliographies.  Finally, simple application of
     awk(1) and awk(1) involves matching every  input  line  with



Version 0.00      Last change: 05 November 1994                 1






CITESUB(1)               USER COMMANDS                 CITESUB(1)



     every  substitution pattern, which results in quadratic run-
     time  behavior  that  proves  impossibly  slow   for   large
     bibliographies.

     citesub provides an efficient solution  whose  run  time  is
     essentially proportional to the size of the input files, and
     independent of the number of  substitutions  to  be  carried
     out.   This  is  achieved by tokenizing the input lines, and
     then looking up each token in a constant-access time  (hash)
     table  of substitutions.  An initial prototype programmed in
     the awk language led to a final version in C that ran  about
     50 times faster, processing about 4000 input lines/sec on an
     entry-level Sun SPARCstation LX workstation.

OPTIONS
     Except for the  option  described  below,  all  command-line
     words are assumed to be input files.  Should such a filename
     begin with a hyphen, it must be disguised by a leading abso-
     lute  or relative directory path, e.g.  /tmp/-foo.bib or ./-
     foo.bib.

     -f substitution-file   This option specifies the name  of  a
                            file  containing pairs of old and new
                            citation labels, one pair  per  line,
                            surrounded  by  arbitrary  amounts of
                            whitespace.  This file is most easily
                            generated  by  the  companion program
                            biblabel(1).

                            If this option is omitted,  then  the
                            substitution filename will be derived
                            from that of the first input file  by
                            replacing   its  extension  by  .sub.
                            Thus, the commands
                            citesub -f foo.sub foo.bib >foo.bib-new
                            and
                            citesub foo.bib >foo.bib-new
                            are equivalent.

                            If the substitution file is named  "-
                            ",  then  citesub  follows the common
                            UNIX convention and interprets it  to
                            mean  standard  input,  allowing  the
                            substitutions to be provided  from  a
                            pipeline, such as
                            biblabel foo.bib | citesub -f - >foo.new

     -v                     Display the program  version  number,
                            and possibly installer, location, and
                            compile-time information, on stderr.





Version 0.00      Last change: 05 November 1994                 2






CITESUB(1)               USER COMMANDS                 CITESUB(1)



WARNING AND ERROR MESSAGES
     citesub will issue warning messages in the following cases:

     + Incorrect number of fields in substitution file. The  line
       is ignored.

     + Invalid citation label  character.  citesub  expects  that
       citation  labels  follow  the  requirements  of the BibTeX
       grammar described in the paper

       Nelson H. F. Beebe, Bibliography prettyprinting and syntax
       checking,  TUGboat  14(3), 222, October (1993) and TUGboat
       14(4), 395--419, December (1993).

       Citation labels must contain only these characters:
       ABCDEFGHIJKLMNOPQRSTUVWXYZ
       abcdefghijklmnopqrstuvwxyz
       0123456789
       :-+/.'_
       citesub will continue processing,  but  since  only  input
       tokens  containing  the above set of characters are candi-
       dates for substitution, such erroneous labels will not  be
       substituted.

     + No label substitutions found in file.  This is not  neces-
       sarily  an  error, but might be.  citesub will then simply
       copy its input to its output.

SEE ALSO
     awk(1), bibcheck(1), bibclean(1), bibextract(1), bibjoin(1),
     biblabel(1),   biblex(1),   biborder(1),  bibparse(1),  bib-
     sort(1), bibtex(1), bibunlex(1), sed(1).

AUTHOR
     Nelson H. F. Beebe, Ph.D.
     Center for Scientific Computing
     Department of Mathematics
     University of Utah
     Salt Lake City, UT 84112
     Tel: +1 801 581 5254
     FAX: +1 801 581 4148
     Email: <beebe@math.utah.edu>













Version 0.00      Last change: 05 November 1994                 3



