


User Commands					       BIBSORT(1)



NAME
     bibsort - sort a BibTeX bibliography file

SYNOPSIS
     bibsort [-?]  [-author]
	     [-byday  or -bylabel  or -bypages
	      or -byseriesvolume  or -byvolume	or -byyear]
	     [-copyright] [-help] [-reverse] [-version]
	     [ optional	sort(1)	options	]
	     [ <infile or BibTeXfile(s)	] >outfile

DESCRIPTION
     bibsort filters a BibTeX bibliography, or bibliography frag-
     ment,  on	its standard input, printing on	standard output	a
     sorted bibliography.

     Sorting is	normally by BibTeX citation  label  name,  or  by
     @String macro name, and letter case is always ignored in the
     sorting.

OPTIONS
     Command-line options may be abbreviated to	a unique  leading
     prefix,  and  letter  case	 is  ignored,  so  that	 -option,
     -Option, -OPTION, -oPtIoN,	etc. are all equivalent.

     For the sort order	options	beginning -by, the last	one  seen
     overrides all earlier ones.

     All options are parsed before any input  bibliography  files
     are read, no matter what their order on the command line.

     Except for	the options described below,  command-line  words
     beginning	with  a	 hyphen	 are  assumed to be options to be
     passed to sort(1).

     The leading hyphen	 that  distinguishes  an  option  from	a
     filename  may  be	doubled,  for  compatibility with GNU and
     POSIX  conventions.   Thus,   -author   and   --author   are
     equivalent.

     All remaining command-line	words are  assumed  to	be  input
     files.   Should such a filename begin with	a hyphen, it must
     be	disguised by a leading	absolute  or  relative	directory
     path, e.g., /tmp/-foo.bib or ./-foo.bib.

     The sort(1) -f option to ignore letter case  differences  is
     always supplied.  The -u option removes duplicate bibliogra-
     phy entries from the input	 stream;  however,  such  entries
     must match	exactly, including all white space.

     Sort keys are constructed from several parts of  the  BibTeX
     entry.   If  non-numeric  values are found	where numbers are



Version	0.15	  Last change: 17 January 2000			1






User Commands					       BIBSORT(1)



     normally expected (that is, for BibTeX day,  number,  pages,
     volume,  and year keys), they are replaced	by large integers
     that will sort higher  than  any  reasonable  integer  value
     likely  to	 be present.  Nondigits	after the first	character
     are ignored, so 20S will reduce  to  20:	such  values  are
     occasionally seen for volume, number, and pages values.

     However, uncertain	year values of the form	19xx or	20xx  are
     sorted at the end of their	century.

     -?		      Give a brief help	message	on  stderr,  pro-
		      cess  all	 further options, but exit with	a
		      successful status	code (on UNIX, 0)  before
		      processing any files.

     -author	      Give an author credit on stderr, then  pro-
		      cess  all	 further options, but exit with	a
		      successful status	code (on UNIX, 0)  before
		      processing any files.

     -byday	      This  option  is	intended  for  use   with
		      bibliographies  of  publications containing
		      day, month, and year data, such as  techni-
		      cal reports, newspapers, and magazines.

		      With  -byday  sorting,  a	 day  keyword  is
		      recognized  (it  will be standard	in BibTeX
		      1.0), but	for backward compatibility, month
		      entries of the form


		      "daynumber " # monthname
		      "daynumber~" # monthname
		      {daynumber } # monthname
		      {daynumber~} # monthname
		      monthname	# "daynumber "
		      monthname	# "daynumber~"
		      monthname	# {daynumber }
		      monthname	# {daynumber~}

		      are also recognized, and will yield both	a
		      day  and	a  month.  If a	day number is not
		      available, a very	large value  is	 assumed,
		      which will sort the entry	after others that
		      have day values in the same year and month.

		      The sort keys are:  <part>  <year>  <month>
		      <day>  <start-pages> <end-pages> <citation-
		      label>, in that order.

		      The <part> key represents	one of the BibTeX
		      file parts described in a	later section.



Version	0.15	  Last change: 17 January 2000			2






User Commands					       BIBSORT(1)



     -bylabel	      Sort the input by	 BibTeX	 citation  label.
		      This  is	the default, if	no -byxxx options
		      are specified.

		      The sort keys are:  <part> <citation-label>
		      <journal>	 <year>	<volume> <number> <start-
		      pages> <end-pages>.

		      The use of additional sort keys  after  the
		      initial  two  or three is	intentional: that
		      way, entries  that  are  otherwise  `equal'
		      will  be	consistently ordered according to
		      their publication	times.

     -bypages	      This  option  is	intended  for  use   with
		      bibliographies of	articles from those jour-
		      nals where page numbers increase	monotoni-
		      cally  through the volume, across	all issue
		      numbers.	Do not use it for  bibliographies
		      of journals or magazines where page numbers
		      are reset	at each	issue.

		      -bypages is similar  to  -byvolume,  except
		      that the issue number is ignored.

		      The reason for ignoring the issue	number is
		      that   some  journal  databases  lack  that
		      information.  If -byvolume were used,  then
		      articles	lacking	 issue	numbers	 would be
		      sorted separately	 from  those  with  issue
		      numbers, which makes it harder to	check for
		      duplicates, or to	compare	entries	with ori-
		      ginal journal issues.

		      The sort keys are:  <part> <journal> <year>
		      <volume>	    <start-pages>     <end-pages>
		      <citation-label>.

     -byseriesvolume  This  option  is	intended  for  use   with
		      bibliographies  of  series, such as Lecture
		      Notes in Mathematics.

		      The  sort	 keys	are:	<part>	 <volume>
		      <citation-label>	<journal> <year> <volume>
		      <number> <start-pages> <end-pages>.

     -byvolume	      This  option  is	intended  for  use   with
		      bibliographies of	single journals.

		      The journal name is included  in	the  sort
		      keys, so that in a bibliography with multi-
		      ple  journals,  output  entries  for   each



Version	0.15	  Last change: 17 January 2000			3






User Commands					       BIBSORT(1)



		      journal are kept together.

		      With -byvolume sorting, warnings are issued
		      for  any entry in	which any of these fields
		      are missing, and a  value	 of  the  missing
		      field  is	 supplied  that	 will sort higher
		      than any printable value.

		      Because -byvolume	sorting	is first on jour-
		      nal  name,  it  is  essential that there be
		      only one form of	each  journal  name;  the
		      best  way	 to  ensure this is to always use
		      @String{...}    abbreviations   for   them.
		      Order  -byvolume is convenient for checking
		      a	bibliography against the  original  jour-
		      nal, but less convenient for a bibliography
		      user.

		      The sort keys are:  <part> <journal> <year>
		      <volume> <number>	<start-pages> <end-pages>
		      <citation-label>.

     -byyear	      If this option is	given,	then  sorting  is
		      first  by	 year,	then  by  citation label.
		      This is useful for keeping  a  bibliography
		      in approximate chronological order, ordered
		      by citation label	within each year.

		      The   sort   keys	  are:	  <part>   <year>
		      <citation-label>	<journal> <year> <volume>
		      <number> <start-pages> <end-pages>.

     -copyright	      Give a brief copyright message  on  stderr,
		      then  process all	further	options, but exit
		      with a successful	status code (on	UNIX,  0)
		      before processing	any files.

     -help	      Give a brief help	message	on  stderr,  then
		      process  all further options, but	exit with
		      a	 successful  status  code  (on	UNIX,  0)
		      before processing	any files.

     -reverse	      Reverse the order	of the sort.  This option
		      does  not	affect the ordering of the BibTeX
		      file parts (see below).  It applies only to
		      the bibliographic	entries, and within those
		      entries, only to	the  citation  label  and
		      `numeric'	 fields	 (volume,  number, pages,
		      day, month, and year).

		      Thus,  bibsort  -reverse	-byvolume  for	a
		      bibliography  with  multiple  journals will



Version	0.15	  Last change: 17 January 2000			4






User Commands					       BIBSORT(1)



		      sort entries for each  journal  in  reverse
		      publication  order,  but the journal blocks
		      will still be in ascending order by journal
		      name.

     -version	      Give a  brief  version  number  message  on
		      stderr,  then  process all further options,
		      but exit with a successful status	code  (on
		      UNIX, 0) before processing any files.

BIBTEX FILE PARTS
     The input stream is conceptually divided  into  five  parts,
     any of which may be absent.

	  1.  Introductory  material  such  as	 comments,   file
	      headers,	and edit logs that are ignored by BibTeX.
	      No line in this part begins with an at-sign, ``@''.

	  2.  Preamble material	delineated by ``@Preamble{''  and
	      a	 matching closing ``}'', intended to be	processed
	      by TeX.  Normally, there is only one such	entry  in
	      a	 bibliography file, although BibTeX, and bibsort,
	      permit more than one.

	  3.  Macro  definitions  (abbreviations)  of  the   form
	      ``@String{...}''.	 Any single @String specification
	      may span multiple	 lines,	 and  there  are  usually
	      several such definitions.

	  4.  Bibliography  entries  such  as  ``@Article{...}'',
	      ``@Book{...}'', ``@InProceedings{...}'', and so on,
	      provided	that  their  citation  labels  have   not
	      already  been  encountered in a crossref assignment
	      in a preceding entry.  For bibsort, any  line  that
	      begins with an ``@'' followed by letters and digits
	      and an open brace	 is  considered	 to  be	 such  an
	      entry.   Optional	 spaces	and tabs may surround the
	      ``@'', and precede  the  first  open  brace;  these
	      spaces  and tabs will be deleted from the	output to
	      help standardize the appearance.

	  5.  ``@Proceedings{...}'' bibliography  entries,  which
	      are  likely to be	cross-referenced by ``@InProceed-
	      ings{...}'' entries,  and	 any  other  bibliography
	      entries  for  which  a  crossref assignment was met
	      before the entry itself.

	  An unfortunate implementation	limitation of the current
	  BibTeX  requires  cross-referenced  entries  to  appear
	  after	all  other  entries  that  cross-reference  them,
	  although this	limitation works to the	advantage of bib-
	  sort,	allowing single-pass processing.



Version	0.15	  Last change: 17 January 2000			5






User Commands					       BIBSORT(1)



     The order of these	parts is preserved in the output  stream.
     Part 1 will be unchanged, but parts 25 will be sorted within
     themselves.

     The sort key of ``@Preamble'' entries is their initial line,
     of	 ``@String''  entries,	the  abbreviation  name.  For all
     other BibTeX entries, the sort key	is citation label between
     the open curly brace and the trailing comma, unless the sort
     key is prefixed  with  additional	fields	as  requested  by
     -byvolume or -byyear options.

     bibsort will correctly handle UNIX	files with LF line termi-
     nators, as	well as	IBM PC DOS files with CR LF line termina-
     tors; the essential requirement is	that input lines be  del-
     ineated by	LF characters.	Thus, files from the Apple Macin-
     tosh, which uses bare CR to  terminate  lines,  would  first
     have  to  be  converted to	UNIX or	PC DOS line format before
     giving them to bibsort.

CAVEATS
     BibTeX has	loose syntactical requirements that  the  current
     simple  implementation of bibsort does not	support.  In par-
     ticular, outer parentheses	may  not  be  used  in	place  of
     braces  following ``@keyword'' patterns.  If you have such	a
     file, you can use bibclean(1) to prettyprint it into a  form
     that bibsort can handle successfully.

     The user must be aware that sorting a  bibliography  is  not
     without peril, for	at least these reasons:

	  1.  BibTeX has a requirement that entry labels given in
	      crossref = label pairs in	a bibliography entry must
	      refer to entries defined later,  rather  than  ear-
	      lier,  in	 the bibliography file.	 This regrettable
	      implementation limitation	of the current	(pre-1.0)
	      BibTeX  prevents arbitrary ordering of entries when
	      crossref values are present.   To	 partially  solve
	      this  problem,  bibsort will place ``@Proceedings''
	      entries last,  since  they  are  frequently  cross-
	      referenced by ``@InProceedings'' entries.	 However,
	      it is also possible for ``@Book'', ``@InBook'', and
	      ``@InCollection''	   entries   to	  cross-reference
	      ``@Book''	entries, and for ``@Article'' entries  to
	      cross-reference  other  ``@Article'' entries.  Nei-
	      ther of these cases  are	dealt  with  by	 bibsort,
	      except   that  ``@Book''	entries	 that  contain	a
	      ``booktitle''  assignment,  and  entries	that  are
	      explicitly  cross-referenced  before  their defini-
	      tion, are	sorted with ``@Proceedings'',

	  2.  If the BibTeX file contains interspersed commentary
	      between  ``@keyword{...}''  entries,  this material



Version	0.15	  Last change: 17 January 2000			6






User Commands					       BIBSORT(1)



	      will be considered part of the preceding entry, and
	      will be sorted with it.  Leading commentary is more
	      common, and will be moved	elsewhere in the file.

	      This is normally not  a  problem	for  the  part	1
	      material before the ``@Preamble'', since it is kept
	      together at the beginning	of the output stream.

	  3.  Some kinds of bibliography files should be kept  in
	      a	 different  order than alphabetically by citation
	      labels.  Good examples are a bibliography	file with
	      the  contents  of	a journal, or a	personal publica-
	      tion list, for both of which chronological publica-
	      tion order is likely to be preferred.

     While a much more sophisticated  implementation  of  bibsort
     could  deal  with	the first point, and the -byvolume option
     provides a	partial	solution to the	third point, in	 general,
     a	satisfactory  solution	requires  human	 intelligence and
     natural language understanding that computers lack.

     bibsort uses octal	ASCII control characters 001 through 007,
     0177,  and	 0377,	for  temporary modifications of	the input
     stream.  If any of	these are already present in  the  input,
     they  will	 be  altered on	output.	 This is unlikely to be	a
     problem, because those characters have neither  a	printable
     representation,  nor  are	they  conventionally used to mark
     line or page boundaries in	text files.

PROGRAMMING NOTES
     Some text editors permit application of an	arbitrary  filter
     command  to a region of text.  For	example, in GNU	emacs(1),
     the   command   C-u    M-x	   shell-command-on-region,    or
     equivalently,  C-u	 M-|,  can  be	used  to run bibsort on	a
     region of the buffer that is devoid of cross references  and
     other material that cannot	be safely sorted.

     Some  implementations  of	BibTeX	editing	 support  in  GNU
     emacs(1)  have  a	sort-bibtex-entries command that is func-
     tionally similar to bibsort.  However, the	 file  size  that
     can  be  processed	by emacs(1) is limited,	while bibsort can
     be	used on	arbitrarily large  files,  since  it  acts  as	a
     filter,  processing  a  small amount of data at a time.  The
     sort stage	needs the entire data  stream,	but  fortunately,
     the  UNIX sort(1) command is clever enough	to deal	with very
     large inputs.

     The current implementation	of bibsort follows the UNIX trad-
     ition  of	combining simple already-available tools.  A six-
     stage pipeline of	egrep(1),  nawk(1),  sort(1),  and  tr(1)
     accomplishes  the	job  in	 one pass with about 900 lines of
     heavily-commented shell script, about 500 lines of	which  is



Version	0.15	  Last change: 17 January 2000			7






User Commands					       BIBSORT(1)



     a	nawk(1)	 program for insertion of sort keys.  The initial
     prototype of bibsort was written and tested on several large
     bibliographies  in	a couple of hours, and after considerable
     use, was later extended with advanced  sorting  capabilities
     and cross-reference recognition in	a couple of days of work.
     By	contrast, bibtex(1) is more than 11 000	lines of code and
     documentation,  and  bibclean(1)  is  more	than 15	000 lines
     long; both	took months to develop,	implement, and test.

BUGS
     bibsort may fail on  some	UNIX  systems  if  their  sort(1)
     implementations  cannot  handle very long lines, because for
     sorting purposes, each complete bibliography entry	 is  tem-
     porarily  folded  into  a	single	line.  You may be able to
     overcome this problem by adding  a	 -znnnnn  option  to  the
     sort(1)  command (passed via the command line to bibsort) to
     increase the maximum line size to some larger value of nnnnn
     bytes.   According	to their documentation,	some UNIX sort(1)
     implementations require a space after -z, others forbid  it,
     and  still	 others	 do not	support	it at all.  If a space is
     required, you must	quote the  pair,  to  prevent  the  nnnnn
     value from	being interpreted as a filename	by bibsort.

SEE ALSO
     bibcheck(1),  bibclean(1),	 bibdup(1),  bibextract(1),  bib-
     join(1),  biblabel(1),  biblex(1),	biborder(1), bibparse(1),
     bibsearch(1),    bibsplit(1),    bibtex(1),     bibunlex(1),
     citesub(1),  egrep(1),  emacs(1), gawk(1),	mawk(1), nawk(1),
     sort(1), tr(1).

AUTHOR
     Nelson H. F. Beebe, Ph.D.
     Center for	Scientific Computing
     University	of Utah
     Department	of Mathematics,	322 INSCC
     155 S 1400	E RM 233
     Salt Lake City, UT	84112-0090
     USA
     Tel: +1 801 581 5254
     FAX: +1 801 585 1640, +1 801 581 4148
     Email: beebe@math.utah.edu, beebe@acm.org,	beebe@ieee.org (Internet)
     WWW URL: http://www.math.utah.edu/~beebe

COPYRIGHT
     ########################################################################
     ########################################################################
     ########################################################################
     ###								  ###
     ###	     bibsort: sort a BibTeX bibliography file		  ###
     ###								  ###
     ###	      Copyright	(C) 2000 Nelson	H. F. Beebe		  ###
     ###								  ###



Version	0.15	  Last change: 17 January 2000			8






User Commands					       BIBSORT(1)



     ### This program is covered by the	GNU General Public License (GPL), ###
     ### version 2 or later, available as the file COPYING in the program ###
     ### source	distribution, and on the Internet at			  ###
     ###								  ###
     ###	       ftp://ftp.gnu.org/gnu/GPL			  ###
     ###								  ###
     ###	       http://www.gnu.org/copyleft/gpl.html		  ###
     ###								  ###
     ### This program is free software;	you can	redistribute it	and/or	  ###
     ### modify	it under the terms of the GNU General Public License as	  ###
     ### published by the Free Software	Foundation; either version 2 of	  ###
     ### the License, or (at your option) any later version.		  ###
     ###								  ###
     ### This program is distributed in	the hope that it will be useful,  ###
     ### but WITHOUT ANY WARRANTY; without even	the implied warranty of	  ###
     ### MERCHANTABILITY or FITNESS FOR	A PARTICULAR PURPOSE.  See the	  ###
     ### GNU General Public License for	more details.			  ###
     ###								  ###
     ### You should have received a copy of the	GNU General Public	  ###
     ### License along with this program; if not, write	to the Free	  ###
     ### Software Foundation, Inc., 59 Temple Place, Suite 330,	Boston,	  ###
     ### MA 02111-1307 USA						  ###
     ########################################################################
     ########################################################################
     ########################################################################

AVAILABILITY
     Internet source distributions of bibsort  are  available  at
     the World-Wide Web	Uniform	Resource Locator addresses

	  ftp://ftp.math.utah.edu/pub/tex/bib/bibsort-x.yy.jar
	  ftp://ftp.math.utah.edu/pub/tex/bib/bibsort-x.yy.tar.gz
	  ftp://ftp.math.utah.edu/pub/tex/bib/bibsort-x.yy.zip
	  ftp://ftp.math.utah.edu/pub/tex/bib/bibsort-x.yy.zoo

	  http://www.math.utah.edu/pub/tex/bib/bibsort-x.yy.jar
	  http://www.math.utah.edu/pub/tex/bib/bibsort-x.yy.tar.gz
	  http://www.math.utah.edu/pub/tex/bib/bibsort-x.yy.zip
	  http://www.math.utah.edu/pub/tex/bib/bibsort-x.yy.zoo

     where x.yy	is the current	version	 (0.15	for  the  version
     whose documentation you are now reading).

     That site is mirrored to several other Internet archives, so
     you  may  also be able to find it elsewhere on the	Internet;
     try searching for the string bibsort at one or more  of  the
     popular Web search	sites, such as

	  http://altavista.digital.com/
	  http://search.microsoft.com/us/default.asp
	  http://www.dejanews.com/
	  http://www.dogpile.com/index.html



Version	0.15	  Last change: 17 January 2000			9






User Commands					       BIBSORT(1)



	  http://www.euroseek.net/page?ifl=uk
	  http://www.excite.com/
	  http://www.go2net.com/search.html
	  http://www.google.com/
	  http://www.hotbot.com/
	  http://www.infoseek.com/
	  http://www.inktomi.com/
	  http://www.lycos.com/
	  http://www.northernlight.com/
	  http://www.snap.com/
	  http://www.stpt.com/
	  http://www.yahoo.com/











































Version	0.15	  Last change: 17 January 2000		       10



