% This program is copyright (C) 1985 by Oren Patashnik; all rights reserved. % Copying of this file is authorized only if (1) you are Oren Patashnik, or if % (2) you make absolutely no changes to your copy. (The WEB system provides % for alterations via an auxiliary file; the master file should stay intact.) % See Appendix H of the WEB manual for hints on how to install this program. % Version 0.98f was released in March 1985. % Version 0.98g was released in April; it removed some system dependencies % (introducing term_in and term_out in place of just tty, and removing % some nonlocal goto's) and it gave context for certain parsing errors % Version 0.98h was released in April; it patched a bug in the output % line-breaking routine that can arise with some nonstandard style files % Version 0.98i was released in May; its main change split up the main program % and some procedures to help certain compilers cope with size % limitations, among other things changing error and warning macros so % they'd produce (much) less inline code; it also redefined the class of % legal style-file identifiers---although this affects only the bizarre % ones, it makes BibTeX's error messages more coherent; and it had many % minor changes, including about a 15% speed-up on TOPS-20 % Please report any bugs to Oren Patashnik (PATASHNIK@@SU-SCORE.ARPA) % Although considerable effort has been expended to make the BibTeX program % correct and reliable, no warranty is implied; the author disclaims any % obligation or liability for damages, including but not limited to % special, indirect, or consequential damages arising out of or in % connection with the use or performance of this software % This program was written by Oren Patashnik, in consultation with Leslie % Lamport, to be used with Lamport's LaTeX document preparation system. % Some modules were taken from Knuth's TeX and TeXware with his permission. % Here is TeX material that gets inserted after \input webmac \def\hang{\hangindent 3em\indent\ignorespaces} \font\ninerm=amr9 \let\mc=\ninerm % medium caps for names like PASCAL \def\PASCAL{{\mc PASCAL}} \def\ph{{\mc PASCAL-H}} \def\<#1>{$\langle#1\rangle$} \def\section{\mathhexbox278} \def\(#1){} % this is used to make section names sort themselves better \def\9#1{} % this is used for sort keys in the index via @@:sort key}{entry@@> % Note: WEAVE will typeset an upper-case `E' in a PASCAL identifier a % bit strangely so that the `TeX' in the name of this program is typeset % correctly; if this becomes a problem remove these three lines to get % normal upper-case `E's in PASCAL identifiers \def\drop{\kern-.1667em\lower.5ex\hbox{E}\kern-.125em} % middle of TeX \catcode`E=13 \uppercase{\def E{e}} \def\\#1{\hbox{\let E=\drop\it#1\/\kern.05em}} % italic type for identifiers \font\sc=amcsc10 \def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}} \def\LaTeX{{\rm L\kern-.36em\raise.3ex\hbox{\sc a}\kern-.15em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}} \def\title{\BibTeX\ } \def\today{\ifcase\month\or January\or February\or March\or April\or May\or June\or July\or August\or September\or October\or November\or December\fi \space\number\day, \number\year} \def\topofcontents{\null\vfill \def\titlepage{F} \centerline{\:\titlefont The {\:\ttitlefont \BibTeX} preprocessor} \vskip 15pt \centerline{(Version 0.98i---\today)} \vfill} @* Introduction. @↑system dependencies@> @!@:BibTeX}{\BibTeX@> @:LaTeX}{\LaTeX@> \BibTeX\ is a preprocessor (with elements of postprocessing as explained below) for the \LaTeX\ document-preparation system. It handles most of the formatting decisions for a reference list, producing a \.{.bbl} file that a user can edit to add any finishing touches \BibTeX\ isn't designed to handle; with this file \LaTeX\ actually produces the reference list. \indent\BibTeX\ works as follows. It takes as input an \.{.aux} file produced by \LaTeX\ on an earlier run, a \.{.bst} file (usually written by a system wizard in a special-purpose language described in the \BibTeX\ documentation---see the file {\.{btxdoc.tex}}.) that specifies the general reference-list style and specifies how to format individual entries, and \.{.bib} file(s) constituting a database of all reference-list entries the user might ever hope to use. \BibTeX\ chooses from the \.{.bib} file(s) only those entries specified by the \.{.aux} file (that is, those given by \LaTeX's \.{\\cite} command), and creates a \.{.bbl} file containing these entries together with the formatting commands specified by the \.{.bst} file. \LaTeX\ will use this \.{.bbl} file, perhaps edited by the user, to actually produce the reference list. Many modules of this code were taken from Knuth's TeX and TeXware with his permission. Megathanks to Howard Trickey, for whose suggestions future users and style writers would be eternally grateful, if only they knew. All known system-dependent modules of \BibTeX\ are marked in the index entry ``system dependencies''; Dave Fuchs helped exorcise unwanted ones. The |banner| string defined here should be changed whenever \BibTeX\ gets modified. @d banner=='This is BibTeX, Version 0.98i' {printed when the program starts} @ @↑system dependencies@> Terminal output goes to the file |term_out|, while terminal input comes from |term_in|. On our system, these (system-dependent) files are already opened at the beginning of the program, and have the same real name. @d term_out == tty @d term_in == tty @ @↑system dependencies@> This program uses the term |print| instead of |write| when writing on both the |log_file| and (system-dependent) |term_out| file, and it uses |trace_pr| when in |trace| mode, for which it writes on just the |log_file|. If you want to change where either set of macros writes to, you should also change the other macros in this program for that set; each such macro begins with |print_| or |trace_pr_|. @d print(#) == begin write(log_file,#); write(term_out,#); end @d print_ln(#) == begin write_ln(log_file,#); write_ln(term_out,#); end @d print_newline == begin write_ln(log_file); write_ln(term_out); end @# @d trace_pr(#) == begin write(log_file,#); end @d trace_pr_ln(#) == begin write_ln(log_file,#); end @d trace_pr_newline == begin write_ln(log_file); end @ @↑debugging@> @↑statistics@> Some of the code below is intended to be used only when diagnosing the strange behavior that sometimes occurs when \BibTeX\ is being installed or when system wizards are fooling around with \BibTeX\ without quite knowing what they are doing. Such code will not normally be compiled; it is delimited by the codewords `$|debug|\ldots|gubed|$', with apologies to people who wish to preserve the purity of English. Similarly, there is some conditional code delimited by `$|stat|\ldots|tats|$' that is intended only for use when statistics are to be kept about \BibTeX's memory/cpu usage, and there is conditional code delimited by `$|trace|\ldots|ecart|$' that is intended to be a trace facility for use mainly when debugging \.{.bst} files. @d debug == @{ { remove the `|@{|' when debugging } @d gubed == @t@>@} { remove the `|@}|' when debugging } @f debug == begin @f gubed == end @# @d stat == @{ { remove the `|@{|' when keeping statistics } @d tats == @t@>@} { remove the `|@}|' when keeping statistics } @f stat == begin @f tats == end @# @d trace == @{ { remove the `|@{|' when in |trace| mode } @d ecart == @t@>@} { remove the `|@}|' when in |trace| mode } @f trace == begin @f ecart == end @ @↑system dependencies@> We assume that |case| statements may include a default case that applies if no matching label is found, since most \PASCAL\ compilers have plugged this hole in the language by incorporating some sort of default mechanism. For example, the \ph\ compiler allows `|others|:' as a default label, and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or `\\{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases| and |endcases| should be changed to agree with local conventions. Note that no semicolon appears before |endcases| in this program, so the definition of |endcases| should include a semicolon if the compiler wants one. (Of course, if no default mechanism is available, the |case| statements of \BibTeX\ will have to be laboriously extended by listing all remaining cases. People who are stuck with such \PASCAL s have in fact done this, successfully but not happily!) @d othercases == others: {default for cases not listed explicitly} @d endcases == @+end {follows the default case in an extended |case| statement} @f othercases == else @f endcases == end @ Labels are given symbolic names by the following definitions, so that occasional |goto| statements will be meaningful. We insert the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a procedure in which we have used the `|return|' statement defined below (and this is the only place `|exit|:' appears). This label is sometimes used for exiting loops that are set up with the |loop| construction defined below. Another generic label is `|loop_exit|:'; it appears immediately after a loop. Incidentally, this program never declares a label that isn't actually used, because some fussy \PASCAL\ compilers will complain about redundant labels. @d exit=10 {go here to leave a procedure} @d loop_exit=15 {go here to leave a loop within a procedure} @d loop1_exit=16 {the first generic label for procedure with two} @d loop2_exit=17 {the second} @ @↑for loops@> And |while| we're discussing loops: This program makes into |while| loops many that would otherwise be |for| loops because of Standard \PASCAL\ limitations (it's a bit complicated---standard \PASCAL\ doesn't allow a global variable as the index of a |for| loop inside a procedure; furthermore, many compilers have fairly severe limitations on the size of a block, including the main block of the program; so most of the code in this program occurs inside procedures, and since for other reasons this program must use primarily global variables, it doesn't use many |for| loops). @ @↑program conventions@> This program uses this convention: If there are several quantities in a boolean expression, they are ordered by expected frequency (except perhaps when an error message results) so that execution will be fastest; this is more an attempt to understand the program than to make it faster. @ Here are some macros for common programming idioms. @d incr(#) == #:=#+1 {increase a variable by unity} @d decr(#) == #:=#-1 {decrease a variable by unity} @d loop == @+ while true do@+ {repeat over and over until a |goto| happens} @f loop == xclause {\.{WEB}'s |xclause| acts like `\ignorespaces|while true do|\unskip'} @d do_nothing == {empty statement} @d return == goto exit {terminate a procedure call} @f return == nil @d empty=0 {symbolic name for a null constant} @* The main program. @↑system dependencies@> @:LaTeX}{\LaTeX@> This program first reads the \.{.aux} file that \LaTeX\ produces, (\romannumeral1) determining which \.{.bib} file(s) and \.{.bst} file to read and (\romannumeral2) constructing a list of cite keys in order of occurrence. The \.{.aux} file may have other \.{.aux} files nested within. Second, it reads and executes the \.{.bst} file, (\romannumeral1) determining how and in which order to process the database entries in the \.{.bib} file(s) corresponding to those cite keys in the list (or in some cases, to all the entries in the \.{.bib} file(s)), (\romannumeral2) determining what text to be output for each entry and determining any additional text to be output, and (\romannumeral3) actually outputting this text to the \.{.bbl} file. In addition, the program sends error messages and other remarks to the |log_file| and terminal. @d close_up_shop=9998 {jump here after fatal errors} @d exit_program=9999 {jump here if we couldn't even get started} @d abort(#)==begin {fatal error---close up shop} print_ln(#); goto close_up_shop; end @p @t\4@>@@/ program BibTEX; {all files are opened dynamically} label close_up_shop,@!exit_program @; const @ type @ var @@; @@; @ @# begin initialize; print_ln(banner);@/ @; @; close_up_shop: @; exit_program: end. @ @↑overflow in arithmetic@> @↑system dependencies@> If the first character of a \PASCAL\ comment is a dollar sign, \ph\ treats the comment as a list of ``compiler directives'' that will affect the translation of this program into machine language. The directives shown below specify full checking and inclusion of the \PASCAL\ debugger when \BibTeX\ is being debugged, but they cause range checking and other redundant code to be eliminated when the production system is being generated. Arithmetic overflow will be detected in all cases. @= @{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead} @!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging} @ @↑gymnastics@> All procedures in this program (except for |initialize|) are grouped into one of the seven classes below, and these classes are dispersed throughout the program. However: Much of this program is written top down, yet \PASCAL\ wants its procedures bottom up. Since mooning is neither a technically nor a socially acceptable solution to the bottom-up problem, this section instead performs the topological gymnastics that \.{WEB} allows, ordering these classes to satisfy \PASCAL\ compilers. There are a few procedures still out of place after this ordering, though, and the other modules that complete the task have ``gymnastics'' as an index entry. @= @@; @@; @@; @@; @@; @@; @ @ This procedure gets things started properly. @= procedure initialize; label @; var @ begin @; if (bad > 0) then begin write_ln (term_out,bad:0,' is a bad bad'); goto exit_program; end; @; @; @; end; @ @↑system dependencies@> These parameters can be changed at compile time to extend or reduce \BibTeX's capacity. @= @!buf_size=1000; {maximum number of characters in an input line (or string)} @!min_print_line=3; {minimum \.{.bbl} line length: must be |>=3|} @!max_print_line=79; {the maximum: must be |>=min_print_line| and | These parameters can also be changed at compile time, but they're needed to define some \.{WEB} numeric macros so they must be so defined themselves. @d hash_size=5000 {must be |>= max_strings| and |>= hash_prime|} @d hash_prime=4253 {a prime number about 85\% of |hash_size| and |>= 128| and |< @t$2↑{14}-2↑6$@>|} @d file_name_size=40 {file names shouldn't be longer than this} @ In case somebody has inadvertently made bad settings of the ``constants,'' \BibTeX\ checks them using a global variable called |bad|. This is the first of many sections of \BibTeX\ where global variables are defined. @= @!bad:integer; {is some ``constant'' wrong?} @ Each digit-value of |bad| has a specific meaning. @= bad := 0; if (min_print_line < 3) then bad:=1; if (max_print_line < min_print_line) then bad:=10*bad+2; if (max_print_line >= buf_size) then bad:=10*bad+3; if (hash_prime < 128) then bad:=10*bad+4; if (hash_prime > hash_size) then bad:=10*bad+5; if (hash_prime >= (16384-64)) then bad:=10*bad+6; if (max_strings > hash_size) then bad:=10*bad+7; if (max_cites > max_strings) then bad:=10*bad+8; if (ent_str_size > buf_size) then bad:=10*bad+9; if (glob_str_size > buf_size) then bad:=10*bad+9; {well, almost each} @* The character set. @↑ASCII code@> (The following material is copied (almost) verbatim from \TeX. Thus, the same system-dependent changes should be made to both programs.) In order to make \TeX\ readily portable between a wide variety of computers, all of its input text is converted to an internal seven-bit code that is essentially standard ASCII, the ``American Standard Code for Information Interchange.'' This conversion is done immediately when each character is read in. Conversely, characters are converted from ASCII to the user's external representation just before they are output to a text file. Such an internal code is relevant to users of \TeX\ primarily because it governs the positions of characters in the fonts. For example, the character `\.A' has ASCII code $65=@'101$, and when \TeX\ typesets this letter it specifies character number 65 in the current font. If that font actually has `\.A' in a different position, \TeX\ doesn't know what the real position is; the program that does the actual printing from \TeX's device-independent files is responsible for converting from ASCII to a particular font encoding. \TeX's internal code is relevant also with respect to constants that begin with a reverse apostrophe. @ Characters of text that have been converted to \TeX's internal form are said to be of type |ASCII_code|, which is a subrange of the integers. @= @!ASCII_code=0..127; {seven-bit numbers} @ @↑character set dependencies@> @↑system dependencies@> The original \PASCAL\ compiler was designed in the late 60s, when six-bit character sets were common, so it did not make provision for lowercase letters. Nowadays, of course, we need to deal with both capital and small letters in a convenient way, especially in a program for typesetting; so the present specification of \TeX\ has been written under the assumption that the \PASCAL\ compiler and run-time system permit the use of text files with more than 64 distinguishable characters. More precisely, we assume that the character set contains at least the letters and symbols associated with ASCII codes @'40 through @'176; all of these characters are now available on most computer terminals. Since we are dealing with more characters than were present in the first \PASCAL\ compilers, we have to decide what to call the associated data type. Some \PASCAL s use the original name |char| for the characters in text files, even though there now are more than 64 such characters, while other \PASCAL s consider |char| to be a 64-element subrange of a larger data type that has some other name. In order to accommodate this difference, we shall use the name |text_char| to stand for the data type of the characters that are converted to and from |ASCII_code| when they are input and output. We shall also assume that |text_char| consists of the elements |chr(first_text_char)| through |chr(last_text_char)|, inclusive. The following definitions should be adjusted if necessary. @d text_char == char {the data type of characters in text files} @d first_text_char=0 {ordinal number of the smallest element of |text_char|} @d last_text_char=127 {ordinal number of the largest element of |text_char|} @= i:0..last_text_char; {this is the first one declared} @ The \TeX\ processor converts between ASCII code and the user's external character set by means of arrays |xord| and |xchr| that are analogous to \PASCAL's |ord| and |chr| functions. @= @!xord: array [text_char] of ASCII_code; {specifies conversion of input characters} @!xchr: array [ASCII_code] of text_char; {specifies conversion of output characters} @ @↑character set dependencies@> @↑system dependencies@> Since we are assuming that our \PASCAL\ system is able to read and write the visible characters of standard ASCII (although not necessarily using the ASCII codes to represent them), the following assignment statements initialize most of the |xchr| array properly, without needing any system-dependent changes. On the other hand, it is possible to implement \TeX\ with less complete character sets, and in such cases it will be necessary to change something here. @= xchr[@'40]:=' '; xchr[@'41]:='!'; xchr[@'42]:='"'; xchr[@'43]:='#'; xchr[@'44]:='$'; xchr[@'45]:='%'; xchr[@'46]:='&'; xchr[@'47]:='''';@/ xchr[@'50]:='('; xchr[@'51]:=')'; xchr[@'52]:='*'; xchr[@'53]:='+'; xchr[@'54]:=','; xchr[@'55]:='-'; xchr[@'56]:='.'; xchr[@'57]:='/';@/ xchr[@'60]:='0'; xchr[@'61]:='1'; xchr[@'62]:='2'; xchr[@'63]:='3'; xchr[@'64]:='4'; xchr[@'65]:='5'; xchr[@'66]:='6'; xchr[@'67]:='7';@/ xchr[@'70]:='8'; xchr[@'71]:='9'; xchr[@'72]:=':'; xchr[@'73]:=';'; xchr[@'74]:='<'; xchr[@'75]:='='; xchr[@'76]:='>'; xchr[@'77]:='?';@/ xchr[@'100]:='@@'; xchr[@'101]:='A'; xchr[@'102]:='B'; xchr[@'103]:='C'; xchr[@'104]:='D'; xchr[@'105]:='E'; xchr[@'106]:='F'; xchr[@'107]:='G';@/ xchr[@'110]:='H'; xchr[@'111]:='I'; xchr[@'112]:='J'; xchr[@'113]:='K'; xchr[@'114]:='L'; xchr[@'115]:='M'; xchr[@'116]:='N'; xchr[@'117]:='O';@/ xchr[@'120]:='P'; xchr[@'121]:='Q'; xchr[@'122]:='R'; xchr[@'123]:='S'; xchr[@'124]:='T'; xchr[@'125]:='U'; xchr[@'126]:='V'; xchr[@'127]:='W';@/ xchr[@'130]:='X'; xchr[@'131]:='Y'; xchr[@'132]:='Z'; xchr[@'133]:='['; xchr[@'134]:='\'; xchr[@'135]:=']'; xchr[@'136]:='↑'; xchr[@'137]:='_';@/ xchr[@'140]:='`'; xchr[@'141]:='a'; xchr[@'142]:='b'; xchr[@'143]:='c'; xchr[@'144]:='d'; xchr[@'145]:='e'; xchr[@'146]:='f'; xchr[@'147]:='g';@/ xchr[@'150]:='h'; xchr[@'151]:='i'; xchr[@'152]:='j'; xchr[@'153]:='k'; xchr[@'154]:='l'; xchr[@'155]:='m'; xchr[@'156]:='n'; xchr[@'157]:='o';@/ xchr[@'160]:='p'; xchr[@'161]:='q'; xchr[@'162]:='r'; xchr[@'163]:='s'; xchr[@'164]:='t'; xchr[@'165]:='u'; xchr[@'166]:='v'; xchr[@'167]:='w';@/ xchr[@'170]:='x'; xchr[@'171]:='y'; xchr[@'172]:='z'; xchr[@'173]:='{'; xchr[@'174]:='|'; xchr[@'175]:='}'; xchr[@'176]:='~';@/ xchr[0]:=' '; xchr[@'177]:=' '; {ASCII codes 0 and |@'177| do not appear in text} @ @↑character set dependencies@> @↑system dependencies@> Some of the ASCII codes without visible characters have been given symbolic names in this program because they are used with a special meaning. The |tab| character may be system dependent. @d null_code=@'0 {ASCII code that might disappear} @d tab=@'11 {ASCII code treated as |white_space|} @d space=@'40 {ASCII code treated as |white_space|} @d invalid_code=@'177 {ASCII code that should not appear} @ @↑character set dependencies@> @↑system dependencies@> @:TeXbook}{\sl The \TeX book@> The ASCII code is ``standard'' only to a certain extent, since many computer installations have found it advantageous to have ready access to more than 94 printing characters. Appendix~C of {\sl The \TeX book\/} gives a complete specification of the intended correspondence between characters and \TeX's internal representation. If \TeX\ is being used on a garden-variety \PASCAL\ for which only standard ASCII codes will appear in the input and output files, it doesn't really matter what codes are specified in |xchr[1..@'37]|, but the safest policy is to blank everything out by using the code shown below. However, other settings of |xchr| will make \TeX\ more friendly on computers that have an extended character set, so that users can type things like `\.↑↑Z' instead of `\.{\\ne}'. At MIT, for example, it would be more appropriate to substitute the code $$\hbox{|for i:=1 to @'37 do xchr[i]:=chr(i);|}$$ \TeX's character set is essentially the same as MIT's, even with respect to characters less than~@'40. People with extended character sets can assign codes arbitrarily, giving an |xchr| equivalent to whatever characters the users of \TeX\ are allowed to have in their input files. It is best to make the codes correspond to the intended interpretations as shown in Appendix~C whenever possible; but this is not necessary. For example, in countries with an alphabet of more than 26 letters, it is usually best to map the additional letters into codes less than~@'40. @= for i:=1 to @'37 do xchr[i]:=' '; xchr[tab]:=chr(tab); @ This system-independent code makes the |xord| array contain a suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]| where |i= for i:=first_text_char to last_text_char do xord[chr(i)]:=invalid_code; for i:=1 to @'176 do xord[xchr[i]]:=i; @ Also, various characters are given symbolic names; all the ones this program uses are collected here. @d double_quote = """" {delimits strings} @d number_sign = "#" {marks an |int_literal|} @d comment = "%" {ignore the rest of a \.{.bst} input line} @d single_quote = "'" {marks a quoted function} @d left_paren = "(" {optional database entry left delimiter} @d right_paren = ")" {corresponding right delimiter} @d comma = "," {separates various things} @d minus_sign = "-" {for a negative number} @d equals_sign = "=" {separates a field name from a field value} @d at_sign = "@@" {the beginning of a database entry} @d left_brace = "{" {left delimiter of many things} @d right_brace = "}" {corresponding right delimiter} @d period = "." {these are three} @d question_mark = "?" {string-ending characters} @d exclamation_mark = "!" {of interest in \.{add.period\$}} @d tie = "~" {the default space char, in \.{format.name\$}} @ These arrays give a lexical classification for the |ASCII_code|s; |lex_class| is used for general scanning and |id_class| is used for scanning identifiers. @= @!lex_class: array [ASCII_code] of lex_type; @!id_class: array [ASCII_code] of id_type; @ Every character has one of the following lexical classifications. @d illegal = 0 {the unrecognized |ASCII_code|s} @d white_space = 1 {things like |space|s that you can't see} @d alpha = 2 {the upper- and lower-case letters} @d numeric = 3 {the ten digits} @d other_lex = 4 {when none of the above applies} @d last_lex = 4 {the same number as on the line above} @# @d illegal_id_char = 0 {most printing characters} @d legal_id_char = 1 {a few forbidden ones} @= @!lex_type = 0..last_lex; @!id_type = 0..1; @ @↑character set dependencies@> @↑system dependencies@> Now we initialize the system-dependent |lex_class| array. The |tab| character may be system dependent. Note that the order of these assignments is important here. @= for i:=0 to @'177 do lex_class[i] := other_lex; for i:=0 to @'37 do lex_class[i] := illegal; lex_class[invalid_code] := illegal; lex_class[tab] := white_space; lex_class[space] := white_space; for i:=@'60 to @'71 do lex_class[i] := numeric; for i:=@'101 to @'132 do lex_class[i] := alpha; for i:=@'141 to @'172 do lex_class[i] := alpha; @ @↑character set dependencies@> @↑system dependencies@> And now the |id_class| array. @= for i:=0 to @'177 do id_class[i] := legal_id_char; for i:=0 to @'37 do id_class[i] := illegal_id_char; id_class[space] := illegal_id_char; id_class[tab] := illegal_id_char; id_class[double_quote] := illegal_id_char; id_class[number_sign] := illegal_id_char; id_class[comment] := illegal_id_char; id_class[single_quote] := illegal_id_char; id_class[left_paren] := illegal_id_char; id_class[right_paren] := illegal_id_char; id_class[comma] := illegal_id_char; id_class[equals_sign] := illegal_id_char; id_class[left_brace] := illegal_id_char; id_class[right_brace] := illegal_id_char; @ The array |char_width| gives relative printing widths of each |ASCII_code|, and |string_width| will be used later to sum up |char_width|s in a string. @= @!char_width : array [ASCII_code] of integer; @!string_width : integer; @ @↑character set dependencies@> @↑system dependencies@> Now we initialize the system-dependent |char_width| array. |space| is the only |white_space| character given a nonzero printing width. The widths here are taken from Stanford's July~'84 $amr10$~font and represent hundredths of a point (rounded), but since they're used only for relative comparisions, the units have no meaning. @= for i:=0 to @'177 do char_width[i] := 0; @# char_width[@'40] := 278; char_width[@'41] := 278; char_width[@'42] := 500; char_width[@'43] := 833; char_width[@'44] := 500; char_width[@'45] := 833; char_width[@'46] := 778; char_width[@'47] := 278; char_width[@'50] := 389; char_width[@'51] := 389; char_width[@'52] := 500; char_width[@'53] := 778; char_width[@'54] := 278; char_width[@'55] := 333; char_width[@'56] := 278; char_width[@'57] := 500; char_width[@'60] := 500; char_width[@'61] := 500; char_width[@'62] := 500; char_width[@'63] := 500; char_width[@'64] := 500; char_width[@'65] := 500; char_width[@'66] := 500; char_width[@'67] := 500; char_width[@'70] := 500; char_width[@'71] := 500; char_width[@'72] := 278; char_width[@'73] := 278; char_width[@'74] := 278; char_width[@'75] := 778; char_width[@'76] := 472; char_width[@'77] := 472; char_width[@'100] := 778; char_width[@'101] := 750; char_width[@'102] := 708; char_width[@'103] := 722; char_width[@'104] := 764; char_width[@'105] := 681; char_width[@'106] := 653; char_width[@'107] := 785; char_width[@'110] := 750; char_width[@'111] := 361; char_width[@'112] := 514; char_width[@'113] := 778; char_width[@'114] := 625; char_width[@'115] := 917; char_width[@'116] := 750; char_width[@'117] := 778; char_width[@'120] := 681; char_width[@'121] := 778; char_width[@'122] := 736; char_width[@'123] := 556; char_width[@'124] := 722; char_width[@'125] := 750; char_width[@'126] := 750; char_width[@'127] :=1028; char_width[@'130] := 750; char_width[@'131] := 750; char_width[@'132] := 611; char_width[@'133] := 278; char_width[@'134] := 500; char_width[@'135] := 278; char_width[@'136] := 500; char_width[@'137] := 278; char_width[@'140] := 278; char_width[@'141] := 500; char_width[@'142] := 556; char_width[@'143] := 444; char_width[@'144] := 556; char_width[@'145] := 444; char_width[@'146] := 306; char_width[@'147] := 500; char_width[@'150] := 556; char_width[@'151] := 278; char_width[@'152] := 306; char_width[@'153] := 528; char_width[@'154] := 278; char_width[@'155] := 833; char_width[@'156] := 556; char_width[@'157] := 500; char_width[@'160] := 556; char_width[@'161] := 528; char_width[@'162] := 395; char_width[@'163] := 394; char_width[@'164] := 389; char_width[@'165] := 556; char_width[@'166] := 528; char_width[@'167] := 722; char_width[@'170] := 528; char_width[@'171] := 528; char_width[@'172] := 444; char_width[@'173] := 500; char_width[@'174] :=1000; char_width[@'175] := 500; char_width[@'176] := 500; @* Input and output. The basic operations we need to do are (1)~inputting and outputting of text characters to or from a file; (2)~instructing the operating system to initiate (``open'') or to terminate (``close'') input or output to or from a specified file; and (3)~testing whether the end of an input file has been reached. @= @!alpha_file=packed file of text_char; {files that contain textual data} @ @↑system dependencies@> Most of what we need to do with respect to input and output can be handled by the I/O facilities that are standard in \PASCAL, i.e., the routines called |get|, |put|, |eof|, and so on. But standard \PASCAL\ does not allow file variables to be associated with file names that are determined at run time, so it cannot be used to implement \BibTeX; some sort of extension to \PASCAL's ordinary |reset| and |rewrite| is crucial for our purposes. We shall assume that |name_of_file| is a variable of an appropriate type such that the \PASCAL\ run-time system being used to implement \BibTeX\ can open a file whose external name is specified by |name_of_file|. @= @!name_of_file:packed array[1..file_name_size] of char; {on some systems this is a \&{record} variable} @!name_length:0..file_name_size; {this many characters are relevant in |name_of_file| (the rest are blank)} @!name_ptr:0..file_name_size+1; {index variable into |name_of_file|} @ @↑system dependencies@> @:PASCAL H}{\ph@> The \ph\ compiler with which the present version of \TeX\ was prepared has extended the rules of \PASCAL\ in a very convenient way. To open file~|f|, we can write $$\vbox{\halign{#\hfil\qquad&#\hfil\cr |reset(f,@t\\{name}@>,'/O')|&for input;\cr |rewrite(f,@t\\{name}@>,'/O')|&for output.\cr}}$$ The `\\{name}' parameter, which is of type `\ignorespaces|packed array[@t\<\\{any}>@>] of text_char|', stands for the name of the external file that is being opened for input or output. Blank spaces that might appear in \\{name} are ignored. The `\.{/O}' parameter tells the operating system not to issue its own error messages if something goes wrong. If a file of the specified name cannot be found, or if such a file cannot be opened for some other reason (e.g., someone may already be trying to write the same file), we will have |@!erstat(f)<>0| after an unsuccessful |reset| or |rewrite|. This allows \TeX\ to undertake appropriate corrective action. \TeX's file-opening procedures return |false| if no file identified by |name_of_file| could be opened. @d reset_OK(#)==erstat(#)=0 @d rewrite_OK(#)==erstat(#)=0 @= function erstat(var f:file):integer; extern; {in the runtime library} @#@t\2@> function a_open_in(var f:alpha_file):boolean; {open a text file for input} begin reset(f,name_of_file,'/O'); a_open_in:=reset_OK(f); end; @# function a_open_out(var f:alpha_file):boolean; {open a text file for output} begin rewrite(f,name_of_file,'/O'); a_open_out:=rewrite_OK(f); end; @ @↑system dependencies@> Files can be closed with the \ph\ routine `|close(f)|', which should be used when all input or output with respect to |f| has been completed. This makes |f| available to be opened again, if desired; and if |f| was used for output, the |close| operation makes the corresponding external file appear on the user's area, ready to be read. @= procedure a_close(var f:alpha_file); {close a text file} begin close(f); end; @ Text output is easy to do with the ordinary \PASCAL\ |put| procedure, so we don't have to make any other special arrangements. The treatment of text input is more difficult, however, because of the necessary translation to |ASCII_code| values, and because \TeX's conventions should be efficient and they should blend nicely with the user's operating environment. @ Input from text files is read one line at a time, using a routine called |input_ln|. This function is defined in terms of global variables called |buffer| and |last|. The |buffer| array contains |ASCII_code| values, and |last| is an index into this array marking the end of a line of text. (Occasionally, |buffer| is used for something else, in which case it is copied to a temporary array.) @= @!buffer:buf_type; {usually, lines of characters being read} @!last:buf_pointer; {end of the line just input to |buffer|} @ The type |buf_type| is used for |buffer|, for saved copies of it, or for scratch work. It's not |packed| because otherwise the program would run more than 25 percent slower (on a TOPS-20). @= @!buf_pointer = 0..buf_size; {an index into a |buf_type|} @!buf_type = array[buf_pointer] of ASCII_code; {for various buffers} @ @↑kludge@> And while we're at it, we declare another buffer for general use. Because buffers are not packed and can get large, we use |sv_buffer| for lots of different reasons; this is a bit kludgy, but it helps make the stack space not overflow on some machines. It's used when reading an \.{.aux} file \.{\\citation} command, when scanning a \.{.bst} implicit function (in the \.{function} command), when reading the entire database file (in the \.{read} command), when doing name-handling in the |built_in| functions \.{format.names\$} and \.{num.names\$} (through the alias |name_buf|), and when executing the |built_in| function \.{swap\$}. @= @!sv_buffer : buf_type; @!sv_ptr1 : buf_pointer; @!sv_ptr2 : buf_pointer; @!tmp_ptr : buf_pointer; {used only as a copy pointer} @ @.BibTeX capacity exceeded@> When something in the program wants to be bigger or something out there wants to be smaller, it's time to call it a run. @d overflow(#)==abort('Sorry, you''ve exceeded BibTeX''s ',#:0) @ @:BibTeX capacity exceeded}{\quad buffer size@> The |input_ln| function brings the next line of input from the specified file into available positions of the buffer array and returns the value |true|, unless the file has already been entirely read, in which case it returns |false| and sets |last:=0|. In general, the |ASCII_code| numbers that represent the next line of the file are input into |buffer[0]|, |buffer[1]|, \dots, |buffer[last-1]|; and the global variable |last| is set equal to the length of the line. Trailing |white_space| characters are removed from the line (|white_space| characters are explained later---most likely they're blanks); thus, either |last=0| (in which case the line was entirely blank) or |lex_class[buffer[last-1]]<>white_space|. An overflow error is given if the normal actions of |input_ln| would make |last>buf_size|. Standard \PASCAL\ says that a file should have |eoln| immediately before |eof|, but \BibTeX\ needs only a weaker restriction: If |eof| occurs in the middle of a line, the system function |eoln| should return a |true| result (even though |f↑| will be undefined). @= function input_ln(var f:alpha_file) : boolean; {inputs the next line or returns |false|} label loop_exit; begin last:=0; if (eof(f)) then input_ln:=false else begin while (not eoln(f)) do begin if (last >= buf_size) then overflow('buffer size ',buf_size); buffer[last]:=xord[f↑]; get(f); incr(last); end; get(f); while (last > 0) do {remove trailing |white_space|} if (lex_class[buffer[last-1]] = white_space) then decr(last) else goto loop_exit; loop_exit: input_ln:=true; end; end; @* String handling. \BibTeX\ uses variable-length strings of seven-bit characters. Since \PASCAL\ does not have a well-developed string mechanism, \BibTeX\ does all its string processing by homegrown (predominantly \TeX's) methods. Unlike \TeX, however, \BibTeX\ does not use a |pool_file| for string storage; it creates its few pre-defined strings at run-time. The necessary operations are handled with a simple data structure. The array |str_pool| contains all the (seven-bit) ASCII codes in all the strings \BibTeX\ must ever search for (generally identifiers names), and the array |str_start| contains indices of the starting points of each such string. Strings are referred to by integer numbers, so that string number |s| comprises the characters |str_pool[j]| for |str_start[s]<=j= @!str_pool : packed array[pool_pointer] of ASCII_code; {the characters} @!str_start : packed array[str_number] of pool_pointer; {the starting pointers} @!pool_ptr : pool_pointer; {first unused position in |str_pool|} @!str_ptr : str_number; {start of the current string being created} @!str_num : str_number; {general index variable into |str_start|} @!p_ptr1,@!p_ptr2 : pool_pointer; {several procedures use these locally} @ Where |pool_pointer| and |str_number| are pointers into |str_pool| and |str_start|. @= @!pool_pointer = 0..pool_size; {for variables that point into |str_pool|} @!str_number = 0..max_strings; {for variables that point into |str_start|} @ @↑kludge@> @↑system dependencies@> @:this can't happen}{\quad illegal string number@> This procedure sends a string in |str_pool| to an output file. Note: The |term_out| file is system dependent. @d max_pop = 3 {---see the |built_in| functions section} @# @d print_pool_str(#) == begin out_pool_str(term_out,#); out_pool_str(log_file,#); end @# @d trace_pr_pool_str(#) == begin out_pool_str(log_file,#); end @= procedure out_pool_str (var f:alpha_file; @!s:str_number); var i:pool_pointer; begin {allowing |str_ptr <= s < str_ptr+max_pop| is a \.{.bst}-stack kludge} if ((s<0) or (s>=str_ptr+max_pop) or (s>=max_strings)) then print_ln ('this can''t happen---illegal string number:',s:0); for i := str_start[s] to str_start[s+1]-1 do write(f,xchr[str_pool[i]]); end; @ @.WEB@> Several of the elementary string operations are performed using \.{WEB} macros instead of using \PASCAL\ procedures, because many of the operations are done quite frequently and we want to avoid the overhead of procedure calls. For example, here is a simple macro that computes the length of a string. @d length(#) == (str_start[#+1]-str_start[#]) {the number of characters in string number \#} @ @:BibTeX capacity exceeded}{\quad pool size@> Strings are created by appending character codes to |str_pool|. The macro called |append_char|, defined here, does not check to see if the value of |pool_ptr| has gotten too high; this test is supposed to be made before |append_char| is used. To test if there is room to append |l| more characters to |str_pool|, we shall write |str_room(l)|, which aborts \BibTeX\ and gives an error message if there isn't enough room. @d append_char(#) == {put |ASCII_code| \# at the end of |str_pool|} begin str_pool[pool_ptr]:=#; incr(pool_ptr); end @d str_room(#) == {make sure that the pool hasn't overflowed} begin if (pool_ptr+# > pool_size) then overflow('pool size ',pool_size); end @ @:BibTeX capacity exceeded}{\quad number of strings@> Once a sequence of characters has been appended to |str_pool|, it officially becomes a string when the procedure |make_string| is called. @= procedure make_string; {current string enters the pool} begin if (str_ptr=max_strings) then overflow('number of strings ',max_strings); incr(str_ptr); str_start[str_ptr]:=pool_ptr; end; @ The macro |flush_string| destroys the string at the end of the pool. @d flush_string == begin decr(str_ptr); pool_ptr := str_start[str_ptr]; end @ This subroutine compares string |s| with another string that appears in the buffer |buf| between positions |bf_ptr| and |bf_ptr+len-1|; the result is |true| if and only if the strings are equal. @= function str_eq_buf (@!s:str_number; var buf:buf_type; @!bf_ptr,@!len:buf_pointer) : boolean; {test equality of strings} label exit; var i : buf_pointer; {running} @!j : pool_pointer; {indices} begin if (length(s) <> len) then {strings of unequal length} begin str_eq_buf := false; return; end; i := bf_ptr; j := str_start[s]; while (j < str_start[s+1]) do begin if (str_pool[j] <> buf[i]) then begin str_eq_buf := false; return; end; incr(i); incr(j); end; str_eq_buf := true; exit: end; @ This subroutine compares two |str_pool| strings and returns true |true| if and only if the strings are equal. @= function str_eq_str (@!s1,@!s2:str_number) : boolean; label exit; begin if (length(s1) <> length(s2)) then begin str_eq_str := false; return; end; p_ptr1 := str_start[s1]; p_ptr2 := str_start[s2]; while (p_ptr1 < str_start[s1+1]) do begin if (str_pool[p_ptr1] <> str_pool[p_ptr2]) then begin str_eq_str := false; return; end; incr(p_ptr1); incr(p_ptr2); end; str_eq_str:=true; exit: end; @ @:BibTeX capacity exceeded}{\quad file name size@> This procedure copies file name |file_name| into the beginning of |name_of_file|, if it will fit. It also sets the global variable |name_length| to the appropriate value. @= procedure start_name (@!file_name:str_number); var p_ptr: pool_pointer; {running index} begin if (length(file_name) > file_name_size) then begin print ('file '); print_pool_str (file_name); print (','); overflow('file name size ',file_name_size); end; name_ptr := 1; p_ptr := str_start[file_name]; while (p_ptr < str_start[file_name+1]) do begin name_of_file[name_ptr] := chr (str_pool[p_ptr]); incr(name_ptr); incr(p_ptr); end; name_length := length(file_name); end; @ @:BibTeX capacity exceeded}{\quad file name size@> This procedure copies file extension |ext| into the array |name_of_file| starting at position |name_length+1|. It also sets the global variable |name_length| to the appropriate value. @= procedure add_extension(@!ext:str_number); var p_ptr: pool_pointer; {running index} begin if (name_length + length(ext) > file_name_size) then begin print ('file=',name_of_file,', with extension '); print_pool_str (ext); print (','); overflow('file name size ',file_name_size); end; name_ptr := name_length + 1; p_ptr := str_start[ext]; while (p_ptr < str_start[ext+1]) do begin name_of_file[name_ptr] := chr (str_pool[p_ptr]); incr(name_ptr); incr(p_ptr); end; name_length := name_length + length(ext); name_ptr := name_length+1; while (name_ptr <= file_name_size) do {pad with blanks} begin name_of_file[name_ptr] := ' '; incr(name_ptr); end; end; @ @:BibTeX capacity exceeded}{\quad file name size@> This procedure copies the default logical area name |area| into the array |name_of_file| starting at position 1, after shifting up the rest of the filename. It also sets the global variable |name_length| to the appropriate value. @= procedure add_area(@!area:str_number); var p_ptr: pool_pointer; {running index} begin if (name_length + length(area) > file_name_size) then begin print ('file='); print_pool_str (area); print (name_of_file,','); overflow('file name size ',file_name_size); end; name_ptr := name_length; while (name_ptr > 0) do {shift up name} begin name_of_file[name_ptr+length(area)] := name_of_file[name_ptr]; decr(name_ptr); end; name_ptr := 1; p_ptr := str_start[area]; while (p_ptr < str_start[area+1]) do begin name_of_file[name_ptr] := chr (str_pool[p_ptr]); incr(name_ptr); incr(p_ptr); end; name_length := name_length + length(area); end; @ This system-independent procedure converts upper-case characters to lower case for the specified part of |buf|. It is system independent because it uses only the internal representation for characters. @d case_difference = "a" - "A" @= procedure lower_case (var buf:buf_type; @!bf_ptr,@!len:buf_pointer); var i:buf_pointer; begin if (len > 0) then for i := bf_ptr to bf_ptr+len-1 do if ((buf[i]>="A") and (buf[i]<="Z")) then buf[i] := buf[i] + case_difference; end; @ This system-independent procedure is the same as the previous except that it converts lower- to upper-case letters. @= procedure upper_case (var buf:buf_type; @!bf_ptr,@!len:buf_pointer); var i:buf_pointer; begin if (len > 0) then for i := bf_ptr to bf_ptr+len-1 do if ((buf[i]>="a") and (buf[i]<="z")) then buf[i] := buf[i] - case_difference; end; @* The hash table. All static strings that \BibTeX\ might have to search for, generally identifiers, are stored and retrieved by means of a fairly standard hash-table algorithm (but slightly altered here) called the method of ``coalescing lists'' (cf.\ Algorithm 6.4C in {\sl The Art of Computer Programming\/}). Once a string enters the table, it is never removed. The actual sequence of characters forming a string is stored in the |str_pool| array. The hash table consists of the four arrays |hash_next|, |hash_text|, |hash_ilk|, and |ilk_info|. The first array, |hash_next[p]|, points to the next identifier belonging to the same coalesced list as the identifier corresponding to~|p|. The second, |hash_text[p]|, points to the |str_start| entry for |p|'s string. If position~|p| of the hash table is empty, we have |hash_text[p]=0|; if position |p| is either empty or the end of a coalesced hash list, we have |hash_next[p]=empty|; an auxiliary pointer variable called |hash_used| is maintained in such a way that all locations |p>=hash_used| are nonempty. The third, |hash_ilk[p]|, tells how this string is used (as ordinary text, as a variable name, as an \.{.aux} file command, etc). The fourth, |ilk_info[p]|, contains information specific to the corresponding |hash_ilk|---for |integer_ilk|s: the integer's value; for |cite_ilk|s: a pointer into |cite_list|; for |lc_cite_ilk|s: a pointer into |cite_list|; for |command_ilk|s: a constant to be used in a |case| statement; for |bst_fn_ilk|: function-specific information; for |macro_ilk|s: a pointer to its definition string; for all other |ilk|s it contains no information. This |ilk|-specific information is set in other parts of the program rather than here in the hashing routine. @d hash_base = empty + 1 {lowest numbered hash-table location} @d hash_max = hash_base + hash_size - 1 {highest numbered hash-table location} @d hash_is_full == (hash_used=hash_base) {test if all positions are occupied} @# @d text_ilk = 0 {a string of ordinary text} @d integer_ilk = 1 {an integer (possibly with a |minus_sign|)} @d aux_command_ilk = 2 {an \.{.aux} file command} @d aux_file_ilk = 3 {an \.{.aux} file name} @d bst_command_ilk = 4 {a \.{.bst} file command} @d bst_file_ilk = 5 {a \.{.bst} file name} @d bib_file_ilk = 6 {a \.{.bib} file name} @d file_ext_ilk = 7 {one of \.{.aux}, \.{.bst}, \.{.bib}, \.{.bbl}, or \.{.blg}} @d file_area_ilk = 8 {one of \.{texinputs:} or \.{texbib:}} @d cite_ilk = 9 {a \.{\\citation} argument} @d lc_cite_ilk = 10 {a \.{\\citation} argument converted to lower case} @d bst_fn_ilk = 11 {a \.{.bst} function name} @d bib_command_ilk = 12 {a \.{.bib} function name} @d macro_ilk = 13 {a \.{.bst} macro or a \.{.bib} string} @d last_ilk = 13 {the same number as on the line above} @= @!hash_loc=hash_base..hash_max; {a location within the hash table} @!hash_pointer=empty..hash_max; {either |empty| or a |hash_loc|} @# @!str_ilk=0..last_ilk; {the legal string types} @ @= @!hash_next : packed array[hash_loc] of hash_pointer; {coalesced-list link} @!hash_text : packed array[hash_loc] of str_number; {pointer to a string} @!hash_ilk : packed array[hash_loc] of str_ilk; {the type of string} @!ilk_info : packed array[hash_loc] of integer; {|ilk|-specific info} @!hash_used : hash_base..hash_max+1; {allocation pointer for hash table} @!hash_found : boolean; {set to |true| if it's already in the hash table} @!dummy_loc : hash_loc; {receives |str_lookup| value whenever it's useless} @ @= @!k:hash_loc; @ Now it's time to initialize the hash table; note that |str_start[0]| must be unused if |hash_text[k] := 0| is to have the desired effect. @= for k:=hash_base to hash_max do begin hash_next[k] := empty; hash_text[k] := 0; {thus, no need to initialize |hash_ilk| or |ilk_info|} end; hash_used := hash_max + 1; {nothing in table initially} @ Here is the subroutine that searches the hash table for a (string,~|str_ilk|) pair, where the string is of length |l>=0| and appears in |buffer[j..(j+l-1)]|. If it finds the pair, it returns the corresponding hash-table location and sets the global variable |hash_found| to |true|. Otherwise it sets |hash_found| to |false|, and if the parameter |insert_it| is |true|, it inserts the pair into the hash table, inserts the string into |str_pool| if not previously encountered, and returns its location. Note that two different pairs can have the same string but different |str_ilk|s, in which case the second pair encountered, if |insert_it| were |true|, would be inserted into the hash table though its string wouldn't be inserted into |str_pool| because it would already be there. @d max_hash_value = hash_prime+hash_prime-2+127 {|h|'s maximum value} @d do_insert == true {insert string if not found in hash table} @d dont_insert == false {don't insert string} @# @d str_found = 40 {go here when you've found the string} @d str_not_found = 45 {go here when you haven't} @= function str_lookup(var buf:buf_type; @!j,@!l:buf_pointer; @!ilk:str_ilk; @!insert_it:boolean) : hash_loc; {search the hash table} label str_found,@!str_not_found; var h:0..max_hash_value; {hash code} @!p:hash_loc; {index into |hash_| arrays} @!k:buf_pointer; {index into |buf| array} @!old_string:boolean; {set to |true| if it's an already encountered string} @!str_num:str_number; {pointer to an already encountered string} begin @; p:=h+hash_base; {start searching here; note that |0<=h; if (hash_next[p]=empty) then {location |p| may or may not be empty} begin if (not insert_it) then goto str_not_found; @; goto str_found; end; p:=hash_next[p]; {old and new locations |p| are not empty} end; str_not_found: do_nothing; {don't insert pair; function value meaningless} str_found: str_lookup:=p; end; @ @↑for loops@> @.WEB@> The value of |hash_prime| should be roughly 85\% of |hash_size|, and it should be a prime number (it should also be less than $2↑{14} + 2↑{6} = 16320$ because of \.{WEB}'s simple-macro bound). The theory of hashing tells us to expect fewer than two table probes, on the average, when the search is successful. @= h := 0; {note that this works for zero-length strings} k := j; while (k < j+l) do {not a |for| loop in case |j = l = 0|} begin h:=h+h+buf[k]; while (h >= hash_prime) do h:=h-hash_prime; incr(k); end; @ Here we handle the case in which we've already encountered this string; note that even if we have, we'll still have to insert the pair into the hash table if |str_ilk| doesn't match. @= if (hash_text[p]>0) then {there's something here} if (str_eq_buf(hash_text[p],buf,j,l)) then {it's the right string} if (hash_ilk[p] = ilk) then {it's the right |str_ilk|} begin hash_found := true; goto str_found; end else begin {it's the wrong |str_ilk|} old_string := true; str_num := hash_text[p]; end; @ @↑for loops@> @:BibTeX capacity exceeded}{\quad hash size@> This code inserts the pair in the appropriate unused location. @= begin if (hash_text[p]>0) then {location |p| isn't empty} begin repeat if (hash_is_full) then overflow('hash size ',hash_size); decr(hash_used); until (hash_text[hash_used]=0); {search for an empty location} hash_next[p]:=hash_used; p:=hash_used; end; {now location |p| is empty} if (old_string) then {it's an already encountered string} hash_text[p] := str_num else begin {it's a new string} str_room(l); {make sure it'll fit in |str_pool|} k := j; while (k < j+l) do {not a |for| loop in case |j = l = 0|} begin append_char(buf[k]); incr(k); end; make_string; {and make it official} hash_text[p] := str_ptr - 1; end; hash_ilk[p] := ilk; end @ @↑string pool@> Now that we've defined the hash-table workings we can initialize the string pool. Unlike \TeX, \BibTeX\ does not use a |pool_file| for string storage; instead it inserts its pre-defined strings into |str_pool|---this makes one file fewer for the \BibTeX\ implementor to deal with. This section initializes |str_pool|; the pre-defined strings will be inserted into it shortly; and other strings are inserted while processing the input files. @= pool_ptr:=0; str_ptr:=1; {hash table must have |str_start[0]| unused} str_start[str_ptr]:=pool_ptr; @ The longest pre-defined string determines type definitions used to insert the pre-defined strings into |str_pool|. @d longest_pds=12 {the length of `\.{change.case\$}'} @= @!pds_loc = 1..longest_pds; @!pds_len = 0..longest_pds; @!pds_type = packed array [pds_loc] of char; @ The variables in this program beginning with |s_| specify the locations in |str_pool| for certain often-used strings. Those here have to do with the file system; the next section will actually insert them into |str_pool|. @= @!s_aux_extension : str_number; {\.{.aux}} @!s_log_extension : str_number; {\.{.blg}} @!s_bbl_extension : str_number; {\.{.bbl}} @!s_bst_extension : str_number; {\.{.bst}} @!s_bib_extension : str_number; {\.{.bib}} @!s_bst_area : str_number; {\.{texinputs:}} @!s_bib_area : str_number; {\.{texbib:}} @ @↑important note@> @↑system dependencies@> It's time to insert some of the pre-defined strings into |str_pool| (and thus the hash table). These strings system-dependent should contain no upper-case letters. The |pre_define| routine appears shortly. Important notes: These pre-definitions must not have any glitches or the program may bomb because the |log_file| hasn't been opened yet, and |text_ilk|s should be pre-defined later, for \.{.bst}-function-execution purposes. @= {these pre-defined strings must all be exactly |longest_pds| long} pre_define('.aux ',4,file_ext_ilk); s_aux_extension := hash_text[pre_def_loc]; pre_define('.bbl ',4,file_ext_ilk); s_bbl_extension := hash_text[pre_def_loc]; pre_define('.blg ',4,file_ext_ilk); s_log_extension := hash_text[pre_def_loc]; pre_define('.bst ',4,file_ext_ilk); s_bst_extension := hash_text[pre_def_loc]; pre_define('.bib ',4,file_ext_ilk); s_bib_extension := hash_text[pre_def_loc]; pre_define('texinputs: ',10,file_area_ilk); s_bst_area := hash_text[pre_def_loc]; pre_define('texbib: ',7,file_area_ilk); s_bib_area := hash_text[pre_def_loc]; @ This global variable gives the hash-table location of pre-defined strings generated by calls to |str_lookup|. @= @!pre_def_loc : hash_loc; @ This procedure initializes a pre-defined string of length at most |longest_pds|. @= procedure pre_define (@!pds:pds_type; @!len:pds_len; @!ilk:str_ilk); var i : pds_len; begin for i:=1 to len do buffer[i] := xord[pds[i]]; pre_def_loc := str_lookup(buffer,1,len,ilk,do_insert); end; @ These constants all begin with |n_| and are used for the |case| statement that determines which command to execute. The variable |command_num| is set to one of these and is used to do the branching, but it must have the full |integer| range because at times it can assume an arbitrary |ilk_info| value (though it will be one of the values here when we actually use it). @d n_aux_bibdata = 0 {\.{\\bibdata}} @d n_aux_bibstyle = 1 {\.{\\bibstyle}} @d n_aux_citation = 2 {\.{\\citation}} @d n_aux_input = 3 {\.{\\@@input}} @# @d n_bst_entry = 0 {\.{entry}} @d n_bst_execute = 1 {\.{execute}} @d n_bst_function = 2 {\.{function}} @d n_bst_integers = 3 {\.{integers}} @d n_bst_iterate = 4 {\.{iterate}} @d n_bst_macro = 5 {\.{macro}} @d n_bst_read = 6 {\.{read}} @d n_bst_reverse = 7 {\.{reverse}} @d n_bst_sort = 8 {\.{sort}} @d n_bst_strings = 9 {\.{strings}} @# @d n_bib_comment = 0 {\.{comment}} @d n_bib_string = 1 {\.{string}} @= @!command_num : integer; @ Now we pre-define the command strings; they must all be exactly |longest_pds| long. @= pre_define('\citation ',9,aux_command_ilk); ilk_info[pre_def_loc] := n_aux_citation; pre_define('\bibdata ',8,aux_command_ilk); ilk_info[pre_def_loc] := n_aux_bibdata; pre_define('\bibstyle ',9,aux_command_ilk); ilk_info[pre_def_loc] := n_aux_bibstyle; pre_define('\@@input ',7,aux_command_ilk); ilk_info[pre_def_loc] := n_aux_input; @# pre_define('entry ',5,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_entry; pre_define('execute ',7,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_execute; pre_define('function ',8,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_function; pre_define('integers ',8,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_integers; pre_define('iterate ',7,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_iterate; pre_define('macro ',5,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_macro; pre_define('read ',4,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_read; pre_define('reverse ',7,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_reverse; pre_define('sort ',4,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_sort; pre_define('strings ',7,bst_command_ilk); ilk_info[pre_def_loc] := n_bst_strings; @# pre_define('comment ',7,bib_command_ilk); ilk_info[pre_def_loc] := n_bib_comment; pre_define('string ',6,bib_command_ilk); ilk_info[pre_def_loc] := n_bib_string; @* Scanning an input line. This section describes the various |buffer| scanning routines. The two global variables |buf_ptr1| and |buf_ptr2| are used in scanning an input line. Between scans, |buf_ptr1| points to the first character of the current token and |buf_ptr2| points to that of the next. The global variable |last|, set by the function |input_ln|, marks the end of the current line; it equals 0 at the end of the current file. All the procedures and functions in this section will indicate an end-of-line when it's the end of the file. @d token_len == (buf_ptr2 - buf_ptr1) {of the current token} @d scan_char == buffer[buf_ptr2] {the current character} @= @!buf_ptr1:buf_pointer; {points to the first position of the current token} @!buf_ptr2:buf_pointer; {used to find the end of the current token} @ @↑system dependencies@> This procedure sends the current token, in |buffer[buf_ptr1]| to |buffer[buf_ptr2-1]|, to an output file. Note: The |term_out| file is system dependent. @d print_token == begin out_token(term_out); out_token(log_file); end @# @d trace_pr_token == begin out_token(log_file); end @= procedure out_token (var f:alpha_file); var i:buf_pointer; begin i := buf_ptr1; while (i < buf_ptr2) do begin write(f,xchr[buffer[i]]); incr(i); end; end; @ This function scans the |buffer| for the next token, starting at the global variable |buf_ptr2| and ending just before either the single specified stop-character or the end of the current line, whichever comes first, respectively returning |true| or |false|; afterward, |scan_char| is the first character following this token. @= function scan1 (@!char1:ASCII_code) : boolean; begin buf_ptr1 := buf_ptr2; {scan until end-of-line or the specified character} while ((scan_char <> char1) and (buf_ptr2 < last)) do incr(buf_ptr2); if (buf_ptr2 < last) then scan1 := true else scan1 := false; end; @ This function is the same but stops at |white_space| characters as well. @= function scan1_white (@!char1:ASCII_code) : boolean; begin buf_ptr1 := buf_ptr2; {scan until end-of-line, the specified character, or |white_space|} while ((lex_class[scan_char] <> white_space) and (scan_char <> char1) and (buf_ptr2 < last)) do incr(buf_ptr2); if (buf_ptr2 < last) then scan1_white := true else scan1_white := false; end; @ This function is similar to |scan1|, but stops at either of two stop-characters as well as the end of the current line. @= function scan2 (@!char1,@!char2:ASCII_code) : boolean; begin buf_ptr1 := buf_ptr2; {scan until end-of-line or the specified characters} while ((scan_char <> char1) and (scan_char <> char2) and (buf_ptr2 < last)) do incr(buf_ptr2); if (buf_ptr2 < last) then scan2 := true else scan2 := false; end; @ This function is the same but stops at |white_space| characters as well. @= function scan2_white (@!char1,@!char2:ASCII_code) : boolean; begin buf_ptr1 := buf_ptr2; {scan until end-of-line, the specified characters, or |white_space|} while ((scan_char <> char1) and (scan_char <> char2) and (lex_class[scan_char] <> white_space) and (buf_ptr2 < last)) do incr(buf_ptr2); if (buf_ptr2 < last) then scan2_white := true else scan2_white := false; end; @ This function is similar to |scan2|, but stops at either of three stop-characters as well as the end of the current line. @= function scan3 (@!char1,@!char2,@!char3:ASCII_code) : boolean; begin buf_ptr1 := buf_ptr2; {scan until end-of-line or the specified characters} while ((scan_char <> char1) and (scan_char <> char2) and (scan_char <> char3) and (buf_ptr2 < last)) do incr(buf_ptr2); if (buf_ptr2 < last) then scan3 := true else scan3 := false; end; @ This function scans for letters, stopping at the first nonletter; it returns |true| if there is at least one letter. @= function scan_alpha : boolean; begin buf_ptr1 := buf_ptr2; {scan until end-of-line or a nonletter} while ((lex_class[scan_char] = alpha) and (buf_ptr2 < last)) do incr(buf_ptr2); if (token_len = 0) then scan_alpha := false else scan_alpha := true; end; @ These are the possible values for |scan_result|; they're set by the |scan_identifier| procedure and are described in the next section. @d id_null = 0 @d specified_char_adjacent = 1 @d other_char_adjacent = 2 @d white_adjacent = 3 @= @!scan_result : id_null..white_adjacent; @ This procedure scans for an identifier, stopping at the first |illegal_id_char|, or stopping at the first character if it's |numeric|. It sets the global variable |scan_result| to |id_null| if the identifier is null, else to |white_adjacent| if it ended at a |white_space| character or an end-of-line, else to |specified_char_adjacent| if it ended at one of |char1| or |char2|, else to |other_char_adjacent| if it ended at a nonspecified, non|white_space| |illegal_id_char|. @= procedure scan_identifier (@!char1,@!char2:ASCII_code); begin buf_ptr1 := buf_ptr2; if (lex_class[scan_char] <> numeric) then {scan until end-of-line or an |illegal_id_char|} while ((id_class[scan_char] = legal_id_char) and (buf_ptr2 < last)) do incr(buf_ptr2); if (token_len = 0) then scan_result := id_null else if ((lex_class[scan_char] = white_space) or (buf_ptr2 = last)) then scan_result := white_adjacent else if ((scan_char = char1) or (scan_char = char2)) then scan_result := specified_char_adjacent else scan_result := other_char_adjacent; end; @ The next two procedures scan for an integer, setting the global variable |token_value| to the corresponding integer. @d char_value == (scan_char - "0") {the value of the digit being scanned} @= @!token_value : integer; {the numeric value of the current token} @ This function scans for a nonnegative integer, stopping at the first nondigit; it sets the value of |token_value| accordingly. It returns |true| if the token was a legal nonnegative integer (i.e., consisted of one or more digits). @= function scan_nonneg_integer : boolean; begin buf_ptr1 := buf_ptr2; token_value := 0; {scan until end-of-line or a nondigit} while ((lex_class[scan_char] = numeric) and (buf_ptr2 < last)) do begin token_value := token_value*10 + char_value; incr(buf_ptr2); end; if (token_len = 0) then {there were no digits} scan_nonneg_integer := false else scan_nonneg_integer := true; end; @ This procedure scans for an integer, stopping at the first nondigit; it sets the value of |token_value| accordingly. It returns |true| if the token was a legal integer (i.e., consisted of an optional |minus_sign| followed by one or more digits). @d negative == (sign_length = 1) {if this integer is negative} @= function scan_integer : boolean; var sign_length : 0..1; {1 if there's a |minus_sign|, 0 if not} begin buf_ptr1 := buf_ptr2; if (scan_char = minus_sign) then {it's a negative number} begin sign_length := 1; incr(buf_ptr2); {skip over the |minus_sign|} end else sign_length := 0; token_value := 0; {scan until end-of-line or a nondigit} while ((lex_class[scan_char] = numeric) and (buf_ptr2 < last)) do begin token_value := token_value*10 + char_value; incr(buf_ptr2); end; if (negative) then token_value := -token_value; if (token_len = sign_length) then {there were no digits} scan_integer := false else scan_integer := true; end; @ This function scans over |white_space| characters, stopping either at the first nonwhite character or the end of the line, respectively returning |true| or |false|. @= function scan_white_space : boolean; begin {scan until end-of-line, or a nonwhite} while ((lex_class[scan_char] = white_space) and (buf_ptr2 < last)) do incr(buf_ptr2); if (buf_ptr2 < last) then scan_white_space := true else scan_white_space := false; end; @ The |print_bad_input_line| procedure prints the current input line, splitting it at the character being scanned: It prints |buffer[0]|, |buffer[1]|, \dots, |buffer[buf_ptr2-1]| on one line and |buffer[buf_ptr2]|, \dots, |buffer[last-1]| on the next (and both lines start with a colon between two |space|s). Each |white_space| character is printed as a |space|. @= procedure print_bad_input_line; var bf_ptr : buf_pointer; begin print (' : '); bf_ptr := 0; while (bf_ptr < buf_ptr2) do begin if (lex_class[buffer[bf_ptr]] = white_space) then print (xchr[space]) else print (xchr[buffer[bf_ptr]]); incr(bf_ptr); end; print_newline; print (' : '); bf_ptr := 0; while (bf_ptr < buf_ptr2) do begin print (xchr[space]); incr(bf_ptr); end; bf_ptr := buf_ptr2; while (bf_ptr < last) do begin if (lex_class[buffer[bf_ptr]] = white_space) then print (xchr[space]) else print (xchr[buffer[bf_ptr]]); incr(bf_ptr); end; print_newline; end; @* Getting the top-level auxiliary file name. @↑system dependencies@> These modules read the name of the top-level \.{.aux} file---as currently implemented the name comes from the user's terminal. The name goes into the |char| array |name_of_file|, and the files relevant to this name are opened. @d aux_found=41 {go here when the \.{.aux} name is legit} @d aux_not_found=46 {go here when it's not} @= aux_found {this is the first such label in the program} ,@!aux_not_found {this isn't} @ @= @!aux_name_length : 0..file_name_size+1; {\.{.aux} name sans extension} @ @↑system dependencies@> This loop reads a (nonnull) file name, adds the various extensions, and tries to open the files with the resulting name. Note: The |term_out| and |term_in| files are system dependent. @= loop begin write (term_out,'Please type input file name (no extension)--'); if (eoln(term_in)) then {so the first |read| works} read_ln (term_in); aux_name_length := 0; while (not eoln(term_in)) do begin if (aux_name_length = file_name_size) then begin while (not eoln(term_in)) do {discard the rest of the line} get(term_in); @; end; incr(aux_name_length); name_of_file[aux_name_length] := term_in↑; get(term_in); end; if ((aux_name_length + length(s_aux_extension) > file_name_size) or@| (aux_name_length + length(s_log_extension) > file_name_size) or@| (aux_name_length + length(s_bbl_extension) > file_name_size)) then @; @; @; goto aux_found; aux_not_found: end; aux_found: {now we're ready to read the \.{.aux} file} @ @↑system dependencies@> @↑user abuse@> I mean, this is truly disgraceful. A user has to type something in to the terminal just once during the entire run. And it's not some complicated string where you have to get every last punctuation mark just right, and it's not some fancy list where you get nervous because if you forget one item you have to type the whole thing again; it's just a simple, ordinary, file name. Now you'd think a five-year-old could do it; you'd think it's so simple a user should be able to do it in his sleep. But noooooooooo. He had to sit there droning on and on about who knows what until he exceeded the bounds of common sense, and he probably didn't even realize it. Just pitiful. What's this world coming to? We should probably just delete all his files and be done with him. Note: The |term_out| file is system dependent. @= begin write (term_out,'file name---'); name_ptr := 1; while (name_ptr <= aux_name_length) do begin write (term_out,name_of_file[name_ptr]); incr(name_ptr); end; write_ln (term_out,'---is too long'); goto aux_not_found; end @ Here we set up definitions and declarations for files opened in this section. Each element in |aux_list| is a pointer to the appropriate string in |str_pool| representing the \.{.aux} file name. |aux_list[aux_stack_size]| is unused. The array |aux_file| contains the corresponding \PASCAL\ |file| variables. @d cur_aux_str == aux_list[aux_ptr] {shorthand for the current \.{.aux} file} @d cur_aux_file == aux_file[aux_ptr] {shorthand for the current |aux_file|} @d cur_aux_line == aux_ln_stack[aux_ptr] {line number of current \.{.aux} file} @= @!aux_file : array[aux_number] of alpha_file; {open \.{.aux} |file| variables} @!aux_list : array[aux_number] of str_number; {the open \.{.aux} file list} @!aux_ptr : aux_number; {points to the currently open \.{.aux} file} @!aux_ln_stack : array[aux_number] of integer; {open \.{.aux} line numbers} @# @!top_lev_str : str_number; {the top-level \.{.aux} file's name} @# @!log_file : alpha_file; {the |file| variable for the \.{.blg} file} @!bbl_file : alpha_file; {the |file| variable for the \.{.bbl} file} @ Where |aux_number| is the obvious. @= @!aux_number = 0..aux_stack_size; {gives the |aux_list| range} @ @↑system dependencies@> We must make sure the (top-level) \.{.aux}, \.{.blg}, and \.{.bbl} files can be opened. @= name_length := aux_name_length; {set to last used position} add_extension (s_aux_extension); {this also sets |name_length|} aux_ptr := 0; {initialize the \.{.aux} file stack} if (not a_open_in(cur_aux_file)) then @; @# name_length := aux_name_length; add_extension (s_log_extension); {this also sets |name_length|} if (not a_open_out(log_file)) then @; @# name_length := aux_name_length; add_extension (s_bbl_extension); {this also sets |name_length|} if (not a_open_out(bbl_file)) then @; @ @↑system dependencies@> @↑user abuse@> We've abused the user enough for one section; suffice it to say here that most of what we said a few modules ago still applies. Note: The |term_out| file is system dependent. @= begin write (term_out,'I couldn''t open file name---'); name_ptr := 1; while (name_ptr <= name_length) do begin write (term_out,name_of_file[name_ptr]); incr(name_ptr); end; write_ln (term_out); goto aux_not_found; end @ @:this can't happen}{\quad already encountered auxiliary file@> This code puts the \.{.aux} file name, both with and without the extension, into the hash table and initializes |aux_list|. Note that all previous top-level \.{.aux}-file stuff must have been successful. @= name_length := aux_name_length; add_extension (s_aux_extension); {this also sets |name_length|} name_ptr := 1; while (name_ptr <= name_length) do begin buffer[name_ptr] := xord[name_of_file[name_ptr]]; incr(name_ptr); end; lower_case (buffer, 1, name_length); top_lev_str := hash_text[ str_lookup(buffer,1,aux_name_length,text_ilk,do_insert)]; cur_aux_str := hash_text[ str_lookup(buffer,1,name_length,aux_file_ilk,do_insert)]; {note that this has initialized |aux_list|} if (hash_found) then begin print ('already encountered auxiliary file '); print_aux_name; abort ('---this can''t happen'); end; cur_aux_line := 0; {this finishes initializing the top-level \.{.aux} file} @ Print the name of the current \.{.aux} file. @= procedure print_aux_name; begin print_pool_str (cur_aux_str); print_newline; end; @* Reading the auxiliary file(s). Now it's time to read the \.{.aux} file. The only commands we handle are \.{\\citation} (there can be arbitrarily many, each having arbitrarily many arguments), \.{\\bibdata} (there can be just one, but it can have arbitrarily many arguments), \.{\\bibstyle} (there can be just one, and it can have just one argument), and \.{\\@@input} (there can be arbitrarily many, each with one argument, and they can be nested to a depth of |aux_stack_size|). Each of these commands is assumed to be on just a single line. The rest of the \.{.aux} file is ignored. @d aux_done=31 {go here when finished with the \.{.aux} files} @= ,@!aux_done @ We keep reading and processing input lines until none left. This is part of the main program. @= print ('The top-level auxiliary file: '); print_aux_name; loop begin {|pop_the_aux_stack| will exit the loop} incr(cur_aux_line); if (not input_ln(cur_aux_file)) then {end of current \.{.aux} file} pop_the_aux_stack else get_aux_command_and_process; end; trace trace_pr_ln ('finished reading the auxiliary file(s)'); ecart@/ aux_done: last_check_for_aux_errors; @ When we find a bug, we print a message and flush the rest of the line. This macro must be called from within a procedure that has an |exit| label. This is the first of several macros that have associated procedures so that they produce less inline code. @d aux_err(#) == begin print (#); aux_err_print; return; {flush this input line} end @= procedure aux_err_print; begin print ('---line ',cur_aux_line:0,' of file '); print_aux_name; end; @ @:this can't happen}{\quad unknown auxiliary-file command@> We're not at the end of an \.{.aux} file, so we see if the current line might be a command of interest. A command of interest will be a line without blanks, consisting of a command name, a |left_brace|, one or more arguments separated by commas, and a |right_brace|. @= procedure get_aux_command_and_process; label exit; begin buf_ptr2 := 0; {mark the beginning of the next token} if (not scan1(left_brace)) then {no |left_brace|---flush line} return; command_num := ilk_info[ str_lookup(buffer,buf_ptr1,token_len,aux_command_ilk,dont_insert)]; if (hash_found) then case command_num of n_aux_bibdata : aux_bib_data_command; n_aux_bibstyle : aux_bib_style_command; n_aux_citation : aux_citation_command; n_aux_input : aux_input_command; othercases aux_err ('this can''t happen---unknown auxiliary-file command') endcases; exit: end; @ Here we introduce some variables for processing a \.{\\bibdata} command. Each element in |bib_list| is a pointer to the appropriate string in |str_pool| representing the \.{.bib} file name. |bib_list[max_bib_files]| is unused. The array |bib_file| contains the corresponding \PASCAL\ |file| variables. @d cur_bib_str == bib_list[bib_ptr] {shorthand for current \.{.bib} file} @d cur_bib_file == bib_file[bib_ptr] {shorthand for current |bib_file|} @= @!bib_list : array[bib_number] of str_number; {the \.{.bib} file list} @!bib_ptr : bib_number; {pointer for the current \.{.bib} file} @!num_bib_files : bib_number; {the total number of \.{.bib} files} @!bib_seen : boolean; {|true| if we've already seen a \.{\\bibdata} command} @!bib_file : array[bib_number] of alpha_file; {corresponding |file| variables} @ Where |bib_number| is the obvious. @= @!bib_number = 0..max_bib_files; {gives the |bib_list| range} @ @= bib_ptr := 0; {this makes |bib_list| empty} bib_seen := false; {we haven't seen a \.{\\bibdata} command yet} @ A \.{\\bibdata} command will have its arguments between braces and separated by commas. There must be exactly one such command in the \.{.aux} file(s). All upper-case letters are converted to lower case. @= procedure aux_bib_data_command; label exit; begin if (bib_seen) then aux_err ('illegal, another \bibdata command'); bib_seen := true; {now we've seen a \.{\\bibdata} command} while (scan_char <> right_brace) do begin incr(buf_ptr2); {skip over the previous stop-character} if (not scan2_white(right_brace,comma)) then aux_err ('no "',xchr[right_brace],'" for \bibdata command'); if (lex_class[scan_char] = white_space) then aux_err ('white space in argument to \bibdata command'); if ((last > buf_ptr2+1) and (scan_char = right_brace)) then aux_err ('stuff after "',xchr[right_brace],'" for \bibdata command'); @; end; exit: end; @ @:BibTeX capacity exceeded}{\quad number of \.{.bib} files@> Now we add the just-found argument to |bib_list| if it hasn't already been encountered as a \.{\\bibdata} argument and if, after appending the |s_bib_extension| string, the resulting file name can be opened. @= if (bib_ptr = max_bib_files) then overflow('number of database files ',max_bib_files); lower_case (buffer, buf_ptr1, token_len); {ignore case differences} cur_bib_str := hash_text[ str_lookup(buffer,buf_ptr1,token_len,bib_file_ilk,do_insert)]; if (hash_found) then {already encountered this as a \.{\\bibdata} argument} begin print ('This database file appears more than once: '); print_bib_name; aux_err (' for \bibstyle command'); end; start_name (cur_bib_str); add_extension (s_bib_extension); if (not a_open_in(cur_bib_file)) then begin add_area (s_bib_area); if (not a_open_in(cur_bib_file)) then begin print ('I couldn''t open database file '); print_bib_name; aux_err (' for \bibdata command'); end; end; trace trace_pr_pool_str (cur_bib_str); trace_pr_pool_str (s_bib_extension); trace_pr_ln (' is a bibdata file'); ecart@/ incr(bib_ptr); @ Print the name of the current \.{.bib} file. @= procedure print_bib_name; begin print_pool_str (cur_bib_str); print_pool_str (s_bib_extension); print_newline; end; @ Here we introduce some variables for processing a \.{\\bibstyle} command. @= @!bst_seen : boolean; {|true| if we've already seen a \.{\\bibstyle} command} @!bst_str : str_number; {the string number for the \.{.bst} file} @!bst_file : alpha_file; {the corresponding |file| variable} @ And we initialize. @= bst_str := 0; {mark |bst_str| as unused} bst_seen := false; {we haven't seen a \.{\\bibstyle} command yet} @ A \.{\\bibstyle} command will have exactly one argument, and it will be between braces. There must be exactly one such command in the \.{.aux} file(s). All upper-case letters are converted to lower case. @= procedure aux_bib_style_command; label exit; begin if (bst_seen) then aux_err ('illegal, another \bibstyle command'); bst_seen := true; {now we've seen a \.{\\bibstyle} command} incr(buf_ptr2); {skip over the |left_brace|} if (not scan1_white(right_brace)) then aux_err ('no "',xchr[right_brace],'" for \bibstyle command'); if (lex_class[scan_char] = white_space) then aux_err ('white space in argument to \bibstyle command'); if (last > buf_ptr2+1) then aux_err ('stuff after "',xchr[right_brace],'" for \bibstyle command'); @; exit: end; @ @:this can't happen}{\quad already encountered style file@> Now we open the file whose name is the just-found argument appended with the |s_bst_extension| string, if possible. @= lower_case (buffer, buf_ptr1, token_len); {ignore case differences} bst_str := hash_text[ str_lookup(buffer,buf_ptr1,token_len,bst_file_ilk,do_insert)]; if (hash_found) then begin print ('this can''t happen---already encountered style file '); print_bst_name; aux_err (' for \bibstyle command'); end; start_name (bst_str); add_extension (s_bst_extension); if (not a_open_in(bst_file)) then begin add_area (s_bst_area); if (not a_open_in(bst_file)) then begin print ('I couldn''t open style file '); print_bst_name;@/ bst_str := 0; {mark as unused again} aux_err (' for \bibstyle command'); end; end; print ('The style file: '); print_bst_name; @ Print the name of the \.{.bst} file. @= procedure print_bst_name; begin print_pool_str (bst_str); print_pool_str (s_bst_extension); print_newline; end; @ Here we introduce some variables for processing a \.{\\citation} command. Each element in |cite_list| is a pointer to the appropriate string in |str_pool|. The cite-key list is kept in order of occurrence with duplicates removed. |cite_list[max_cites]| is unused. @d cur_cite_str == cite_list[cite_ptr] {shorthand for the current cite key} @= @!cite_list : packed array[cite_number] of str_number; {the cite-key list} @!cite_ptr : cite_number; {pointer for the current cite key} @!num_cites : cite_number; {the total number of distinct cite keys} @!citation_seen : boolean; {|true| if we've seen a \.{\\citation} command} @!cite_loc : hash_loc; {the hash-table location of a cite key} @!lc_cite_loc : hash_loc; {and of its lower-case equivalent} @!cite_found : boolean; {|true| if we've already seen this cite key} @ Where |cite_number| is the obvious. @= @!cite_number = 0..max_cites; {gives the |cite_list| range} @ @= cite_ptr := 0; {this makes |cite_list| empty} citation_seen := false; {we haven't seen a \.{\\citation} command yet} @ @:LaTeX}{\LaTeX@> A \.{\\citation} command will have its arguments between braces and separated by commas. Upper/lower cases are considered to be different for \.{\\citation} arguments, which is the same as the rest of \LaTeX\ but different from the rest of \BibTeX. But they're only different as far as certain warning messages are concerned; a cite key needn't exactly case-match its corresponding database key to work. @= procedure aux_citation_command; label exit; begin citation_seen := true; {now we've seen a \.{\\citation} command} while (scan_char <> right_brace) do begin incr(buf_ptr2); {skip over the previous stop-character} if (not scan2_white(right_brace,comma)) then aux_err ('no "',xchr[right_brace],'" for \citation command'); if (lex_class[scan_char] = white_space) then aux_err ('white space in argument to \citation command'); if ((last > buf_ptr2+1) and (scan_char = right_brace)) then aux_err ('stuff after "',xchr[right_brace],'" for \citation command'); @; end; exit: end; @ We must check if (the lower-case version of) this cite key has been previously encountered, and proceed accordingly. @= trace trace_pr_token; trace_pr (' \cite key encountered'); ecart@/ tmp_ptr := buf_ptr1; while (tmp_ptr < buf_ptr2) do begin sv_buffer[tmp_ptr] := buffer[tmp_ptr]; incr(tmp_ptr); end; lower_case (sv_buffer, buf_ptr1, token_len); {convert to `canonical' form} lc_cite_loc := str_lookup(sv_buffer,buf_ptr1,token_len,lc_cite_ilk,do_insert); if (hash_found) then {already encountered this as a \.{\\citation} argument} @ else @; {it's a new cite key---add it to |cite_list|} @ The lower-case version has already been encountered, so we check that the actual version exactly matches the actual version of the corresponding previously-encountered cite key(s). @= begin trace trace_pr_ln (' previously'); ecart@/ dummy_loc := str_lookup(buffer,buf_ptr1,token_len,cite_ilk,dont_insert); if (not hash_found) then {case mismatch error} begin print ('case mismatch between \cite keys '); print_token; print (' and '); print_pool_str (cite_list[ilk_info[lc_cite_loc]]); print_newline; aux_err (' '); end; end @ @:BibTeX capacity exceeded}{\quad number of cite keys@> @:this can't happen}{\quad cite hash error@> Now we add the just-found argument to |cite_list| if there isn't anything funny happening. @= begin trace trace_pr_newline; ecart@/ if (cite_ptr = max_cites) then overflow('number of \cite keys ',max_cites); cite_loc := str_lookup(buffer,buf_ptr1,token_len,cite_ilk,do_insert); if (hash_found) then aux_err ('this can''t happen---cite hash error'); cur_cite_str := hash_text[cite_loc]; ilk_info[cite_loc] := cite_ptr; ilk_info[lc_cite_loc] := cite_ptr; incr(cite_ptr); end @ An \.{\\@@input} command will have exactly one argument, it will be between braces, and it must have the |s_aux_extension|. All upper-case letters are converted to lower case. @= procedure aux_input_command; label exit; var aux_extension_ok : boolean; {to check for a correct file extension} begin incr(buf_ptr2); {skip over the |left_brace|} if (not scan1_white(right_brace)) then aux_err ('no "',xchr[right_brace],'" for \@@input command'); if (lex_class[scan_char] = white_space) then aux_err ('white_space in argument to \@@input command'); if (last > buf_ptr2+1) then aux_err ('stuff after "',xchr[right_brace],'" for \@@input command'); @; exit: end; @ @:BibTeX capacity exceeded}{\quad number of \.{.aux} files@> We must check that this potential \.{.aux} file won't overflow the stack, that it has the correct extension, that we haven't encountered it before (to prevent, among other things, an infinite loop). @= incr(aux_ptr); if (aux_ptr = aux_stack_size) then overflow('number of auxiliary files ',aux_stack_size); aux_extension_ok := true; lower_case (buffer, buf_ptr1, token_len); {ignore case differences} if (token_len < length(s_aux_extension)) then aux_extension_ok := false {else |str_eq_buf| might bomb the program} else if (not str_eq_buf(s_aux_extension, buffer, buf_ptr2-length(s_aux_extension), length(s_aux_extension))) then aux_extension_ok := false; if (not aux_extension_ok) then begin print_token; print_ln (' has a wrong extension'); decr(aux_ptr); aux_err (' for \@@input command'); end; cur_aux_str := hash_text[ str_lookup(buffer,buf_ptr1,token_len,aux_file_ilk,do_insert)]; if (hash_found) then begin print ('already encountered file '); print_aux_name; decr(aux_ptr); aux_err (' for \@@input command'); end; @; @ We check that this \.{.aux} file can actually be opened, and then open it. @= start_name (cur_aux_str); {extension already there for \.{.aux} files} name_ptr := name_length+1; while (name_ptr <= file_name_size) do {pad with blanks} begin name_of_file[name_ptr] := ' '; incr(name_ptr); end; if (not a_open_in(cur_aux_file)) then begin print ('I couldn''t open auxiliary file '); print_aux_name; decr(aux_ptr); aux_err (' for \@@input command'); end; print ('A level-',aux_ptr:0,' auxiliary file: '); print_aux_name; cur_aux_line := 0; @ Here we close the current-level \.{.aux} file and go back up a level, if possible, by decrementing |aux_ptr|. @= procedure pop_the_aux_stack; begin a_close (cur_aux_file); if (aux_ptr=0) then goto aux_done else decr(aux_ptr); end; @ @↑gymnastics@> That's it for processing \.{.aux} commands, except for finishing the procedural gymnastics. @= @ @ We must complain if anything's amiss. @d aux_end_err(#) == begin print (#); aux_end_err_print; end @= procedure aux_end_err_print; begin print ('---while reading file '); print_aux_name; end; @ Before proceeding, we see if we have any complaints. @= procedure last_check_for_aux_errors; begin num_cites := cite_ptr; {record the number of distinct cite keys} num_bib_files := bib_ptr; {and the number of \.{.bib} files} if (not citation_seen) then aux_end_err ('I found no \citation commands') else if (num_cites = 0) then aux_end_err ('I found no \cite keys'); if (not bib_seen) then aux_end_err ('I found no \bibdata command') else if (num_bib_files = 0) then aux_end_err ('I found no database files'); if (not bib_seen) then aux_end_err ('I found no \bibstyle command') else if (bst_str = 0) then aux_end_err ('I found no style file'); end; @* Reading the style file. This part of the program reads the \.{.bst} file, which consists of a sequence of commands. Each \.{.bst} command consists of a name (for which case differences are ignored) followed by zero or more arguments, each enclosed in braces. @d bst_done=32 {go here when finished with the \.{.bst} file} @d no_bst_file=9932 {go here when skipping the \.{.bst} file} @= ,@!bst_done,@!no_bst_file @ The |bbl_line_num| gets initialized along with the |bst_line_num|, so it's declared here too. @= @!bbl_line_num : integer; {line number of the \.{.bbl} (output) file} @!bst_line_num : integer; {line number of the \.{.bst} file} @ When there's a serious error parsing the \.{.bst} file, we flush the rest of the current command; a blank line is assumed to mark the end of a command (but for the purposes of error recovery only). Thus, error recovery will be better if style designers leave blank lines between \.{.bst} commands. This macro must be called from within a procedure that has an |exit| label. @d bst_err(#) == begin {serious error during \.{.bst} parsing} print (#); bst_err_print_and_look_for_a_blank_line; return; end @= procedure bst_err_print_and_look_for_a_blank_line; begin print ('---line ',bst_line_num:0,' of file '); print_bst_name; print_bad_input_line; while (last <> 0) do {look for a blank input line} if (not input_ln(bst_file)) then {or the end of the file} goto bst_done else incr (bst_line_num); buf_ptr2 := last; {to input the next line} end; @ When there's a harmless (syntactically, at least) error parsing the \.{.bst} file, we just give an error message. @d bst_warn(#) == begin {non-serious error during \.{.bst} parsing} print (#); bst_warn_print; end @= procedure bst_warn_print; begin print ('---line ',bst_line_num:0,' of file '); print_bst_name; end; @ Here's the outer loop for reading the \.{.bst} file---it keeps reading and processing \.{.bst} commands until none left. This is part of the main program. @= if (bst_str = 0) then {there's no \.{.bst} file to read} goto no_bst_file; {this is a |goto| so that |bst_done| is not in a block} bst_line_num := 0; {initialize things} bbl_line_num := 1; {best spot to initialize the output line number} buf_ptr2 := last; {to get the first input line} loop begin if (not eat_bst_white_space) then {the end of the \.{.bst} file} goto bst_done; get_bst_command_and_process; end; bst_done: a_close (bst_file); no_bst_file: a_close (bbl_file); @ This \.{.bst}-specific scanning function skips over |white_space| characters (and comments) until hitting a nonwhite character or the end of the file, respectively returning |true| or |false|. It also updates |bst_line_num|, the line counter. @= function eat_bst_white_space : boolean; label exit; begin loop begin if (scan_white_space) then {hit a nonwhite character on this line} if (scan_char <> comment) then {it's not a comment character; return} begin eat_bst_white_space := true; return; end; if (not input_ln(bst_file)) then {end-of-file; return |false|} begin eat_bst_white_space := false; return; end; incr(bst_line_num); buf_ptr2 := 0; end; exit: end; @ It's often illegal to end a \.{.bst} command in certain places, and this is where we come to check. @d eat_bst_white_and_eof_check(#) == begin if (not eat_bst_white_space) then begin eat_bst_print; bst_err (#); end; end @= procedure eat_bst_print; begin print ('illegal end of style file in command: '); end; @ We must attend to a few details before getting to work on this \.{.bst} command. @= procedure get_bst_command_and_process; label exit; begin if (not scan_alpha) then bst_err ('"',xchr[scan_char],'" can''t start a style-file command'); lower_case (buffer, buf_ptr1, token_len); {ignore case differences} command_num := ilk_info[ str_lookup(buffer,buf_ptr1,token_len,bst_command_ilk,dont_insert)]; if (not hash_found) then begin print_token; bst_err (' is an illegal style-file command'); end; @; exit: end; @ @:this can't happen}{\quad unknown style-file command@> Here we determine which \.{.bst} command we're about to process, and then go to it. @= case command_num of n_bst_entry : bst_entry_command; n_bst_execute : bst_execute_command; n_bst_function : bst_function_command; n_bst_integers : bst_integers_command; n_bst_iterate : bst_iterate_command; n_bst_macro : bst_macro_command; n_bst_read : bst_read_command; n_bst_reverse : bst_reverse_command; n_bst_sort : bst_sort_command; n_bst_strings : bst_strings_command; othercases bst_err ('this can''t happen---unknown style-file command') endcases @ We need data structures for the function definitions, the entry variables, the global variables, and the actual entries corresponding to the cite-key list. First we define the classes of `function's used. Functions in all classes are of |bst_fn_ilk| except for |macro|s, which are of |macro_ilk|; |int_literal|s, which are of |integer_ilk|; and |str_literal|s, which are of |text_ilk|. @d built_in = 0 {the `primitive' functions} @d macro = 1 {defines a string substitution} @d wiz_defined = 2 {defined in the \.{.bst} file} @d int_literal = 3 {integer `constants'} @d str_literal = 4 {string `constants'} @d field = 5 {things like `author' and `title'} @d int_entry_var = 6 {integer entry variable} @d str_entry_var = 7 {string entry variable} @d int_global_var = 8 {integer global variable} @d str_global_var = 9 {string global variable} @d last_fn_class = 9 {the same number as on the line above} @ @:this can't happen}{\quad unknown function class@> Occasionally we'll want to |print| the name of one of these function classes. @= procedure print_fn_class (@!fn_loc : hash_loc); begin case fn_type[fn_loc] of built_in : print ('built-in'); macro : print ('macro'); wiz_defined : print ('wizard-defined'); int_literal : print ('integer-literal'); str_literal : print ('string-literal'); field : print ('field'); int_entry_var : print ('integer-entry-variable'); str_entry_var : print ('string-entry-variable'); int_global_var : print ('integer-global-variable'); str_global_var : print ('string-global-variable'); othercases print ('this can''t happen---unknown function class') endcases; end; @ @:this can't happen}{\quad unknown function class@> This version is for printing when in |trace| mode. @= trace procedure trace_pr_fn_class (@!fn_loc : hash_loc); begin case fn_type[fn_loc] of built_in : trace_pr ('built-in'); macro : trace_pr ('macro'); wiz_defined : trace_pr ('wizard-defined'); int_literal : trace_pr ('integer-literal'); str_literal : trace_pr ('string-literal'); field : trace_pr ('field'); int_entry_var : trace_pr ('integer-entry-variable'); str_entry_var : trace_pr ('string-entry-variable'); int_global_var : trace_pr ('integer-global-variable'); str_global_var : trace_pr ('string-global-variable'); othercases print ('this can''t happen---unknown function class') endcases; end; ecart @ Besides the function classes, we have types based on \BibTeX's capacity limitations and one based on what can go into the array |wiz_functions| explained below. @d quote_next_fn = hash_base - 1 {special marker used in defining functions} @d end_of_def = hash_max + 1 {another such special marker} @= @!fn_class = 0..last_fn_class; {the \.{.bst} function classes} @!wiz_fn_loc = 0..wiz_fn_space; {|wiz_defined|-function storage locations} @!int_ent_loc = 0..max_ent_ints; {|int_entry_var| storage locations} @!str_ent_loc = 0..max_ent_strs; {|str_entry_var| storage locations} @!str_glob_loc = 0..max_glob_strs; {|str_global_var| storage locations} @!field_loc = 0..max_fields; {individual field storage locations} @!hash_ptr2 = quote_next_fn..end_of_def; {a special marker or a |hash_loc|} @ @↑wasted space@> We store information about the \.{.bst} functions in arrays the same size as the hash-table arrays and in locations corresponding to their hash-table locations. The two arrays |fn_info| (an alias of |ilk_info| described earlier) and |fn_type| accomplish this: |fn_type| specifies one of the above classes, and |fn_info| gives information dependent on the class. Note: Since in practice, functions generally won't comprise a large fraction of the hash table, this scheme wastes some space. If space becomes a problem, this can be fixed. Five other arrays give the contents of functions: The array |wiz_functions| holds definitions for |wiz_defined| functions---each such function consists of a sequence of pointers to hash-table locations of other functions (with the two special-marker exceptions above); the array |entry_ints| contains the current values of |int_entry_var|s; the array |entry_strs| contains the current values of |str_entry_var|s; the array |global_strs| contains the current values of |str_global_var|s; and the array |field_info|, for each field of each entry, contains either a hash-table pointer to the string or the special value |missing|. @d fn_info == ilk_info {an alias used with functions} @# @d missing = empty {a special pointer for missing fields} @= @!fn_loc : hash_loc; {the hash-table location of a function} @!wiz_loc : hash_loc; {the hash-table location of a wizard function} @!literal_loc : hash_loc; {the hash-table location of a literal function} @!macro_name_loc : hash_loc; {the hash-table location of a macro name} @!macro_def_loc : hash_loc; {the hash-table location of a macro definition} @!fn_type : packed array[hash_loc] of fn_class; @!wiz_def_ptr : wiz_fn_loc; {storage location for the next wizard function} @!wiz_fn_ptr : wiz_fn_loc; {general |wiz_functions| location} @!wiz_functions : packed array[wiz_fn_loc] of hash_ptr2; @!int_ent_ptr : int_ent_loc; {general |int_entry_var| location} @!entry_ints : array[int_ent_loc] of integer; @!num_ent_ints : integer; {the number of distinct |int_entry_var| names} @!str_ent_ptr : str_ent_loc; {general |str_entry_var| location} @!entry_strs : array[str_ent_loc] of packed array[0..ent_str_size] of ASCII_code; @!num_ent_strs : integer; {the number of distinct |str_entry_var| names} @!str_glb_ptr : str_glob_loc; {general |str_global_var| location} @!global_strs : array[str_glob_loc] of packed array[0..glob_str_size] of ASCII_code; @!num_glb_strs : str_glob_loc; {the number of distinct |str_global_var| names} @!field_ptr : field_loc; {general |field_info| location} @!field_info : packed array[field_loc] of hash_pointer; @!num_fields : integer; {the number of distinct field names} @ Now we initialize storage for the |wiz_defined| functions and we initialize variables so that the first |str_entry_var|, |int_entry_var|, |str_global_var|, and |field| name will be assigned the number~0. Note: The variable |num_ent_strs| will also be set when pre-defining strings. @= wiz_def_ptr := 0; num_glb_strs := 0; num_ent_ints := 0; num_ent_strs := 0; num_fields := 0; @* Style-file commands. There are ten \.{.bst} commands: Five (\.{entry}, \.{function}, \.{integers}, \.{macro}, and \.{strings}) declare and define functions, one (\.{read}) reads in the \.{.bib}-file entries, and four (\.{execute}, \.{iterate}, \.{reverse}, and \.{sort}) manipulate the entries and produce output. The boolean variables |entry_seen| and |read_seen| indicate whether we've yet encountered an \.{entry} and a \.{read} command. There must be exactly one of each of these, and the \.{entry} command, as well as any \.{macro} command, must precede the \.{read} command. Furthermore, the \.{read} command must precede the four that manipulate the entries and produce output. @= @!entry_seen : boolean; {|true| if we've already seen an \.{entry} command} @!read_seen : boolean; {|true| if we've already seen a \.{read} command} @!read_performed : boolean; {|true| if we actually read the database file(s)} @ And we initialize them. @= entry_seen := false; read_seen := false; read_performed := false; @ @:this can't happen}{\quad identifier scanning error@> This macro is used to scan all \.{.bst} identifiers. The argument supplies the \.{.bst} command name. The associated procedure simply prints an error message. @d bst_identifier_scan(#) == begin scan_identifier (right_brace,comment); if ((scan_result = white_adjacent) or (scan_result = specified_char_adjacent)) then do_nothing else begin bst_id_print; bst_err (#); end; end @= procedure bst_id_print; begin if (scan_result = id_null) then print ('"',xchr[scan_char],'" begins identifier, command: ') else if (scan_result = other_char_adjacent) then print ('"',xchr[scan_char],'" immediately follows identifier, command: ') else print ('this can''t happen---identifier scanning error, command: '); end; @ This macro just makes sure we're at a |left_brace|. @d bst_get_and_check_left_brace(#) == begin if (scan_char <> left_brace) then begin bst_left_brace_print; bst_err (#); end; incr(buf_ptr2); {skip over the |left_brace|} end @= procedure bst_left_brace_print; begin print ('"',xchr[left_brace],'" is missing in command: '); end; @ And this one, a |right_brace|. @d bst_get_and_check_right_brace(#) == begin if (scan_char <> right_brace) then begin bst_right_brace_print; bst_err (#); end; incr(buf_ptr2); {skip over the |right_brace|} end @= procedure bst_right_brace_print; begin print ('"',xchr[right_brace],'" is missing in command: '); end; @ This macro complains if we've already encountered a function to be inserted into the hash table. @d check_for_already_seen_function(#) == begin if (hash_found) then {already encountered this as a \.{.bst} function} begin already_seen_function_print (#); return; end; end @= procedure already_seen_function_print (@!seen_fn_loc : hash_loc); label exit; {so the call to |bst_err| works} begin print_pool_str (hash_text[seen_fn_loc]); print (' is already a type '); print_fn_class (seen_fn_loc); print_ln (' function name'); bst_err (' '); exit: end; @ An \.{entry} command has three arguments, each a (possibly empty) list of function names between braces (the names are separated by one or more |white_space| characters). All function names in this and other commands must be legal \.{.bst} identifiers. Upper/lower cases are considered to be the same for function names in these lists---all upper-case letters are converted to lower case. These arguments give lists of |field|s, |int_entry_var|s, and |str_entry_var|s. @= procedure bst_entry_command; label exit; begin if (entry_seen) then bst_err ('illegal, another entry command'); entry_seen := true; {now we've seen an \.{entry} command} eat_bst_white_and_eof_check ('entry'); @; eat_bst_white_and_eof_check ('entry'); if (num_fields = 0) then bst_warn ('---I didn''t find any fields'); @; eat_bst_white_and_eof_check ('entry'); @; exit: end; @ This module reads a |left_brace|, the list of |field|s, and a |right_brace|. The |field|s are those like `author' and `title.' @= bst_get_and_check_left_brace ('entry'); eat_bst_white_and_eof_check ('entry'); while (scan_char <> right_brace) do begin bst_identifier_scan ('entry'); @; eat_bst_white_and_eof_check ('entry'); end; incr(buf_ptr2); {skip over the |right_brace|} @ Here we insert the just found field name into the hash table, record it as a |field|, and assign it a number to be used in indexing into the |field_info| array. @= trace trace_pr_token; trace_pr_ln (' is a field'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,do_insert); check_for_already_seen_function (fn_loc); fn_type[fn_loc] := field;@/ fn_info[fn_loc] := num_fields; {give this field a number, take away its name} incr(num_fields); @ This module reads a |left_brace|, the list of |int_entry_var|s, and a |right_brace|. @= bst_get_and_check_left_brace ('entry'); eat_bst_white_and_eof_check ('entry'); while (scan_char <> right_brace) do begin bst_identifier_scan ('entry'); @; eat_bst_white_and_eof_check ('entry'); end; incr(buf_ptr2); {skip over the |right_brace|} @ Here we insert the just found |int_entry_var| name into the hash table and record it as an |int_entry_var|. @= trace trace_pr_token; trace_pr_ln (' is an integer entry-variable'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,do_insert); check_for_already_seen_function (fn_loc); fn_type[fn_loc] := int_entry_var;@/ fn_info[fn_loc] := num_ent_ints; {give this |int_entry_var| a number} incr(num_ent_ints); @ This module reads a |left_brace|, the list of |str_entry_var|s, and a |right_brace|. @= bst_get_and_check_left_brace ('entry'); eat_bst_white_and_eof_check ('entry'); while (scan_char <> right_brace) do begin bst_identifier_scan ('entry'); @; eat_bst_white_and_eof_check ('entry'); end; incr(buf_ptr2); {skip over the |right_brace|} @ Here we insert the just found |str_entry_var| name into the hash table, record it as a |str_entry_var|, set its pointer into |entry_strs|, and initialize its value there to the null string. @= trace trace_pr_token; trace_pr_ln (' is a string entry-variable'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,do_insert); check_for_already_seen_function (fn_loc); fn_type[fn_loc] := str_entry_var;@/ fn_info[fn_loc] := num_ent_strs; {give this |str_entry_var| a number} incr(num_ent_strs); @ An \.{execute} command has one argument, a single |built_in| or |wiz_defined| function name between braces. Upper/lower cases are considered to be the same---all upper-case letters are converted to lower case. Also, we must make sure we've already seen a \.{read} command. This module reads a |left_brace|, a single function to be executed, and a |right_brace|. @= procedure bst_execute_command; label exit; begin if (not read_seen) then bst_err ('illegal, execute command before read command'); eat_bst_white_and_eof_check ('execute'); bst_get_and_check_left_brace ('execute'); eat_bst_white_and_eof_check ('execute'); bst_identifier_scan ('execute'); @; eat_bst_white_and_eof_check ('execute'); bst_get_and_check_right_brace ('execute'); @; exit: end; @ Before executing the function, we must make sure it's a legal one. It must exist and be |built_in| or |wiz_defined|. @= trace trace_pr_token; trace_pr_ln (' is a to be executed function'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if (not hash_found) then {unknown \.{.bst} function} begin print_token; bst_warn (' is an unknown function'); end else if ((fn_type[fn_loc] <> built_in) and (fn_type[fn_loc] <> wiz_defined)) then print_function_type_bst_bad; @ Where |print_function_type_bst_bad| tells what's wrong. @= procedure print_function_type_bst_bad; begin print_token; print (' has a bad function type---'); print_fn_class (fn_loc); bst_warn (' '); end; @ A \.{function} command has two arguments; the first is a |wiz_defined| function name between braces. Upper/lower cases are considered to be the same---all upper-case letters are converted to lower case. The second argument defines this function. It consists of a sequence of functions, between braces, separated by |white_space| characters. Upper/lower cases are considered to be the same for function names but not for |str_literal|s. @= procedure bst_function_command; label exit; begin eat_bst_white_and_eof_check ('function'); @; eat_bst_white_and_eof_check ('function'); bst_get_and_check_left_brace ('function'); scan_fn_def(wiz_loc); {this scans the function definition} exit: end; @ This module reads a |left_brace|, a |wiz_defined| function name, and a |right_brace|. @= bst_get_and_check_left_brace ('function'); eat_bst_white_and_eof_check ('function'); bst_identifier_scan ('function'); @; eat_bst_white_and_eof_check ('function'); bst_get_and_check_right_brace ('function'); @ The function name must exist and be a new one; we mark it as |wiz_defined|. Also, see if it's the default entry-type function. @= trace trace_pr_token; trace_pr_ln (' is a wizard-defined function'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} wiz_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,do_insert); check_for_already_seen_function (wiz_loc); fn_type[wiz_loc] := wiz_defined; if (hash_text[wiz_loc] = s_default) then {we've found the default entry-type} b_default := wiz_loc; {see the |built_in| functions for |b_default|} @ We're about to start scanning tokens in a function definition. When a function token is illegal, we skip until it ends; a |white_space| character, an end-of-line, a |right_brace|, or a |comment| marks the end of the current token. This macro is similar to |bst_warn|. @d next_token=25 {a bad function token; go read the next one} @# @d skip_token(#) == begin {not-so-serious error during \.{.bst} parsing} print (#); skip_token_print; {also, skip to the current token's end} goto next_token; end @= procedure skip_token_print; begin print ('---line ',bst_line_num:0,' of file '); print_bst_name; if (scan2_white(right_brace,comment)) then {ok if token ends line} do_nothing; end; @ This recursive function reads and stores the list of functions (separated by |white_space| characters or end-of-lines) that define this new function, and reads a |right_brace|. @= procedure scan_fn_def (@!fn_hash_loc : hash_loc); label next_token,exit; type @!fn_def_loc = 0..single_fn_space; {for a single |wiz_defined|-function} var singl_function : packed array[fn_def_loc] of hash_ptr2; @!single_ptr : fn_def_loc; {next storage location for this definition} @!copy_ptr : fn_def_loc; {dummy variable} @!end_of_num : buf_pointer; {the end of an implicit function's name} @!impl_fn_loc : hash_loc; {an implicit function's hash-table location} begin eat_bst_white_and_eof_check ('function'); single_ptr := 0; while (scan_char <> right_brace) do begin @; next_token: eat_bst_white_and_eof_check ('function'); end; @; incr(buf_ptr2); {skip over the |right_brace|} exit: end; @ @:BibTeX capacity exceeded}{\quad single function space@> This macro inserts a hash-table location (or one of the two special markers |quote_next_fn| and |end_of_def|) into the |singl_function| array, which will later be copied into the |wiz_functions| array. @d insert_fn_loc(#) == begin singl_function[single_ptr] := #; if (single_ptr = single_fn_space) then overflow('single function space ',single_fn_space); incr(single_ptr); end @ There are five possibilites for the first character of the token representing the next function of the definition: If it's a |number_sign|, the token is an |int_literal|; if it's a |double_quote|, the token is a |str_literal|; if it's a |single_quote|, the token is a quoted function; if it's a |left_brace|, the token isn't really a token, but rather the start of another function definition (which will result in a recursive call to |scan_fn_def|); if it's anything else, the token is the name of an already-defined function. Note: To prevent recursion, we have to check that neither a quoted function nor an already-defined-function is actually the currently-being-defined function (which is stored at |wiz_loc|). @= case scan_char of number_sign : @; double_quote : @; single_quote : @; left_brace : @; othercases @ endcases; @ An |int_literal| is preceded by a |number_sign|, consists of an integer (i.e., an optional |minus_sign| followed by one or more |numeric| characters), and is followed either by a |white_space| character, an end-of-line, or a |right_brace|. The array |fn_info| contains the value of the integer for |int_literal|s. @= begin incr(buf_ptr2); {skip over the |number_sign|} if (not scan_integer) then skip_token ('illegal integer in integer literal'); trace trace_pr ('#'); trace_pr_token; trace_pr_ln (' is an integer literal with value ',token_value:0); ecart@/ literal_loc := str_lookup(buffer,buf_ptr1,token_len,integer_ilk,do_insert); if (not hash_found) then begin fn_type[literal_loc] := int_literal; {set the |fn_class|} fn_info[literal_loc] := token_value; {the value of this integer} end; if ((lex_class[scan_char]<>white_space) and (buf_ptr2right_brace) and@| (scan_char<>comment)) then skip_token ('"',xchr[scan_char],'" can''t follow a literal'); insert_fn_loc (literal_loc); {add this function to |wiz_functions|} end @ A |str_literal| is preceded by a |double_quote| and consists of all characters on this line up to the next |double_quote|. Also, there must be either a |white_space| character, an end-of-line, a |right_brace|, or a |comment| following (since functions in the definition must be separated by |white_space|). The array |fn_info| contains nothing for |str_literal|s. @= begin incr(buf_ptr2); {skip over the |double_quote|} if (not scan1(double_quote)) then skip_token ('no `',xchr[double_quote],''' to end string literal'); trace trace_pr ('"'); trace_pr_token; trace_pr ('"'); trace_pr_ln (' is a string literal'); ecart@/ literal_loc := str_lookup(buffer,buf_ptr1,token_len,text_ilk,do_insert);@/ fn_type[literal_loc] := str_literal; {set the |fn_class|} incr(buf_ptr2); {skip over the |double_quote|} if ((lex_class[scan_char]<>white_space) and (buf_ptr2right_brace) and@| (scan_char<>comment)) then skip_token ('"',xchr[scan_char],'" can''t follow a literal'); insert_fn_loc (literal_loc); {add this function to |wiz_functions|} end @ A quoted function is preceded by a |single_quote| and consists of all characters up to the next |white_space| character, end-of-line, |right_brace|, or |comment|. @= begin incr(buf_ptr2); {skip over the |single_quote|} if (scan2_white(right_brace,comment)) then {ok if token ends line} do_nothing; trace trace_pr (''''); trace_pr_token; trace_pr (' is a quoted function '); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if (not hash_found) then {unknown \.{.bst} function} begin print_token; skip_token (' is an unknown function'); end else @; end @ Here we check that this quoted function is a legal one---the function name must already exist, but it mustn't be the currently-being-defined function (which is stored at |wiz_loc|). @= if (fn_loc = wiz_loc) then begin trace trace_pr_newline; ecart@/ print ('curse you, wizard, before you recurse me: function '); print_token; print_ln (' is illegal'); skip_token ('in its own definition'); end else begin trace trace_pr ('of type '); trace_pr_fn_class (fn_loc); trace_pr_newline; ecart@/ insert_fn_loc (quote_next_fn); {add special marker together with} insert_fn_loc (fn_loc); {this function to |wiz_functions|} end @ @:this can't happen}{\quad already encountered implicit function@> This module marks the implicit function as being quoted, generates a name, and stores it in the hash table. This name is strictly internal to this program, starts with a |single_quote| (since that will make this function name unique), and ends with the variable |impl_fn_num| converted to ASCII. @= begin sv_buffer[0] := single_quote; int_to_ASCII (impl_fn_num,sv_buffer,1,end_of_num); impl_fn_loc := str_lookup(sv_buffer,0,end_of_num,bst_fn_ilk,do_insert); if (hash_found) then begin print ('this can''t happen---already encountered implicit function '); print_pool_str (hash_text[impl_fn_loc]); print_newline; end; trace trace_pr_pool_str (hash_text[impl_fn_loc]); trace_pr_ln (' is an implicit function'); ecart@/ incr(impl_fn_num); fn_type[impl_fn_loc] := wiz_defined;@/ insert_fn_loc (quote_next_fn); {all implicit functions are quoted} insert_fn_loc (impl_fn_loc); {add it to |wiz_functions|} incr(buf_ptr2); {skip over the |left_brace|} scan_fn_def (impl_fn_loc); {this is the recursive call} end @ The variable |impl_fn_num| counts the number of implicit functions seen in the \.{.bst} file. @= @!impl_fn_num : integer; {the number of implicit functions seen so far} @ Now we initialize it. @= impl_fn_num := 0; @ @:BibTeX capacity exceeded}{\quad buffer size@> This module appends a character to |int_buf| after checking to make sure it will fit; for use in |int_to_ASCII|. @d append_int_char(#) == begin if (int_ptr = buf_size) then overflow('buffer size ',buf_size); int_buf[int_ptr]:=#; incr(int_ptr); end @ This procedure takes the integer |int|, copies the appropriate |ASCII_code| string into |int_buf| starting at |int_begin|, and sets the |var| parameter |int_end| to the first unused |int_buf| location. The ASCII string will consist of decimal digits, the first of which will be not be a~0 if the integer is nonzero, with a prepended minus sign if the integer is negative. @= procedure int_to_ASCII (@!int:integer; var int_buf:buf_type; @!int_begin:buf_pointer; var int_end:buf_pointer); var int_ptr,@!int_xptr : buf_pointer; {pointers into |int_buf|} @!int_tmp_val : ASCII_code; {the temporary element in an exchange} begin int_ptr := int_begin; if (int < 0) then {add the |minus_sign| and use the absolute value} begin append_int_char (minus_sign); int := -int; end; int_xptr := int_ptr; repeat {copy digits into |int_buf|} append_int_char ("0" + (int mod 10)); int := int div 10; until (int = 0); int_end := int_ptr; {set the string length} decr(int_ptr); while (int_xptr < int_ptr) do {and reorder (flip) the digits} begin int_tmp_val := int_buf[int_xptr]; int_buf[int_xptr] := int_buf[int_ptr]; int_buf[int_ptr] := int_tmp_val; decr(int_ptr); incr(int_xptr); end end; @ An already-defined function consists of all characters up to the next |white_space| character, end-of-line, |right_brace|, or |comment|. This function name must already exist, but it mustn't be the currently-being-defined function (which is stored at |wiz_loc|). @= begin if (scan2_white(right_brace,comment)) then {ok if token ends line} do_nothing; trace trace_pr_token; trace_pr (' is a function '); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if (not hash_found) then {unknown \.{.bst} function} begin print_token; skip_token (' is an unknown function'); end else if (fn_loc = wiz_loc) then begin trace trace_pr_newline; ecart@/ print ('curse you, wizard, before you recurse me: function '); print_token; print_ln (' is illegal'); skip_token ('in its own definition'); end else begin trace trace_pr ('of type '); trace_pr_fn_class (fn_loc); trace_pr_newline; ecart@/ insert_fn_loc (fn_loc); {add this function to |wiz_functions|} end; end @ @:BibTeX capacity exceeded}{\quad wizard-defined function space@> Now we add the |end_of_def| special marker, make sure this function will fit into |wiz_functions|, and put it there. @= insert_fn_loc (end_of_def); {add special marker ending the definition} if (single_ptr + wiz_def_ptr > wiz_fn_space) then begin print (single_ptr + wiz_def_ptr : 0,' '); overflow('wizard-defined function space ',wiz_fn_space); end; fn_info[fn_hash_loc] := wiz_def_ptr; {pointer into |wiz_functions|} copy_ptr := 0; while (copy_ptr < single_ptr) do {make this function official} begin wiz_functions[wiz_def_ptr] := singl_function[copy_ptr]; incr (copy_ptr); incr (wiz_def_ptr); end; @ An \.{integers} command has one argument, a list of function names between braces (the names are separated by one or more |white_space| characters). Upper/lower cases are considered to be the same for function names in these lists---all upper-case letters are converted to lower case. Each name in this list specifies an |int_global_var|. There may be several \.{integers} commands in the \.{.bst} file. This module reads a |left_brace|, a list of |int_global_var|s, and a |right_brace|. @= procedure bst_integers_command; label exit; begin eat_bst_white_and_eof_check ('integers'); bst_get_and_check_left_brace ('integers'); eat_bst_white_and_eof_check ('integers'); while (scan_char <> right_brace) do begin bst_identifier_scan ('integers'); @; eat_bst_white_and_eof_check ('integers'); end; incr(buf_ptr2); {skip over the |right_brace|} exit: end; @ Here we insert the just found |int_global_var| name into the hash table and record it as an |int_global_var|. Also, we initialize it by setting |fn_info[fn_loc]| to 0. @= trace trace_pr_token; trace_pr_ln (' is an integer global-variable'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,do_insert); check_for_already_seen_function (fn_loc); fn_type[fn_loc] := int_global_var;@/ fn_info[fn_loc] := 0; {initialize} @ An \.{iterate} command has one argument, a single |built_in| or |wiz_defined| function name between braces. Upper/lower cases are considered to be the same---all upper-case letters are converted to lower case. Also, we must make sure we've already seen a \.{read} command. This module reads a |left_brace|, a single function to be iterated, and a |right_brace|. @= procedure bst_iterate_command; label exit; begin if (not read_seen) then bst_err ('illegal, iterate command before read command'); eat_bst_white_and_eof_check ('iterate'); bst_get_and_check_left_brace ('iterate'); eat_bst_white_and_eof_check ('iterate'); bst_identifier_scan ('iterate'); @; eat_bst_white_and_eof_check ('iterate'); bst_get_and_check_right_brace ('iterate'); @; exit: end; @ Before iterating the function, we must make sure it's a legal one. It must exist and be |built_in| or |wiz_defined|. @= trace trace_pr_token; trace_pr_ln (' is a to be iterated function'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if (not hash_found) then {unknown \.{.bst} function} begin print_token; bst_warn (' is an unknown function'); end else if ((fn_type[fn_loc] <> built_in) and (fn_type[fn_loc] <> wiz_defined)) then print_function_type_bst_bad; @ A \.{macro} command, like a \.{function} command, has two arguments; the first is a macro name between braces. The name must be a legal \.{.bst} identifier. Upper/lower cases are considered to be the same---all upper-case letters are converted to lower case. The second argument defines this macro. It consists of a |double_quote|-delimited string (which must be on a single line) between braces, with optional |white_space| characters between the braces and the |double_quote|s. This |double_quote|-delimited string is parsed exactly as a |str_literal| is for the \.{function} command. @= procedure bst_macro_command; label exit; begin if (read_seen) then bst_err ('illegal, macro command after read command'); eat_bst_white_and_eof_check ('macro'); @; eat_bst_white_and_eof_check ('macro'); @; exit: end; @ This module reads a |left_brace|, a macro name, and a |right_brace|. @= bst_get_and_check_left_brace ('macro'); eat_bst_white_and_eof_check ('macro'); bst_identifier_scan ('macro'); @; eat_bst_white_and_eof_check ('macro'); bst_get_and_check_right_brace ('macro'); @ The macro name must be a new one; we mark it as a |macro|. @= trace trace_pr_token; trace_pr_ln (' is a macro'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} macro_name_loc := str_lookup(buffer,buf_ptr1,token_len,macro_ilk,do_insert); check_for_already_seen_function (macro_name_loc); fn_type[macro_name_loc] := macro; fn_info[macro_name_loc] := macro_name_loc; {a default in case of error} @ This module reads a |left_brace|, the |double_quote|-delimited string that defines this macro, and a |right_brace|. @= bst_get_and_check_left_brace ('macro'); eat_bst_white_and_eof_check ('macro'); if (scan_char <> double_quote) then bst_err ('a macro definition must be ',xchr[double_quote],'-delimited'); @; eat_bst_white_and_eof_check ('macro'); bst_get_and_check_right_brace ('macro'); @ A macro definition-string is preceded by a |double_quote| and consists of all characters on this line up to the next |double_quote|. The array |fn_info| contains a pointer to this string for the macro name. @= incr(buf_ptr2); {skip over the |double_quote|} if (not scan1(double_quote)) then bst_err ('no `',xchr[double_quote],''' to end macro string literal'); trace trace_pr ('"'); trace_pr_token; trace_pr ('"'); trace_pr_ln (' is a macro string'); ecart@/ macro_def_loc := str_lookup(buffer,buf_ptr1,token_len,text_ilk,do_insert);@/ fn_type[macro_def_loc] := str_literal; {set the |fn_class|} fn_info[macro_name_loc] := macro_def_loc; incr(buf_ptr2); {skip over the |double_quote|} @ @↑gymnastics@> We need to include stuff for \.{.bib} reading here because that's done by the \.{read} command. @= @ @ The \.{read} command has no arguments so there's no more parsing to do. We must make sure we haven't seen a \.{read} command before and we've already seen an \.{entry} command. @= procedure bst_read_command; label exit; begin if (read_seen) then bst_err ('illegal, another read command'); read_seen := true; {now we've seen a \.{read} command} if (not entry_seen) then bst_err ('illegal, read command before entry command'); read_performed := true; sv_ptr1 := buf_ptr2; {save the contents of the \.{.bst} input line} sv_ptr2 := last; tmp_ptr := sv_ptr1; while (tmp_ptr < sv_ptr2) do begin sv_buffer[tmp_ptr] := buffer[tmp_ptr]; incr(tmp_ptr); end; @; buf_ptr2 := sv_ptr1; {and restore} last := sv_ptr2; tmp_ptr := buf_ptr2; while (tmp_ptr < last) do begin buffer[tmp_ptr] := sv_buffer[tmp_ptr]; incr(tmp_ptr); end; exit: end; @ A \.{reverse} command has one argument, a single |built_in| or |wiz_defined| function name between braces. Upper/lower cases are considered to be the same---all upper-case letters are converted to lower case. Also, we must make sure we've already seen a \.{read} command. This module reads a |left_brace|, a single function to be iterated in reverse, and a |right_brace|. @= procedure bst_reverse_command; label exit; begin if (not read_seen) then bst_err ('illegal, reverse command before read command'); eat_bst_white_and_eof_check ('reverse'); bst_get_and_check_left_brace ('reverse'); eat_bst_white_and_eof_check ('reverse'); bst_identifier_scan ('reverse'); @; eat_bst_white_and_eof_check ('reverse'); bst_get_and_check_right_brace ('reverse'); @; exit: end; @ Before iterating the function in reverse, we must make sure it's a legal one. It must exist and be |built_in| or |wiz_defined|. @= trace trace_pr_token; trace_pr_ln (' is a to be iterated in reverse function'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if (not hash_found) then {unknown \.{.bst} function} begin print_token; bst_warn (' is an unknown function'); end else if ((fn_type[fn_loc] <> built_in) and (fn_type[fn_loc] <> wiz_defined)) then print_function_type_bst_bad; @ The \.{sort} command has no arguments so there's no more parsing to do, but we must make sure we've already seen a \.{read} command. @= procedure bst_sort_command; label exit; begin if (not read_seen) then bst_err ('illegal, sort command before read command'); @; exit: end; @ A \.{strings} command has one argument, a list of function names between braces (the names are separated by one or more |white_space| characters). Upper/lower cases are considered to be the same for function names in these lists---all upper-case letters are converted to lower case. Each name in this list specifies a |str_global_var|. There may be several \.{strings} commands in the \.{.bst} file. This module reads a |left_brace|, a list of |str_global_var|s, and a |right_brace|. @= procedure bst_strings_command; label exit; begin eat_bst_white_and_eof_check ('strings'); bst_get_and_check_left_brace ('strings'); eat_bst_white_and_eof_check ('strings'); while (scan_char <> right_brace) do begin bst_identifier_scan ('strings'); @; eat_bst_white_and_eof_check ('strings'); end; incr(buf_ptr2); {skip over the |right_brace|} exit: end; @ @:BibTeX capacity exceeded}{\quad number of string global-variables@> Here we insert the just found |str_global_var| name into the hash table, record it as a |str_global_var|, set its pointer into |global_strs|, and initialize its value there to the null string. @d end_of_string = invalid_code {this illegal |ASCII_code| ends a string} @= trace trace_pr_token; trace_pr_ln (' is a string global-variable'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} fn_loc := str_lookup(buffer,buf_ptr1,token_len,bst_fn_ilk,do_insert); check_for_already_seen_function (fn_loc); fn_type[fn_loc] := str_global_var;@/ fn_info[fn_loc] := num_glb_strs; {pointer into |global_strs|} global_strs[num_glb_strs][0] := end_of_string; {make it the null string} if (num_glb_strs = max_glob_strs) then overflow('number of string global-variables ',max_glob_strs); incr (num_glb_strs); @ @↑gymnastics@> That's it for processing \.{.bst} commands, except for finishing the procedural gymnastics. Also, we need to include stuff for \.{.bib} reading because that's done by the \.{read} command. @= @ @* Reading the database file(s). This section reads the \.{.bib} file(s), each of which consists of a sequence of entries (perhaps with a few \.{.bib} commands thrown in, as explained later). Each entry consists of an |at_sign|, an entry type, and, between braces or parentheses and separated by |comma|s, a database key and a list of fields. Each field consists of a field name, an |equals_sign|, and either a nonnegative number, a macro name (like `jan'), or a brace-balanced string delimited by either |double_quote|s or braces. Finally, case differences are ignored for all but delimited strings and database keys, and |white_space| characters and end-of-lines may appear in all reasonable places (i.e., anywhere except within entry types, database keys, field names, and macro names); furthermore, comments may appear anywhere between entries (or before the first or after the last) as long as they contain no |at_sign|s (except when using the \.{comment} command---see its description). @ These global variables are used while reading the \.{.bib} file(s). The elements of |type_list| point either to a |hash_loc| or to one of two special markers: |empty|, from which |hash_base = empty + 1| was defined, means we haven't yet encountered the \.{.bib} entry corresponding to this cite key, and |undefined| means we've encountered it but it had an unknown entry type. Thus, the array |type_list| is of type |hash_ptr2|, also defined before. @d undefined = hash_max + 1 {a special marker used for |type_list|} @= @!bib_line_num : integer; {line number of the \.{.bib} file} @!entry_type_loc : hash_loc; {the hash-table location of an entry type} @!type_list : packed array[cite_number] of hash_ptr2; @!type_exists : boolean; {|true| if this entry type is \.{.bst}-defined} @!store_entry : boolean; {|true| if we're to store info for this entry} @!field_name_loc : hash_loc; {the hash-table location of a field name} @!field_val_loc : hash_loc; {the hash-table location of a field value} @!store_field : boolean; {|true| if we're to store info for this field} @!right_outer_delim : ASCII_code; {either a |right_brace| or a |right_paren|} @!right_str_delim : ASCII_code; {either a |right_brace| or a |double_quote|} @ When there's a serious error parsing a \.{.bib} file, we flush everything up to the beginning of the next entry. @d bib_err(#) == begin {serious error during \.{.bib} parsing} print (#); bib_err_print; return; end @= procedure bib_err_print; begin print ('---line ',bib_line_num:0,' of file '); print_bib_name; print_bad_input_line; end; @ When there's a harmless error parsing a \.{.bib} file, we just give a warning message. This is always called after other stuff has been printed out. @= procedure bib_warn_print; begin print ('--warning--line ',bib_line_num:0,' of file '); print_bib_name; end; @ For all |num_bib_files| database files, we keep reading and processing \.{.bib} entries until none left. @= @; bib_ptr := 0; while (bib_ptr < num_bib_files) do begin print ('Database file #',bib_ptr+1:0,': '); print_bib_name;@/ bib_line_num := 0; {initialize to get the first input line} buf_ptr2 := last; while (not eof(cur_bib_file)) do get_bib_command_or_entry_and_process; a_close (cur_bib_file); incr(bib_ptr); end; trace trace_pr_ln ('finished reading the database file(s)'); ecart@/ @; @ These are initialized here so that we can still try to read the database file(s) if there was an earlier error while parsing. @= @; @; @; @; @ @:BibTeX capacity exceeded}{\quad total number of fields@> This module initializes all fields of all entries to |missing|, the value to which all fields are initialized. @= if (num_fields*num_cites > max_fields) then begin print (num_fields*num_cites,' '); overflow('total number of fields ',max_fields); end; field_ptr := 0; while (field_ptr < num_fields*num_cites) do begin field_info[field_ptr] := missing; incr(field_ptr); end; @ @:BibTeX capacity exceeded}{\quad total number of integer entry-variables@> This module initializes all |int_entry_var|s of all entries to 0, the value to which all integers are initialized. @= if (num_ent_ints*num_cites > max_ent_ints) then begin print (num_ent_ints*num_cites,' '); overflow('total number of integer entry-variables ',max_ent_ints); end; int_ent_ptr := 0; while (int_ent_ptr < num_ent_ints*num_cites) do begin entry_ints[int_ent_ptr] := 0; incr(int_ent_ptr); end; @ @:BibTeX capacity exceeded}{\quad total number of string entry-variables@> This module initializes all |str_entry_var|s of all entries to the null string, the value to which all strings are initialized. Note: Both |num_ent_strs| and |num_cites| are positive. @= if (num_ent_strs*num_cites > max_ent_strs) then begin print (num_ent_strs*num_cites,' '); overflow('total number of string entry-variables ',max_ent_strs); end; str_ent_ptr := 0; while (str_ent_ptr < num_ent_strs*num_cites) do begin entry_strs[str_ent_ptr][0] := end_of_string; incr(str_ent_ptr); end; @ The array |type_list| must be initialized so that we can detect duplicate (or missing) entries for cite keys on |cite_list|. Also, the array |sorted_cites| initially specifies that the entries are to be processed in order of cite-key occurrence. The \.{sort} command may change this to whatever it likes (which, we hope, is whatever the style-designer instructs it to like). @= cite_ptr := 0; while (cite_ptr < num_cites) do begin type_list[cite_ptr] := empty; sorted_cites[cite_ptr] := cite_ptr; incr(cite_ptr); end; @ Before we actually start the code for reading a database file, we must define this \.{.bib}-specific scanning function. It skips over |white_space| characters until hitting a nonwhite character or the end of the file, respectively returning |true| or |false|. It also updates |bib_line_num|, the line counter. @= function eat_bib_white_space : boolean; label exit; begin while (not scan_white_space) do {no characters left; read another line} begin if (not input_ln(cur_bib_file)) then {end-of-file; return |false|} begin eat_bib_white_space := false; return; end; incr(bib_line_num); buf_ptr2 := 0; end; eat_bib_white_space := true; exit: end; @ It's often illegal to end a \.{.bib} command in certain places, and this is where we come to check. @d eat_bib_white_and_eof_check == begin if (not eat_bib_white_space) then begin eat_bib_print; return; end; end @= procedure eat_bib_print; label exit; {so the call to |bib_err| works} begin bib_err ('illegal end of database file'); exit: end; @ And here are a bunch of error-message macros, each called more than once, that thus save space as implemented. This one is for when one of two possible characters is expected while scanning. @d bib_one_of_two_expected_err(#) == begin bib_one_of_two_print (#); return; end @= procedure bib_one_of_two_print (@!char1,@!char2:ASCII_code); label exit; {so the call to |bib_err| works} begin bib_err ('I was expecting a `',xchr[char1],''' or a `',xchr[char2],''''); exit: end; @ This one's for an expected |equals_sign|. @d bib_equals_sign_expected_err == begin bib_equals_sign_print; return; end @= procedure bib_equals_sign_print; label exit; {so the call to |bib_err| works} begin bib_err ('I was expecting an "',xchr[equals_sign],'"'); exit: end; @ This complains about unbalanced braces. @d bib_unbalanced_braces_err == begin bib_unbalanced_braces_print; return; end @= procedure bib_unbalanced_braces_print; label exit; {so the call to |bib_err| works} begin bib_err ('unbalanced braces'); exit: end; @ And this one about an overly exuberant field. @d bib_field_too_long_err == begin bib_field_too_long_print; return; end @= procedure bib_field_too_long_print; label exit; {so the call to |bib_err| works} begin bib_err ('your field is more than ',buf_size:0,' characters'); exit: end; @ @:this can't happen}{\quad identifier scanning error@> This macro is used to scan all \.{.bib} identifiers. The argument tells what was happening at the time. The associated procedure simply prints an error message. @d bib_identifier_scan_check(#) == begin if ((scan_result = white_adjacent) or (scan_result = specified_char_adjacent)) then do_nothing else begin bib_id_print; bib_err (#); end; end @= procedure bib_id_print; begin if (scan_result = id_null) then print ('"',xchr[scan_char],'" begins ') else if (scan_result = other_char_adjacent) then print ('"',xchr[scan_char],'" immediately follows ') else print ('this can''t happen---identifier scanning error, for '); end; @ This module either reads a database entry, whose three main components are an entry type, a database key, and a list of fields, or it reads a \.{.bib} command, whose structure is command dependent and explained later. @= procedure get_bib_command_or_entry_and_process; label exit; begin @; @; eat_bib_white_and_eof_check; @; eat_bib_white_and_eof_check; @; exit: end; @ This module skips over everything until hitting an |at_sign| or the end of the file. It also updates |bib_line_num|, the line counter. @= while (not scan1(at_sign)) do {no |at_sign|; get next line} begin if (not input_ln(cur_bib_file)) then {end-of-file} return; incr(bib_line_num); buf_ptr2 := 0; end; @ @:this can't happen}{\quad an at-sign disappeared@> This module reads an |at_sign| and an entry type (like `book' or `article') or a \.{.bib} command. If it's an entry type, it must be defined in the \.{.bst} file if this entry is to be included in the reference list. @= if (scan_char <> at_sign) then bib_err ('this can''t happen---an "',xchr[at_sign],'" disappeared'); incr(buf_ptr2); {skip over the |at_sign|} eat_bib_white_and_eof_check; scan_identifier (left_brace,left_paren); bib_identifier_scan_check ('an entry type'); trace trace_pr_token; trace_pr_ln (' is an entry type or a database-file command'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} command_num := ilk_info[ str_lookup(buffer,buf_ptr1,token_len,bib_command_ilk,dont_insert)]; if (hash_found) then @ else begin {process an entry type} entry_type_loc := str_lookup( buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if ((not hash_found) or (fn_type[entry_type_loc]<>wiz_defined)) then@/ type_exists := false {no such entry type defined in the \.{.bst} file} else type_exists := true; end; @ @:this can't happen}{\quad unknown database-file command@> Here we determine which \.{.bib} command we're about to process, then go to it. @= begin case command_num of n_bib_comment : @; n_bib_string : @; othercases bib_err ('this can''t happen---unknown database-file command') endcases; end @ The \.{comment} command is implemented for SCRIBE compatibility. It's not really needed because \BibTeX\ treats (flushes) everything not within an entry as a comment anyway. @= begin return; {flush comments} end @ The \.{string} command is implemented both for SCRIBE compatibility and for allowing a user: to override a \.{.bst}-file \.{macro} command, to define one that the \.{.bst} file doesn't, or to engage in good, wholesome, typing laziness. The \.{string} command does mostly the same thing as the \.{.bst}-file's \.{macro} command (but the syntax is different and the \.{string} command compresses |white_space|). In fact, later in this program, the term ``macro'' refers to either a \.{.bst} ``macro'' or a \.{.bib} ``string''. The syntax for a \.{string} command has either braces or parentheses as outer delimiters. Inside is the string's name (it must be a legal identifier, and case differences are ignored---all upper-case letters are converted to lower case), an |equals_sign|, and the string's definition enclosed in either braces or |double_quote|s. Of course |white_space| may occur in all the usual places. @= begin eat_bib_white_and_eof_check; @; eat_bib_white_and_eof_check; @; return; end @ This module reads a left outer-delimiter and a string name. @= if (scan_char = left_brace) then right_outer_delim := right_brace else if (scan_char = left_paren) then right_outer_delim := right_paren else bib_one_of_two_expected_err (left_brace,left_paren); incr(buf_ptr2); {skip over the left-delimiter} eat_bib_white_and_eof_check; scan_identifier (equals_sign,equals_sign); bib_identifier_scan_check ('a string name'); @; @ @↑commented-out code@> This module marks this string as a |macro|; the commented-out code will give a warning message when overwriting a previously defined |macro|. @= trace trace_pr_token; trace_pr_ln (' is a database-defined macro'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} macro_name_loc := str_lookup(buffer,buf_ptr1,token_len,macro_ilk,do_insert); fn_type[macro_name_loc] := macro;@/ fn_info[macro_name_loc] := macro_name_loc; {a default in case of error} @{ if (hash_found) then {already known \.{.bst} function} begin print ('I''m overwriting a string definition for "'); print_token; print_ln ('"'); bib_warn_print; end; @}@/ @ This module skips over the |equals_sign|, reads and stores the brace- or |double_quote|-delimited string that defines this macro (compressing |white_space|), and reads a |right_outer_delim|. @= if (scan_char <> equals_sign) then bib_equals_sign_expected_err; incr(buf_ptr2); {skip over the |equals_sign|} eat_bib_white_and_eof_check; @; eat_bib_white_and_eof_check; if (scan_char <> right_outer_delim) then bib_err ('missing "',xchr[right_outer_delim],'" in string command'); incr(buf_ptr2); {skip over the |right_outer_delim|} @ This module uses the code that is used to scan a brace-balanced string for an entry's field value; that is, the delimited string is treated as if it were an ordinary field value. @= store_field := true; if (scan_char = left_brace) then begin right_str_delim := right_brace; if (not scan_balanced_braces) then return; end else if (scan_char = double_quote) then begin right_str_delim := double_quote; if (not scan_balanced_braces) then return; end else bib_one_of_two_expected_err (left_brace,double_quote); fn_info[macro_name_loc] := field_val_loc; @ And now, back to processing an entry (rather than a command). This module reads a left outer-delimiter and a database key. @= if (scan_char = left_brace) then right_outer_delim := right_brace else if (scan_char = left_paren) then right_outer_delim := right_paren else bib_one_of_two_expected_err (left_brace,left_paren); incr(buf_ptr2); {skip over the left-delimiter} eat_bib_white_and_eof_check; if (right_outer_delim = right_paren) then {allow it in a database key} begin if (scan1_white(comma)) then {ok if database key ends line} do_nothing; end else if (scan2_white(comma,right_brace)) then {|right_brace=right_outer_delim|} do_nothing; @; @ @↑kludge@> The lower-case version of this database key must correspond to one in |cite_list| if this entry is to be included in the reference list. Accordingly, this module sets |store_entry|, which determines whether the relevant information for this entry is stored. The alias kludge helps make the stack space not overflow on some machines. @d sv_buf2 == ex_buf {an alias, used only in this module} @= trace trace_pr_token; trace_pr_ln (' is a database key'); ecart@/ tmp_ptr := buf_ptr1; while (tmp_ptr < buf_ptr2) do begin sv_buf2[tmp_ptr] := buffer[tmp_ptr]; incr(tmp_ptr); end; lower_case (sv_buf2, buf_ptr1, token_len); {convert to `canonical' form} lc_cite_loc := str_lookup(sv_buf2,buf_ptr1,token_len,lc_cite_ilk,dont_insert); if (not hash_found) then store_entry := false {no such cite key read on |cite_list|} else begin store_entry := true; @; end; @ @↑case mismatch errors@> @↑commented-out code@> We must give a warning if there's a case difference between the database key and the corresponding cite key (perhaps), if this entry type doesn't exist, or if there's already something stored for this entry. Also, we point the appropriate entry of |type_list| to the entry type just read above. The code to give a warning for a case mismatch between a cite key and a database key is commented-out here, partly for SCRIBE compatibility. Those systems without any SCRIBE users may want to uncomment it. (Note: Case mismatches between cite keys or between database keys still produce warnings). @= @{ dummy_loc := str_lookup(buffer,buf_ptr1,token_len,cite_ilk,dont_insert); if (not hash_found) then {give a warning if there is a case difference} begin print ('case mismatch between database key "'); print_token; print ('" and \cite key "'); print_pool_str (cite_list[ilk_info[lc_cite_loc]]); print_ln ('"'); bib_warn_print; end; @}@/ if (type_list[ilk_info[lc_cite_loc]] <> empty) then begin print ('database entry "'); print_token; print_ln ('" is a duplicate'); bib_warn_print; end; if (type_exists) then type_list[ilk_info[lc_cite_loc]] := entry_type_loc else begin type_list[ilk_info[lc_cite_loc]] := undefined; print ('the entry type for database entry "'); print_token; print_ln ('" is not style-file defined'); bib_warn_print; end; @ This module reads a |comma| and a field as many times as it can, and then reads a |right_outer_delim|, ending the current entry. @= while (scan_char <> right_outer_delim) do begin if (scan_char <> comma) then bib_one_of_two_expected_err (comma,right_outer_delim); incr(buf_ptr2); {skip over the |comma|} eat_bib_white_and_eof_check; @; eat_bib_white_and_eof_check; @; eat_bib_white_and_eof_check; @; end; incr(buf_ptr2); {skip over the |right_outer_delim|} @ This module reads a field name; its contents won't be stored unless it was declared in the \.{.bst} file and |store_entry = true|. @= scan_identifier (equals_sign,equals_sign); bib_identifier_scan_check ('a field name'); trace trace_pr_token; trace_pr_ln (' is a field name'); ecart@/ store_field := false; if (store_entry) then begin lower_case (buffer, buf_ptr1, token_len); {ignore case differences} field_name_loc := str_lookup( buffer,buf_ptr1,token_len,bst_fn_ilk,dont_insert); if ((hash_found) and (fn_type[field_name_loc]=field)) then@/ store_field := true; {field name was declared in the \.{.bst} file} end; eat_bib_white_and_eof_check; if (scan_char <> equals_sign) then bib_equals_sign_expected_err; incr(buf_ptr2); {skip over the |equals_sign|} @ There are four possibilites for the first character of the token representing the field value: If it's a |left_brace| or a |double_quote|, the token (with balanced braces, up to the matching |right_str_delim|) is a string; if it's |numeric|, the token is a number; if it's anything else, the token is a macro name (and should thus be defined by either the \.{.bst}-file's \.{macro} command or the \.{.bib}-file's \.{string} command). @= case scan_char of left_brace : begin right_str_delim := right_brace; if (not scan_balanced_braces) then return; end; double_quote : begin right_str_delim := double_quote; if (not scan_balanced_braces) then return; end; "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" : @; othercases @ endcases; @ Some of these local variables for the next procedure are also needed for |compress_bib_white|, so they're made global. @d brace_bal_str == ex_buf {an alias, used only in |scan_balanced_braces|} @= @!bib_brace_level : integer; {brace nesting depth (excluding |str_delim|s)} @!field_end : buf_pointer; {the end marker for the field-value string} @ @↑gymnastics@> Since |scan_balanced_braces| calls the yet-to-be-described |compress_bib_white|, we must perform some topological gymnastics. @= @ @ This \.{.bib}-specific function scans a string with balanced braces, stopping just past the matching |right_str_delim|. How much work it does depends on whether |store_field = true|. It returns |false| if there was a serious syntax error. @= function scan_balanced_braces : boolean; label exit; begin scan_balanced_braces := false; {now it's easy to exit if necessary} incr(buf_ptr2); {skip over the left-delimiter} eat_bib_white_and_eof_check; {this removes leading |white_space|} bib_brace_level := 0; if (store_field) then @ else @; incr(buf_ptr2); {skip over the |right_str_delim|} scan_balanced_braces := true; exit: end; @ This module scans over a brace-balanced string without keeping track of anything else. It starts with |bib_brace_level = 0| and at a non|white_space| character. @= while (scan_char <> right_str_delim) do {we're at |bib_brace_level = 0|} if (scan_char = left_brace) then begin incr(bib_brace_level); incr(buf_ptr2); {skip over the |left_brace|} eat_bib_white_and_eof_check; while (bib_brace_level > 0) do @ 0|@>; end else if (scan_char = right_brace) then bib_unbalanced_braces_err else begin incr(buf_ptr2); {skip over some other character} if (not scan3 (right_str_delim, left_brace, right_brace)) then eat_bib_white_and_eof_check; end @ This module does the same as above but, because |bib_brace_level > 0|, it doesn't have to look for a |right_str_delim|. @ 0|@>= {top part of the |while| loop---we're always at a nonwhite character} if (scan_char = right_brace) then begin decr(bib_brace_level); incr(buf_ptr2); {skip over the |right_brace|} eat_bib_white_and_eof_check; end else if (scan_char = left_brace) then begin incr(bib_brace_level); incr(buf_ptr2); {skip over the |left_brace|} eat_bib_white_and_eof_check; end else begin incr(buf_ptr2); {skip over some other character} if (not scan2 (right_brace, left_brace)) then eat_bib_white_and_eof_check; end @ This module actually copies characters into |brace_bal_str|; since it's so low level, it's implemented as a macro. @d copy_char(#) == begin brace_bal_str[field_end] := #; incr(field_end); end @ This \.{.bib}-specific scanning function skips over |white_space| characters within an entry until hitting a nonwhite character; in fact, it does everything |eat_bib_white_space| does, but it also adds a |space| to |brace_bal_str| and checks that any new input line will also fit in |brace_bal_str| in the worst case when all its characters go into |brace_bal_str|---this worst case assumption could possibly result in a needless error message, but it saves a fair amount of time---besides, for reasonable settings of |buf_size|, this will almost never happen due to this (more likely, it will happen due to a syntax error). This procedure is never called if there are no |white_space| characters (or end-of-lines) to be scanned. It returns |false| if there is a serious syntax error. @d check_for_and_compress_bib_white_space == begin if ((lex_class[scan_char]=white_space) or (buf_ptr2=last)) then if (not compress_bib_white) then return; end @= function compress_bib_white : boolean; label exit; begin compress_bib_white := false; {now it's easy to exit if necessary} copy_char (space); while (not scan_white_space) do {no characters left; read another line} begin if (not input_ln(cur_bib_file)) then {end-of-file; complain} begin eat_bib_print; return; end; incr(bib_line_num); buf_ptr2 := 0; if (field_end+last > buf_size) then bib_field_too_long_err; end; compress_bib_white := true; exit: end; @ This module scans over a brace-balanced string, compressing multiple |white_space| characters into a single |space|, removing a trailing |space| if there is one (there can be at most one), and storing the corresponding text string in the hash table. It starts with |bib_brace_level = 0| and at a non|white_space| character. @= begin field_end := 0; if (field_end+last-buf_ptr2 > buf_size) then bib_field_too_long_err; while (scan_char <> right_str_delim) do @; if (field_end > 1) then if (brace_bal_str[field_end-1] = space) then {remove trailing |space|} decr(field_end); field_val_loc := str_lookup(brace_bal_str,0,field_end,text_ilk,do_insert);@/ fn_type[field_val_loc] := str_literal; {set the |fn_class|} trace trace_pr ('"'); trace_pr_pool_str (hash_text[field_val_loc]); trace_pr_ln ('" is a brace-balanced string'); ecart@/ end @ This is a |while| loop body that stores characters in the buffer |brace_bal_str| while |bib_brace_level = 0|. @= if (scan_char = left_brace) then begin incr(bib_brace_level); copy_char (left_brace); incr(buf_ptr2); {skip over the |left_brace|} check_for_and_compress_bib_white_space; while (bib_brace_level > 0) do @ 0|@>; end else if (scan_char = right_brace) then bib_unbalanced_braces_err else while ((scan_char <> right_str_delim) and (scan_char <> left_brace) and (scan_char <> right_brace)) do begin copy_char (scan_char); incr(buf_ptr2); {skip over some other character} check_for_and_compress_bib_white_space; end @ This is a |while| loop body does the same thing while |bib_brace_level > 0|. @ 0|@>= {top part of the |while| loop---we're always at a nonwhite character} if (scan_char = right_brace) then begin decr(bib_brace_level); copy_char (right_brace); incr(buf_ptr2); {skip over the |right_brace|} check_for_and_compress_bib_white_space; end else if (scan_char = left_brace) then begin incr(bib_brace_level); copy_char (left_brace); incr(buf_ptr2); {skip over the |left_brace|} check_for_and_compress_bib_white_space; end else while ((scan_char <> right_brace) and (scan_char <> left_brace)) do begin copy_char (scan_char); incr(buf_ptr2); {skip over some other character} check_for_and_compress_bib_white_space; end @ @:this can't happen}{\quad a digit disappeared@> This module scans a nonnegative number and stores the corresponding text string in the hash table if |store_field = true|. @= begin if (not scan_nonneg_integer) then print_ln ('this can''t happen---a digit disappeared'); if (store_field) then begin trace trace_pr_token; trace_pr_ln (' is a number'); ecart@/ field_val_loc := str_lookup(buffer,buf_ptr1,token_len,text_ilk,do_insert); fn_type[field_val_loc] := str_literal; {set the |fn_class|} end; end @ This module scans a macro name and finds it in the hash table if |store_field = true|. If the macro is undefined, this field value is not stored and the program complains. @= begin scan_identifier (comma,right_outer_delim); bib_identifier_scan_check ('a string name'); if (store_field) then begin trace trace_pr_token; trace_pr_ln (' is a macro'); ecart@/ lower_case (buffer, buf_ptr1, token_len); {ignore case differences} macro_name_loc := str_lookup( buffer,buf_ptr1,token_len,macro_ilk,dont_insert); if (not hash_found) then begin print ('the string name "'); print_token; print ('" for database entry "'); print_pool_str (cite_list[ilk_info[lc_cite_loc]]); print_ln ('" is undefined'); bib_warn_print; store_field := false; end else field_val_loc := fn_info[macro_name_loc]; {the field value is the \.{macro} definition} end; end @ When |store_field = true|, this module computes the offset into the |field_info| array and stores there a pointer to the field value. @= if (store_field) then begin field_ptr := ilk_info[lc_cite_loc] * num_fields + fn_info[field_name_loc]; if (field_info[field_ptr] <> missing) then begin print ('I''m overwriting a field for database entry "'); print_pool_str (cite_list[ilk_info[lc_cite_loc]]); print_ln ('"'); bib_warn_print; end; field_info[field_ptr] := field_val_loc; end; @ We must give a warning for each cite key without a corresponding database entry. @= cite_ptr := 0; while (cite_ptr < num_cites) do begin if (type_list[cite_ptr] = empty) then begin print ('I didn''t find a database entry for \cite key "'); print_pool_str (cur_cite_str); print_ln ('"--warning'); end; incr(cite_ptr); end; @* Executing the style file. This part of the program produces the output by executing the \.{.bst}-file commands \.{execute}, \.{iterate}, \.{reverse}, and \.{sort}. To do this it uses a stack (consisting of the two arrays |lit_stack| and |lit_stk_type|) for storing literals, a buffer |ex_buf| for manipulating strings, and an array |sorted_cites| containing pointers to the sorted cite keys. @= @!lit_stack : array[lit_stk_loc] of integer; {the literal function stack} @!lit_stk_type : array[lit_stk_loc] of stk_type; {their corresponding types} @!lit_stk_ptr : lit_stk_loc; {points just above the top of the stack} @!cmd_str_ptr : str_number; {stores value of |str_ptr| during execution} @!ent_chr_ptr : 0..ent_str_size; {points at a |str_entry_var| character} @!glob_chr_ptr : 0..glob_str_size; {points at a |str_global_var| character} @!ex_buf : buf_type; {a buffer for manipulating strings} @!ex_buf_ptr : buf_pointer; {general |ex_buf| location} @!ex_buf_length : buf_pointer; {the length of the current string in |ex_buf|} @!out_buf : buf_type; {the \.{.bbl} output buffer} @!out_buf_ptr : buf_pointer; {general |out_buf| location} @!out_buf_length : buf_pointer; {the length of the current string in |out_buf|} @!mess_with_entries : boolean; {|true| if functions can use entry info} @!sorted_cites : array[cite_number] of cite_number; {ptrs to sorted cite keys} @!sort_cite_ptr : cite_number; {a |for|-loop index for the sorted cite keys} @!sort_key_num : str_ent_loc; {index for the |str_entry_var| \.{sort.key\$}} @!brace_level : integer; {the brace nesting depth within a string} @ Where |lit_stk_loc| is a stack location, and where |stk_type| gives one of the three types of literals (an integer, a string, or a function) or a special marker. If a |lit_stk_type| element is a |stk_int| then the corresponding |lit_stack| element is an integer; if a |stk_str|, then a pointer to a |str_pool| string; and if a |stk_fn|, then a pointer to the function's hash-table location. However, if the literal should have been a |stk_str| that was the value of a field that happened to be |missing|, then the special value |stk_field_missing| goes on the stack instead; its corresponding |lit_stack| element is a pointer to the field-name's string. Finally, |stk_empty| is the type of a literal popped from an empty stack. @d stk_int = 0 {an integer literal} @d stk_str = 1 {a string literal} @d stk_fn = 2 {a function literal} @d stk_field_missing = 3 {a special marker: a field value was missing} @d stk_empty = 4 {another: the stack was empty when this was popped} @d last_lit_type = 4 {the same number as on the line above} @= @!lit_stk_loc = 0..lit_stk_size; {the stack range} @!stk_type = 0..last_lit_type; {the literal types} @ When there's an error while executing \.{.bst} functions, what we do depends on whether the function is messing with the entries. These warnings (i.e., from |bst_ex_warn|) are meant both for the user and for the style designer while debugging. @d bst_ex_warn(#) == begin {error while executing some function} print (#); bst_ex_warn_print; end @= procedure bst_ex_warn_print; begin if (mess_with_entries) then begin print (' for entry '); print_pool_str (cur_cite_str); end; print_ln ('--warning'); bst_warn (' while executing'); end; @ This module executes a single specified function once. It can't do anything with the entries. @= begin init_command_execution; mess_with_entries := false; execute_fn (fn_loc); check_command_execution; end @ This module iterates a single specified function for all entries specified by |cite_list|. @= begin init_command_execution; mess_with_entries := true; sort_cite_ptr := 0; while (sort_cite_ptr < num_cites) do begin cite_ptr := sorted_cites[sort_cite_ptr]; trace trace_pr_pool_str (hash_text[fn_loc]); trace_pr (' to be iterated on '); trace_pr_pool_str (cur_cite_str); trace_pr_newline; ecart@/ execute_fn (fn_loc); check_command_execution; incr(sort_cite_ptr); end; end @ This module iterates a single specified function for all entries specified by |cite_list|, but does it in reverse order. @= begin init_command_execution; mess_with_entries := true; if (num_cites > 0) then begin sort_cite_ptr := num_cites; repeat decr(sort_cite_ptr); cite_ptr := sorted_cites[sort_cite_ptr]; trace trace_pr_pool_str (hash_text[fn_loc]); trace_pr (' to be iterated in reverse on '); trace_pr_pool_str (cur_cite_str); trace_pr_newline; ecart@/ execute_fn (fn_loc); check_command_execution; until (sort_cite_ptr = 0); end; end @ This module sorts the entries based on \.{sort.key\$}. @= begin trace trace_pr_ln ('sorting the entries'); ecart@/ if (num_cites > 1) then quick_sort (0, num_cites-1); trace trace_pr_ln ('done sorting'); ecart@/ end @ These next three procedures (actually, two procedures and a function, but who's counting) are subroutines for |quick_sort|, which follows. The |swap| procedure exchanges the two elements pointed to by its arguments. @= procedure swap (@!swap1,@!swap2 : cite_number); var innocent_bystander : cite_number; {the temporary element in an exchange} begin innocent_bystander := sorted_cites[swap2]; sorted_cites[swap2] := sorted_cites[swap1]; sorted_cites[swap1] := innocent_bystander; end; @ The |cycle| procedure is similar, but it does a cyclic shift of the three elements to which its arguments point. @= procedure cycle (@!cycle1,@!cycle2,@!cycle3 : cite_number); var innocent_bystander : cite_number; {the temporary element in an exchange} begin innocent_bystander := sorted_cites[cycle3]; sorted_cites[cycle3] := sorted_cites[cycle2]; sorted_cites[cycle2] := sorted_cites[cycle1]; sorted_cites[cycle1] := innocent_bystander; end; @ @↑system-independent dependencies@> The function |less_than| compares the two \.{sort.key\$}s indirectly pointed to by its arguments and returns |true| if and only if the first argument's \.{sort.key\$} is lexicographically less than the second's (that is, alphabetically earlier). This function uses |ASCII_code|s to compare, so it is system-independent dependent. @d compare_return(#) == begin {the compare is finished} less_than := #; return; end @= function less_than (@!arg1,@!arg2 : cite_number) : boolean; label exit; var char_ptr : 0..ent_str_size; {character index into compared strings} @!ptr1,@!ptr2 : str_ent_loc; {the two \.{sort.key\$} pointers} @!char1,@!char2 : ASCII_code; {the two characters being compared} begin ptr1 := arg1*num_ent_strs + sort_key_num; ptr2 := arg2*num_ent_strs + sort_key_num; char_ptr := 0; loop begin char1 := entry_strs[ptr1][char_ptr]; char2 := entry_strs[ptr2][char_ptr]; if (char2 = end_of_string) then compare_return (false) else if (char1 = end_of_string) then compare_return (true) else if (char2 < char1) then compare_return (false) else if (char1 < char2) then compare_return (true); incr(char_ptr); end; exit: end; @ The recursive procedure |quick_sort| sorts the entries indirectly pointed to by the |sorted_cites| elements between |left_end| and |right_end|, inclusive, based on the value of the |str_entry_var| \.{sort.key\$}. It's a fairly standard quicksort (for example, see Algorithm 5.2.2Q in {\sl The Art of Computer Programming\/}), but uses the median-of-three method to choose the partition element just in case the entries are already sorted (or nearly sorted---humans and ASCII might have different ideas on lexicograpic ordering). This code generally prefers clarity to assembler-type execution-time efficiency since |cite_list|s will rarely be huge. The value |short_list|, which must be at least 2 for this code to work, tells us the list-length at which the list is small enough to warrant switching over to straight insertion sort from the recursive quicksort. The value here is just a guess at the optimal value. @d short_list = 10 {use straight insertion sort at or below this length} @# @d next_insert = 24 {now insert the next element} @d left_found = 42 {found a to-be-|swap|ped elements during partitioning} @d right_found = 43 {found the other} @d partition_done = 34 {ready to recurse} @= procedure quick_sort (@!left_end,@!right_end : cite_number); label next_insert,@!left_found,@!right_found,@!partition_done; var left,@!right : cite_number; {two general |sorted_cites| pointers} @!insert_ptr : cite_number; {the to-be-(straight)-inserted element} @!middle : cite_number; {the |(left_end+right_end) div 2| element} @!partition : cite_number; {the median-of-three partition element} begin trace trace_pr_ln ('sorting ',left_end:0,' through ',right_end:0); ecart@/ if (right_end - left_end < short_list) then @ else begin @; @; end; end; @ This code sorts the entries between |left_end| and |right_end| when the difference is less than |short_list|. Each iteration of the outer loop inserts the element indicated by |insert_ptr| into its proper place among the (sorted) elements from |left_end| through |insert_ptr-1|. @= begin for insert_ptr := left_end+1 to right_end do begin for right := insert_ptr downto left_end+1 do begin if (not less_than (sorted_cites[right], sorted_cites[right-1])) then goto next_insert; swap (right, right-1); end; next_insert: end; end @ Now we find the median of the three \.{sort.key\$}s to which the three elements |sorted_cites[left_end]|, |sorted_cites[right_end]|, and |sorted_cites[(left_end+right_end) div 2]| point. This code merely determines which of the six possible permutations we're dealing with and permutes accordingly. The comments after the actions |swap|, |cycle|, and |do_nothing| give the known orderings of the corresponding elements of |sorted_cites| before the action. @= middle := (left_end+right_end) div 2; if (less_than (sorted_cites[middle], sorted_cites[left_end])) then if (less_than (sorted_cites[right_end], sorted_cites[left_end])) then if (less_than (sorted_cites[right_end], sorted_cites[middle])) then swap(right_end,left_end) {|right_end < middle < left_end|} else cycle(left_end,right_end,middle) {|middle <= right_end < left_end|} else swap(middle,left_end) {|middle < left_end <= right_end|} else if (less_than (sorted_cites[right_end], sorted_cites[left_end])) then cycle(left_end,middle,right_end) {|right_end < left_end <= middle|} else if (less_than (sorted_cites[right_end], sorted_cites[middle])) then swap(right_end,middle) {|left_end <= right_end < middle|} else do_nothing; {|left_end <= middle <= right_end|} @ This module uses the median-of-three computed above to partition the elements into those at most and those at least the median (the set of elements equal to the median may itself be nontrivially partitioned). We assume two \.{sort.key\$}s will so rarely be equal that it's not worth trying to do anything fancy for equalities. @= partition := sorted_cites[middle]; left := left_end; right := right_end; loop begin loop {at the top of this loop we know |sorted_cites[left] <= partition|,} begin {|sorted_cites[right] >= partition|, and |left < right|} if (left+1 = right) then goto partition_done; incr(left); if (less_than (partition, sorted_cites[left])) then goto left_found; end; left_found: loop {at the top of this loop we know |sorted_cites[left] > partition|,} begin {|sorted_cites[right] >= partition|, and |left < right|} if (left+1 = right) then begin decr(left); {so now |sorted_cites[left] <= partition|} decr(right); {so now |sorted_cites[right] > partition|} goto partition_done; end; decr(right); if (less_than (sorted_cites[right], partition)) then goto right_found; end; right_found: {|sorted_cites[left]>partition|, |sorted_cites[right] @:BibTeX capacity exceeded}{\quad literal-stack size@> Ok, that's it for sorting; now we'll play with the literal stack. This procedure pushes a literal onto the stack, checking for stack overflow. @= procedure push_lit_stk (@!push_lt:integer; @!push_type:stk_type); trace var dum_ptr : lit_stk_loc; {used just as an index variable} ecart@/ begin lit_stack[lit_stk_ptr] := push_lt; lit_stk_type[lit_stk_ptr] := push_type; trace for dum_ptr := 0 to lit_stk_ptr do trace_pr (' '); trace_pr ('pushing '); case lit_stk_type[lit_stk_ptr] of stk_int : trace_pr_ln (lit_stack[lit_stk_ptr]:0); stk_str : begin trace_pr ('"'); trace_pr_pool_str (lit_stack[lit_stk_ptr]); trace_pr_ln ('"'); end; stk_fn : begin trace_pr ('`'); trace_pr_pool_str (hash_text[lit_stack[lit_stk_ptr]]); trace_pr_ln (''''); end; stk_field_missing : begin trace_pr ('missing field `'); trace_pr_pool_str (lit_stack[lit_stk_ptr]); trace_pr_ln (''''); end; stk_empty : trace_pr_ln ('a bad literal--popped from an empty stack'); othercases print_ln ('this can''t happen---unknown literal type') endcases; ecart@/ if (lit_stk_ptr = lit_stk_size) then overflow('literal-stack size ',lit_stk_size); incr(lit_stk_ptr); end; @ @:this can't happen}{\quad nontop top of string stack@> This procedure pops the stack, checking for, and trying to recover from, stack underflow. (Actually, this procedure is really a function, since it returns the two values through its |var| parameters.) Also, if the literal being popped is a |stk_str| that's been created during the execution of the current \.{.bst} command, pop it from |str_pool| as well (it will be the string corresponding to |str_ptr-1|). Note that when this happens, the string is no longer `officially' available so that it must be used before anything else is added to |str_pool|. @= procedure pop_lit_stk (var pop_lit:integer; var pop_type:stk_type); begin if (lit_stk_ptr = 0) then begin bst_ex_warn ('you can''t pop an empty literal stack');@/ pop_type := stk_empty; {this is an error recovery attempt} end else begin decr(lit_stk_ptr); pop_lit := lit_stack[lit_stk_ptr]; pop_type := lit_stk_type[lit_stk_ptr]; if (pop_type = stk_str) then if (pop_lit >= cmd_str_ptr) then begin if (pop_lit <> str_ptr-1) then print_ln ('this can''t happen---nontop top of string stack'); flush_string; end; end; end; @ @:this can't happen}{\quad unknown literal type@> Occasionally we'll want to know what's on the literal stack. Here we print out a stack literal, giving its type. @= procedure print_stk_lit (@!stk_lt:integer; @!stk_tp:stk_type); begin print (' : '); case stk_tp of stk_int : print (stk_lt:0,' is an integer literal'); stk_str : begin print ('"'); print_pool_str (stk_lt); print ('" is a string literal'); end; stk_fn : begin print ('`'); print_pool_str (hash_text[stk_lt]); print (''' is a function literal'); end; stk_field_missing : begin print ('`'); print_pool_str (stk_lt); print (''' is a missing field'); end; stk_empty : print ('empty stack when popping this--bad literal'); othercases print_ln ('this can''t happen---unknown literal type') endcases; end; @ @:this can't happen}{\quad an illegal literal type@> @:this can't happen}{\quad unknown literal type@> This procedure appropriately chastises the style designer. @= procedure print_wrong_stk_lit (@!stk_lt:integer; @!stk_tp1,@!stk_tp2:stk_type); begin print_stk_lit (stk_lt, stk_tp1); case stk_tp2 of stk_int : print_ln (', not an integer'); stk_str : print_ln (', not a string'); stk_fn : print_ln (', not a function'); stk_field_missing, stk_empty : print_ln ('---this can''t happen---an illegal literal type'); othercases print_ln ('---this can''t happen---unknown literal type') endcases; bst_ex_warn (' '); end; @ @:this can't happen}{\quad unknown literal type@> This is similar to |print_stk_lit|, but here we don't give the literal's type. @= procedure print_lit (@!stk_lt:integer; @!stk_tp:stk_type); begin print (' : '); case stk_tp of stk_int : print_ln (stk_lt:0); stk_str : begin print_pool_str (stk_lt); print_newline; end; stk_fn : begin print_pool_str (hash_text[stk_lt]); print_newline; end; stk_field_missing : begin print_pool_str (stk_lt); print_newline; end; stk_empty : print_ln ('the stack was empty when this was popped'); othercases print_ln ('this can''t happen---unknown literal type') endcases; end; @ This procedure pops and prints the top of the stack. @= procedure pop_top_and_print; var stk_lt : integer; @!stk_tp : stk_type; begin pop_lit_stk (stk_lt,stk_tp); print_lit (stk_lt,stk_tp); end; @ This procedure pops and prints the whole stack. @= procedure pop_whole_stack; begin while (lit_stk_ptr > 0) do pop_top_and_print; end; @ At the beginning of a \.{.bst}-command execution we make the stack empty and record how much of |str_pool| has been used. @= procedure init_command_execution; begin lit_stk_ptr := 0; {make the stack empty} cmd_str_ptr := str_ptr; {we'll check this when we finish command execution} end; @ @:this can't happen}{\quad nonempty empty string stack@> At the end of a \.{.bst} command-execution we check that the stack and |str_pool| are still in good shape. @= procedure check_command_execution; begin if (lit_stk_ptr<>0) then begin print_ln (lit_stk_ptr:0,'=ptr---the literal stack isn''t empty---'); pop_whole_stack; bst_ex_warn (' '); end; if (cmd_str_ptr<>str_ptr) then begin print_ln ('this can''t happen---nonempty empty string stack---'); print_ln ('pointer is ',str_ptr:0,' but should be ',cmd_str_ptr:0); bst_ex_warn (' '); end; end; @ This procedure adds to |str_pool| the string from |ex_buf[0]| through |ex_buf[ex_buf_length-1]| if it will fit. It assumes the global variable |ex_buf_length| gives the length of the current string in |ex_buf|. It then pushes this string onto the literal stack. @= procedure add_pool_buf_and_push; begin str_room (ex_buf_length); {make sure this string will fit} ex_buf_ptr := 0; while (ex_buf_ptr < ex_buf_length) do begin append_char (ex_buf[ex_buf_ptr]); incr (ex_buf_ptr); end; make_string; {make it an official string} push_lit_stk (str_ptr-1, stk_str); {and push it onto the stack} end; @ @:BibTeX capacity exceeded}{\quad execution buffer size@> These macros append a character to |ex_buf|. Which is called depends on whether the character is known to fit. @d append_ex_buf_char(#) == begin ex_buf[ex_buf_ptr] := #; incr(ex_buf_ptr); end @# @d append_ex_buf_char_and_check(#) == begin if (ex_buf_ptr = buf_size) then overflow('execution buffer size ',buf_size); ex_buf[ex_buf_ptr] := #; incr(ex_buf_ptr); end @ @:BibTeX capacity exceeded}{\quad execution buffer size@> This procedure does the reverse---it adds to the execution buffer the given string in |str_pool| if it will fit. It assumes the global variable |ex_buf_length| gives the length of the current string in |ex_buf|, and thus also gives the location of the next character. @= procedure add_buf_pool (@!p_str : str_number); begin p_ptr1 := str_start[p_str]; p_ptr2 := str_start[p_str+1]; if (ex_buf_length+(p_ptr2-p_ptr1) > buf_size) then overflow('execution buffer size ',buf_size); ex_buf_ptr := ex_buf_length; while (p_ptr1 < p_ptr2) do begin {copy characters into the buffer} append_ex_buf_char (str_pool[p_ptr1]); incr(p_ptr1); end; ex_buf_length := ex_buf_ptr; end; @ This procedure actually writes onto the \.{.bbl}~file a line of output (the characters from |out_buf[0]| to |out_buf[out_buf_length-1]|, after removing trailing |white_space| characters). It also updates |bbl_line_num|, the line counter. It writes a blank line if and only if |out_buf| is empty. The program uses this procedure in such a way that |out_buf| will be nonempty if there have been characters put in it since the most recent \.{newline\$}. @= procedure output_bbl_line; label loop_exit, exit; begin if (out_buf_length <> 0) then {the buffer's not empty} begin while (out_buf_length > 0) do {remove trailing |white_space|} if (lex_class[out_buf[out_buf_length-1]] = white_space) then decr(out_buf_length) else goto loop_exit; loop_exit: if (out_buf_length = 0) then {ignore a line of just |white_space|} return; out_buf_ptr := 0; while (out_buf_ptr < out_buf_length) do begin write (bbl_file, xchr[out_buf[out_buf_ptr]]); incr(out_buf_ptr); end; end; write_ln (bbl_file); incr (bbl_line_num); {update line number} out_buf_length := 0; {make the next line empty} exit: end; @ @:BibTeX capacity exceeded}{\quad output buffer size@> This procedure adds to the output buffer the given string in |str_pool|. It assumes the global variable |out_buf_length| gives the length of the current string in |out_buf|, and thus also gives the location for the next character. If there are enough characters present in the output buffer, it writes one or more lines out to the \.{.bbl} file. It may break a line at any |white_space| character it likes, but if it does, it will add two |space|s to the next output line. @= procedure add_out_pool (@!p_str : str_number); var break_ptr : buf_pointer; {the first character following the line break} @!end_ptr : buf_pointer; {temporary end-of-buffer pointer} begin p_ptr1 := str_start[p_str]; p_ptr2 := str_start[p_str+1]; if (out_buf_length+(p_ptr2-p_ptr1) > buf_size) then overflow('output buffer size ',buf_size); out_buf_ptr := out_buf_length; while (p_ptr1 < p_ptr2) do begin {copy characters into the buffer} out_buf[out_buf_ptr] := str_pool[p_ptr1]; incr(p_ptr1); incr(out_buf_ptr); end; out_buf_length := out_buf_ptr; while (out_buf_length > max_print_line) do @ end; @ Here we break the line by looking for a |white_space| character, backward from |out_buf[max_print_line]| until |out_buf[min_print_line]|; if there isn't one, we break the line just before |out_buf[max_print_line]| and complain. @= begin end_ptr := out_buf_length; out_buf_ptr := max_print_line; while ((lex_class[out_buf[out_buf_ptr]] <> white_space) and (out_buf_ptr >= min_print_line)) do decr (out_buf_ptr); if (out_buf_ptr = min_print_line-1) then {no |white_space| character} begin print ('unbreakable output line---line ',bbl_line_num:0,' of file '); print_pool_str (top_lev_str); print_pool_str (s_bbl_extension); print_newline; bst_ex_warn (' this error is'); out_buf_length := max_print_line; break_ptr := out_buf_length; end else begin {hit a |white_space| character} out_buf_length := out_buf_ptr; break_ptr := out_buf_length + 1; end; output_bbl_line; {output what we can,} out_buf[0] := space; {add two |space|s to start the next line,} out_buf[1] := space; {and slide the rest down} out_buf_ptr := 2; while (out_buf_ptr <= end_ptr-break_ptr+1) do begin out_buf[out_buf_ptr] := out_buf[out_buf_ptr+break_ptr-2]; incr(out_buf_ptr); end; out_buf_length := end_ptr - break_ptr + 2; end @ @↑Tuesdays@> @↑windows@> @:this can't happen}{\quad unknown function class@> This procedure executes a single specified function; it is the single execution-primitive that does everything (except windows, and it takes Tuesdays off). @<|execute_fn| itself@>= procedure execute_fn (@!ex_fn_loc : hash_loc); @ @!wiz_ptr : wiz_fn_loc; {general |wiz_functions| location} begin trace trace_pr ('execute_fn `'); trace_pr_pool_str (hash_text[ex_fn_loc]); trace_pr_ln (''''); ecart@/ case fn_type[ex_fn_loc] of built_in : @; macro : push_lit_stk (hash_text[fn_info[ex_fn_loc]], stk_str); wiz_defined : @; int_literal : push_lit_stk (fn_info[ex_fn_loc], stk_int); str_literal : push_lit_stk (hash_text[ex_fn_loc], stk_str); field : @; int_entry_var : @; str_entry_var : @; int_global_var : push_lit_stk (fn_info[ex_fn_loc], stk_int); str_global_var : @; othercases print_ln ('this can''t happen---unknown function class') endcases; end; @ To execute a |wiz_defined| function, we just execute all those functions in its definition, except that the special marker |quote_next_fn| means we push the next function onto the stack. @= begin wiz_ptr := fn_info[ex_fn_loc]; while (wiz_functions[wiz_ptr] <> end_of_def) do begin if (wiz_functions[wiz_ptr] <> quote_next_fn) then execute_fn (wiz_functions[wiz_ptr]) else begin incr(wiz_ptr); push_lit_stk (wiz_functions[wiz_ptr], stk_fn); end; incr(wiz_ptr); end; end @ This module pushes the string given by the field onto the literal stack unless it's |missing|, in which case it pushes a special value onto the stack. @= begin if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else if (field_info[cite_ptr*num_fields+fn_info[ex_fn_loc]] = missing) then push_lit_stk (hash_text[ex_fn_loc], stk_field_missing) else push_lit_stk (hash_text[field_info[cite_ptr*num_fields+fn_info[ex_fn_loc]]], stk_str); end @ This module pushes the integer given by an |int_entry_var| onto the literal stack. @= begin if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else push_lit_stk (entry_ints[cite_ptr*num_ent_ints+fn_info[ex_fn_loc]], stk_int); end @ This module adds the string given by a |str_entry_var| to |str_pool| via the execution buffer and pushes it onto the literal stack. @= begin if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else begin str_ent_ptr := cite_ptr*num_ent_strs + fn_info[ex_fn_loc];@/ ex_buf_ptr := 0; {also serves as |ent_chr_ptr|} while (entry_strs[str_ent_ptr][ex_buf_ptr] <> end_of_string) do {copy characters into the buffer} append_ex_buf_char (entry_strs[str_ent_ptr][ex_buf_ptr]); ex_buf_length := ex_buf_ptr; add_pool_buf_and_push; {push this string onto the stack} end; end @ This module adds the string given by a |str_global_var| to |str_pool| via the execution buffer and pushes it onto the literal stack. @= begin str_glb_ptr := fn_info[ex_fn_loc]; ex_buf_ptr := 0; {also serves as |glob_chr_ptr|} while (global_strs[str_glb_ptr][ex_buf_ptr] <> end_of_string) do {copy characters into the buffer} append_ex_buf_char (global_strs[str_glb_ptr][ex_buf_ptr]); ex_buf_length := ex_buf_ptr; add_pool_buf_and_push; {push this string onto the stack} end @* The built-in functions. This section gives the all the code for all the |built_in| functions (including pre-defined |str_entry_var|s, which thus aren't classified as |built_in|). To modify or add one, we needn't go anywhere else (with one exception: The constant |max_pop|, which gives the maximum number of literals that any of these functions pops off the stack, is defined earlier because it's needed earlier; thus, if we need to update it, which will happen if some new |built_in| functions uses more than |max_pop| literals from the stack, we'll have to go outside this section). These variables all begin with |b_| and specify the hash-table locations of the |built_in| functions, except that |b_default| is pseudo-|built_in|---either it will point to the no-op \.{skip\$} or to the \.{.bst}-defined function \.{default.type}; it's used when an entry has a type that's not defined in the \.{.bst} file. @= @!b_equals : hash_loc; {\.{=}} @!b_greater_than : hash_loc; {\.{>}} @!b_less_than : hash_loc; {\.{<}} @!b_plus : hash_loc; {\.{+} (this may be changed to an |a_minus|)} @!b_minus : hash_loc; {\.{-}} @!b_concatenate : hash_loc; {\.{*}} @!b_gets : hash_loc; {\.{:=} (formerly, |b_gat|)} @!b_add_period : hash_loc; {\.{add.period\$}} @!b_call_type : hash_loc; {\.{call.type\$}} @!b_change_case : hash_loc; {\.{change.case\$}} @!b_chr_to_int : hash_loc; {\.{chr.to.int\$}} @!b_cite : hash_loc; {\.{cite\$}} @!b_duplicate : hash_loc; {\.{duplicate\$}} @!b_format_name : hash_loc; {\.{format.name\$}} @!b_if : hash_loc; {\.{if\$}} @!b_int_to_chr : hash_loc; {\.{int.to.chr\$}} @!b_int_to_str : hash_loc; {\.{int.to.str\$}} @!b_missing : hash_loc; {\.{missing\$}} @!b_newline : hash_loc; {\.{newline\$}} @!b_num_names : hash_loc; {\.{num.names\$}} @!b_pop : hash_loc; {\.{pop\$}} @!b_purify : hash_loc; {\.{purify\$}} @!b_quote : hash_loc; {\.{quote\$}} @!b_skip : hash_loc; {\.{skip\$}} @!b_stack : hash_loc; {\.{stack\$}} @!b_substring : hash_loc; {\.{substring\$}} @!b_swap : hash_loc; {\.{swap\$}} @!b_top_stack : hash_loc; {\.{top\$}} @!b_type : hash_loc; {\.{type\$}} @!b_while : hash_loc; {\.{while\$}} @!b_width : hash_loc; {\.{width\$}} @!b_write : hash_loc; {\.{write\$}} @!b_default : hash_loc; {either \.{skip\$} or \.{default.type}} @# stat @!blt_in_loc : array[blt_in_range] of hash_loc; {for execution counts} @!execution_count : array[blt_in_range] of integer; {the same} @!blt_in_ptr : blt_in_range; {a pointer into |blt_in_loc|} tats@/ @ Where |blt_in_range| gives the legal |built_in| function numbers. @= @!blt_in_range = 0..num_blt_in_fns; @ These constants all begin with |n_| and are used for the |case| statement that determines which |built_in| function to execute. @d n_equals = 0 {\.{=}} @d n_greater_than = 1 {\.{>}} @d n_less_than = 2 {\.{<}} @d n_plus = 3 {\.{+}} @d n_minus = 4 {\.{-}} @d n_concatenate = 5 {\.{*}} @d n_gets = 6 {\.{:=}} @d n_add_period = 7 {\.{add.period\$}} @d n_call_type = 8 {\.{call.type\$}} @d n_change_case = 9 {\.{change.case\$}} @d n_chr_to_int = 10 {\.{chr.to.int\$}} @d n_cite = 11 {\.{cite\$} (this may start a riot)} @d n_duplicate = 12 {\.{duplicate\$}} @d n_format_name = 13 {\.{format.name\$}} @d n_if = 14 {\.{if\$}} @d n_int_to_chr = 15 {\.{int.to.chr\$}} @d n_int_to_str = 16 {\.{int.to.str\$}} @d n_missing = 17 {\.{missing\$}} @d n_newline = 18 {\.{newline\$}} @d n_num_names = 19 {\.{num.names\$}} @d n_pop = 20 {\.{pop\$}} @d n_purify = 21 {\.{purify\$}} @d n_quote = 22 {\.{quote\$}} @d n_skip = 23 {\.{skip\$}} @d n_stack = 24 {\.{stack\$}} @d n_substring = 25 {\.{substring\$}} @d n_swap = 26 {\.{swap\$}} @d n_top_stack = 27 {\.{top\$}} @d n_type = 28 {\.{type\$}} @d n_while = 29 {\.{while\$}} @d n_width = 30 {\.{width\$}} @d n_write = 31 {\.{write\$}} @= @!num_blt_in_fns = 32; {one more than the previous number} @ @↑important note@> It's time for us to insert more pre-defined strings into |str_pool| (and thus the hash table) and to insert the |built_in| functions into the hash table. The strings corresponding to these functions should contain no upper-case letters. The |build_in| routine (to appear shortly) does the work. Important note: These pre-definitions must not have any glitches or the program may bomb because the |log_file| hasn't been opened yet. @= {these pre-defined strings must all be exactly |longest_pds| long} build_in('= ',1,b_equals,n_equals); build_in('> ',1,b_greater_than,n_greater_than); build_in('< ',1,b_less_than,n_less_than); build_in('+ ',1,b_plus,n_plus); build_in('- ',1,b_minus,n_minus); build_in('* ',1,b_concatenate,n_concatenate); build_in(':= ',2,b_gets,n_gets); build_in('add.period$ ',11,b_add_period,n_add_period); build_in('call.type$ ',10,b_call_type,n_call_type); build_in('change.case$',12,b_change_case,n_change_case); build_in('chr.to.int$ ',11,b_chr_to_int,n_chr_to_int); build_in('cite$ ',5,b_cite,n_cite); build_in('duplicate$ ',10,b_duplicate,n_duplicate); build_in('format.name$',12,b_format_name,n_format_name); build_in('if$ ',3,b_if,n_if); build_in('int.to.chr$ ',11,b_int_to_chr,n_int_to_chr); build_in('int.to.str$ ',11,b_int_to_str,n_int_to_str); build_in('missing$ ',8,b_missing,n_missing); build_in('newline$ ',8,b_newline,n_newline); build_in('num.names$ ',10,b_num_names,n_num_names); build_in('pop$ ',4,b_pop,n_pop); build_in('purify$ ',7,b_purify,n_purify); build_in('quote$ ',6,b_quote,n_quote); build_in('skip$ ',5,b_skip,n_skip); build_in('stack$ ',6,b_stack,n_stack); build_in('substring$ ',10,b_substring,n_substring); build_in('swap$ ',5,b_swap,n_swap); build_in('top$ ',4,b_top_stack,n_top_stack); build_in('type$ ',5,b_type,n_type); build_in('while$ ',6,b_while,n_while); build_in('width$ ',6,b_width,n_width); build_in('write$ ',6,b_write,n_write); @ This procedure inserts a |built_in| function into the hash table and initializes the corresponding pre-defined string (of length at most |longest_pds|). The array |fn_info| contains a number from 0 through the number of |built_in| functions minus 1 (i.e., |num_blt_in_fns - 1| if we're keeping statistics); this number is used by a |case| statement to execute this function and is used for keeping execution counts when keeping statistics. @= procedure build_in (@!pds:pds_type; @!len:pds_len; var fn_hash_loc:hash_loc; @!blt_in_num:blt_in_range); begin pre_define (pds,len,bst_fn_ilk);@/ fn_hash_loc := pre_def_loc; {the |pre_define| routine sets |pre_def_loc|} fn_type[fn_hash_loc] := built_in; fn_info[fn_hash_loc] := blt_in_num; stat blt_in_loc[blt_in_num] := fn_hash_loc;@/ execution_count[blt_in_num] := 0; {initialize the function-execution count} tats@/ end; @ These variables all begin with |s_| and specify the locations in |str_pool| for certain often-used strings that the \.{.bst} commands need. @= @!s_null : str_number; {the null string} @!s_default : str_number; {\.{default.type}; for unknown entry types} @!s_ul : str_number; {\.{ul}; for |first_upper| case conversion} @!s_ll : str_number; {\.{ll}; for |all_lowers| case conversion} @!s_uu : str_number; {\.{uu}; for |all_uppers| case conversion} @!s_lu : str_number; {\.{lu}; for |first_lower| case conversion} @!s_period : str_number; {\.{.}; for adding a period} @ Now we pre-define any built-in |str_entry_var|s, whose argument strings must all be exactly |longest_pds| long. Note that although these are built-in functions, we classify them as |str_entry_var|s. We also pre-define the null string, which is sometimes pushed onto the stack, and a string used for default entry types. |text_ilk|s should be pre-defined here, not earlier, for \.{.bst}-function-execution purposes. @= pre_define('sort.key$ ',9,bst_fn_ilk); fn_type[pre_def_loc] := str_entry_var; fn_info[pre_def_loc] := num_ent_strs; {give this |str_entry_var| a number} sort_key_num := num_ent_strs; {and remember it for sorting purposes} incr(num_ent_strs); pre_define(' ',0,text_ilk); s_null := hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal; pre_define('default.type',12,text_ilk); s_default := hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal;@/ b_default := b_skip; {this may be changed to the \.{default.type} function} pre_define('ul ',2,text_ilk); s_ul:= hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal; pre_define('ll ',2,text_ilk); s_ll := hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal; pre_define('uu ',2,text_ilk); s_uu := hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal; pre_define('lu ',2,text_ilk); s_lu:= hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal; pre_define('. ',1,text_ilk); s_period := hash_text[pre_def_loc]; fn_type[pre_def_loc] := str_literal; @ @:this can't happen}{\quad unknown built-in function@> This module branches to the code for the appropriate |built_in| function. Only three do a recursive call. @= begin stat {update this function's execution count} incr(execution_count[fn_info[ex_fn_loc]]); tats@/ case fn_info[ex_fn_loc] of n_equals : x_equals; n_greater_than : x_greater_than; n_less_than : x_less_than; n_plus : x_plus; n_minus : x_minus; n_concatenate : x_concatenate; n_gets : x_gets; n_add_period : x_add_period; n_call_type : @<|execute_fn|({\.{call.type\$}})@>; n_change_case : x_change_case; n_chr_to_int : x_chr_to_int; n_cite : x_cite; n_duplicate : x_duplicate; n_format_name : x_format_name; n_if : @<|execute_fn|({\.{if\$}})@>; n_int_to_chr : x_int_to_chr; n_int_to_str : x_int_to_str; n_missing : x_missing; n_newline : @<|execute_fn|({\.{newline\$}})@>; n_num_names : x_num_names; n_pop : @<|execute_fn|({\.{pop\$}})@>; n_purify : x_purify; n_quote : x_quote; n_skip : @<|execute_fn|({\.{skip\$}})@>; n_stack : @<|execute_fn|({\.{stack\$}})@>; n_substring : x_substring; n_swap : x_swap; n_top_stack : @<|execute_fn|({\.{top\$}})@>; n_type : x_type; n_while : @<|execute_fn|({\.{while\$}})@>; n_width : x_width; n_write : x_write; othercases print_ln ('this can''t happen---unknown built-in function') endcases; end @ @↑gymnastics@> This extra level of module-pointing allows a uniformity of module names for the |built_in| functions, regardless of whether they do a recursive call to |execute_fn| or are trivial (a single statement). Those that do a recursive call are left as part of |execute_fn|, avoiding \PASCAL's forward procedure mechanism, and those that don't (except for the single-statement ones) are made into procedures so that |execute_fn| doesn't get too large. @= @<|execute_fn|({\.{=}})@>@; @<|execute_fn|({\.{>}})@>@; @<|execute_fn|({\.{<}})@>@; @<|execute_fn|({\.{+}})@>@; @<|execute_fn|({\.{-}})@>@; @<|execute_fn|({\.{*}})@>@; @<|execute_fn|({\.{:=}})@>@; @<|execute_fn|({\.{add.period\$}})@>@; @<|execute_fn|({\.{change.case\$}})@>@; @<|execute_fn|({\.{chr.to.int\$}})@>@; @<|execute_fn|({\.{cite\$}})@>@; @<|execute_fn|({\.{duplicate\$}})@>@; @<|execute_fn|({\.{format.name\$}})@>@; @<|execute_fn|({\.{int.to.chr\$}})@>@; @<|execute_fn|({\.{int.to.str\$}})@>@; @<|execute_fn|({\.{missing\$}})@>@; @<|execute_fn|({\.{num.names\$}})@>@; @<|execute_fn|({\.{purify\$}})@>@; @<|execute_fn|({\.{quote\$}})@>@; @<|execute_fn|({\.{substring\$}})@>@; @<|execute_fn|({\.{swap\$}})@>@; @<|execute_fn|({\.{type\$}})@>@; @<|execute_fn|({\.{width\$}})@>@; @<|execute_fn|({\.{write\$}})@>@; @<|execute_fn| itself@> @ Now it's time to declare some things for executing |built_in| functions only. The variables here (and only these) are used recursively, so they can't be global. @d end_while = 51 {stop executing the \.{while\$} function} @= label end_while; var r_pop_lt1,@!r_pop_lt2 : integer; {stack literals for \.{while\$}} @!r_pop_tp1,@!r_pop_tp2 : stk_type; {stack types for \.{while\$}} @ These are nonrecursive variables that |execute_fn| uses. Declaring them here (instead of in the previous module) saves execution time and stack space on most machines. @d name_buf == sv_buffer {an alias, a buffer for manipulating names} @= @!pop_lit1,@!pop_lit2,@!pop_lit3 : integer; {stack literals} @!pop_typ1,@!pop_typ2,@!pop_typ3 : stk_type; {stack types} @!sp_ptr : pool_pointer; {for manipulating |str_pool| strings} @!sp_xptr1,@!sp_xptr2 : pool_pointer; {more of the same} @!sp_end : pool_pointer; {marks the end of a |str_pool| string} @!sp_brace_level : integer; {for scanning |str_pool| strings} @!ex_buf_xptr : buf_pointer; {an xtra |ex_buf| location} @!preceding_white : boolean; {used in scanning strings} @!and_found : boolean; {to stop the loop that looks for an ``and''} @!num_names : integer; {for counting names} @!name_bf_ptr : buf_pointer; {general |name_buf| location} @!name_bf_xptr : buf_pointer; {another} @!nm_brace_level : integer; {for scanning |name_buf| strings} @!name_tok : packed array[buf_pointer] of buf_pointer; {name-token ptr list} @!num_tokens : buf_pointer; {this counts name tokens} @!token_starting : boolean; {used in scanning name tokens} @!alpha_found : boolean; {used in scanning the format string} @!double_letter,@!end_of_group,@!to_be_written : boolean; {the same} @!first_start : buf_pointer; {start-ptr into |name_tok| for the first name} @!first_end : buf_pointer; {end-ptr into |name_tok| for the first name} @!last_end : buf_pointer; {end-ptr into |name_tok| for the last name} @!von_start : buf_pointer; {start-ptr into |name_tok| for the von name} @!von_end : buf_pointer; {end-ptr into |name_tok| for the von name} @!jr_end : buf_pointer; {end-ptr into |name_tok| for the jr name} @!cur_token,@!last_token : buf_pointer; {|name_tok| ptrs for outputting tokens} @!use_default : boolean; {for the inter-token intra-name part string} @!num_commas : buf_pointer; {used to determine the name syntax} @!comma1,@!comma2 : buf_pointer; {ptrs into |name_tok|} @ The |built_in| function {\.{=}} pops the top two (integer or string) literals, compares them, and pushes the integer 1 if they're equal, 0 otherwise. If they're not either both string or both integer, it complains and pushes the integer 0. @<|execute_fn|({\.{=}})@>= procedure x_equals; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> pop_typ2) then begin print_stk_lit (pop_lit1,pop_typ1); print_stk_lit (pop_lit2,pop_typ2); print_newline; bst_ex_warn ('--they aren''t the same literal types'); push_lit_stk (0, stk_int); end else if ((pop_typ1 <> stk_int) and (pop_typ1 <> stk_str)) then begin print_stk_lit (pop_lit1,pop_typ1); print_ln (', not an integer or a string'); bst_ex_warn (' '); push_lit_stk (0, stk_int); end else if (pop_typ1 = stk_int) then if (pop_lit2 = pop_lit1) then push_lit_stk (1, stk_int) else push_lit_stk (0, stk_int) else if (str_eq_str (pop_lit2,pop_lit1)) then push_lit_stk (1, stk_int) else push_lit_stk (0, stk_int); end; @ The |built_in| function {\.{>}} pops the top two (integer) literals, compares them, and pushes the integer 1 if the second is greater than the first, 0 otherwise. If either isn't an integer literal, it complains and pushes the integer 0. @<|execute_fn|({\.{>}})@>= procedure x_greater_than; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (0, stk_int); end else if (pop_typ2 <> stk_int) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_int); push_lit_stk (0, stk_int); end else if (pop_lit2 > pop_lit1) then push_lit_stk (1, stk_int) else push_lit_stk (0, stk_int); end; @ The |built_in| function {\.{<}} pops the top two (integer) literals, compares them, and pushes the integer 1 if the second is less than the first, 0 otherwise. If either isn't an integer literal, it complains and pushes the integer 0. @<|execute_fn|({\.{<}})@>= procedure x_less_than; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (0, stk_int); end else if (pop_typ2 <> stk_int) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_int); push_lit_stk (0, stk_int); end else if (pop_lit2 < pop_lit1) then push_lit_stk (1, stk_int) else push_lit_stk (0, stk_int); end; @ The |built_in| function {\.{+}} pops the top two (integer) literals and pushes their sum. If either isn't an integer literal, it complains and pushes the integer 0. @<|execute_fn|({\.{+}})@>= procedure x_plus; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (0, stk_int); end else if (pop_typ2 <> stk_int) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_int); push_lit_stk (0, stk_int); end else push_lit_stk (pop_lit2+pop_lit1, stk_int); end; @ The |built_in| function {\.{-}} pops the top two (integer) literals and pushes their difference (the first subtracted from the second). If either isn't an integer literal, it complains and pushes the integer 0. @<|execute_fn|({\.{-}})@>= procedure x_minus; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (0, stk_int); end else if (pop_typ2 <> stk_int) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_int); push_lit_stk (0, stk_int); end else push_lit_stk (pop_lit2-pop_lit1, stk_int); end; @ The |built_in| function {\.{*}} pops the top two (string) literals, concatenates them (in reverse order, that is, the order in which pushed), and pushes the resulting string back onto the stack. If either isn't a string literal, it complains and pushes the null string. @<|execute_fn|({\.{*}})@>= procedure x_concatenate; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (s_null, stk_str); end else if (pop_typ2 <> stk_str) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_str); push_lit_stk (s_null, stk_str); end else begin ex_buf_length := 0; add_buf_pool (pop_lit2); add_buf_pool (pop_lit1);@/ add_pool_buf_and_push; {push this string onto the stack} end; end; @ The |built_in| function {\.{:=}} pops the top two literals and assigns to the second (which must be an |int_entry_var|, a |str_entry_var|, an |int_global_var|, or a |str_global_var|) the value of the first; it complains if the value isn't of the appropriate type. @<|execute_fn|({\.{:=}})@>= procedure x_gets; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ2 <> stk_fn) then print_wrong_stk_lit (pop_lit2,pop_typ2,stk_fn) else if ((not mess_with_entries) and ((fn_type[pop_lit2] = str_entry_var) or (fn_type[pop_lit2] = int_entry_var))) then bst_ex_warn ('you can''t mess with entries here') else case fn_type[pop_lit2] of int_entry_var : @; str_entry_var : @; int_global_var : @; str_global_var : @; othercases begin print ('you can''t assign to type '); print_fn_class (pop_lit2); print_ln (', a nonvariable function class'); bst_ex_warn (' '); end endcases; end; @ This module checks that what we're about to assign is really an integer, and then assigns. @= if (pop_typ1 <> stk_int) then print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int) else entry_ints[cite_ptr*num_ent_ints+fn_info[pop_lit2]] := pop_lit1 @ This module checks that what we're about to assign is really a string, and then assigns. @= if (pop_typ1 <> stk_str) then print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str) else begin str_ent_ptr := cite_ptr*num_ent_strs + fn_info[pop_lit2]; ent_chr_ptr := 0; sp_ptr := str_start[pop_lit1]; sp_xptr1 := str_start[pop_lit1+1]; if (sp_xptr1-sp_ptr > ent_str_size) then begin bst_ex_warn ('you''ve exceeded entry-string-size ',ent_str_size:0); sp_xptr1 := sp_ptr + ent_str_size; end; while (sp_ptr < sp_xptr1) do begin {copy characters into |entry_strs|} entry_strs[str_ent_ptr][ent_chr_ptr] := str_pool[sp_ptr]; incr(ent_chr_ptr); incr(sp_ptr); end; entry_strs[str_ent_ptr][ent_chr_ptr] := end_of_string; end @ This module checks that what we're about to assign is really an integer, and then assigns. @= if (pop_typ1 <> stk_int) then print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int) else fn_info[pop_lit2] := pop_lit1 @ This module checks that what we're about to assign is really a string, and then assigns. @= if (pop_typ1 <> stk_str) then print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str) else begin str_glb_ptr := fn_info[pop_lit2]; glob_chr_ptr := 0; sp_ptr := str_start[pop_lit1]; sp_xptr1 := str_start[pop_lit1+1]; if (sp_xptr1-sp_ptr > glob_str_size) then begin bst_ex_warn ('you''ve exceeded global-string-size ',glob_str_size:0); sp_xptr1 := sp_ptr + glob_str_size; end; while (sp_ptr < sp_xptr1) do begin {copy characters into |global_strs|} global_strs[str_glb_ptr][glob_chr_ptr] := str_pool[sp_ptr]; incr(glob_chr_ptr); incr(sp_ptr); end; global_strs[str_glb_ptr][glob_chr_ptr] := end_of_string; end @ The |built_in| function {\.{add.period\$}} pops the top (string) literal, adds a |period| to it if the last non-|right_brace| character isn't a |period|, |question_mark|, or |exclamation_mark|, and pushes this resulting string back onto the stack. If the literal isn't a string, it complains and pushes the null string. @<|execute_fn|({\.{add.period\$}})@>= procedure x_add_period; label loop_exit; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (s_null, stk_str); end else begin ex_buf_length := 0; add_buf_pool (pop_lit1); @; add_pool_buf_and_push; {push this string onto the stack} end; end; @ Here we scan backwards from the end of the string, skipping non-|right_brace| characters. @= ex_buf_ptr := ex_buf_length; while (ex_buf_ptr > 0) do begin decr(ex_buf_ptr); if (ex_buf[ex_buf_ptr] <> right_brace) then goto loop_exit; end; loop_exit: if (ex_buf_ptr = 0) then add_buf_pool (s_period) else case ex_buf[ex_buf_ptr] of period, question_mark, exclamation_mark : do_nothing; othercases add_buf_pool (s_period) endcases; @ The |built_in| function {\.{call.type\$}} executes the function specified in |type_list| for this entry unless it's |undefined|, in which case it executes the default function \.{default.type} defined in the \.{.bst} file, or unless it's |empty|, in which case it does nothing. @<|execute_fn|({\.{call.type\$}})@>= begin if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else if (type_list[cite_ptr] = undefined) then execute_fn (b_default) else if (type_list[cite_ptr] = empty) then do_nothing else execute_fn (type_list[cite_ptr]); end @ The |built_in| function {\.{change.case\$}} pops the top two (string) literals; it changes the case of the second according to the specifications of the first, as follows. (Note: The word `letters' in the next sentence refers only to those at brace-level~0, the top-most brace level; no other characters are changed.) If the first literal is the string \.{ul}, it converts all letters to lower case except the very first character in the string, which it converts to upper case; if it's the string \.{uu}, it converts all letters to upper case; if it's the string \.{ll}, it converts all letters to lower case; if it's the string \.{lu}, it converts all letters to upper case except the very first character in the string, which it converts to lower case; and if it's anything else, it complains and does no conversion. It then pushes this resulting string. If either type is incorrect, it complains and pushes the null string; however, if both types are correct but the specification string (i.e., the first string) isn't one of the legal ones, it merely pushes the second back onto the stack, after complaining. (Another note: It ignores case differences in the specification string; for example, the strings \.{uL} and \.{ul} are equivalent for the purposes of this |built_in| function.) @<|execute_fn|({\.{change.case\$}})@>= procedure x_change_case; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (s_null, stk_str); end else if (pop_typ2 <> stk_str) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_str); push_lit_stk (s_null, stk_str); end else begin ex_buf_length := 0; add_buf_pool (pop_lit1); @; ex_buf_length := 0; add_buf_pool (pop_lit2); @; add_pool_buf_and_push; {push this string onto the stack} end; end; @ First we define a variable to be used in |case| statements. These are in order of probable frequency. @d first_upper = 0 {representing the string \.{ul}} @d all_lowers = 1 {representing the string \.{ll}} @d all_uppers = 2 {representing the string \.{uu}} @d first_lower = 3 {representing the string \.{lu}} @d bad_conversion = 4 {representing any illegal case-conversion string} @= @!conversion_type : 0..bad_conversion; {the possible cases} @ Now we determine which of the four case-conversion types we're dealing with: \.{ul}, \.{ll}, \.{uu}, or \.{lu}. @= lower_case (ex_buf, 0, length(pop_lit1)); {ignore case differences} if (str_eq_buf (s_ul, ex_buf, 0, length(pop_lit1))) then conversion_type := first_upper else if (str_eq_buf (s_ll, ex_buf, 0, length(pop_lit1))) then conversion_type := all_lowers else if (str_eq_buf (s_uu, ex_buf, 0, length(pop_lit1))) then conversion_type := all_uppers else if (str_eq_buf (s_lu, ex_buf, 0, length(pop_lit1))) then conversion_type := first_lower else begin conversion_type := bad_conversion; print_pool_str (pop_lit1); bst_ex_warn (' is an illegal case-conversion string'); end; @ This procedure, used in name scanning, decrements |brace_level| and gives an error message if it has just become negative. @= procedure decr_brace_level (@!pop_lit_var : str_number); begin decr(brace_level); if (brace_level = -1) then begin print_pool_str (pop_lit_var); print_ln (' is not a brace-balanced string'); bst_ex_warn (' '); brace_level := 0; {try to recover} end; end; @ This one, also used in name scanning, makes sure that |brace_level=0| (it's called at a point in a string where braces must be balanced). @= procedure check_brace_level (@!pop_lit_var : str_number); begin if (brace_level > 0) then begin print_pool_str (pop_lit_var); print_ln (' is not a brace-balanced string'); bst_ex_warn (' '); end; end; @ Here's where we actually go through the string and do the case conversion. @= brace_level := 0; {this is the top level} ex_buf_ptr := 0; {we start with the string's first character} if (ex_buf_ptr < ex_buf_length) then if (ex_buf[ex_buf_ptr] = left_brace) then incr(brace_level) else if (ex_buf[ex_buf_ptr] = right_brace) then decr_brace_level (pop_lit2) {here, an error} else @; {|brace_level=0|} incr(ex_buf_ptr); while (ex_buf_ptr < ex_buf_length) do begin if (ex_buf[ex_buf_ptr] = right_brace) then decr_brace_level (pop_lit2) else if (ex_buf[ex_buf_ptr] = left_brace) then incr(brace_level) else if (brace_level = 0) then @; incr(ex_buf_ptr); end; check_brace_level (pop_lit2); @ @:this can't happen}{\quad unknown type of case conversion@> The very first character of a string may be converted differently from the others. This code won't touch nonletters. @= begin case conversion_type of first_upper, all_uppers : upper_case (ex_buf, ex_buf_ptr, 1); all_lowers, first_lower : lower_case (ex_buf, ex_buf_ptr, 1); bad_conversion : do_nothing; othercases print_ln ('this can''t happen---unknown type of case conversion') endcases; end @ @:this can't happen}{\quad unknown type of case conversion@> This code does any needed conversion for characters other than the first. This code won't touch nonletters. @= begin case conversion_type of first_upper, all_lowers : lower_case (ex_buf, ex_buf_ptr, 1); all_uppers, first_lower : upper_case (ex_buf, ex_buf_ptr, 1); bad_conversion : do_nothing; othercases print_ln ('this can''t happen---unknown type of case conversion') endcases; end @ The |built_in| function {\.{chr.to.int\$}} pops the top (string) literal, makes sure it's a single character, converts it to the corresponding |ASCII_code| integer, and pushes this integer. If the literal isn't an appropriate string, it complains and pushes the integer~0. @<|execute_fn|({\.{chr.to.int\$}})@>= procedure x_chr_to_int; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (0, stk_int); end else if (length(pop_lit1) <> 1) then begin print ('"'); print_pool_str (pop_lit1); bst_ex_warn ('" isn''t a single character'); push_lit_stk (0, stk_int); end else push_lit_stk (str_pool[str_start[pop_lit1]], stk_int); {push the (|ASCII_code|) integer} end; @ The |built_in| function {\.{cite\$}} pushes the appropriate string from |cite_list| onto the stack. @<|execute_fn|({\.{cite\$}})@>= procedure x_cite; begin if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else push_lit_stk (cur_cite_str, stk_str); end; @ The |built_in| function {\.{duplicate\$}} pops the top literal from the stack and pushes two copies of it. @<|execute_fn|({\.{duplicate\$}})@>= procedure x_duplicate; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then begin push_lit_stk (pop_lit1, pop_typ1); push_lit_stk (pop_lit1, pop_typ1); end else begin ex_buf_length := 0; add_buf_pool (pop_lit1);@/ add_pool_buf_and_push; {push this string onto the stack} add_pool_buf_and_push; {do it again, do it again} end; end; @ The |built_in| function {\.{format.name\$}} pops the top three literals (they are a string, an integer, and a string literal, in that order). The last string literal represents a name list (each name corresponding to a person), the integer literal specifies which name to pick from this list, and the first string literal specifies how to format this name, as described in the \BibTeX\ documentation. Finally, this function pushes the formatted name. If any of the types is incorrect, it complains and pushes the null string. @<|execute_fn|({\.{format.name\$}})@>= procedure x_format_name; label loop1_exit,@!loop2_exit; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); pop_lit_stk (pop_lit3,pop_typ3); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (s_null, stk_str); end else if (pop_typ2 <> stk_int) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_int); push_lit_stk (s_null, stk_str); end else if (pop_typ3 <> stk_str) then begin print_wrong_stk_lit (pop_lit3,pop_typ3,stk_str); push_lit_stk (s_null, stk_str); end else begin ex_buf_length := 0; add_buf_pool (pop_lit3); @; @; @; ex_buf_length := 0; add_buf_pool (pop_lit1); @; add_pool_buf_and_push; {push the formatted string onto the stack} end; end; @ This module skips over undesired names in |pop_lit3| and it throws away the ``and'' from the end of the name if it exists. When it's done, |ex_buf_xptr| points to its first character and |ex_buf_ptr| points just past its last. @= ex_buf_ptr := 0; num_names := 0; while ((num_names < pop_lit2) and (ex_buf_ptr < ex_buf_length)) do begin incr(num_names); ex_buf_xptr := ex_buf_ptr; name_scan_for_and (pop_lit3); end; if (ex_buf_ptr < ex_buf_length) then {remove the ``and''} ex_buf_ptr := ex_buf_ptr - 4; if (num_names < pop_lit2) then begin print ('there aren''t ',pop_lit2:0,' names in "'); print_pool_str (pop_lit3); print_ln ('"'); bst_ex_warn (' '); end @ This module, starting at |ex_buf_ptr|, looks in |ex_buf| for an ``and'' surrounded by nonnull |white_space|. It stops either at |ex_buf_length| or just past the ``and'', whichever comes first, setting |ex_buf_ptr| accordingly. Its parameter |pop_lit_var| is either |pop_lit3| or |pop_lit1|, depending on whether {\.{format.name\$}} or {\.{num.names\$}} calls it. @= procedure name_scan_for_and (@!pop_lit_var : str_number); begin brace_level := 0; preceding_white := false; and_found := false; while ((not and_found) and (ex_buf_ptr < ex_buf_length)) do case ex_buf[ex_buf_ptr] of "a", "A" : begin incr(ex_buf_ptr); if (preceding_white) then @; {if so, |and_found := true|} preceding_white := false; end; left_brace : begin incr(brace_level); incr(ex_buf_ptr); @ 0|@>; preceding_white := false; end; right_brace : begin decr_brace_level(pop_lit_var); {this will give an error} incr(ex_buf_ptr); preceding_white := false; end; othercases if (lex_class[ex_buf[ex_buf_ptr]] = white_space) then begin incr(ex_buf_ptr); preceding_white := true; end else begin incr(ex_buf_ptr); preceding_white := false; end endcases; check_brace_level (pop_lit_var); end; @ When we come here |ex_buf_ptr| is just past the |left_brace|, and when we leave it's either at |ex_buf_length| or just past the matching |right_brace|. @ 0|@>= while ((brace_level > 0) and (ex_buf_ptr < ex_buf_length)) do begin if (ex_buf[ex_buf_ptr] = right_brace) then decr(brace_level) else if (ex_buf[ex_buf_ptr] = left_brace) then incr(brace_level); incr(ex_buf_ptr); end; @ When we come here |ex_buf_ptr| is just past the ``a'' or ``A'', and when we leave it's either at the same place or, if we found an ``and'', at the following |white_space| character. @= if (ex_buf_ptr <= (ex_buf_length - 3)) then {enough characters are left} if ((ex_buf[ex_buf_ptr] = "n") or (ex_buf[ex_buf_ptr] = "N")) then if ((ex_buf[ex_buf_ptr+1] = "d") or (ex_buf[ex_buf_ptr+1] = "D")) then if (lex_class[ex_buf[ex_buf_ptr+2]] = white_space) then begin ex_buf_ptr := ex_buf_ptr + 2; and_found := true; end; @ When we arrive here, the desired name is in |ex_buf[ex_buf_xptr]| through |ex_buf[ex_buf_ptr-1]|. This module does its thing for characters only at |brace_level = 0|; the rest get processed verbatim. It removes trailing |white_space| and |comma|s, complaining for each trailing |comma|. It then copies the name into |name_buf|, removing all |white_space| and |comma|s, counting |comma|s, and constructing a list of name tokens, which are sequences of characters separated by |white_space| or |comma|s (at |brace_level=0|). It complains if there are too many (more than two) |comma|s. @= @; name_bf_ptr := 0; num_commas := 0; num_tokens := 0;@/ token_starting := true; {to indicate that a name token is starting} while (ex_buf_xptr < ex_buf_ptr) do case ex_buf[ex_buf_xptr] of comma : @; left_brace : @; right_brace : @; othercases if (lex_class[ex_buf[ex_buf_xptr]] = white_space) then @ else @ endcases; name_tok[num_tokens] := name_bf_ptr; {this is an end-marker} @ This module removes all trailing |comma|s and |white_space|. It complains for each trailing |comma|. @= while (ex_buf_ptr > ex_buf_xptr) do {remove trailing |white_space|} if (lex_class[ex_buf[ex_buf_ptr-1]] = white_space) then decr(ex_buf_ptr) else if (ex_buf[ex_buf_ptr-1] = comma) then begin print ('name ',pop_lit2:0,' in "'); print_pool_str (pop_lit3); print_ln ('" has a comma at the end'); bst_ex_warn (' '); decr(ex_buf_ptr); end else goto loop1_exit; loop1_exit: @ Here we mark where this comma has occurred in relation to the token number. @= begin if (num_commas = 2) then begin print ('too many commas in name ',pop_lit2:0,' of "'); print_pool_str (pop_lit3); print_ln ('"'); bst_ex_warn (' '); end else begin incr(num_commas); if (num_commas = 1) then comma1 := num_tokens else comma2 := num_tokens; {|num_commas = 2|} end; incr(ex_buf_xptr); token_starting := true; end @ We copy the stuff up through the matching |right_brace| verbatim. @= begin incr(brace_level); if (token_starting) then begin name_tok[num_tokens] := name_bf_ptr; incr(num_tokens); end; name_buf[name_bf_ptr] := ex_buf[ex_buf_xptr]; incr(name_bf_ptr); incr(ex_buf_xptr); while ((brace_level > 0) and (ex_buf_xptr < ex_buf_ptr)) do begin if (ex_buf[ex_buf_xptr] = right_brace) then decr(brace_level) else if (ex_buf[ex_buf_xptr] = left_brace) then incr(brace_level); name_buf[name_bf_ptr] := ex_buf[ex_buf_xptr]; incr(name_bf_ptr); incr(ex_buf_xptr); end; token_starting := false; end @ We don't copy an extra |right_brace|; this code will almost never be executed. @= begin if (token_starting) then begin name_tok[num_tokens] := name_bf_ptr; incr(num_tokens); end; print ('name ',pop_lit2:0,' of "'); print_pool_str (pop_lit3); print_ln ('" is not brace balanced'); bst_ex_warn (' '); incr(ex_buf_xptr); token_starting := false; end @ A token will be starting soon in a theater near you. @= begin incr(ex_buf_xptr); token_starting := true; end @ We just copy the character. @= begin if (token_starting) then begin name_tok[num_tokens] := name_bf_ptr; incr(num_tokens); end; name_buf[name_bf_ptr] := ex_buf[ex_buf_xptr]; incr(name_bf_ptr); incr(ex_buf_xptr); token_starting := false; end @ @:this can't happen}{\quad illegal number of comma,s@> Here we set all the pointers for the various parts of the name, depending on which of the three possible syntaxes this name uses. @= if (num_commas = 0) then begin first_start := 0; last_end := num_tokens; jr_end := last_end; @; end else if (num_commas = 1) then begin von_start := 0; last_end := comma1; jr_end := last_end; first_start := jr_end; first_end := num_tokens; von_name_ends_and_last_name_starts_stuff; end else if (num_commas = 2) then begin von_start := 0; last_end := comma1; jr_end := comma2; first_start := jr_end; first_end := num_tokens; von_name_ends_and_last_name_starts_stuff; end else print_ln ('this can''t happen---illegal number of comma,s'); @ When there are no brace-level-0 |comma|s in the name, the von name starts with the first nonlast token whose first |nm_brace_level = 0| letter is in lower case. A module following this one determines where the von name ends and the last starts. @= von_start := 0; while (von_start < last_end-1) do begin name_bf_ptr := name_tok[von_start]; name_bf_xptr := name_tok[von_start+1]; if (von_token_found) then goto loop2_exit; incr(von_start); end; loop2_exit: first_end := von_start; von_name_ends_and_last_name_starts_stuff; @ It's a von token if the first |nm_brace_level = 0| letter is in lower case, in which case we return |true|. The token is in |name_buf|, starting at |name_bf_ptr| and ending just before |name_bf_xptr|. @= function von_token_found : boolean; label exit; begin nm_brace_level := 0; von_token_found := false; {now it's easy to exit if necessary} while (name_bf_ptr < name_bf_xptr) do if ((name_buf[name_bf_ptr] >= "A") and (name_buf[name_bf_ptr] <= "Z")) then return else if ((name_buf[name_bf_ptr] >= "a") and (name_buf[name_bf_ptr] <= "z")) then begin von_token_found := true; return; end else if (name_buf[name_bf_ptr] = left_brace) then begin incr(nm_brace_level); incr(name_bf_ptr); @ 0|@>; end else incr(name_bf_ptr); exit: end; @ When we come here |name_bf_ptr| is just past the |left_brace|; when we leave it's either at |name_bf_xptr| or just past the matching |right_brace|. @ 0|@>= while ((nm_brace_level > 0) and (name_bf_ptr < name_bf_xptr)) do begin if (name_buf[name_bf_ptr] = right_brace) then decr(nm_brace_level) else if (name_buf[name_bf_ptr] = left_brace) then incr(nm_brace_level); incr(name_bf_ptr); end; @ @↑Tuesdays@> The last name starts just past the last token, before the first |comma| (if there is no |comma|, there is deemed to be one at the end of the string), whose first |brace_level = 0| letter is in lower case, unless this last token is also the last token before the |comma|, in which case the last name starts with this token (except on Tuesdays $\ldots$). That is, if there are any tokens in either the von or last names, then the last name has at least one, even if it starts with a lower-case letter. @= procedure von_name_ends_and_last_name_starts_stuff; label exit; begin if (von_start >= last_end - 1) then {there is no von name} von_end := von_start else begin {there may or may not be a von name} von_end := last_end - 1; while (von_end > von_start) do begin name_bf_ptr := name_tok[von_end-1]; name_bf_xptr := name_tok[von_end]; if (von_token_found) then return; decr(von_end); end; end; exit: end; @ This module uses the information in |pop_lit1| to format the name. Everything at |sp_brace_level = 0| is copied verbatim to the formatted string; the rest is described in the succeeding modules. @= ex_buf_ptr := 0; sp_brace_level := 0; sp_ptr := str_start[pop_lit1]; sp_end := str_start[pop_lit1+1]; while (sp_ptr < sp_end) do if (str_pool[sp_ptr] = left_brace) then begin incr(sp_brace_level); incr(sp_ptr); @; end else if (str_pool[sp_ptr] = right_brace) then begin braces_unbalanced_complaint; incr(sp_ptr); end else begin append_ex_buf_char_and_check (str_pool[sp_ptr]); incr(sp_ptr); end; if (sp_brace_level > 0) then braces_unbalanced_complaint; ex_buf_length := ex_buf_ptr; @ When we arrive here we're at |sp_brace_level = 1|, just past the |left_brace|. Letters at this |sp_brace_level| other than those denoting the parts of the name (i.e., the first letters of `first,' `last,' `von,' and `jr,' ignoring case) are illegal. We do two passes over this group; the first determines whether we're to output anything, and, if we are, the second actually outputs it. @= sp_xptr1 := sp_ptr; alpha_found := false; double_letter := false; end_of_group := false; to_be_written := true; while ((not end_of_group) and (sp_ptr < sp_end)) do if (lex_class[str_pool[sp_ptr]] = alpha) then begin incr(sp_ptr); @
; end else if (str_pool[sp_ptr] = right_brace) then begin decr(sp_brace_level); incr(sp_ptr); end_of_group := true; end else if (str_pool[sp_ptr] = left_brace) then begin incr(sp_brace_level); incr(sp_ptr); skip_stuff_at_sp_brace_level_greater_than_one; end else incr(sp_ptr); if ((end_of_group) and (to_be_written)) then {do the second pass} @; @ When we come here |sp_ptr| is just past the |left_brace|, and when we leave it's either at |sp_end| or just past the matching |right_brace|. @= procedure skip_stuff_at_sp_brace_level_greater_than_one; begin while ((sp_brace_level > 1) and (sp_ptr < sp_end)) do begin if (str_pool[sp_ptr] = right_brace) then decr(sp_brace_level) else if (str_pool[sp_ptr] = left_brace) then incr(sp_brace_level); incr(sp_ptr); end; end; @ We won't output anything for this part of the name if this is a second occurrence of an |sp_brace_level = 1| letter, if it's an illegal letter, or if there are no tokens corresponding to this part. We also determine if we're we to output complete tokens (indicated by a double letter). @
= begin if (alpha_found) then begin brace_lvl_one_letters_complaint; to_be_written := false; end else case str_pool[sp_ptr-1] of "f","F" : @
; "v","V" : @
; "l","L" : @
; "j","J" : @
; othercases begin brace_lvl_one_letters_complaint; to_be_written := false; end endcases; if (double_letter) then incr(sp_ptr); alpha_found := true; end @ At most one of the important letters, perhaps doubled, may appear at |sp_brace_level = 1|. @= procedure brace_lvl_one_letters_complaint; begin print ('the format string "'); print_pool_str (pop_lit1); print_ln ('" has an illegal brace-level-1 letter'); bst_ex_warn (' '); end; @ Here we set pointers into |name_tok| and note whether we'll be dealing with a full first-name tokens (|double_letter = true|) or abbreviations (|double_letter = true|). @
= begin cur_token := first_start; last_token := first_end; if (cur_token = last_token) then to_be_written := false; if ((str_pool[sp_ptr] = "f") or (str_pool[sp_ptr] = "F")) then double_letter := true; end @ The same as above but for von-name tokens. @
= begin cur_token := von_start; last_token := von_end; if (cur_token = last_token) then to_be_written := false; if ((str_pool[sp_ptr] = "v") or (str_pool[sp_ptr] = "V")) then double_letter := true; end @ The same as above but for last-name tokens. @
= begin cur_token := von_end; last_token := last_end; if (cur_token = last_token) then to_be_written := false; if ((str_pool[sp_ptr] = "l") or (str_pool[sp_ptr] = "L")) then double_letter := true; end @ The same as above but for jr-name tokens. @
= begin cur_token := last_end; last_token := jr_end; if (cur_token = last_token) then to_be_written := false; if ((str_pool[sp_ptr] = "j") or (str_pool[sp_ptr] = "J")) then double_letter := true; end @ This is the second pass over this part of the name; here we actually write stuff out to |ex_buf|. @= begin sp_ptr := sp_xptr1; sp_brace_level := 1; while (sp_brace_level > 0) do if ((lex_class[str_pool[sp_ptr]] = alpha) and (sp_brace_level = 1)) then begin incr(sp_ptr); @
; end else if (str_pool[sp_ptr] = right_brace) then begin decr(sp_brace_level); incr(sp_ptr); if (sp_brace_level > 0) then append_ex_buf_char_and_check (right_brace); end else if (str_pool[sp_ptr] = left_brace) then begin incr(sp_brace_level); incr(sp_ptr); append_ex_buf_char_and_check (left_brace); end else begin append_ex_buf_char_and_check (str_pool[sp_ptr]); incr(sp_ptr); end; end @ When we come here, |sp_ptr| is just past the letter indicating the part of the name for which we're about to output tokens. When we leave, it's at the first character of the rest of the group. @
= if (double_letter) then incr(sp_ptr); use_default := true; sp_xptr2 := sp_ptr; if (str_pool[sp_ptr] = left_brace) then {find the inter-token string} begin use_default := false; incr(sp_brace_level); incr(sp_ptr); sp_xptr1 := sp_ptr; skip_stuff_at_sp_brace_level_greater_than_one; sp_xptr2 := sp_ptr - 1; end; @; if (not use_default) then sp_ptr := sp_xptr2 + 1; @ Here, for each token in this part, we output either a full or an abbreviated token and the inter-token string for all but the last token of this part. @= while (cur_token < last_token) do begin if (double_letter) then @ else @; incr(cur_token); if (cur_token < last_token) then @; end @ @:BibTeX capacity exceeded}{\quad execution buffer size@> Here we output all the characters in the token, verbatim. @= begin name_bf_ptr := name_tok[cur_token]; name_bf_xptr := name_tok[cur_token+1]; if (ex_buf_length+(name_bf_xptr-name_bf_ptr) > buf_size) then overflow('execution buffer size ',buf_size); while (name_bf_ptr < name_bf_xptr) do begin append_ex_buf_char (name_buf[name_bf_ptr]); incr(name_bf_ptr); end; end @ Here we output the first alphabetic character of the token; brace level is irrelevant. @= begin name_bf_ptr := name_tok[cur_token]; while ((lex_class[name_buf[name_bf_ptr]] <> alpha) and (name_bf_ptr < name_tok[cur_token+1])) do incr(name_bf_ptr); if (name_bf_ptr < name_tok[cur_token+1]) then append_ex_buf_char_and_check (name_buf[name_bf_ptr]); end @ @:BibTeX capacity exceeded}{\quad execution buffer size@> Here we output either the default string or the given one. A |tie| is the default space character. @= begin if (use_default) then if (double_letter) then append_ex_buf_char_and_check (tie) else begin if (ex_buf_length+2 > buf_size) then overflow('execution buffer size ',buf_size); append_ex_buf_char (period); append_ex_buf_char (tie); end else begin if (ex_buf_length+(sp_xptr2-sp_xptr1) > buf_size) then overflow('execution buffer size ',buf_size); sp_ptr := sp_xptr1; while (sp_ptr < sp_xptr2) do begin append_ex_buf_char (str_pool[sp_ptr]); incr(sp_ptr); end end; end @ This complaint arises because the style designer has to type lots of braces. @= procedure braces_unbalanced_complaint; begin print ('the format string "'); print_pool_str (pop_lit1); print_ln ('" is not brace balanced'); bst_ex_warn (' '); end; @ The |built_in| function {\.{if\$}} pops the top three literals (they are two function literals and an integer literal, in that order); if the integer is greater than 0, it executes the second literal, else it executes the first. If any of the types is incorrect, it complains but does nothing else. @<|execute_fn|({\.{if\$}})@>= begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); pop_lit_stk (pop_lit3,pop_typ3); if (pop_typ1 <> stk_fn) then print_wrong_stk_lit (pop_lit1,pop_typ1,stk_fn) else if (pop_typ2 <> stk_fn) then print_wrong_stk_lit (pop_lit2,pop_typ2,stk_fn) else if (pop_typ3 <> stk_int) then print_wrong_stk_lit (pop_lit3,pop_typ3,stk_int) else if (pop_lit3 > 0) then execute_fn (pop_lit2) else execute_fn (pop_lit1); end @ The |built_in| function {\.{int.to.chr\$}} pops the top (integer) literal, interpreted as the |ASCII_code| of a single character, converts it to the corresponding single-character string, and pushes this string. If the literal isn't an appropriate integer, it complains and pushes the null string. @<|execute_fn|({\.{int.to.chr\$}})@>= procedure x_int_to_chr; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (s_null, stk_str); end else if ((pop_lit1 < 0) or (pop_lit1 > 127)) then begin bst_ex_warn (pop_lit1:0,' isn''t valid ASCII'); push_lit_stk (s_null, stk_str); end else begin ex_buf[0] := pop_lit1; ex_buf_length := 1;@/ add_pool_buf_and_push; {push this string onto the stack} end; end; @ The |built_in| function {\.{int.to.str\$}} pops the top (integer) literal, converts it to its (unique) string equivalent, and pushes this string. If the literal isn't an integer, it complains and pushes the null string. @<|execute_fn|({\.{int.to.str\$}})@>= procedure x_int_to_str; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (s_null, stk_str); end else begin int_to_ASCII (pop_lit1, ex_buf, 0, ex_buf_length);@/ add_pool_buf_and_push; {push this string onto the stack} end; end; @ The |built_in| function {\.{missing\$}} pops the top literal and pushes the integer 1 if it's a missing field, 0 otherwise. If the literal isn't a missing field or a string, it complains and pushes 0. @<|execute_fn|({\.{missing\$}})@>= procedure x_missing; begin pop_lit_stk (pop_lit1,pop_typ1); if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else if ((pop_typ1 <> stk_str) and (pop_typ1 <> stk_field_missing)) then begin print_stk_lit (pop_lit1,pop_typ1); print_ln (', not a string or missing-field'); bst_ex_warn (' '); push_lit_stk (0, stk_int); end else if (pop_typ1 = stk_field_missing) then push_lit_stk (1, stk_int) else push_lit_stk (0, stk_int); end; @ The |built_in| function {\.{newline\$}} writes whatever has accumulated in the output buffer |out_buf| onto the \.{.bbl} file. @<|execute_fn|({\.{newline\$}})@>= begin output_bbl_line; end @ The |built_in| function {\.{num.names\$}} pops the top (string) literal; it pushes the number of names the string represents---one plus the number of occurrences of the substring ``and'' surrounded by nonnull |white_space| (ignoring case differences) at the top brace level. If the literal isn't a string, it complains and pushes the value 0. @<|execute_fn|({\.{num.names\$}})@>= procedure x_num_names; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (0, stk_int); end else begin ex_buf_length := 0; add_buf_pool (pop_lit1); @; push_lit_stk (num_names, stk_int); end; end; @ This module counts the occurrences of ``and'' surrounded by nonnull |white_space| (ignoring case differences) while scanning the list of names, and adds 1. @= ex_buf_ptr := 0; num_names := 0; while (ex_buf_ptr < ex_buf_length) do begin name_scan_for_and (pop_lit1); incr(num_names); end; @ The |built_in| function {\.{pop\$}} pops the top of the stack but doesn't print it. @<|execute_fn|({\.{pop\$}})@>= begin pop_lit_stk (pop_lit1,pop_typ1); end @ The |built_in| function {\.{purify\$}} pops the top (string) literal, removes nonalphanumeric characters except for |white_space| characters (these get converted to a |space|), and pushes the resulting string. If the literal isn't a string, it complains and pushes the null string. @<|execute_fn|({\.{purify\$}})@>= procedure x_purify; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (s_null, stk_str); end else begin ex_buf_length := 0; add_buf_pool (pop_lit1); @; add_pool_buf_and_push; {push this string onto the stack} end; end; @ The resulting string has nonalphanumeric characters removed, and each |white_space| character converted to a |space|. @= ex_buf_xptr := 0; ex_buf_ptr := 0; while (ex_buf_ptr < ex_buf_length) do begin if (lex_class[ex_buf[ex_buf_ptr]] = white_space) then begin ex_buf[ex_buf_xptr] := space; incr(ex_buf_xptr); end else if ((lex_class[ex_buf[ex_buf_ptr]] = alpha) or (lex_class[ex_buf[ex_buf_ptr]] = numeric)) then begin ex_buf[ex_buf_xptr] := ex_buf[ex_buf_ptr]; incr(ex_buf_xptr); end; incr(ex_buf_ptr); end; ex_buf_length := ex_buf_xptr; @ The |built_in| function {\.{quote\$}} pushes the string consisting of the |double_quote| character. @<|execute_fn|({\.{quote\$}})@>= procedure x_quote; begin ex_buf[0] := double_quote; ex_buf_length := 1;@/ add_pool_buf_and_push; {push this string onto the stack} end; @ The |built_in| function {\.{skip\$}} is a no-op. @<|execute_fn|({\.{skip\$}})@>= begin do_nothing; end @ The |built_in| function {\.{stack\$}} pops and prints the whole stack; it's meant to be used for style designers while debugging. @<|execute_fn|({\.{stack\$}})@>= begin pop_whole_stack; end @ The |built_in| function {\.{substring\$}} pops the top three literals (they are the two integers literals |pop_lit1| and |pop_lit2| and a string literal, in that order). It pushes the substring of the (at most) |pop_lit1| consecutive characters starting at the |pop_lit2|th character (assuming 1-based indexing) if |pop_lit2| is positive, and ending at the |-pop_lit2|th character from the end if |pop_lit2| is negative (where the first character from the end is the last character). If any of the types is incorrect, it complains and pushes the null string. @<|execute_fn|({\.{substring\$}})@>= procedure x_substring; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); pop_lit_stk (pop_lit3,pop_typ3); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); push_lit_stk (s_null, stk_str); end else if (pop_typ2 <> stk_int) then begin print_wrong_stk_lit (pop_lit2,pop_typ2,stk_int); push_lit_stk (s_null, stk_str); end else if (pop_typ3 <> stk_str) then begin print_wrong_stk_lit (pop_lit3,pop_typ3,stk_str); push_lit_stk (s_null, stk_str); end else begin ex_buf_length := 0; add_buf_pool (pop_lit3); @
; add_pool_buf_and_push; {push this string onto the stack} end; end; @ This module finds the substring as described in the last section. Note that although the integer literals assume 1-based indexing, the |ex_buf| array uses 0-based indexing. @= if ((pop_lit1 <= 0) or (pop_lit2 = 0) or (pop_lit2 > ex_buf_length) or (pop_lit2 < -ex_buf_length)) then ex_buf_length := 0 {that is, make it the null string} else if (pop_lit2 > 0) then begin if (pop_lit1 > ex_buf_length-(pop_lit2-1)) then pop_lit1 := ex_buf_length - (pop_lit2-1); ex_buf_ptr := 0; while (ex_buf_ptr < pop_lit1) do begin {shift characters down in the buffer} ex_buf[ex_buf_ptr] := ex_buf[ex_buf_ptr+(pop_lit2-1)]; incr(ex_buf_ptr); end; ex_buf_length := ex_buf_ptr; {set the new length} end else begin {|-ex_buf_length <= pop_lit2 < 0|} pop_lit2 := -pop_lit2; if (pop_lit1 > ex_buf_length-(pop_lit2-1)) then pop_lit1 := ex_buf_length - (pop_lit2-1); ex_buf_ptr := 0; while (ex_buf_ptr < pop_lit1) do begin {shift characters down in the buffer} ex_buf[ex_buf_ptr] := ex_buf[ex_buf_ptr+ ex_buf_length-(pop_lit2-1)-pop_lit1]; incr(ex_buf_ptr); end; ex_buf_length := ex_buf_ptr; end @ The |built_in| function {\.{swap\$}} pops the top two literals from the stack and pushes them back swapped. @<|execute_fn|({\.{swap\$}})@>= procedure x_swap; begin pop_lit_stk (pop_lit1,pop_typ1); pop_lit_stk (pop_lit2,pop_typ2); if (pop_typ1 <> stk_str) then begin push_lit_stk (pop_lit1, pop_typ1); if (pop_typ2 <> stk_str) then push_lit_stk (pop_lit2, pop_typ2) else begin ex_buf_length := 0; add_buf_pool (pop_lit2);@/ add_pool_buf_and_push; {push 2nd string onto the stack} end; end else if (pop_typ2 <> stk_str) then begin ex_buf_length := 0; add_buf_pool (pop_lit1);@/ add_pool_buf_and_push; {push 1st string onto the stack} push_lit_stk (pop_lit2, pop_typ2); end else @; {bummer, both are strings} end; @ We have to do some buffer juggling to get this correct. @= begin ex_buf_length := 0; add_buf_pool (pop_lit2); sv_ptr2 := ex_buf_length; {save} tmp_ptr := 0; while (tmp_ptr < sv_ptr2) do begin sv_buffer[tmp_ptr] := ex_buf[tmp_ptr]; incr(tmp_ptr); end; ex_buf_length := 0; add_buf_pool (pop_lit1);@/ add_pool_buf_and_push; {push 1st string onto the stack} ex_buf_length := sv_ptr2; {restore} tmp_ptr := 0; while (tmp_ptr < ex_buf_length) do begin ex_buf[tmp_ptr] := sv_buffer[tmp_ptr]; incr(tmp_ptr); end; add_pool_buf_and_push; {push 2nd string onto the stack} end @ The |built_in| function {\.{top\$}} pops and prints the top of the stack. @<|execute_fn|({\.{top\$}})@>= begin pop_top_and_print; end @ The |built_in| function {\.{type\$}} pushes the appropriate string from |type_list| onto the stack (unless either it's |undefined| or |empty|, in which case it pushes the null string). @<|execute_fn|({\.{type\$}})@>= procedure x_type; begin if (not mess_with_entries) then bst_ex_warn ('you can''t mess with entries here') else if ((type_list[cite_ptr] = undefined) or (type_list[cite_ptr] = empty)) then push_lit_stk (s_null, stk_str) else push_lit_stk (hash_text[type_list[cite_ptr]], stk_str); end; @ The |built_in| function {\.{while\$}} pops the top two (function) literals, and keeps executing the second as long as the (integer) value left on the stack by executing the first is greater than 0. If either type is incorrect, it complains but does nothing else. @<|execute_fn|({\.{while\$}})@>= begin pop_lit_stk (r_pop_lt1,r_pop_tp1); pop_lit_stk (r_pop_lt2,r_pop_tp2); if (r_pop_tp1 <> stk_fn) then print_wrong_stk_lit (r_pop_lt1,r_pop_tp1,stk_fn) else if (r_pop_tp2 <> stk_fn) then print_wrong_stk_lit (r_pop_lt2,r_pop_tp2,stk_fn) else loop begin execute_fn (r_pop_lt2); {this is the \.{while\$} test} pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_int) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_int); goto end_while; end else if (pop_lit1 > 0) then execute_fn (r_pop_lt1) {this is the \.{while\$} body} else goto end_while; end; end_while: {justifies this |mean_while|} end @ The |built_in| function {\.{width\$}} pops the top (string) literal and pushes the integer that represents its width in units specified by the |char_width| array. If the literal isn't a string, it complains and pushes 0. @<|execute_fn|({\.{width\$}})@>= procedure x_width; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then begin print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str); push_lit_stk (0, stk_int); end else begin {add up the |char_width|s in this string} string_width := 0; sp_ptr := str_start[pop_lit1]; while (sp_ptr < str_start[pop_lit1+1]) do begin string_width := string_width + char_width[str_pool[sp_ptr]]; incr(sp_ptr) end; push_lit_stk (string_width, stk_int) end end; @ The |built_in| function {\.{write\$}} pops the top (string) literal and writes it onto the output buffer |out_buf| (which will result in stuff being written onto the \.{.bbl} file if the buffer fills up). If the literal isn't a string, it complains but does nothing else. @<|execute_fn|({\.{write\$}})@>= procedure x_write; begin pop_lit_stk (pop_lit1,pop_typ1); if (pop_typ1 <> stk_str) then print_wrong_stk_lit (pop_lit1,pop_typ1,stk_str) else add_out_pool (pop_lit1); end; @* Cleaning up. @↑clich\'es-\`a-trois@> This section does any last-minute printing and ends the program. @= trace_and_stat_printing; a_close (log_file); {turn out the lights, the fat lady has sung; it's over, Yogi} @ Here we print |trace| and/or |stat| information, if desired. @= procedure trace_and_stat_printing; begin @# trace @; @; @; @; ecart@/ @# stat @; tats@/ @# end; @ This prints information obtained from the \.{.aux} file about the other files. @= begin trace_pr_ln ('the ',num_bib_files:0,' database files are'); if (num_bib_files = 0) then trace_pr_ln (' undefined') else begin bib_ptr := 0; while (bib_ptr < num_bib_files) do begin trace_pr (' '); trace_pr_pool_str (cur_bib_str); trace_pr_pool_str (s_bib_extension); trace_pr_newline; incr(bib_ptr); end; end; trace_pr ('the style file is '); if (bst_str = 0) then trace_pr_ln ('undefined') else begin trace_pr_pool_str (bst_str); trace_pr_pool_str (s_bst_extension); trace_pr_newline; end; end @ In entry-sorted order, this prints an entry's |cite_list| string and, indirectly, its entry type and entry variables. @= begin trace_pr_ln ('the ',num_cites:0,' \cite keys:'); if (num_cites = 0) then trace_pr_ln (' undefined') else begin sort_cite_ptr := 0; while (sort_cite_ptr < num_cites) do begin if (not read_performed) then {we didn't even read a \.{.bst} file} cite_ptr := sort_cite_ptr else cite_ptr := sorted_cites[sort_cite_ptr]; trace_pr_pool_str (cur_cite_str); if (read_performed) then @ else trace_pr_newline; incr(sort_cite_ptr); end; end; end @ This prints information gathered while reading the \.{.bst} and \.{.bib} files. @= begin trace_pr (' of entry-type '); if (type_list[cite_ptr] = undefined) then undefined : trace_pr ('unknown,') else if (type_list[cite_ptr] = empty) then trace_pr ('--- no type found,') else trace_pr_pool_str (hash_text[type_list[cite_ptr]]); trace_pr_ln (' has entry strings'); @; trace_pr ('and has entry integers'); @; end @ This prints, for the current entry, the strings declared by the \.{entry} command. @= begin if (num_ent_strs = 0) then trace_pr_ln (' undefined') else begin str_ent_ptr := cite_ptr * num_ent_strs; while (str_ent_ptr < (cite_ptr+1)*num_ent_strs) do begin ent_chr_ptr := 0; trace_pr (' "'); while (entry_strs[str_ent_ptr][ent_chr_ptr] <> end_of_string) do begin trace_pr (xchr[entry_strs[str_ent_ptr][ent_chr_ptr]]); incr(ent_chr_ptr); end; trace_pr_ln ('"'); incr(str_ent_ptr); end; end; end @ This prints, for the current entry, the integers declared by the \.{entry} command. @= begin if (num_ent_ints = 0) then trace_pr (' undefined') else begin int_ent_ptr := cite_ptr*num_ent_ints; while (int_ent_ptr < (cite_ptr+1)*num_ent_ints) do begin trace_pr (' ',entry_ints[int_ent_ptr]:0); incr(int_ent_ptr); end; end; trace_pr_newline; end @ This gives all the |wiz_defined| functions that appeared in the \.{.bst} file. @= begin trace_pr_ln ('the wiz-defined functions are'); if (wiz_def_ptr = 0) then trace_pr_ln (' nonexistent') else begin wiz_fn_ptr := 0; while (wiz_fn_ptr < wiz_def_ptr) do begin if (wiz_functions[wiz_fn_ptr] = end_of_def) then trace_pr_ln (wiz_fn_ptr:0,'--end-of-def--') else if (wiz_functions[wiz_fn_ptr] = quote_next_fn) then trace_pr (wiz_fn_ptr:0,' quote_next_function ') else begin trace_pr (wiz_fn_ptr:0,' `'); trace_pr_pool_str (hash_text[wiz_functions[wiz_fn_ptr]]); trace_pr_ln (''''); end; incr(wiz_fn_ptr); end; end; end @ This includes all the `static' strings (that is, those that are also in the hash table), but none of the dynamic strings (that is, those put on the stack while executing \.{.bst} commands). @= begin trace_pr_ln ('the string pool is'); str_num := 1; while (str_num < str_ptr) do begin trace_pr (str_num:4, str_start[str_num]:6,' "'); trace_pr_pool_str (str_num); trace_pr_ln ('"'); incr(str_num); end; end @ @↑statistics@> These statistics can help determine how large some of the constants should be and can tell how useful certain |built_in| functions are. They are written to the same files as tracing information. @d stat_pr_ln == trace_pr_ln @d stat_pr_pool_str == trace_pr_pool_str @= begin stat_pr_ln ('you''ve used ',num_cites:0,' \cite keys,'); stat_pr_ln (' ',wiz_def_ptr:0,' wiz_defined-function locations,'); stat_pr_ln (' ',str_ptr:0,' strings with ',str_start[str_ptr]:0, ' characters,'); stat_pr_ln ('and the built_in function-call counts are:'); blt_in_ptr := 0; while (blt_in_ptr < num_blt_in_fns) do begin stat_pr_pool_str (hash_text[blt_in_loc[blt_in_ptr]]); stat_pr_ln ('---',execution_count[blt_in_ptr]:0); incr(blt_in_ptr); end; end @* System-dependent changes. @↑system dependencies@> This section should be replaced, if necessary, by changes to the program that are necessary to make \BibTeX\ work at a particular installation. It is usually best to design your change file so that all changes to previous sections preserve the section numbering; then everybody's version will be consistent with the printed program. More extensive changes, which introduce new sections, can be inserted here; then only the index itself will get a new section number. @* Index. @.this can't happen@> Here is where you can find all uses of each identifier in the program, with underlined entries pointing to where the identifier was defined. If the identifier is only one letter long, however, you get to see only the underlined entries. All references are to section numbers instead of page numbers. This index also lists error messages and other aspects of the program that you might want to look up some day. For example, the entry for ``system dependencies'' lists all sections that should receive special attention from people who are installing \TeX\ in a new operating environment. A list of various things that can't happen appears under ``this can't happen''.