< infile > outfile or file(s) > outfile
texpretty [ --? ] [ --texinfo ] [ --author ] [ --copyright ] [ --displaymath-all ] [ --emacs-mode modename ] [ --filename filename ] [ --help ] [ --indent nnn ] [ --logfile filename ] [ --math-conversions nnn ] [ --no-comment-banner ] [ --outfile filename ] [ --quick ] [ --stylefile filename ] [ --tabular-as-verbatim ] [ --version ] [ --width nnn ]
< infile > outfile or file(s) > outfile
texpretty filters its TeX input from stdin, or from one or more files named on the command line, and prettyprints it to stdout. Most formatting systems based on TeX, including AmSTeX, AmSLaTeX, ETeX (K. Berry's Extended plain TeX), LAmSTeX, LaTeX, and SliTeX, are handled reasonably well. texpretty also includes support for the Free Software Foundation's GNU Project TeXinfo, whose markup syntax resembles that of scribe(1) rather than that of TeX.
LaTeXinfo is similar enough to LaTeX that it can be handled by texpretty, provided a suitable supplementary style file is supplied; an example is given in the STYLE FILES section below.
Although prettyprinters of necessity impose a certain style that may not be universally agreed upon, they have nevertheless proven useful for many programming languages for heuristic syntax checking, and for generating a consistent appearance in files that may have been prepared by many authors (human, or computer programs), or even by a single author with lax file preparation discipline.
Because of their low-level markup, plain TeX and AmSTeX files offer fewer opportunities for useful prettyprinting than do LaTeX, LAmSTeX, and SliTeX. Nevertheless, the heuristic error checking provided by texpretty may still be useful for catching brace, dollar, and environment balance errors.
LaTeX users are advised to make regular use of lacheck(1), which warns about many other kinds of likely errors that LaTeX itself cannot detect. texpretty will automatically repair some of these types of errors.
texpretty may be less useful for files consisting only of macro definitions (e.g., plain.tex, or LaTeX style files), because
- they may already have stylized formatting,
- macro definitions might include unclosed environments, and
- there is usually little structure in such files to be exposed by prettyprinting.
Because TeX commands can be arcane, and do unexpected things, users of texpretty are urged to check the output carefully, and not replace input files with prettyprinted output files until the latter have been typeset and verified.
texpretty does not examine included files that would be read by TeX during processing of commands like \bibliography, \include, \input, \listoffigures, \listoftables, \printindex, \tableofcontents, ... You must provide these files to texpretty explicitly if you want them to be prettyprinted.
Letter case in option names is not significant, although it may be in option values.
GNU- and POSIX-style long options of the form --name are also recognized: they begin with one or two option prefix characters. Long option names may be abbreviated to any unique leading prefix, unless a shorter prefix is documented.
Any argument that begins with a hyphen is expected to be an option, and will raise an error if it is not recognized. If a filename begins with a hyphen, you therefore need to disguise it by supplying a leading directory path. For example, ./-foo represents the file named -foo in the current directory in UNIX.
Unrecognized arguments, or arguments lacking an expected value string, result in an error message on stderr, and immediate termination with a failure status code.
Because TeXinfo is a continually evolving language, all TeXinfo commands that do not require special handling are enumerated in texpretty; any others that it finds will raise a warning message. Such new commands can probably be handled without modifying texpretty; see the STYLE FILES section below.
If the file cannot be opened for output, texpretty will terminate silently (because the internal attempted redirection required the closure of stderr) with a non-zero exit code.
- 1
- Inline mathematics is coded as $ ... $.
- 2
- Inline mathematics is coded as \( ... \) (LaTeX).
- 4
- Inline mathematics is coded as \begin{math} ... \end{math} (LaTeX).
- 8
- Display mathematics is coded as $$ ... $$.
- 16
- Display mathematics is coded as \[ ... \] (LaTeX).
- 32
- Display mathematics is coded as \begin{displaymath} ... \end{displaymath} (LaTeX).
texpretty recognizes all of these forms in the input stream.
This option may be helpful in assisting in the conversion between LaTeX and plain TeX markup, in standardizing the markup of mathematics in LaTeX, and in improving the possibilities for detection of begin-end imbalances (when the dollar forms are eliminated, and an editor or other software capable of delimiter balance checking is employed).
Although the double-dollar markup style for display mathematics is frequently found in LaTeX documents, strictly speaking, it should be replaced by either of the alternatives above.
If the file cannot be opened for output, texpretty will terminate with an error message and a non-zero exit code.
Otherwise, texpretty carries out its normal prettyprinting activities inside tabular environments, except that there, it lines up ampersand column separators on line positions that are multiples of 8, to improve vertical alignment for better readability. This is about the best it can do with only a single pass over the environment.
- Long lines are wrapped to obey the requested (or default) maximum output line width.
- Text inside LaTeX \begin ... \end environments is indented according to the environment nesting level, except for the outer document environment, which does not cause indentation.
- Brace, bracket, environment, mathematics mode, and parenthesis balance are checked for inconsistencies, except inside verbatim environments. Apart from environments, none of these are expected to contain empty lines signifying a paragraph break.
If multiple paragraphs really are intended between opening and closing delimiters, you can suppress the warning messages by inserting \par on the empty lines to ensure that TeX sees the paragraph breaks, but texpretty does not. However, remember that TeX forbids paragraph breaks inside mathematics mode, so if you inserted blank lines there to improve readability, just change them to empty comment lines.
For those rare cases where unmatched delimiters are intended, you can eliminate the warning messages by hiding matching delimiters inside comments in the same line or paragraph.
- Newlines are inserted before and/or after important control sequences to improve their visibility.
- Comment percent characters are inserted after open braces at end-of-line, to avoid unwanted space creeping into macro arguments.
- Ties before literature citations are removed; their use is a common error.
- Redundant consecutive newlines are reduced to just two, indicating a paragraph break.
- Whitespace (tabs, formfeeds, line breaks) other than literal space (ISO 8859/ASCII decimal 32) is converted to literal space.
- Redundant consecutive spaces are reduced to just one, or two after sentence-ending punctuation.
- End-of-line spaces are discarded.
- Spaces between certain control sequences and their arguments are discarded. These are cases where the arguments are generally short, and should appear on the same line as the control sequence.
- Backslash-newline is converted to backslash-space-percent-newline. The reason for this change is that automatic line wrapping and filling in text editors can break a backslash-space control sequence at a line boundary, which can potentially change the meaning of a document if backslash-newline is defined differently than backslash-space. Thus, you should not use texpretty on files where these two control sequences have different meanings.
- Control sequences (footnotes, glossary, index, label, and cross-reference) that must be tightly bound to the preceding word to avoid the possibility of an intervening space, line break, or page break, are output on a new line, with the preceding line ending with a comment percent character. TeX ignores text from the comment character up to just before the first non-blank character on the next line, so the control sequence is still tightly bound to the preceding word, but is more readable.
- In tabular environments, additional whitespace is produced to line up ampersand column separators at column positions that are multiples of 8, except when the -t command-line option has been specified to force verbatim output.
AmSLaTeX and LaTeX files that adhere to the markup defined in the LaTeX User's Guide and Reference Manual by Leslie Lamport (Addison-Wesley, 1985 (ISBN 0-201-15790-X), 1994 (ISBN 0-201-52983-1)), and the LaTeX Companion by Michel Goossens, Frank Mittelbach, and Alexander Samarin (Addison-Wesley, 1994 (0-201-54199-8)), will benefit most from texpretty's processing.
In order to allow the user to control the formatting of these new features, texpretty supports a simple style file mechanism. At startup, it processes a style file in the user's home directory, and another in the current directory. Neither of these need exist. During command-line argument processing, additional style files can be provided with the -s option. These style files support user-specific, directory-specific, and file-specific prettyprinting control.
The default name of the first two style files is system dependent: .texprettyrc (UNIX), texpty.ini (IBM PC DOS), and texpretty.ini (DEC VMS and OpenVMS).
The line length limit in style files is system-dependent, but guaranteed to be at least 1024 characters.
texpretty's formatting actions group control sequences into the following style classes:
Style class names, like TeX control sequences and LaTeX environments, are case sensitive. All of the ones recognized by texpretty must be spelled with lowercase letters.
The style file is expected to contain lines of the form:
Blank lines, leading and trailing whitespace, and text from the TeX comment character (%) to end of line, are ignored. Whitespace separates items, and can be omitted around the colon. There is no significance to the order of items on a line, or lines in the file, except that later settings can override earlier ones. The same style class name may occur on multiple lines.style-class : envname1 envname2 ... \command1 \command2 ...
For example, suppose you have defined new sectional division commands named \Kapitel and \Teil, a new tabular environment named \SuperTabular, and two new display math environments named EasyMath and HardMath. Your style file might then look something like this:
% additional texpretty style specifications % [02-Jun-1995] chapter : \Kapitel \Teil tabular : SuperTabular displaymath : EasyMath HardMath
LaTeXinfo is, sadly, less widely used than TeXinfo; it supports most of the standard LaTeX commands, plus a few others: some additional sectioning and indexing commands, two comment-start macros (\c and \comment), a hypertext link macro (\node) and a menu environment in which line breaks are significant. Here is a suitable texpretty style file for LaTeXinfo files (there are additional LaTeXinfo control sequences not listed here, but they do not require any particular special formatting):
The \node macro can be handled by the comment class because all of its arguments follow on the same line, up to the end of the line. Menu environments are usually laid out neatly, because their formatting is preserved exactly in the ASCII output used for online info documentation; prettyprinting them in verbatim mode ensures that the layout will be retained. The \* macro will actually not be recognized in the current version of texpretty, because macro names definable in style files may contain only letters after the leading backslash. In this case, no harm will arise, since the default formatting of control sequences containing special characters is adequate.% LaTeXinfo style file for texpretty % [08-Jun-1995] chapter : \unnumbered \unnumberedsec \unnumberedsubsec chapter : \unnumberedsubsubsec comment : \c \comment \node index : \cindex \cpindexbold \cpsubindex \findex index : \kindex \pindex \tindex \vindex newline-after : \* \br newline-before : \copyright \newindex \setfilename \synindex verbatim : ifinfo ignore menu
There is no built-in support in texpretty for LaTeXinfo, because it has not achieved widespread use; however, the style file above should be sufficient for texpretty to prettyprint LaTeXinfo files correctly.
The last style class attached to command or environment name is the one that is used, so specifications in a command-line style file can override those in the current directory style file, and those in turn override settings from the home directory style file.
The -d and -t command-line options affect the prettyprinting of all commands in the math and tabular classes.
Don't use the -m math mode translation option if you specify the math class in a style file; if you do, those commands and environments will be renamed. When math mode translation is selected, it may also be advisable to specify the -q option, and avoid -s options, to eliminate all style file input.
For the purposes of matching TeX control words, texpretty assumes that they begin with a backslash followed by one or more letters or at-sign; the latter is commonly used inside macro packages to create command names that are supposed to be hidden from the end user. There is no provision in style files for modifying this assumption.
Occasionally, it may be desirable to have a control sequence and its arguments handled together as an indivisible unit. To support this, control sequences in style files may be followed by zero or more of the following patterns, in whatever order is required:
These patterns are ignored for index, verb, and verbatim style classes, because they have their own specialized formatting requirements.
- *
- Match an optional literal asterisk; LaTeX uses this for variant forms.
- []
- Match an optional argument in balanced brackets (LaTeX).
- ""
- Match an optional argument in quotes (AmSTeX and LAmSTeX).
- ()
- Match a required argument in balanced parentheses (LaTeX).
- {}
- Match a required argument in balanced braces (any TeX).
- \
- Match an alphabetic control sequence (LAmSTeX).
Here is a sample style file that illustrates the use of argument patterns:
When argument patterns are processed, whitespace before and between arguments in the input stream is discarded as long as an argument match is found. Arguments themselves are copied verbatim, even if they include line breaks or comments. The only requirement is that braced, bracketed, or parenthesized arguments have balanced delimiters.default : \makebox()[]{} % LaTeX list-item : \item"" % AmSTeX and LAmSTeX standalone : \Reset\ % LAmSTeX: e.g., \Reset \list
Control sequence name matching against style file specifications does not include any argument patterns, so if the same control sequence name is specified more than once in a style file, as in
only the last one will be effective, in this case, a required braced argument. This should not normally be a serious limitation, because TeX control sequence definitions that include argument delimiter characters also have this behavior. However, it is possible with special programming to use one-character lookahead to distinguish between argument types, and LaTeX does this internally for optional bracketed arguments, and asterisked variants.list-item : \myitem"" \myitem() \myitem[] \myitem{}
User-defined control sequences for which whitespace is significant, or whose use is idiosyncratically formatted, will likely conflict with prettyprinting.
The plain TeX \obeylines and \obeyspaces commands, and the ETeX \obeywhitespace command, would be similarly mishandled, except that texpretty watches for them, and once they are seen, copies the remainder of the file in verbatim mode, effectively suppressing further prettyprinting. It has to do this, because it has no reliable way to detect the end of scope for these commands.
For a small number of control sequences, there are formatting conflicts between two or more macro packages. In such a case, preference is first given to LaTeX (and AmSLaTeX and SliTeX), then to AmSTeX, then to LAmSTeX, and finally to plain TeX and ETeX.
There are also cases in some of these macro packages where the same control sequence has environment-dependent meaning, so formatting irregularities may appear.
Nelson H. F. Beebe Center for Scientific Computing University of Utah Department of Mathematics, 322 INSCC 155 S 1400 E RM 233 Salt Lake City, UT 84112-0090 USA Email: beebe@math.utah.edu, beebe@acm.org, beebe@computer.org, beebe@ieee.org (Internet) WWW URL: http://www.math.utah.edu/~beebe Telephone: +1 801 581 5254 FAX: +1 801 585 1640, +1 801 581 4148
texpretty's master distribution can be found at
ftp://ftp.math.utah.edu/pub/misc/ http://www.math.utah.edu/pub/misc/
in the file texpretty-x.yz.tar.gz where x.yz is the current version. Additional distribution formats are usually available at the same location.
That site is mirrored to several other Internet archives, so you may also be able to find it elsewhere on the Internet; try searching for the string texpretty at one or more of the popular Web search sites, such as
http://altavista.digital.com/ http://search.microsoft.com/us/default.asp http://www.dejanews.com/ http://www.dogpile.com/index.html http://www.euroseek.net/page?ifl=uk http://www.excite.com/ http://www.go2net.com/search.html http://www.google.com/ http://www.hotbot.com/ http://www.infoseek.com/ http://www.inktomi.com/ http://www.lycos.com/ http://www.northernlight.com/ http://www.snap.com/ http://www.stpt.com/ http://www.yahoo.com/