.TH prep "" "" Command
.PC "Produce a word list"
\fBprep [ \-dfp ] [ \-i \fIifile \fB] [ \-o \fIofile \fB] [ \fIfile ... \fB]\fR
.PP
.HS
.SH Options:
.IC \fB\-d\fR
Print number of each word output
.IC \fB\-f\fR
Fold upper case to lower case
.IC "\fB\-i \fIfile\fR"
Ignore all words in \fIfile\fR on output
.IC "\fB\-o \fIfile\fR"
Output only words from \fIfile\fR
.IC \fB\-p\fR
Print punctuation marks on separate lines (not numbered)
.Pp
Text is taken from each input \fIfile\fR or stdin if none.
Words consist of alphabetic characters and apostrophes.
.HE
The command
.B prep
prepares a word list that is useful for statistical processing
from the textual data found in each input
.IR file .
If no
.I file
is given,
.B prep
reads the standard input for text.
.PP
For the purposes of
.BR prep ,
a word consists of a string of alphabetic letters and apostrophes.
Words are written, one per line, to the standard output.
Hyphenated words are treated as two words.
However, any word hyphenated between two lines is rejoined as one word.
.PP
.B prep
recognizes the following options:
.RS
.IP \fB\-d\fR
Print a sequence number (of words in the input text)
before each output word.
.IP \fB\-f\fR
Fold upper-case letters into lower case.
This is sometimes useful for producing unique lists of words.
.IP "\fB\-i \fIifile\fR"
Ignore words found in
.IR ifile .
.I ifile
has words one per line that are matched against each
input word, independent of case.
.IP "\fB\-o \fIofile\fR"
Print only words found in
.IR ofile .
Only one of
.B \-i
or
.B \-o
may be specified.
.IP \fB\-p\fR
In addition to printing words,
also print each punctuation
character (printable, non-numeric characters
that separate words), one per line.
These lines are not counted for
.B \-d.
.RE
.SH "See Also"
.Xr "commands," commands
.Xr "deroff," deroff
.Xr "ksh," ksh
.Xr "sh," sh
.Xr "sort," sort
.Xr "spell," spell
.Xr "typo," typo
.Xr "wc" wc
.SH Notes
What constitutes a
.I word
is different in
.BR deroff ,
.BR prep ,
and
.BR wc .
