% mmarticle.mm -- article about Malayalam TeX % (c) 1993 Jeroen Hellingman % last edit: 02-APR-1993 \input mmmacs \input mmtrmacs \def\q{\hfill\quad} \def\os{\oldstyle} \font\logo=logo10 \def\MF{{\logo METAFONT}} \parindent=0pt \def\today{\number\day\space\ifcase\month\or January\or February\or March\or April\or May\or June\or July\or August\or September\or October\or November\or December\fi \space\number\year} \headline={{\it preliminary version of \today \hfil Typesetting Malayalam with \TeX}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \centerline{\twelvebf Typesetting Malayalam with \TeX}\bigskip \centerline{Jeroen Hellingman}\bigskip {\narrower \noindent{\bf Abstract}\medskip \noindent Malayalam, a language spoken in the Indian state of Kerala, employs a beautiful but complicated script. This script uses numerous ligatures and places glyphs on top of each other and out of phonetic order. To be able to typeset Malayalam, a \MF\ containing all necessary letter forms, a pre-processor and a collection of \TeX-macro's were designed. Together they form a package called Malayalam-\TeX. This package enables the user to type Malayalam in an adaptable roman transliteration, which after being processed by the pre-processor, can be typeset with \TeX. During the design and development of the package, several shortcomings of the \TeX\ system appeared, which made a straightforward implementation of the Malayalam script difficult. Several additions to \TeX\ would have made the pre-processor unnecessary. Still, \MF\ and \TeX\ together proved to be a powerful tool that made the development of the package an enjoyable task.\par} \beginsection Introduction The Malayalam language, ($malayaaLam$), which is the first language of an estimated {\os30}~million people in the South Indian state Kerala, uses its own script, which is characterised by curly shaped characters. It is believed that those rounded shapes developed because it was written on palm leaves with a metal nib, which would have pricked through the medium, if the script had contained sharp curves. The same rounded shapes can be found in the other South Indian and Sri~Lankan scripts, which are all closely related to Malayalam script. Through the centuries, Malayalam has been written in various scripts and variations of scripts. Until a few centuries ago, the Malayalam language was normally written in the Old Tamil or $^mappiLLa$ ($$^mappiLLa$$) alphabet, also known as $vaTTezhuttu$ ($$vaTTezhuttu$$), the `rounded characters'. This script however, did not contain many letters necessary for writing Sanskrit. They became necessary in writing Malayalam since the language absorbed many Sanskrit loans, which could only be written in an ambiguous way. To overcome those problems, the {\os17}th~century poet $^tun~jettu ^ezhuttacchan_$ ($$^tun~jettu ^ezhuttacchan_$$)\footnote*{literally ``Father of Letters.''} adopted the modern Malayalam script, the so-called $^aaryavezhuttu$ ($$^aaryavezhuttu$$) or `Aryan characters' from the Tamil Grantha alphabet to render Sanskrit poetry in Malayalam. Tamil Grantha, in its turn, was derived from the ancient Brahmi script. His alphabet quickly became the standard for writing Malayalam. (however, a variant of the old script, called $koolezhuttu$ ($$koolezhuttu$$) was in use far into this century for keeping records in the Raja's palaces.) The Malayalam script still looks very similar to the Tamil script, but unlike Tamil, it uses complex conjunct characters, with letters subscribed to others, and out of phonetic order, as can be seen in most Indian scripts. With the introduction of the printing press in Kerala, somewhere in the {\os18}th~century, printing letters for Malayalam where designed, modelled after the, at that time, popular Latin Bodoni types, with large differences between thin and thick strokes. Printing Malayalam, however, proved to be quite complicated; in traditional typography, several hundreds of distinct lead glyphs where needed to typeset the language. To ease learning and typesetting, the government of the state of Kerala adopted a reformed script in {\os1974}. It abandons most of the subscribed conjuncts and all the irregular vowel signs. Although this script is now teached in all schools in Kerala, and newspapers are printed in it, the original script is still in widespread use, and mixtures of the two styles in one document can be seen regularly. In the following section I will explain the reformed script. \beginsection The Malayalam Script Like all scripts derived from Brahmi, Malayalam script is neither purely syllabic, nor purely alphabetic. The modern alphabet consists of {\os61}~syllabic characters, {\os15}~for the pure vowel sounds, and {\os36}~for the consonants followed by an `inherent'~$$a$$. To write a syllable with another vowel, a special sign is applied to the normal consonant character. When two consonants appear next to each other without an intermediate vowel, They are joined to each other in some way. Traditionally, the Malayalam syllabary has been ordered, following the Sanskrit order, on phonetic principles, putting related sounds together. For the sounds not used in Sanskrit, new letters where added to the end of the syllabary. \noindent The vowels are:\footnote*{The traditional alphabet also includes the Sanskrit vowels $.r.r$, $.l$, and $.l.l$ for respectively $$.r.r$$, $$.l$$, and $$.l.l$$.} \bigskip \centerline{\vbox{\twelvemmc\halign{#\q&#\q&#\q&#\q&#\q&#\q&#\cr $a$ & $aa$ & $i$ & $ii$ & $u$ & $uu$ & $.r$ \cr $$a$$ & $$aa$$ & $$i$$ & $$ii$$ & $$u$$ & $$uu$$ & $$.r$$ \cr $e$ & $ee$ & $ai$ & $o$ & $oo$ & $au$ \cr $$e$$ & $$ee$$ & $$ai$$& $$o$$ & $$oo$$& $$au$$ \cr }}} \bigskip \noindent After the vowels, are normally listed the modifiers $M$, $$M$$, and $H$, $$H$$. They are called {\it anusvaram} and {\it visargam} and represent a nasal sound and a aspiration respectively. The consonants are: \bigskip \centerline{\vbox{\twelvemmc\halign{#\q&#\q&#\q&#\q&#\q\q\q&#\q&#\q&#\q&#\cr $ka$ & $kha$ & $ga$ & $gha$ & $n"a$ & $ya$ & $ra$ & $la$ & $va$ \cr $$ka$$ & $$kha$$ & $$ga$$ & $$gha$$ & $$n"a$$ & $$ya$$ & $$ra$$ & $$la$$ & $$va$$ \cr $ca$ & $cha$ & $ja$ & $jha$ & $n~a$ & $sha$ & $Sa$ & $sa$ \cr $$ca$$ & $$cha$$ & $$ja$$ & $$jha$$ & $$n~a$$ & $$sha$$ & $$Sa$$ & $$sa$$ \cr $Ta$ & $Tha$ & $Da$ & $Dha$ & $Na$ & $ha$ & $La$ & $zha$ & $Ra$ \cr $$Ta$$ & $$Tha$$ & $$Da$$ & $$Dha$$ & $$Na$$ & $$ha$$ & $$La$$ & $$zha$$ & $$Ra$$ \cr $ta$ & $tha$ & $da$ & $dha$ & $na$ \cr $$ta$$ & $$tha$$ & $$da$$ & $$dha$$ & $$na$$ \cr $pa$ & $pha$ & $ba$ & $bha$ & $ma$ \cr $$pa$$ & $$pha$$ & $$ba$$ & $$bha$$ & $$ma$$ \cr }}} \bigskip \noindent When a vowel follows a consonant, it is written with a vowel sign, some of them standing to the left, or on both sides of the consonant to which they belong. This can be quite confusing in the when one starts learning the script. Here the vowel signs are given with the letter $ka$, {\it$$ka$$}: \bigskip \centerline{\vbox{\twelvemmc\halign{#\q&#\q&#\q&#\q&#\q&#\q&#\cr $ka$ & $kaa$ & $ki$ & $kii$ & $ku$ & $kuu$ & $k.r$ \cr $$ka$$ & $$kaa$$ & $$ki$$ & $$kii$$ & $$ku$$ & $$kuu$$ & $$k.r$$ \cr $ke$ & $kee$ & $kai$ & $ko$ & $koo$ & $kau$ \cr $$ke$$ & $$kee$$ & $$kai$$& $$ko$$ & $$koo$$& $$kau$$ \cr }}} \bigskip \noindent The biggest complexities arise when two consonants follow each other. In that case there are several alternatives. First, the consonants are joined with each other to form a new conjunct, like in $jja$, $nta$, and $NTa$; second one can write the second consonant as a subscript to the first, like in $ppa$, and $NNa$; the third option is to use a special symbol, $+$, called $candrakala$, {\it$$candrakala$$}, to indicate the absence of the inherent vowel in a consonant. This symbol is also used when a word ends in a consonant. Several letters form a ligature with this sign: $k$ for $ka+$, $N$ for $Na+$, $t$ for $ta+$ or $la+$, $n$ for $na+$, $m$ for $ma+$, $r$ for $ra+$, and $L$ for $La+$. Some consonants have a special sign when they occur as the last element of a consonant cluster: $ya$ becomes ${}<