This is guile-ref.info, produced by Makeinfo version 3.12a from guile-ref.texi. INFO-DIR-SECTION Scheme Programming START-INFO-DIR-ENTRY * guile-ref: (guile-ref). The Guile Reference Manual. END-INFO-DIR-ENTRY Guile Reference Manual Copyright (C) 1996 Free Software Foundation Copyright (C) 1997 Free Software Foundation Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation.  File: guile-ref.info, Node: Characters, Next: Property Lists, Prev: Strings, Up: Top Characters ********** Most of the characters in the ASCII character set may be referred to by name: for example, `#\tab', `#\esc', `#\stx', and so on. The following table describes the ASCII names for each character. 0 = `#\nul§ 1 = `#\soh§ 2 = `#\stx§ 3 = `#\etx§ 4 = `#\eot§ 5 = `#\enq§ 6 = `#\ack§ 7 = `#\bel§ 8 = `#\bs§ 9 = `#\ht§ 10 = `#\nl§ 11 = `#\vt§ 12 = `#\np§ 13 = `#\cr§ 14 = `#\so§ 15 = `#\si§ 16 = `#\dle§ 17 = `#\dc1§ 18 = `#\dc2§ 19 = `#\dc3§ 20 = `#\dc4§ 21 = `#\nak§ 22 = `#\syn§ 23 = `#\etb§ 24 = `#\can§ 25 = `#\em§ 26 = `#\sub§ 27 = `#\esc§ 28 = `#\fs§ 29 = `#\gs§ 30 = `#\rs§ 31 = `#\us§ 32 = `#\sp§ The `delete' character (octal 177) may be referred to with the name `#\del'. Several characters have more than one name: * #\space, #\sp * #\newline, #\nl * #\tab, #\ht * #\backspace, #\bs * #\return, #\cr * #\page, #\np * #\null, #\nul  File: guile-ref.info, Node: Property Lists, Next: Ports, Prev: Characters, Up: Top Property Lists ************** Every object in the system can have a "property list" that may be used for information about that object. For example, a function may have a property list that includes information about the source file in which it is defined. Property lists are implemented as assq lists (*note Association Lists::.). Currently, property lists are implemented differently for procedures and closures than for other kinds of objects. Therefore, when manipulating a property list associated with a procedure object, use the `procedure' functions; otherwise, use the `object' functions. -- primitive: object-properties obj -- primitive: procedure-properties obj Return OBJ's property list. -- primitive: set-object-properties! obj alist -- primitive: set-procedure-properties! obj alist Set OBJ's property list to ALIST. -- primitive: object-property obj key -- primitive: procedure-property obj key Return the property of OBJ with name KEY. -- primitive: set-object-property! obj key value -- primitive: set-procedure-property! obj key value In OBJ's property list, set the property named KEY to VALUE. [Interface bug: there should be a second level of interface in which the user provides a "property table" that is possibly private.]  File: guile-ref.info, Node: Ports, Next: Bitwise Operations, Prev: Property Lists, Up: Top Ports ***** Input and output devices in Scheme are represented by "ports". All input and output in Scheme programs is accomplished by operating on a port: characters are read from an input port and written to an output port. This chapter explains the operations that Guile provides for working with ports. The formal definition of a port is very generic: an input port is simply ``an object which can deliver characters on command,'' and an output port is ``an object which can accept characters.'' Because this definition is so loose, it is easy to write functions that simulate ports in software. "Soft ports" and "string ports" are two interesting and powerful examples of this technique. * Menu: * File Ports:: * Soft Ports:: * String Ports:: * Closing Ports:: * Miscellaneous: Port Operations. * Extended I/O (fseek and line read/write)::  File: guile-ref.info, Node: File Ports, Next: Soft Ports, Up: Ports File Ports ========== The following procedures are used to open file ports. See also *Note open: Ports and File Descriptors, for an interface to the Unix `open' system call. -- primitive: open-file string mode Open the file whose name is STRING, and return a port representing that file. Whether an input or output port is allocated is determined by the MODE string. This is interpreted in the traditional Unix manner: use `r' for reading, `w' for writing, and `a' for appending. In addition, `0' can be used to specifiy an unbuffered port. See the `stdio' documentation for your system for more I/O mode options. If a file cannot be opened, `open-file' throws an exception. -- procedure: open-input-file filename Open FILENAME for input. Equivalent to (open-file FILENAME "r") -- procedure: open-output-file filename Open FILENAME for output. Equivalent to (open-file FILENAME "w") -- primitive: port-mode port Return the mode flags from the open PORT. -- primitive: port-filename [port] Return the filename associated with PORT. This function returns the strings "standard input", "standard output" and "standard error" when called on the current input, output and error ports respectively. -- primitive: set-port-filename! [port] filename Change the filename associated with PORT, using the current input port if none is specified. Note that this does not change the port's source of data, but only the value that is returned by `port-filename' and reported in diagnostic output. -- primitive: %make-void-port mode Create and return a new void port. The MODE argument describes the permissions to use for this port; for a description, see the documentation for `open-file' in *Note File Ports::.  File: guile-ref.info, Node: Soft Ports, Next: String Ports, Prev: File Ports, Up: Ports Soft Ports ========== A "soft-port" is a port based on a vector of procedures capable of accepting or delivering characters. It allows emulation of I/O ports. -- primitive: make-soft-port vector modes Returns a port capable of receiving or delivering characters as specified by the MODES string (*note open-file: File Ports.). VECTOR must be a vector of length 6. Its components are as follows: 0. procedure accepting one character for output 1. procedure accepting a string for output 2. thunk for flushing output 3. thunk for getting one character 4. thunk for closing port (not by garbage collection) For an output-only port only elements 0, 1, 2, and 4 need be procedures. For an input-only port only elements 3 and 4 need be procedures. Thunks 2 and 4 can instead be `#f' if there is no useful operation for them to perform. If thunk 3 returns `#f' or an `eof-object' (*note eof-object?: (r4rs)Input.) it indicates that the port has reached end-of-file. For example: (define stdout (current-output-port)) (define p (make-soft-port (vector (lambda (c) (write c stdout)) (lambda (s) (display s stdout)) (lambda () (display "." stdout)) (lambda () (char-upcase (read-char))) (lambda () (display "@" stdout))) "rw")) (write p p) => #  File: guile-ref.info, Node: String Ports, Next: Closing Ports, Prev: Soft Ports, Up: Ports String Ports ============ -- primitive: call-with-output-string proc PROC must be a procedure of one argument. This procedure calls PROC with one argument: a (newly created) output port. When the function returns, the string composed of the characters written into the port is returned. -- primitive: call-with-input-string string proc PROC must be a procedure of one argument. This procedure calls PROC with one argument: an (newly created) input port from which STRING's contents may be read. When PROC returns, the port is closed and the value yielded by the procedure PROC is returned. -- primitive: eval-string string Evaluate STRING as the text representation of a Scheme form or forms, and return whatever value they produce. [FIXME: figure out where this should go... eval-string is not really closely related to other string port functions. It just happens to appear in strports.c because it uses a string port to read from the string. -twp]  File: guile-ref.info, Node: Closing Ports, Next: Port Operations, Prev: String Ports, Up: Ports Closing Ports ============= The following procedures return `#t' if they successfully close a port or `#f' if it was already closed. They can also raise exceptions if an error occurs: some errors arising from writing output may be delayed until close. See also *Note close: Ports and File Descriptors, for a procedure which can close file descriptors. -- primitive: close-port port Close the specified port object. -- primitive: close-input-port port Close the specified input port object. This procedure is required by R4RS. -- primitive: close-output-port port Close the specified output port object. This procedure is required by R4RS.  File: guile-ref.info, Node: Port Operations, Next: Extended I/O (fseek and line read/write), Prev: Closing Ports, Up: Ports Port Operations =============== These procedures obtain and modify information about ports, but are not specific to one kind of port. -- primitive: current-error-port Return the port to which errors and warnings should be sent (the "standard error" in Unix and C terminology). -- primitive: set-current-input-port port -- primitive: set-current-output-port port -- primitive: set-current-error-port port Change the ports returned by `current-input-port', `current-output-port' and `current-error-port', respectively, so that they use the supplied PORT for input or output. -- primitive: port-column [input-port] -- primitive: port-line [input-port] Return the current column number or line number of INPUT-PORT, using the current input port if none is specified. If the number is unknown, the result is #f. Otherwise, the result is a 0-origin integer - i.e. the first character of the first line is line 0, column 0. (However, when you display a file position, for example in an error message, we recommand you add 1 to get 1-origin integers. This is because lines and column numbers traditionally start with 1, and that is what non-programmers will find most natural.) -- primitive: set-port-column! [input-port] column -- primitive: set-port-line! [input-port] line Set the current column or line number of INPUT-PORT, using the current input port if none is specified.  File: guile-ref.info, Node: Extended I/O (fseek and line read/write), Prev: Port Operations, Up: Ports Extended I/O (fseek and line read/write) ======================================== Extended I/O procedures are available which read or write lines of text, read text delimited by a specified set of characters, or report or set the current position of a port. Interfaces to `read'/`fread' and `write'/`fwrite' are also available, as `uniform-array-read!' and `uniform-array-write!', *Note Uniform Array::. -- procedure: read-line [port] [handle-delim] Return a line of text from PORT if specified, otherwise from the value returned by `(current-input-port)'. Under Unix, a line of text is terminated by the first end-of-line character or by end-of-file. If HANDLE-DELIM is specified, it should be one of the following symbols: `trim' Discard the terminating delimiter. This is the default, but it will be impossible to tell whether the read terminated with a delimiter or end-of-file. `concat' Append the terminating delimiter (if any) to the returned string. `peek' Push the terminating delimiter (if any) back on to the port. `split' Return a pair containing the string read from the port and the terminating delimiter or end-of-file object. NOTE: if the scsh module is loaded then multiple values are returned instead of a pair. -- procedure: read-line! buf [port] Read a line of text into the supplied string BUF and return the number of characters added to BUF. If BUF is filled, then `#f' is returned. Read from PORT if specified, otherwise from the value returned by `(current-input-port)'. -- procedure: read-delimited delims [port] [handle-delim] Read text until one of the characters in the string DELIMS is found or end-of-file is reached. Read from PORT if supplied, otherwise from the value returned by `(current-input-port)'. HANDLE-DELIM takes the same values as described for `read-line'. NOTE: if the scsh module is loaded then DELIMS must be an scsh char-set, not a string. -- procedure: read-delimited! delims buf [port] [handle-delim] [start] [end] Read text into the supplied string BUF and return the number of characters added to BUF (subject to HANDLE-DELIM, which takes the same values specified for `read-line'. If BUF is filled, `#f' is returned for both the number of characters read and the delimiter. Also terminates if one of the characters in the string DELIMS is found or end-of-file is reached. Read from PORT if supplied, otherwise from the value returned by `(current-input-port)'. NOTE: if the scsh module is loaded then DELIMS must be an scsh char-set, not a string. -- primitive: write-line obj [port] Display OBJ and a newline character to PORT. If PORT is not specified, `(current-output-port)' is used. This function is equivalent to: (display obj [port]) (newline [port]) -- primitive: ftell fd/port Returns an integer representing the current position of FD/PORT, measured from the beginning. If FD/PORT is a file descriptor, the underlying system call is `lseek'. -- primitive: fseek fd/port offset whence Sets the current position of FD/PORT to the integer OFFSET, which is interpreted according to the value of WHENCE. One of the following variables should be supplied for WHENCE: -- Variable: SEEK_SET Seek from the beginning of the file. -- Variable: SEEK_CUR Seek from the current position. -- Variable: SEEK_END Seek from the end of the file. If FD/PORT is a file descriptor the underlying system call is `lseek'. The return value is unspecified. Some of the abovementioned I/O functions rely on the following C primitives. These will mainly be of interest to people hacking Guile internals. -- primitive: %read-delimited! delims buf gobble? [port [start [end]]] Read characters from PORT into BUF until one of the characters in the DELIMS string is encountered. If GOBBLE? is true, store the delimiter character in BUF as well; otherwise, discard it. If PORT is not specified, use the value of `(current-input-port)'. If START or END are specified, store data only into the substring of BUF bounded by START and END (which default to the beginning and end of the buffer, respectively). Return a pair consisting of the delimiter that terminated the string and the number of characters read. If reading stopped at the end of file, the delimiter returned is the EOF-OBJECT; if the buffer was filled without encountering a delimiter, this value is #F. -- primitive: %read-line [port] Read a newline-terminated line from PORT, allocating storage as necessary. The newline terminator (if any) is removed from the string, and a pair consisting of the line and its delimiter is returned. The delimiter may be either a newline or the EOF-OBJECT; if `%read-line' is called at the end of file, it returns the pair `(# . #)'.  File: guile-ref.info, Node: Bitwise Operations, Next: Regular Expressions, Prev: Ports, Up: Top Bitwise Operations ****************** -- primitive: logand n1 n2 Returns the integer which is the bit-wise AND of the two integer arguments. Example: (number->string (logand #b1100 #b1010) 2) => "1000" -- primitive: logior n1 n2 Returns the integer which is the bit-wise OR of the two integer arguments. Example: (number->string (logior #b1100 #b1010) 2) => "1110" -- primitive: logxor n1 n2 Returns the integer which is the bit-wise XOR of the two integer arguments. Example: (number->string (logxor #b1100 #b1010) 2) => "110" -- primitive: lognot n Returns the integer which is the 2s-complement of the integer argument. Example: (number->string (lognot #b10000000) 2) => "-10000001" (number->string (lognot #b0) 2) => "-1" -- primitive: logtest j k (logtest j k) == (not (zero? (logand j k))) (logtest #b0100 #b1011) => #f (logtest #b0100 #b0111) => #t -- primitive: logbit? index j (logbit? index j) == (logtest (integer-expt 2 index) j) (logbit? 0 #b1101) => #t (logbit? 1 #b1101) => #f (logbit? 2 #b1101) => #t (logbit? 3 #b1101) => #t (logbit? 4 #b1101) => #f -- primitive: ash int count Returns an integer equivalent to `(inexact->exact (floor (* INT (expt 2 COUNT))))'. Example: (number->string (ash #b1 3) 2) => "1000" (number->string (ash #b1010 -1) 2) => "101" -- primitive: logcount n Returns the number of bits in integer N. If integer is positive, the 1-bits in its binary representation are counted. If negative, the 0-bits in its two's-complement binary representation are counted. If 0, 0 is returned. Example: (logcount #b10101010) => 4 (logcount 0) => 0 (logcount -2) => 1 -- primitive: integer-length n Returns the number of bits neccessary to represent N. Example: (integer-length #b10101010) => 8 (integer-length 0) => 0 (integer-length #b1111) => 4 -- primitive: integer-expt n k Returns N raised to the non-negative integer exponent K. Example: (integer-expt 2 5) => 32 (integer-expt -3 3) => -27 -- primitive: bit-extract n start end Returns the integer composed of the START (inclusive) through END (exclusive) bits of N. The STARTth bit becomes the 0-th bit in the result. Example: (number->string (bit-extract #b1101101010 0 4) 2) => "1010" (number->string (bit-extract #b1101101010 4 9) 2) => "10110"  File: guile-ref.info, Node: Regular Expressions, Next: Keywords, Prev: Bitwise Operations, Up: Top Regular Expressions ******************* A "regular expression" (or "regexp") is a pattern that describes a whole class of strings. A full description of regular expressions and their syntax is beyond the scope of this manual; an introduction can be found in the Emacs manual (*note Syntax of Regular Expressions: (emacs)Regexps., or in many general Unix reference books. If your system does not include a POSIX regular expression library, and you have not linked Guile with a third-party regexp library such as Rx, these functions will not be available. You can tell whether your Guile installation includes regular expression support by checking whether the `*features*' list includes the `regex' symbol. * Menu: * Regexp Functions:: Functions that create and match regexps. * Match Structures:: Finding what was matched by a regexp. * Backslash Escapes:: Removing the special meaning of regexp metacharacters. * Rx Interface:: Tom Lord's Rx library does things differently. [FIXME: it may be useful to include an Examples section. Parts of this interface are bewildering on first glance.]  File: guile-ref.info, Node: Regexp Functions, Next: Match Structures, Up: Regular Expressions Regexp Functions ================ By default, Guile supports POSIX extended regular expressions. That means that the characters `(', `)', `+' and `?' are special, and must be escaped if you wish to match the literal characters. This regular expression interface was modeled after that implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions. -- procedure: string-match pattern str [start] Compile the string PATTERN into a regular expression and compare it with STR. The optional numeric argument START specifies the position of STR at which to begin matching. `string-match' returns a "match structure" which describes what, if anything, was matched by the regular expression. *Note Match Structures::. If STR does not match PATTERN at all, `string-match' returns `#f'. Each time `string-match' is called, it must compile its PATTERN argument into a regular expression structure. This operation is expensive, which makes `string-match' inefficient if the same regular expression is used several times (for example, in a loop). For better performance, you can compile a regular expression in advance and then match strings against the compiled regexp. -- primitive: make-regexp str [flag ...] Compile the regular expression described by STR, and return the compiled regexp structure. If STR does not describe a legal regular expression, `make-regexp' throws a `regular-expression-syntax' error. The FLAG arguments change the behavior of the compiled regexp. The following flags may be supplied: `regexp/icase' Consider uppercase and lowercase letters to be the same when matching. `regexp/newline' If a newline appears in the target string, then permit the `^' and `$' operators to match immediately after or immediately before the newline, respectively. Also, the `.' and `[^...]' operators will never match a newline character. The intent of this flag is to treat the target string as a buffer containing many lines of text, and the regular expression as a pattern that may match a single one of those lines. `regexp/basic' Compile a basic (``obsolete'') regexp instead of the extended (``modern'') regexps that are the default. Basic regexps do not consider `|', `+' or `?' to be special characters, and require the `{...}' and `(...)' metacharacters to be backslash-escaped (*note Backslash Escapes::.). There are several other differences between basic and extended regular expressions, but these are the most significant. `regexp/extended' Compile an extended regular expression rather than a basic regexp. This is the default behavior; this flag will not usually be needed. If a call to `make-regexp' includes both `regexp/basic' and `regexp/extended' flags, the one which comes last will override the earlier one. -- primitive: regexp-exec regexp str [start] Match the compiled regular expression REGEXP against `str'. If the optional integer START argument is provided, begin matching from that position in the string. Return a match structure describing the results of the match, or `#f' if no match could be found. -- primitive: regexp? obj Return `#t' if OBJ is a compiled regular expression, or `#f' otherwise. Regular expressions are commonly used to find patterns in one string and replace them with the contents of another string. -- procedure: regexp-substitute port match [item...] Write to the output port PORT selected contents of the match structure MATCH. Each ITEM specifies what should be written, and may be one of the following arguments: * A string. String arguments are written out verbatim. * An integer. The submatch with that number is written. * The symbol `pre'. The portion of the matched string preceding the regexp match is written. * The symbol `post'. The portion of the matched string following the regexp match is written. PORT may be `#f', in which case nothing is written; instead, `regexp-substitute' constructs a string from the specified ITEMs and returns that. -- procedure: regexp-substitute/global port regexp target [item...] Similar to `regexp-substitute', but can be used to perform global substitutions on STR. Instead of taking a match structure as an argument, `regexp-substitute/global' takes two string arguments: a REGEXP string describing a regular expression, and a TARGET string which should be matched against this regular expression. Each ITEM behaves as in REGEXP-SUBSTITUTE, with the following exceptions: * A function may be supplied. When this function is called, it will be passed one argument: a match structure for a given regular expression match. It should return a string to be written out to PORT. * The `post' symbol causes `regexp-substitute/global' to recurse on the unmatched portion of STR. This _must_ be supplied in order to perform global search-and-replace on STR; if it is not present among the ITEMs, then `regexp-substitute/global' will return after processing a single match.  File: guile-ref.info, Node: Match Structures, Next: Backslash Escapes, Prev: Regexp Functions, Up: Regular Expressions Match Structures ================ A "match structure" is the object returned by `string-match' and `regexp-exec'. It describes which portion of a string, if any, matched the given regular expression. Match structures include: a reference to the string that was checked for matches; the starting and ending positions of the regexp match; and, if the regexp included any parenthesized subexpressions, the starting and ending positions of each submatch. In each of the regexp match functions described below, the `match' argument must be a match structure returned by a previous call to `string-match' or `regexp-exec'. Most of these functions return some information about the original target string that was matched against a regular expression; we will call that string TARGET for easy reference. -- procedure: regexp-match? obj Return `#t' if OBJ is a match structure returned by a previous call to `regexp-exec', or `#f' otherwise. -- procedure: match:substring match [n] Return the portion of TARGET matched by subexpression number N. Submatch 0 (the default) represents the entire regexp match. If the regular expression as a whole matched, but the subexpression number N did not match, return `#f'. -- procedure: match:start match [n] Return the starting position of submatch number N. -- procedure: match:end match [n] Return the ending position of submatch number N. -- procedure: match:prefix match Return the unmatched portion of TARGET preceding the regexp match. -- procedure: match:suffix match Return the unmatched portion of TARGET following the regexp match. -- procedure: match:count match Return the number of parenthesized subexpressions from MATCH. Note that the entire regular expression match itself counts as a subexpression, and failed submatches are included in the count. -- procedure: match:string match Return the original TARGET string.  File: guile-ref.info, Node: Backslash Escapes, Next: Rx Interface, Prev: Match Structures, Up: Regular Expressions Backslash Escapes ================= Sometimes you will want a regexp to match characters like `*' or `$' exactly. For example, to check whether a particular string represents a menu entry from an Info node, it would be useful to match it against a regexp like `^* [^:]*::'. However, this won't work; because the asterisk is a metacharacter, it won't match the `*' at the beginning of the string. In this case, we want to make the first asterisk un-magic. You can do this by preceding the metacharacter with a backslash character `\'. (This is also called "quoting" the metacharacter, and is known as a "backslash escape".) When Guile sees a backslash in a regular expression, it considers the following glyph to be an ordinary character, no matter what special meaning it would ordinarily have. Therefore, we can make the above example work by changing the regexp to `^\* [^:]*::'. The `\*' sequence tells the regular expression engine to match only a single asterisk in the target string. Since the backslash is itself a metacharacter, you may force a regexp to match a backslash in the target string by preceding the backslash with itself. For example, to find variable references in a TeX program, you might want to find occurrences of the string `\let\' followed by any number of alphabetic characters. The regular expression `\\let\\[A-Za-z]*' would do this: the double backslashes in the regexp each match a single backslash in the target string. -- procedure: regexp-quote str Quote each special character found in STR with a backslash, and return the resulting string. *Very important:* Using backslash escapes in Guile source code (as in Emacs Lisp or C) can be tricky, because the backslash character has special meaning for the Guile reader. For example, if Guile encounters the character sequence `\n' in the middle of a string while processing Scheme code, it replaces those characters with a newline character. Similarly, the character sequence `\t' is replaced by a horizontal tab. Several of these "escape sequences" are processed by the Guile reader before your code is executed. Unrecognized escape sequences are ignored: if the characters `\*' appear in a string, they will be translated to the single character `*'. This translation is obviously undesirable for regular expressions, since we want to be able to include backslashes in a string in order to escape regexp metacharacters. Therefore, to make sure that a backslash is preserved in a string in your Guile program, you must use _two_ consecutive backslashes: (define Info-menu-entry-pattern (make-regexp "^\\* [^:]*")) The string in this example is preprocessed by the Guile reader before any code is executed. The resulting argument to `make-regexp' is the string `^\* [^:]*', which is what we really want. This also means that in order to write a regular expression that matches a single backslash character, the regular expression string in the source code must include _four_ backslashes. Each consecutive pair of backslashes gets translated by the Guile reader to a single backslash, and the resulting double-backslash is interpreted by the regexp engine as matching a single backslash character. Hence: (define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*")) The reason for the unwieldiness of this syntax is historical. Both regular expression pattern matchers and Unix string processing systems have traditionally used backslashes with the special meanings described above. The POSIX regular expression specification and ANSI C standard both require these semantics. Attempting to abandon either convention would cause other kinds of compatibility problems, possibly more severe ones. Therefore, without extending the Scheme reader to support strings with different quoting conventions (an ungainly and confusing extension when implemented in other languages), we must adhere to this cumbersome escape syntax.  File: guile-ref.info, Node: Rx Interface, Prev: Backslash Escapes, Up: Regular Expressions Rx Interface ============ [FIXME: this is taken from Gary and Mark's quick summaries and should be reviewed and expanded. Rx is pretty stable, so could already be done!] Guile includes an interface to Tom Lord's Rx library (currently only to POSIX regular expressions). Use of the library requires a two step process: compile a regular expression into an efficient structure, then use the structure in any number of string comparisons. For example, given the regular expression `abc.' (which matches any string containing `abc' followed by any single character): guile> (define r (regcomp "abc.")) guile> r # guile> (regexec r "abc") #f guile> (regexec r "abcd") #((0 . 4)) guile> The definitions of `regcomp' and `regexec' are as follows: -- primitive: regcomp pattern [flags] Compile the regular expression pattern using POSIX rules. Flags is optional and should be specified using symbolic names: -- Variable: REG_EXTENDED use extended POSIX syntax -- Variable: REG_ICASE use case-insensitive matching -- Variable: REG_NEWLINE allow anchors to match after newline characters in the string and prevents `.' or `[^...]' from matching newlines. The `logior' procedure can be used to combine multiple flags. The default is to use POSIX basic syntax, which makes `+' and `?' literals and `\+' and `\?' operators. Backslashes in PATTERN must be escaped if specified in a literal string e.g., `"\\(a\\)\\?"'. -- primitive: regexec regex string [match-pick] [flags] Match STRING against the compiled POSIX regular expression REGEX. MATCH-PICK and FLAGS are optional. Possible flags (which can be combined using the logior procedure) are: -- Variable: REG_NOTBOL The beginning of line operator won't match the beginning of STRING (presumably because it's not the beginning of a line) -- Variable: REG_NOTEOL Similar to REG_NOTBOL, but prevents the end of line operator from matching the end of STRING. If no match is possible, regexec returns #f. Otherwise MATCH-PICK determines the return value: `#t' or unspecified: a newly-allocated vector is returned, containing pairs with the indices of the matched part of STRING and any substrings. `""': a list is returned: the first element contains a nested list with the matched part of STRING surrounded by the the unmatched parts. Remaining elements are matched substrings (if any). All returned substrings share memory with STRING. `#f': regexec returns #t if a match is made, otherwise #f. vector: the supplied vector is returned, with the first element replaced by a pair containing the indices of the matched portion of STRING and further elements replaced by pairs containing the indices of matched substrings (if any). list: a list will be returned, with each member of the list specified by a code in the corresponding position of the supplied list: a number: the numbered matching substring (0 for the entire match). `#\<': the beginning of STRING to the beginning of the part matched by regex. `#\>': the end of the matched part of STRING to the end of STRING. `#\c': the "final tag", which seems to be associated with the "cut operator", which doesn't seem to be available through the posix interface. e.g., `(list #\< 0 1 #\>)'. The returned substrings share memory with STRING. Here are some other procedures that might be used when using regular expressions: -- primitive: compiled-regexp? obj Test whether obj is a compiled regular expression. -- primitive: regexp->dfa regex [flags] -- primitive: dfa-fork dfa -- primitive: reset-dfa! dfa -- primitive: dfa-final-tag dfa -- primitive: dfa-continuable? dfa -- primitive: advance-dfa! dfa string  File: guile-ref.info, Node: Keywords, Next: Exceptions, Prev: Regular Expressions, Up: Top Keywords ******** [FIXME: need some initial schmooze about keywords; this is all taken from the NEWS file, and is accurate but not very useful to someone who has not used keywords before.] Guile supports a new R4RS--compliant syntax for keywords. A token of the form #:NAME, where NAME has the same syntax as a Scheme symbol, is the external representation of the keyword named NAME. Keyword objects print using this syntax as well, so values containing keyword objects can be read back into Guile. When used in an expression, keywords are self-quoting objects. Guile suports this read syntax, and uses this print syntax, regardless of the current setting of the `keyword' read option. The `keyword' read option only controls whether Guile recognizes the `:NAME' syntax, which is incompatible with R4RS. (R4RS says such token represent symbols.) The default behaviour is the R4RS--compliant one (`#:xxx' instead of `:xxx'). To change between the two keyword syntaxes you use the `read-options' procedure documented in *Note General option interface:: and *Note Reader options::. To use the `:xxx' keyword syntax, use (read-set! keywords 'prefix) To make keyword syntax R4RS compliant, with the `#:xxx' syntax, use: (read-set! keywords #f) [FIXME: in state of flux right now; here are the Scheme primitives defined in `kw.c':] -- primitive: keyword? kw `keyword?' returns `#t' if the argument KW is a keyword; it returns `#f' otherwise. -- primitive: keyword-dash-symbol arg `keyword-dash-symbol' [FIXME: have no idea what this does; it is not commented.]  File: guile-ref.info, Node: Exceptions, Next: Modules, Prev: Keywords, Up: Top Exceptions ********** -- primitive: catch key thunk handler Invoke THUNK in the dynamic context of HANDLER for exceptions matching KEY. If thunk throws to the symbol KEY, then HANDLER is invoked this way: (handler key args ...) KEY is a symbol or #t. THUNK takes no arguments. If THUNK returns normally, that is the return value of `catch'. Handler is invoked outside the scope of its own `catch'. If HANDLER again throws to the same key, a new handler from further up the call chain is invoked. If the key is `#t', then a throw to _any_ symbol will match this call to `catch'. -- primitive: throw key &rest args ... Invoke the catch form matching KEY, passing ARGS to the HANDLER. KEY is a symbol. It will match catches of the same symbol or of #t. If there is no handler at all, an error is signaled. -- procedure: error msg args ... Raise an error with key `misc-error' and a message constructed by displaying MSG and writing ARGS. -- primitive: scm-error key subr message args data Raise an error with key KEY. SUBR can be a string naming the procedure associated with the error, or `#f'. MESSAGE is the error message string, possibly containing `%S' and `%s' escapes. When an error is reported, these are replaced by formating the corresponding members of ARGS: `%s' formats using `display' and `%S' formats using `write'. DATA is a list or `#f' depending on KEY: if KEY is `system-error' then it should be a list containing the Unix `errno' value; If KEY is `signal' then it should be a list containing the Unix signal number; otherwise it will usually be `#f'. -- primitive: strerror errno Returns the Unix error message corresponding to ERRNO, an integer. -- syntax: false-if-exception expr Returns the result of evaluating its argument; however if an exception occurs then `#f' is returned instead. It is traditional in Scheme to implement exception systems using `call-with-current-continuation', but his has not been done, for performance reasons. The implementation of `call-with-current-continuation' is a stack copying implementation. This allows it to interact well with ordinary C code. Unfortunately, a stack-copying implementation can be slow -- creating a new continuation involves a block copy of the stack. Instead of using `call-with-current-continuation', the exception primitives are implemented as built-ins that take advantage of the _upward only_ nature of exceptions.  File: guile-ref.info, Node: Modules, Next: Module Internals, Prev: Exceptions, Up: Top Modules ******* [FIXME: somewhat babbling; should be reviewed by someone who understands modules, once the new module system is in place] When programs become large, naming conflicts can occur when a function or global variable defined in one file has the same name as a function or global variable in another file. Even just a _similarity_ between function names can cause hard-to-find bugs, since a programmer might type the wrong function name. The approach used to tackle this problem is called _information encapsulation_, which consists of packaging functional units into a given name space that is clearly separated from other name spaces. The language features that allow this are usually called _the module system_ because programs are broken up into modules that are compiled separately (or loaded separately in an interpreter). Older languages, like C, have limited support for name space manipulation and protection. In C a variable or function is public by default, and can be made local to a module with the `static' keyword. But you cannot reference public variables and functions from another module with different names. More advanced module systems have become a common feature in recently designed languages: ML, Python, Perl, and Modula 3 all allow the _renaming_ of objects from a foreign module, so they will not clutter the global name space. * Menu: * Scheme and modules:: * The Guile module system::  File: guile-ref.info, Node: Scheme and modules, Next: The Guile module system, Up: Modules Scheme and modules ================== Scheme, as defined in R4RS, does _not_ have a module system at all. Aubrey Jaffer, mostly to support his portable Scheme library SLIB, implemented a provide/require mechanism for many Scheme implementations. Library files in SLIB _provide_ a feature, and when user programs _require_ that feature, the library file is loaded in. For example, the file `random.scm' in the SLIB package contains the line (provide 'random) so to use its procedures, a user would type (require 'random) and they would magically become available, _but still have the same names!_ So this method is nice, but not as good as a full-featured module system.  File: guile-ref.info, Node: The Guile module system, Prev: Scheme and modules, Up: Modules The Guile module system ======================= In 1996 Tom Lord implemented a full-featured module system for Guile which allows loading Scheme source files into a private name space. This module system is regarded as being rather idiosyncratic, and will probably change to something more like the ML module system, so for now I will simply descrive how it works for a couple of simple cases. First of all, the Guile module system sets up a hierarchical name space, and that name space can be represented like Unix pathnames preceded by a <#> character. The root name space for all Guile-supplied modules is called `ice-9'. So for example, the SLIB interface, contained in `$srcdir/ice-9/slib.scm', starts out with (define-module (ice-9 slib)) and a user program can use (use-modules (ice-9 slib)) to have access to all procedures and variables defined within the slib module with `(define-public ...)'. So here are the functions involved: -- syntax: define-module module-specification MODULE-SPECIFICATION is of the form `(hierarchy file)'. One example of this is (use-modules (ice-9 slib)) define-module makes this module available to Guile programs under the given MODULE-SPECIFICATION. -- syntax: define-public ... Makes a procedure or variable available to programs that use the current module. -- syntax: use-modules module-specification MODULE-SPECIFICATION is of the form `(hierarchy file)'. One example of this is (use-modules (ice-9 slib)) use-modules allows the current Guile program to use all publicly defined procedures and variables in the module denoted by MODULE-SPECIFICATION. [FIXME: must say more, and explain, and also demonstrate a private name space use, and demonstrate how one would do Python's "from Tkinter import *" versus "import Tkinter". Must also add something about paths and standards for contributed modules.] Some modules are included in the Guile distribution; here are references to the entries in this manual which describe them in more detail: *boot-9* boot-9 is Guile's initialization module, and it is always loaded when Guile starts up. *(ice-9 debug)* Mikael Djurfeldt's source-level debugging support for Guile (*note debugger user interface::.). *(ice-9 threads)* Guile's support for multi threaded execution (*note Threads and Dynamic Roots::.). *(ice-9 slib)* This module contains hooks for using Aubrey Jaffer's portable Scheme library SLIB from Guile (*note SLIB::.). *(ice-9 jacal)* This module contains hooks for using Aubrey Jaffer's symbolic math packge Jacal from Guile (*note JACAL::.).  File: guile-ref.info, Node: Module Internals, Next: Dynamic Linking from Marius, Prev: Modules, Up: Top Module Internals **************** * Menu: * First-class Variables:: * First-class Modules:: * Dynamic Libraries:: Loading libraries of compiled code at run time.  File: guile-ref.info, Node: First-class Variables, Next: First-class Modules, Up: Module Internals First-class Variables =====================  File: guile-ref.info, Node: First-class Modules, Next: Dynamic Libraries, Prev: First-class Variables, Up: Module Internals First-class Modules ===================