.\" t '\" vim:set syntax=groff: .\" Copyright (C) 2009-2024 Kaz Kylheku . .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions are met: .\" .\" 1. Redistributions of source code must retain the above copyright notice, this .\" list of conditions and the following disclaimer. .\" .\" 2. Redistributions in binary form must reproduce the above copyright notice, .\" this list of conditions and the following disclaimer in the documentation .\" and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE .\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE .\" OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" Useful groff definitions. .\" .\" Some constants that depend on troff/nroff mode: .ie n \{\ .ds vspc 1 .\} .el \{\ .ds vspc 0.5 .\} .\" Mount numeric fonts when not running under man2html .if !\n(M2 \{\ . fp 4 CR . fp 5 CI .\} .\" Base font .nr fsav 1 .\" start of code block: switch to monospace font and no format .de verb . ft 4 . nf .. .\" end of code block: restore font and formatting .de brev . fi . ft 1 .. .\" switch to mono font .de mono . ft 4 .. .\" switch back from mon font .de onom . ft 1 .. .\" typeset argument in monospace .\" .code x -> \f[CR]x\f[] .de code \f[4]\\$1\f[] .. .\" like .code typesets meta-syntax .\" which is done in angle brackets + italic in nroff or oblique .\" courier in PDF/HTML. .de meta . ie n \{\ \fI<\\$1>\fP . \} . el \{\ \f[5]\\$1\f[] . \} .. .\" like .meta but tack on second argument with no space. .de metn . ie n \{\ \fI<\\$1>\fP\\$2 . \} . el \{\ \f[5]\\$1\f[]\\$2 . \} .. .\" like .code but wraps in quotes .\" .str x y z -> \f[CR]"x y z"\f[]. .de str \f[4]"\\$*"\f[] .. .\" wrap first argument in quotes, tack no second one with no space .\" .strn x y -> \f[CR]"x"\f[]y. .de strn \f[4]"\\$1"\f[]\\$2 .. .\" like .IP but use monospace too .de coIP . IP "\\f[4]\\$*\\f[]" .. .\" Directive heading .de dir . NP* The \f[4]\\$1\f[] Directive .. .\" Multiple directive heading .de dirs . ds s " . while (\\n[.$]>2) \{\ . as s \f[4]\\$1\f[], . shift . \} . if (\\n[.$]>1) \{\ . as s \f[4]\\$1\f[] . shift . \} . if (\\n[.$]>0) \{\ . as s and \f[4]\\$1\f[] . \} . NP* The \\*s Directives .. .\" heading with code in position 1 .de c1NP . ds s \\f[4]\\$1\\f[] . shift . as s " \\$* . NP* \\*s .. .\" utility macro for gathering material into "s" string. .\" a pair of arguments "@ arg" becomes arg set in code .\" a pair of arguments "@, arg" becomes "arg," where .\" arg is set in code, followed by comma not in code. .de gets . ds s " . while (\\n[.$]>0) \{\ . ie "\\$1"@" \{\ . shift . as s \f[4]\\$1\f[] . shift . \} . el \{\ . ie "\\$1"@," \{\ . shift . as s \f[4]\\$1\f[], . shift . \} . el \{\ . as s \\$1 . shift . \} . \} . \} .. .\" a macro for gathering material into "s" .\" a pair of arguments "< arg" is typeset like .\" .meta arg. "<< arg arg" is like .metn arg arg. .ie n \{\ . de getm . ds s " . while (\\n[.$]>0) \{\ . ie "\\$1"<" \{\ . shift . as s \fI<\\$1>\fP . shift . \} . el \{\ . ie "\\$1"<<" \{\ . shift . as s \fI<\\$1>\fP\\$2 . shift . shift . \} . el \{\ . ie "\\$1">>" \{\ . shift . as s \\$1\fI<\\$2>\fP . shift . shift . \} . el \{\ . ie "\\$1"<>" \{\ . shift . as s \\$1\fI<\\$2>\fP\\$3 . shift . shift . shift . \} . el \{\ . ie "\\$1"><" \{\ . shift . as s \fI<\\$1>\fP\\$2\fI<\\$3>\fP . shift . shift . shift . \} . el \{\ . ie "\\$1"<2>" \{\ . shift . as s \\$1\fI<\\$2>\fP\\$3\fI<\\$4>\fP\\$5 . shift . shift . shift . shift . shift . \} . el \{\ . as s \\$1 . shift . \} . \} . \} . \} . \} . \} . \} . . .\} .el \{\ . de getm . ds s " . while (\\n[.$]>0) \{\ . ie "\\$1"<" \{\ . shift . as s \\f5\\$1\\f4 . shift . \} . el \{\ . ie "\\$1"<<" \{\ . shift . as s \\f5\\$1\\f4\\$2 . shift . shift . \} . el \{\ . ie "\\$1">>" \{\ . shift . as s \\$1\\f5\\$2\\f4 . shift . shift . \} . el \{\ . ie "\\$1"<>" \{\ . shift . as s \\$1\\f5\\$2\\f4\\$3 . shift . shift . shift . \} . el \{\ . ie "\\$1"<2>" \{\ . shift . as s \\$1\\f5\\$2\\f4\\$3\\f5\\$4\\f4\\$5 . shift . shift . shift . shift . shift . \} . el \{\ . ie "\\$1"><" \{\ . shift . as s \\f5\\$1\\f4\\$2\f5\\$3\\f4 . shift . shift . shift . \} . el \{\ . as s \\$1 . shift . \} . \} . \} . \} . \} . \} . \} . . .\} .\" typeset left argument in monospace, then right one .\" in previous font, with no space between. .\" .codn x y \f[CR]x\f[]y .de codn \f[4]\\$1\f[]\\$2 .. .\" .cod1 a b c -> abc where a is typeset as code .de cod1 \&\\$1\f[4]\\$2\f[]\$3 .. .\" .cod2 a b -> ab where b is typeset as code .de cod2 \&\\$1\f[4]\\$2\f[] .. .\" .cod3 a b c -> abc where a and c are typeset as code .de cod3 \f[4]\\$1\f[]\\$2\f[4]\\$3\f[] .. .\" Syntax section markup .de synb . TP* Syntax: . mono .. .de syne . onom .. .\" Used for meta-variables in syntax blocks .de mets . nr fsav \\n[.f] . getm \\$* . \"workaround for man2html: . as s \\f\\n[fsav] \\*s . ft \\n[fsav] .. .\" Used for meta-variables in inline blocks .de meti . nr fsav \\n[.f] . getm \\$* . \"workaround for man2html: . as s \\f\\n[fsav] \&\\*s . ft \\n[fsav] .. .\" Used for meta-variables in .coIP .de meIP . nr fsav \\n[.f] . getm \\$* . \"workaround for man2html: . as s \\f\\n[fsav] .coIP \\*s . ft \\n[fsav] .. .\" Description section .de desc . TP* Description: .. .\" Section counters: heading, section, paragraph. .nr shco 0 1 .nr ssco 0 1 .nr spco 0 1 .\" wrapper for .SH .de SH* . SH \\n+[shco] \\$* . rs . nr ssco 0 . nr spco 0 . sp \*[vspc] . ns .. .\" wrapper for .SS .de SS* . SS \\n[shco].\\n+[ssco] \\$* . rs . nr spco 0 . sp \*[vspc] . ns .. .\" wrapper for .TP .de TP* . ds s \\$1 . shift . TP \\$* \&\\*s . sp \*[vspc] . ns .. .\" numbered paragraph .de NP* . ie \n(M2 \{\ . M2SS 2 h4 "\\n[shco].\\n[ssco].\\n+[spco] \\$*" . \} . el \{\ . TP* "\f[B]\\n[shco].\\n[ssco].\\n+[spco] \\$*\f[]" . \} . PP .. .\" process arguments using .gets so that some material .\" is typeset as code. Then pass to .SS* section macro. .de coSS . gets \\$* . SS* \\*s .. .\" like coSS but targeting NP* .de coNP . gets \\$* . NP* \\*s .. .\" like coSS but use monospace IP .de ccIP . gets \\$* . IP "\\*s" .. .\" keystrokes .ie \n(M2 \{\ .de key .M2HT \\$1 .. .de keyn .M2HT \\$1\\$2 .. .\} .el \{\ . ie n \{\ . de key [\\$1] . . . de keyn [\\$1]\\$2 . . . \} . el \{\ . \" Box macro from Groff manual with $2 added . de box . nr @wd \w'\\$1' \h'.2m'\ \h'-.2m'\v'(.2m - \\n[rsb]u)'\ \D'l 0 -(\\n[rst]u - \\n[rsb]u + .4m)'\ \D'l (\\n[@wd]u + .4m) 0'\ \D'l 0 (\\n[rst]u - \\n[rsb]u + .4m)'\ \D'l -(\\n[@wd]u + .4m) 0'\ \h'.2m'\v'-(.2m - \\n[rsb]u)'\ \\$1\ \h'.2m'\\$2 . . . de key . box "\\$1" "" . . . de keyn . box "\\$1" "\\$2" . . . \} .\} .\" TXR name .ds TX \f[B]TXR\f[] .ds TL \f[B]TXR Lisp\f[] .\" Start of man page: .TH TXR 1 2024-12-15 "Utility Commands" "TXR Programming Language" "Kaz Kylheku" .SH* NAME \*(TX \- Programming Language (Version 297) .SH* SYNOPSIS .mono .meti txr [ < options ] [ < script-file [ < arguments ... ]] .onom .SH* DESCRIPTION \*(TX is a general-purpose, multi-paradigm programming language. It comprises two languages integrated into a single tool: a text scanning and extraction language referred to as the \*(TX Pattern Language (sometimes just "TXR"), and a general-purpose dialect of Lisp called \*(TL. \*(TX can be used for everything from "one liner" data transformation tasks at the command line, to data scanning and extracting scripts, to full application development in a wide range of areas. A script written in the \*(TX Pattern Language, also referred to in this document as a .IR query , specifies a pattern which matches one or more sources of inputs, such as text files. Patterns can consist of large chunks of multiline free-form text, which is matched literally against material in the input sources. Free variables occurring in the pattern (denoted by the .code @ symbol) are bound to the pieces of text occurring in the corresponding positions. Patterns can be arbitrarily complex, and can be broken down into named pattern functions, which may be mutually recursive. In addition to embedded variables which implicitly match text, the \*(TX pattern language supports a number of directives, for matching text using regular expressions, for continuing a match in another file, for searching through a file for the place where an entire subquery matches, for collecting lists, and for combining subqueries using logical conjunction, disjunction and negation, and numerous others. Patterns can contain actions which transform data and generate output. These actions can be embedded anywhere within the pattern-matching logic. A common structure for small \*(TX scripts is to perform a complete matching session at the top of the script, and then deal with processing and reporting at the bottom. The \*(TL language can be used from within \*(TX scripts as an embedded language, or completely standalone. It supports functional, imperative and object-oriented programming, and provides numerous data types such as symbols, strings, vectors, hash tables with weak reference support, lazy lists, and arbitrary-precision ("bignum") integers. It has an expressive foreign function interface (FFI) for calling into libraries and other software components that support C-language-style calls. \*(TL source files as well as individual functions can be optionally compiled for execution on a virtual machine that is built into \*(TX. Compiled files execute and load faster, and resist reverse-engineering. Standalone application delivery is possible. \*(TX is free software offered under the two-clause BSD license which places almost no restrictions on redistribution, and allows every conceivable use, of the whole software or any constituent part, royalty-free, free of charge, and free of any restrictions. .SH* ARGUMENTS AND OPTIONS If \*(TX is given no arguments, it will enter into an interactive mode. See the INTERACTIVE LISTENER section for a description of this mode. When \*(TX enters interactive mode this way, it prints a one-line banner announcing the program name and version, and one line of help text instructing the user how to exit. If \*(TX is invoked under the name .codn txrlisp , it behaves as if the .code --lisp option had been specified before any other option. Similarly, if \*(TX is invoked under the name .codn txrvm , it behaves as if the .code --compiled option had been given. Unless the .code -c or .code -f options are present, the first non-option argument is treated as a .meta script-file which is executed. This is described after the following descriptions of all of the options. Any additional arguments have no fixed meaning; they are available to the \*(TX query or \*(TL application for specifying input files to be processed, or other meanings under the control of the application. Options which don't take an argument may be combined together. The .code -v and .code -q options are mutually exclusive. Of these two, the one which occurs in the rightmost position in the argument list dominates. The .code -c and .code -f options are also mutually exclusive; if both are specified, it is a fatal error. .meIP >> -D var=value Bind the variable .meta var to the value .meta value prior to processing the query. The name is in scope over the entire query, so that all occurrences of the variable are substituted and match the equivalent text. If the value contains commas, these are interpreted as separators, which give rise to a list value. For instance .code -Dvar=a,b,c binds .code var to the list of the strings .strn "a" , .str "b" and .strn "c" . (See the .code @(collect) directive.) List variables provide a multiple match. That is to say, if a list variable occurs in a query, a successful match occurs if any of its values matches the text. If more than one value matches the text, the first one is taken. .meIP >> -D var Binds the variable .meta var to an empty string value prior to processing the query. .coIP -q Quiet operation during matching. Certain error messages are not reported on the standard error device (but if the situations occur, they still fail the query). This option does not suppress error generation during the parsing of the query, only during its execution. .coIP -i If this option is present, then \*(TX will enter into an interactive interpretation mode after processing all options, and the input query if one is present. See the INTERACTIVE LISTENER section for a description of this mode. .coIP -d .coIP --debugger Invoke the interactive \*(TX debugger. See the DEBUGGER section. Implies .codn --backtrace . .coIP --backtrace Turns on the establishment of backtrace frames for function calls so that a backtrace can be produced when an unhandled exception occurs, and in other situations. Backtraces are helpful in identifying the causes of errors, but require extra stack space and slow down execution. .coIP -n .coIP --noninteractive This option affects behavior related to \*(TX's .code *stdin* stream. It also has a another, unrelated effect, on the behavior of the interactive listener; see below. Normally, if this stream is connected to a terminal device, it is automatically marked as having the real-time property when \*(TX starts up (see the functions .code stream-set-prop and .codn real-time-stream-p ). The .code -n option suppresses this behavior; the .code *stdin* stream remains ordinary. The \*(TX pattern language reads standard input via a lazy list, created by applying the .code lazy-stream-cons function to the .code *stdin* stream. If that stream is marked real-time, then the lazy list which is returned by that function has behaviors that are better suited for scanning interactive input. A more detailed explanation is given under the description of this function. If the .code -n option is effect and \*(TX enters into the interactive listener, the listener operates in .I "plain mode" instead of the .IR "visual mode" . The listener reads buffered lines from the operating system without any character-based editing features or history navigation. In plain mode, no prompts appear and no terminal control escape sequences are generated. The only output is the results of evaluation, related diagnostic messages, and any output generated by the evaluated expressions themselves. .coIP -v Verbose operation. Detailed logging is enabled. .coIP -b This option binds a Lisp global lexical variable (as if by the .code defparml function) to an object described by Lisp syntax. It requires an argument of the form .meta sym=value where .meta sym must be, syntactically, a token denoting a bindable symbol, and .meta value is arbitrary \*(TL syntax. The .meta sym syntax is converted to the symbol it denotes, which is bound as a global lexical variable, if it is not already a variable. The .meta value syntax is parsed to the Lisp object it denotes. This object is not subject to evaluation; the object itself is stored into the variable binding denoted by .metn sym . Note that if .meta sym already exists as a global variable, then it is simply overwritten. If .meta sym is marked special, then it stays special. .coIP -B If the query is successful, print the variable bindings as a sequence of assignments in shell syntax that can be .IR eval -ed by a POSIX shell. II the query fails, print the word "false". Evaluation of this word by the shell has the effect of producing an unsuccessful termination status from the shell's .I eval command. .coIP -l .coIP --lisp-bindings This option implies .codn -B . Print the variable bindings in Lisp syntax instead of shell syntax. .meIP -a < num This option implies .codn -B . The decimal integer argument .meta num specifies the maximum number of array dimensions to use for list-valued variable bindings. The default is 1. Additional dimensions are expressed using numeric suffixes in the generated variable names. For instance, consider the three-dimensional list arising out of a triply nested collect: .mono ((("a" "b") ("c" "d")) (("e" "f") ("g" "h"))). .onom Suppose this is bound to a variable .metn V . With .codn "-a 1" , this will be reported as: .verb V_0_0[0]="a" V_0_1[0]="b" V_1_0[0]="c" V_1_1[0]="d" V_0_0[1]="e" V_0_1[1]="f" V_1_0[1]="g" V_1_1[1]="h" .brev With .codn "-a 2" , it comes out as: .verb V_0[0][0]="a" V_1[0][0]="b" V_0[0][1]="c" V_1[0][1]="d" V_0[1][0]="e" V_1[1][0]="f" V_0[1][1]="g" V_1[1][1]="h" .brev The leftmost bracketed index is the most major index. That is to say, the dimension order is: .codn "NAME_m_m+1_..._n[1][2]...[m-1]" . .meIP -c < query Specifies the query in the form of a command-line argument. If this option is used, the .meta script-file argument is omitted. The first non-option argument, if there is one, now specifies the first input source rather than a query. Unlike queries read from a file, (nonempty) queries specified as arguments using -c do not have to properly end in a newline. Internally, \*(TX adds the missing newline before parsing the query. Thus .code -c .str @a is a valid query which matches a line. Example: Shell script which uses \*(TX to read two lines .str 1 and .str 2 from standard input, binding them to variables .code a and .codn b . Standard input is specified as .code - and the data comes from shell "here document" redirection: .RS .IP code: .mono \ #!/bin/sh txr -B -c "@a @b" - <> --compat= number Requests \*(TX to behave in a manner that is compatible with the specified version of \*(TX. This makes a difference in situations when a release of \*(TX breaks backward compatibility. If some version N+1 deliberately introduces a change which is backward incompatible, then .code "-C N" can be used to request the old behavior. The requested value of N can be too low, in which case \*(TX will complain and exit with an unsuccessful termination status. This indicates that \*(TX refuses to be compatible with such an old version. Users requiring the behavior of that version will have to install an older version of \*(TX which supports that behavior, or even that exact version. If the option is specified more than once, the behavior is not specified. Compatibility can also be requested via the .code TXR_COMPAT environment variable instead of the .code -C option. For more information, see the COMPATIBILITY section. .meIP >> --gc-delta= number The .meta number argument to this option must be a decimal integer. It represents a megabyte value, the "GC delta": one megabyte is 1048576 bytes. The "GC delta" controls an aspect of the garbage collector behavior. See the .code gc-set-delta function for a description. .meIP --debug-autoload This option turns on debugging, like .code --debugger but also requests stepping into the autoload processing of \*(TL library code. Normally, debugging through the evaluations triggered by autoloading is suppressed. Implies .codn --backtrace . .meIP --debug-expansion This option turns on debugging, like .code --debugger but also requests stepping into the parse-time macro-expansion of \*(TL code embedded in \*(TX queries. Normally, this is suppressed. Implies .codn --backtrace . .coIP --help Prints usage summary on standard output, and terminates successfully. .coIP --license Prints the software license. This depends on the software being installed such that the LICENSE file is in the data directory. Use of \*(TX implies agreement with the liability disclaimer in the license. .coIP --version Prints a message on standard output which includes the program version, and then immediately causes \*(TX to terminate with a successful status. .coIP --build-id If \*(TX was built with an embedded build ID string, this option prints that string. Otherwise nothing is printed. In either case, \*(TX then immediately terminates with a successful status. .coIP --args The .code --args option provides a way to encode multiple arguments as a single argument, which is useful on some systems which have limitations in their implementation of the hash-bang mechanism. For details about its special syntax, see Hash-Bang Support below. It is also useful in standalone application deployment. See the section STANDALONE APPLICATION SUPPORT, in which example uses of .code --args are shown. .coIP --eargs The .code --eargs option (extended .codn --args ) is like .code --args but must be followed by an argument. The argument is removed from the argument list and substituted in place of occurrences of .code {} among the arguments expanded from the .code --eargs syntax. .coIP --lisp .coIP --compiled These options influence the treatment of query files which do not have a recognized suffix indicating their type. The .code --lisp option causes a file with an unrecognized suffix, or no suffix, to be treated as Lisp source; .code --compiled causes it to be treated as a compiled \*(TL file. Moreover, .code --lisp and .code --compiled influence the suffix search. By default, when a query file name does not have a recognizable suffix, and the file does not exist, \*(TX adds the .str .txr suffix to the name and tries opening that name, and in a similar way tries .strn .tlo , .str .tlo.gz and finally .strn .tl . In this situation, if either of these two options is specified, \*(TX tries only the .strn .tlo , .str .tlo.gz and .str .tl suffixes, in that order, avoiding the .str .txr suffix. The search order is always .str .tlo first, then .str .tl regardless of whether .code --lisp or .code --compiled is specified. Note that .code --lisp and .code --compiled influence how the argument of the .code -f option is treated, but only if they precede that option. If the file has a recognized suffix: .strn .tl , .strn .tlo , .strn .tlo.gz , .strn .txr , .str .txr-profile or .strn .txr_profile , then these options have no effect. The suffix determines the interpretation of the content. Moreover, no suffix search takes place: only the given path name is tried. .coIP --reexec On platforms which support the POSIX .code exec family of functions, this option causes \*(TX to re-execute itself. The re-executed image receives the remaining arguments which follow the .code --reexec argument. Note: this option is useful for supporting setuid operation in hash-hang scripts. On some platforms, the interpreter designated by a hash-bang script runs without altered privilege, even if that interpreter is installed setuid. If the interpreter is executed directly, then setuid applies to it, but not if it is executed via hash bang. If the .code --reexec option is used in the interpreter command line of such a script, the interpreter will re-execute itself, thereby gaining the setuid privilege. The re-executed image will then obtain the script name from the arguments which are passed to it and determine whether that script will run setuid. See the section SETUID/SETGID OPERATION. .coIP --noprofile If entering the interactive listener, suppress the reading of the .code .txr-profile in the home directory. See the Interactive Profile File subsection in the INTERACTIVE LISTENER section of the manual. .coIP --gc-debug This option enables a behavior which stresses the garbage collector with frequent garbage collection requests. The purpose is to make it more likely to reproduce certain kinds of bugs. Use of this option severely degrades the performance of \*(TX. .coIP --vg-debug If \*(TX is enabled with Valgrind support, then this option is available. It enables code which uses the Valgrind API to integrate with the Valgrind debugger, for more accurate tracking of garbage collected objects. For example, objects which have been reclaimed by the garbage collector are marked as inaccessible, and marked as uninitialized when they are allocated again. .coIP --free-all This option specifies that all memory allocated by \*(TX should be freed upon normal termination. This behavior is useful for debugging memory leaks. An accurate leak detection tool, such as the one built into Valgrind, should report zero leaked or still reachable memory if .code --free-all has been used and \*(TX has terminated normally. that indicates either a leak in \*(TX, a leak or global object retention in a platform library, or else a a leak introduced due to misuse of FFI. .coIP --dv-regex If this option is used, then regular expressions are all treated using the derivative-based back-end. The NFA-based regex implementation is disabled. Normally, only regular expressions which require the intersection and complement operators are handled using the derivative back-end. This option makes it possible to test that back-end on test cases that it wouldn't normally receive. .meIP >> --in-package= name This option changes to the specified package, by finding the package of the specified .meta name and assigning that to the .code *package* special variable. If the package is not found, a diagnostic is issued, and \*(TX terminates unsuccessfully. The package thus specified is visible to the subsequent occurrences of the .code -e family of options as well as of the .code --compile option. It does not affect the value of .code *package* which is in effect when a .meta script-file is executed or when the interactive listener is entered. .meIP <2> --compile= source-file [: target-file ] This option invokes the .code compile-update-file on .metn source-file . If .meta target-file is specified, it is passed to .code compile-update-file as the target argument; otherwise, that argument is defaulted. The option can be used multiple times to process multiple files. Unsuccessful compilation throws an exception, causing \*(TX to terminate abnormally. Similarly to the .code -e option, if this option is used at least once, and all of the invocations are successful, and there is no .meta script-file argument, then \*(TX terminates with a successful status instead of entering the interactive listener. The .code -i option can be used request the listener. .coIP -- Signifies the end of the option list. .coIP - This argument is not interpreted as an option, but treated as a filename argument. After the first such argument, no more options are recognized. Even if another argument looks like an option, it is treated as a name. This special argument .code - means "read from standard input" instead of a file. The .metn script-file , or any of the data files, may be specified using this option. If two or more files are specified as .codn - , the behavior is system-dependent. It may be possible to indicate EOF from the interactive terminal, and then specify more input which is interpreted as the second file, and so forth. .PP After the options, the remaining arguments are treated as follows. If neither the .code -f nor the .code -c options were specified, then the first argument is treated as the .metn script-file . If no arguments are present, then \*(TX enters interactive mode, provided that none of the .codn -e , .codn -p , .code -P or .code -t options had been processed, in which case it instead terminates. The \*(TX Pattern Language has features for implicitly treating the subsequent command-line arguments as input files. It follows the convention that an argument consisting of a single .code - (dash) character specifies that standard input is to be used, instead of opening a file. If the query does not use the .code @(next) directive to select an alternative data source, and a pattern-matching construct is processed which demands data, then the first argument will be opened as a data source. Arguments not opened as data sources can be assigned alternative meanings and uses, or can be ignored entirely, under control of the query. Specifying standard input as a source with an explicit .code - argument is unnecessary. If no arguments are present, then \*(TX scans standard input by default. This was not true in versions of \*(TX prior to 171; see the COMPATIBILITY section. .PP \*(TX begins by reading the script, which is given as the contents of the argument of the .code -c option, or else as the contents of an input source specified by the .code -f option or by the .meta script-file argument. If .code -f or the .meta script-file argument specify .code - (dash) then the script is read from standard input. In the case of the \*(TX pattern language, the entire query is scanned, internalized, and then begins executing, if it is free of syntax errors. (\*(TL is processed differently, form by form.) On the other hand, the pattern language reads data files in a lazy manner. A file isn't opened until the query demands material from that file, and then the contents are read on demand, not all at once. The suffix of the .meta script-file is significant. If the name has no suffix, or if it has a .str .txr suffix, then it is assumed to be in the \*(TX pattern language. If it has the .str .tl suffix, then it is assumed to be \*(TL. The .code --lisp and .code --compiled options change the treatment of unsuffixed script file names, causing them to be interpreted as \*(TL source or compiled \*(TL, respectively. If a file name is specified which does not have a recognized suffix, and names a file which doesn't exist, then \*(TX adds the .str .txr suffix and tries again. If that doesn't exist, another attempt is made with the .str .tlo suffix, which will be treated as as a \*(TL compiled file. If that doesn't exist, then .str .tlo.gz is tried, expected to be a file compressed in .code gzip format. Finally, if that doesn't exist, the .str .tl suffix is tried, which will be treated as containing \*(TL source. If either the .code --lisp or .code --compiled option has been specified, then \*(TX skips trying the .str .txr suffix, and tries only .str .tlo followed by .str .tlo.gz and .strn .tl . A \*(TL file is processed as if by the .code load macro: forms from the file are read and evaluated. If the forms do not terminate the \*(TX process or throw an exception, and there are no syntax errors, then \*(TX terminates successfully after evaluating the last form. If syntax errors are encountered in a form, then \*(TX terminates unsuccessfully. \*(TL is documented in the section TXR LISP. If a query file is specified, but no file arguments, it is up to the query to open a file, pipe or standard input via the .code @(next) directive prior to attempting to make a match. If a query attempts to match text, but has run out of files to process, the match fails. .SH* STATUS AND ERROR REPORTING \*(TX sends errors and verbose logs to the standard error device. The following paragraphs apply when \*(TX is run without enabling verbose mode with .codn -v , or the printing of variable bindings with .code -B or .codn -a . If the command-line arguments are incorrect, \*(TX issues an error diagnostic and terminates with a failed status. If the .meta script-file specifies a query, and the query has a malformed syntax, \*(TX likewise issues error diagnostics and terminates with a failed status. If the query fails due to a mismatch, \*(TX terminates with a failed status. No diagnostics are issued. If the query is well-formed, and matches, then \*(TX issues no diagnostics, and terminates with a successful status. In verbose mode (option .codn -v ), \*(TX issues diagnostics on the standard error device even in situations which are not erroneous. In bindings-printing mode (options .code -B or .codn -a ), \*(TX prints the word .code false if the query fails, and exits with a failed termination status. If the query succeeds, the variable bindings, if any, are output on standard output. If the .meta script-file is \*(TL, then it is processed form by form. Each top-level Lisp form is evaluated after it is read. If any form is syntactically malformed, \*(TX issues diagnostics and terminates unsuccessfully. This is somewhat different from how the pattern language is treated: a script in the pattern language is parsed in its entirety before being executed. .SH* BASIC TXR SYNTAX .SS* Comments A query may contain comments which are delimited by the sequence .code @; and extend to the end of the line. Whitespace can occur between the .code @ and .codn ; . A comment which begins on a line swallows that entire line, as well as the newline which terminates it. In essence, the entire comment line disappears. If the comment follows some material in a line, then it does not consume the newline. Thus, the following two queries are equivalent: .IP 1. .mono \ @a@; comment: match whole line against variable @a @; this comment disappears entirely @b .onom .IP 2. .mono \ @a @b .onom .PP The comment after the .code @a does not consume the newline, but the comment which follows does. Without this intuitive behavior, line comment would give rise to empty lines that must match empty lines in the data, leading to spurious mismatches. Instead of the .code ; character, the .code # character can be used. This is an obsolescent feature. .SS* Hash-Bang Support \*(TX has several features which support use of the hash-bang convention for creating apparently standalone executable programs. .NP* Basic Hash Bang Special processing is applied to \*(TX query or \*(TL script files that are specified on the command line via the .code -f option or as the first non-option argument. If the first line of such a file begins with the characters .codn #! , that entire line is consumed and processed specially. This removal allows for \*(TX queries to be turned into standalone executable programs in the POSIX environment using the hash-bang mechanism. Unlike most interpreters, \*(TX applies special processing to the .code #! line, which is described below, in the section .BR "Argument Generation with the Null Hack" . Shell session example: create a simple executable program called .str "twoline.txr" and run it. This assumes \*(TX is installed in .codn /usr/bin . .verb $ cat > hello.txr #!/usr/bin/txr @(bind a "Hey") @(output) Hello, world! @(end) $ chmod a+x hello.txr $ ./hello.txr Hello, world! .brev When this plain hash-bang line is used, \*(TX receives the name of the script as an argument. Therefore, it is not possible to pass additional options to \*(TX. For instance, if the above script is invoked like this .verb $ ./hello.txr -B .brev the .code -B option isn't processed by \*(TX, but treated as an additional argument, just as if .mono .meti txr < script-file -B .onom had been executed directly. This behavior is useful if the script author wants not to expose the \*(TX options to the user of the script. However, the hash-bang line can use the .code -f option: .verb #!/usr/bin/txr -f .brev Now, the name of the script is passed as an argument to the .code -f option, and \*(TX will look for more options after that, so that the resulting program appears to accept \*(TX options. Now we can run .verb $ ./hello.txr -B Hello, world! a="Hey" .brev The .code -B option is honored. .coNP Argument Generation with @ --args and @ --eargs On some operating systems, it is not possible to pass more than one argument through the hash-bang mechanism. That is to say, this will not work. .verb #!/usr/bin/txr -B -f .brev To support systems like this, \*(TX supports the special argument .codn --args , as well as an extended version, .codn --eargs . With .codn --args , it is possible to encode multiple arguments into one argument. The .code --args option must be followed by a separator character, chosen by the programmer. The characters after that are split into multiple arguments on the separator character. The .code --args option is then removed from the argument list and replaced with these arguments, which are processed in its place. Example: .verb #!/usr/bin/txr --args:-B:-f .brev The above has the same behavior as .verb #!/usr/bin/txr -B -f .brev on a system which supports multiple arguments in the hash-bang line. The separator character is the colon, and so the remainder of that argument, .codn -B:-f , is split into the two arguments .codn "-B -f" . The .code --eargs option is similar to .codn --args , but must be followed by one more argument. After .code --eargs performs the argument splitting in the same manner as .codn --args , any of the arguments which it produces which are the two-character sequence .code {} are replaced with that following argument. Whether or not the replacement occurs, that following argument is then removed. Example: .verb #!/usr/bin/txr --eargs:-B:{}:--foo:42 .brev This has an effect which cannot be replicated in any known implementation of the hash-bang mechanism. Suppose that this hash-bang line is placed in a script called .codn script.txr . When this script is invoked with arguments, as in: .verb script.txr a b c .brev then \*(TX is invoked similarly to: .verb /usr/bin/txr --eargs:-B:{}:--foo:42 script.txr a b c .brev Then, when .code --eargs processing takes place, firstly the argument sequence .verb -B {} --foo 42 .brev is produced by splitting into four fields using the .code : (colon) character as the separator. Then, within these four fields, all occurrences of .code {} are replaced with the following argument .codn script.txr , resulting in: .verb -B script.txr --foo 42 .brev Furthermore, that .code script.txr argument is removed from the remaining argument list. The four arguments are then substituted in place of the original .code --eargs:-B:{}:--foo:42 syntax. The resulting \*(TX invocation is, therefore: .verb /usr/bin/txr -B script.txr --foo 42 a b c .brev Thus, .code --eargs allows some arguments to be encoded into the interpreter script, such that script name is inserted anywhere among them, possibly multiple times. Arguments for the interpreter can be encoded, as well as arguments to be processed by the script. .coNP Argument Generation with the Null Hack The .code --args and .code --eargs mechanisms do not solve the following problem: the POSIX .code env utility is often exploited for its .code PATH searching capability, and used to express hash-bang scripts in the following way: .verb #!/usr/bin/env txr .brev Here, the .code env utility searches for the .code txr program in the directories indicated by the .code PATH variable, which liberates the script from having to encode the exact location where the program is installed. However, if the operating system allows only one argument in the hash-bang mechanism, then no arguments can be passed to the program. To mitigate this problem, \*(TX supports a special feature in its hash-bang support. If the hash-bang line contains a null byte, then the text from after the null byte until the end of the line is split into fields using the space character as a separator, and these fields are inserted into the command line. This manipulation happens during command-line processing, i.e. prior to the execution of the file. If this processing is applied to a file that is specified using the .code -f option, then the arguments which arise from the special processing are inserted after that option and its argument. If this processing is applied to the file which is the first non-option argument, then the options are inserted before that argument. However, care is taken not to process that argument a second time. In either situation, processing of the command-line options continues, and the arguments which are processed next are the ones which were just inserted. This is true even if the options had been inserted as a result of processing the first non-option argument, which would ordinarily signal the termination of option processing. In the following examples, it is assumed that the script is named, and invoked, as .codn /home/jenny/foo.txr , and is given arguments .codn "--bar abc" , and that .code txr resolves to .codn /usr/bin/txr . The .code code indicates a literal ASCII NUL character (the zero byte). Basic example: .verb #!/usr/bin/env txr-a 3 .brev Here, .code env searches for .codn txr , finding it in .codn /usr/bin . Thus, including the executable name, \*(TX receives this full argument list: .verb /usr/bin/txr /home/jenny/foo.txr --bar abc .brev The first non-option argument is the name of the script. \*(TX opens the script, and notices that it begins with a hash-bang line. It consumes the hash-bang line and finds the null byte inside it, retrieving the character string after it, which is .strn "-a 3" . This is split into the two arguments .code -a and .codn 3 , which are then inserted into the command line ahead of the the script name. The effective command line then becomes: .verb /usr/bin/txr -a 3 /home/jenny/foo.txr --bar abc .brev Command-line option processing continues, beginning with the .code -a option. After the option is processed, .code /home/jenny/foo.txr is encountered again. This time it is not opened a second time; it signals the end of option processing, exactly as it would immediately do if it hadn't triggered the insertion of any arguments. Advanced example: use .code env to invoke .codn txr , passing options to the interpreter and to the script: .verb #!/usr/bin/env txr--eargs:-C:175:{}:--debug .brev This example shows how .code --eargs can be used in conjunction with the null hack. When .code txr begins executing, it receives the arguments .verb /usr/bin/txr /home/jenny/foo.txr .brev The script file is opened, and the arguments delimited by the null character in the hash-bang line are inserted, resulting in the effective command line: .verb /usr/bin/txr --eargs:-C:175:{}:--debug /home/jenny/foo.txr .brev Next, .code --eargs is processed in the ordinary way, transforming the command line into: .verb /usr/bin/txr -C 175 /home/jenny/foo.txr --debug .brev The name of the script file is encountered, and signals the end of option processing. Thus .code txr receives the .code -C option, instructing it to emulate some behaviors from version 175, and the .code /home/jenny/foo.txr script receives .code --debug as .B its argument: it executes with the .code *args* list containing one element, the character string .strn --debug . The hash-bang null-hack feature was introduced in \*(TX 177. Previous versions ignore the hash-bang line, performing no special processing. Where a risk exists that programs which depend on the feature might be executed by an older version of \*(TX, care must be taken to detect and handle that situation, either by means of the .code txr-version variable, or else by some logic which infers that the processing of the hash-bang line hasn't been performed. .coNP Passing Options to \*(TX via Hash-Bang Null Hack It is possible to use the Hash-Bang Null Hack, such that the resulting executable program recognizes \*(TX options. This is made possible by a special behavior in the processing of the .code -f option. For instance, suppose that the effect of the following familiar hash-bang line is required: .verb #!/path/to/txr -f .brev However, suppose there is also a requirement to use the .code env utility to find \*(TX. Furthermore, the operating system allows only one hash-bang argument. Using the Null Hack, this is rewritten as: .verb #!/usr/bin/env txr-f .brev then if the script is invoked with arguments .codn "-a b c" , the command line will ultimately be transformed into: .verb /path/to/txr -f /path/to/scriptfile -i a b c .brev which allows \*(TX to process the .code -i option, leaving .codn a , .code b and .code c as arguments for the script. However, note that there is a subtle issue with the .code -f option that has been inserted via the Null Hack: namely, this insertion happens after \*(TX has opened the script file and read the hash-bang line from it. This means that when the inserted .code -f option is being processed, the script file is already open. A special behavior occurs. The .code -f option processing notices that the argument to .code -f is identical to the pathname of name of the script file that \*(TX has already opened for processing. The .code -f option and its argument are then skipped. .NP* Hash Bang and Setuid \*(TX supports setuid hash-bang scripting, even on platforms that do not support setuid and setgid attributes on hash-bang scripts. On such platforms, \*(TX has to be installed setuid/setgid. See the section SETUID/SETGID OPERATION. On some platforms, it may also be necessary to to use the .code --reexec option. .SS* Whitespace Outside of directives, whitespace is significant in \*(TX queries, and represents a pattern match for whitespace in the input. An extent of text consisting of an undivided mixture of tabs and spaces is a whitespace token. Whitespace tokens match a precisely identical piece of whitespace in the input, with one exception: a whitespace token consisting of precisely one space has a special meaning. It is equivalent to the regular expression .codn "@/[ ]+/" : match an extent of one or more spaces (but not tabs!). Multiple consecutive spaces do not have this meaning. Thus, the query line .str "a b" (one space between .code a and .codn b ) matches .str "a b" with any number of spaces between the two letters. For matching a single space, the syntax .code "@\e " can be used (backslash-escaped space). It is more often necessary to match multiple spaces than to match exactly one space, so this rule simplifies many queries and inconveniences only a few. In output clauses, string and character literals and quasiliterals, a space token denotes a space. .SS* Text Query material which is not escaped by the special character .code @ is literal text, which matches input character for character. Text which occurs at the beginning of a line matches the beginning of a line. Text which starts in the middle of a line, other than following a variable, must match exactly at the current position, where the previous match left off. Moreover, if the text is the last element in the line, its match is anchored to the end of the line. An empty query line matches an empty line in the input. Note that an empty input stream does not contain any lines, and therefore is not matched by an empty line. An empty line in the input is represented by a newline character which is either the first character of the file, or follows a previous newline-terminated line. Input streams which end without terminating their last line with a newline are tolerated, and are treated as if they had the terminator. Text which follows a variable has special semantics, described in the section Variables below. A query may not leave a line of input partially matched. If any portion of a line of input is matched, it must be entirely matched, otherwise a matching failure results. However, a query may leave unmatched lines. Matching only four lines of a ten-line file is not a matching failure. The .code eof directive can be used to explicitly match the end of a file. In the following example, the query matches the text, even though the text has an extra line. .IP code: .mono \ Four score and seven years ago our .onom .IP data: .mono \ Four score and seven years ago our forefathers .onom .PP In the following example, the query .B fails to match the text, because the text has extra material on one line that is not matched: .IP code: .mono \ I can carry nearly eighty gigs in my head .onom .IP data: .mono \ I can carry nearly eighty gigs of data in my head .onom .PP Needless to say, if the text has insufficient material relative to the query, that is a failure also. To match arbitrary material from the current position to the end of a line, the "match any sequence of characters, including empty" regular expression .code @/.*/ can be used. Example: .IP code: .mono \ I can carry nearly eighty gigs@/.*/ .onom .IP data: .mono \ I can carry nearly eighty gigs of data .onom .PP In this example, the query matches, since the regular expression matches the string "of data". (See the Regular Expressions section below.) Another way to do this is: .IP code: .mono \ I can carry nearly eighty gigs@(skip) .onom .SS* Special Characters in Text Control characters may be embedded directly in a query (with the exception of newline characters). An alternative to embedding is to use escape syntax. The following escapes are supported: .meIP >> @\e newline A backslash immediately followed by a newline introduces a physical line break without breaking up the logical line. Material following this sequence continues to be interpreted as a continuation of the previous line, so that indentation can be introduced to show the continuation without appearing in the data. .meIP >> @\e space A backslash followed by a space encodes a space. This is useful in line continuations when it is necessary for some or all of the leading spaces to be preserved. For instance the two line sequence .verb abcd@\e @\e efg .brev is equivalent to the line .verb abcd efg .brev The two spaces before the .code @\e in the second line are consumed. The spaces after are preserved. .coIP @\ea Alert character (ASCII 7, BEL). .coIP @\eb Backspace (ASCII 8, BS). .coIP @\et Horizontal tab (ASCII 9, HT). .coIP @\en Line feed (ASCII 10, LF). Serves as abstract newline on POSIX systems. .coIP @\ev Vertical tab (ASCII 11, VT). .coIP @\ef Form feed (ASCII 12, FF). This character clears the screen on many kinds of terminals, or ejects a page of text from a line printer. .coIP @\er Carriage return (ASCII 13, CR). .coIP @\ee Escape (ASCII 27, ESC) .meIP >> @\ex hex-digits A .code @\ex immediately followed by a sequence of hex digits is interpreted as a hexadecimal numeric character code. For instance .code @\ex41 is the ASCII character A. If a semicolon character immediately follows the hex digits, it is consumed, and characters which follow are not considered part of the hex escape even if they are hex digits. .meIP >> @\e octal-digits A .code @\e immediately followed by a sequence of octal digits (0 through 7) is interpreted as an octal character code. For instance .code @\e010 is character 8, same as .codn @\eb . If a semicolon character immediately follows the octal digits, it is consumed, and subsequent characters are not treated as part of the octal escape, even if they are octal digits. .PP Note that if a newline is embedded into a query line with .code @\en, this does not split the line into two; it's embedded into the line and thus cannot match anything. However, .code @\en may be useful in the .code @(cat) directive and in .codn @(output) . .SS* Character Handling and International Characters \*(TX represents text internally using wide characters, which are used to represent Unicode code points. Script source code, as well as all data sources, are assumed to be in the UTF-8 encoding. In \*(TX and \*(TL source, extended characters can be used directly in comments, literal text, string literals, quasiliterals and regular expressions. Extended characters can also be expressed indirectly using hexadecimal or octal escapes. On some platforms, wide characters may be restricted to 16 bits, so that \*(TX can only work with characters in the BMP (Basic Multilingual Plane) subset of Unicode. \*(TX does not use the localization features of the system library; its handling of extended characters is not affected by environment variables like .code LANG and .codn L_CTYPE . The program reads and writes only the UTF-8 encoding. \*(TX deals with UTF-8 separately in its parser and in its I/O streams implementation. \*(TX's text streams perform UTF-8 conversion internally, such that \*(TX applications use Unicode code points. In text streams, invalid UTF-8 bytes are treated as follows. When an invalid byte is encountered in the middle of a multibyte character, or if the input ends in the middle of a multibyte character, or if an invalid character is decoded, such as an overlong from, or code in the range U+DC00 through U+DCFF, the UTF-8 decoder returns to the starting byte of the ill-formed multibyte character, and extracts just one byte, mapping that byte to the Unicode character range U+DC00 through U+DCFF, producing that code point as the decoded result. The decoder is then reset to its initial state and begins decoding at the following byte, where the same algorithm is repeated. Furthermore, because \*(TX internally uses a null-terminated character representation of strings which easily interoperates with C language interfaces, when a null character is read from a stream, \*(TX converts it to the code U+DC00. On output, this code converts back to a null byte, as explained in the previous paragraph. By means of this representational trick, \*(TX can handle textual data containing null bytes. In contrast to the above, the \*(TX parser scans raw UTF-8 bytes from a binary stream, rather than using a text stream. The parser performing its own recognition of UTF-8 sequences in certain language constructs, using a UTF-8 decoder only when processing certain kinds of tokens. Comments are read without regard for encoding, so invalid encoding bytes in comments are not detected. A comment is simply a sequence of bytes terminated by a newline. Invalid UTF-8 encountered while scanning identifiers and character names in character literal (hash-backslash) syntax is diagnosed as a syntax error. UTF-8 in string literals is treated in the same way as UTF-8 in text streams. Invalid UTF-8 bytes are mapped into code points in the U+DC000 through U+DCFF range, and incorporated as such into the resulting string object which the literal denotes. The same remarks apply to regular-expression literals. .SS* Regular Expression Directives In place of a piece of text (see section Text above), a regular-expression directive may be used, which has the following syntax: .verb @/RE/ .brev where the RE part enclosed in slashes represents regular-expression syntax (described in the section Regular Expressions below). Long regular expressions can be broken into multiple lines using a backslash-newline sequence. Whitespace before the sequence or after the sequence is not significant, so the following two are equivalent: .verb @/reg \e ular/ @/regular/ .brev There may not be whitespace between the backslash and newline. Whereas literal text simply represents itself, regular expression denotes a (potentially infinite) set of texts. The regular-expression directive matches the longest piece of text (possibly empty) which belongs to the set denoted by the regular expression. The match is anchored to the current position; thus if the directive is the first element of a line, the match is anchored to the start of a line. If the regular-expression directive is the last element of a line, it is anchored to the end of the line also: the regular expression must match the text from the current position to the end of the line. Even if the regular expression matches the empty string, the match will fail if the input is empty, or has run out of data. For instance suppose the third line of the query is the regular expression .codn @/.*/ , but the input is a file which has only two lines. This will fail: the data has no line for the regular expression to match. A line containing no characters is not the same thing as the absence of a line, even though both abstractions imply an absence of characters. Like text which follows a variable, a regular-expression directive which follows a variable has special semantics, described in the section Variables below. .SS* Variables Much of the query syntax consists of arbitrary text, which matches file data character for character. Embedded within the query may be variables and directives which are introduced by a .code @ character. Two consecutive .code @@ characters encode a literal .codn @ . A variable-matching or substitution directive is written in one of several ways: .mono .mets >> @ sident .mets <> @{ bident } .mets >> @* sident .mets <> @*{ bident } .mets >> @{ bident <> / regex /} .mets >> @{ bident >> ( fun >> [ arg ...])} .mets >> @{ bident << number } .mets >> @{ bident << bident } .onom The forms with an .code * indicate a long match, see Longest Match below. The forms with the embedded regexp .mono .meti <> / regex / .onom or function or .meta number have special semantics; see Positive Match below. The identifier .code t cannot be used as a name; it is a reserved symbol which denotes the value true. An attempt to use the variable .code @t will result in an exception. The symbol .code nil can be used where a variable name is required syntactically, but it has special semantics, described in a section below. A .meta sident is a "simple identifier" form which is not delimited by braces. A .meta sident consists of any combination of one or more letters, numbers, and underscores. It may not look like a number, so that for instance .code 123 is not a valid .metn sident , but .code 12A is valid. Case is sensitive, so that .code FOO is different from .codn foo , which is different from .codn Foo . The braces around an identifier can be used when material which follows would otherwise be interpreted as being part of the identifier. When a name is enclosed in braces it is a .metn bident . The following additional characters may be used as part of a .meta bident which are not allowed in a .metn sident : .verb ! $ % & * + - < = > ? \e ~ .brev Moreover, most Unicode characters beyond U+007F may appear in a .metn bident , with certain exceptions. A character may not be used if it is any of the Unicode space characters, a member of the high or low surrogate region, a member of any Unicode private-use area, or is either of the two characters U+FFFE and U+FFFF. These situations produce a syntax error. Invalid UTF-8 in an identifier is also a syntax error. The rule still holds that a name cannot look like a number so .code +123 is not a valid .meta bident but these are valid: .codn a->b , .codn *xyz* , .codn foo-bar . The syntax .code @FOO_bar introduces the name .codn FOO_bar , whereas .code @{FOO}_bar means the variable named .str FOO followed by the text .strn _bar . There may be whitespace between the .code @ and the name, or opening brace. Whitespace is also allowed in the interior of the braces. It is not significant. If a variable has no prior binding, then it specifies a match. The match is determined from some current position in the data: the character which immediately follows all that has been matched previously. If a variable occurs at the start of a line, it matches some text at the start of the line. If it occurs at the end of a line, it matches everything from the current position to the end of the line. .SS* Negative Match If a variable is one of the plain forms .mono .mets >> @ sident .mets <> @{ bident } .mets >> @* sident .mets <> @*{ bident } .onom then this is a "negative match". The extent of the matched text (the text bound to the variable) is determined by looking at what follows the variable, and ranges from the current position to some position where the following material finds a match. This is why this is called a "negative match": the spanned text which ends up bound to the variable is that in which the match for the trailing material did not occur. A variable may be followed by a piece of text, a regular-expression directive, a function call, a directive, another variable, or nothing (i.e. occurs at the end of a line). These cases are described in detail below. .NP* Variable Followed by Nothing If the variable is followed by nothing, the negative match extends from the current position in the data, to the end of the line. Example: .IP code: .mono \ a b c @FOO .onom .IP data: .mono \ a b c defghijk .onom .IP result: .mono \ FOO="defghijk" .onom .NP* Variable Followed by Text For the purposes of determining the negative match, text is defined as a sequence of literal text and regular expressions, not divided by a directive. So for instance in this example: .verb @a:@/foo/bcd e@(maybe)f@(end) .brev .PP the variable .code a is considered to be followed by .strn ":@/foo/bcd e" . If a variable is followed by text, then the extent of the negative match is determined by searching for the first occurrence of that text within the line, starting at the current position. The variable matches everything between the current position and the matching position (not including the matching position). Any whitespace which follows the variable (and is not enclosed inside braces that surround the variable name) is part of the text. For example: .IP code: .mono \ a b @FOO e f .onom .IP data: .mono \ a b c d e f .onom .IP result: .mono \ FOO="c d" .onom .PP In the above example, the pattern text .str "a b " matches the data .strn "a b " . So when the .code @FOO variable is processed, the data being matched is the remaining .strn "c d e f" . The text which follows .code @FOO is .strn " e f" . This is found within the data .str "c d e f" at position 3 (counting from 0). So positions 0\(en2 .mono ("c d") .onom constitute the matching text which is bound to FOO. .NP* Variable Followed by a Function Call or Directive If the variable is followed by a function call, or a directive, the extent is determined by scanning the text for the first position where a match occurs for the entire remainder of the line. (For a description of functions, see Functions.) For example: .verb @foo@(bind a "abc")xyz .brev Here, .code @foo will match the text from the current position to where .str "xyz" occurs, even though there is a .code @(bind) directive. Furthermore, if more material is added after the .strn "xyz" , it is part of the search. Note the difference between the following two: .verb @foo@/abc/@(func) @foo@(func)@/abc/ .brev In the first example, .code @foo matches the text from the current position until the match for the regular expression .strn "abc" . .code @(func) is not considered when processing .codn @foo . In the second example, .code @foo matches the text from the current position until the position which matches the function call, followed by a match for the regular expression. The entire sequence .code @(func)@/abc/ is considered. .NP* Consecutive Variables If an unbound variable specifies a fixed-width match or a regular expression, then the issue of consecutive variables does not arise. Such a variable consumes text regardless of any context which follows it. However, what if an unbound variable with no modifier is followed by another variable? The behavior depends on the nature of the other variable. If the other variable is also unbound, and also has no modifier, this is a semantic error which will cause the query to fail. A diagnostic message will be issued, unless operating in quiet mode via .codn -q . The reason is that there is no way to bind two consecutive variables to an extent of text; this is an ambiguous situation, since there is no matching criterion for dividing the text between two variables. (In theory, a repetition of the same variable, like .codn @FOO@FOO , could find a solution by dividing the match extent in half, which would work only in the case when it contains an even number of characters. This behavior seems to have dubious value.) An unbound variable may be followed by one which is bound. The bound variable is effectively replaced by the text which it denotes, and the logic proceeds accordingly. It is possible for a variable to be bound to a regular expression. If .code x is an unbound variable and .code y is bound to a regular expression .codn RE , then .code @x@y means .codn @x@/RE/ . A variable .code v can be bound to a regular expression using, for example, .codn "@(bind v #/RE/)" . The .code @* syntax for longest match is available. Example: .IP code: .mono \ @FOO:@BAR@FOO .onom .IP data: .mono \ xyz:defxyz .onom .IP result: .mono \ FOO=xyz, BAR=def .onom .PP Here, .code FOO is matched with .strn "xyz" , based on the delimiting around the colon. The colon in the pattern then matches the colon in the data, so that .code BAR is considered for matching against .strn "defxyz" . .code BAR is followed by .codn FOO , which is already bound to .strn "xyz" . Thus .str "xyz" is located in the .str "defxyz" data following .strn "def" , and so BAR is bound to .strn "def" . If an unbound variable is followed by a variable which is bound to a list, or nested list, then each character string in the list is tried in turn to produce a match. The first match is taken. An unbound variable may be followed by another unbound variable which specifies a regular expression or function call match. This is a special case called a "double variable match". What happens is that the text is searched using the regular expression or function. If the search fails, then neither variable is bound: it is a matching failure. If the search succeeds, then the first variable is bound to the text which is skipped by the search. The second variable is bound to the text matched by the regular expression or function. Example: .IP code: .mono \ @foo@{bar /abc/} .onom .IP data: .mono \ xyz@#abc .onom .IP result: .mono \ foo="xyz@#", BAR="abc" .onom .PP .NP* Consecutive Variables via Directive Two variables can be de facto consecutive in a manner shown in the following example: .verb @var1@(all)@var2@(end) .brev This is treated just like the variable followed by directive. No semantic error is identified, even if both variables are unbound. Here, .code @var2 matches everything at the current position, and so .code @var1 ends up bound to the empty string. Example 1: .code b matches at position 0 and .code a binds the empty string: .IP code: .mono \ @a@(all)@b@(end) .onom .IP data: .mono \ abc .onom .IP result: .mono \ a="" b="abc" .onom .PP Example 2: .code *a specifies longest match (see Longest Match below), and so it takes everything: .IP code: .mono \ @*a@(all)@b@(end) .onom .IP data: .mono \ abc .onom .IP result: .mono \ a="abc" b="" .onom .PP .NP* Longest Match The closest-match behavior for the negative match can be overridden to longest match behavior. A special syntax is provided for this: an asterisk between the .code @ and the variable, e.g.: .IP code: .mono \ a @*{FOO}cd .onom .IP data: .mono \ a b cdcdcdcd .onom .IP result: .mono \ FOO="b cdcdcd" .onom .PP .IP code: .mono \ a @{FOO}cd .onom .IP data: .mono \ a b cdcdcd .onom .IP result: .mono \ FOO="b " .onom .PP In the former example, the match extends to the rightmost occurrence of .strn "cd" , and so .code FOO receives .strn "b cdcdcd" . In the latter example, the .code * syntax isn't used, and so a leftmost match takes place. The extent covers only the .strn "b " , stopping at the first .str "cd" occurrence. .SS* Positive Match There are syntactic variants of variable syntax which have an embedded expression enclosed with the variable in braces: .mono .mets >> @{ bident <> / regex /} .mets >> @{ bident >> ( fun >> [ args ...])} .mets >> @{ bident << number } .mets >> @{ bident << bident } .onom These specify a variable binding that is driven by a positive match derived from a regular expression, function or character count, rather than from trailing material (which is regarded as a "negative" match, since the variable is bound to material which is .B skipped in order to match the trailing material). The positive match syntax is processed without considering any following syntax, and therefore may be followed by an unbound variable. In the .mono .meti >> @{ bident <> / regex /} .onom form, the match extends over all characters from the current position which match the regular expression .metn regex . (See the Regular Expressions section below.) If the variable already has a value, the text extracted by the regular expression must exactly match the variable. In the .mono .meti >> @{ bident >> ( fun >> [ args ...])} .onom form, the match extends over lines or characters which are matched by the call to the function, if the call succeeds. Thus .code "@{x (y z w)}" is just like .codn "@(y z w)" , except that the region of text skipped over by .code "@(y z w)" is also bound to the variable .codn x . Except in one special case, the matching takes place horizontally within the current line, and the spanned range of text is treated as a string. The exception is that if the .mono .meti >> @{ bident >> ( fun >> [ args ...])} .onom appears as the only element of a line, and .meta fun has a binding as a vertical function, then the function is invoked in the same manner as it would be by the .mono .meti >> @( fun >> [ args ...]) .onom syntax. Then the variable indicated by .meta bident is bound to the list of lines matched by the function call. Pattern functions are described in the Functions section below. The function is invoked even if the variable already has a value. The text matched by the function must match the variable. In the .mono .meti >> @{ bident << number } .onom form, the match processes a field of text which consists of the specified number of characters, which must be a nonnegative number. If the data line doesn't have that many characters starting at the current position, the match fails. A match for zero characters produces an empty string. The text which is actually bound to the variable is all text within the specified field, but excluding leading and trailing whitespace. If the field contains only spaces, then an empty string is extracted. This fixed-field extraction takes place whether or not the variable already has a binding. If it already has a binding, then it must match the extracted, trimmed text. The .mono .meti >> @{ bident << bident } .onom syntax allows the .meta number or .meta regex modifier to come from a variable. The variable must be bound and contain a nonnegative integer or regular expression. For example, .code "@{x y}" behaves like .code "@{x 3}" if .code y is bound to the integer 3. It is an error if .code y is unbound. .coSS Special Symbols @ nil and @ t Just like in the Common Lisp language, the names .code nil and .code t are special. .code nil symbol stands for the empty list object, an object which marks the end of a list, and Boolean false. It is synonymous with the syntax .code () which may be used interchangeably with .code nil in most constructs. In \*(TL, .code nil and .code t cannot be used as variables. When evaluated, they evaluate to themselves. In the \*(TX pattern language, .code nil can be used in the variable binding syntax, but does not create a binding; it has a special meaning. It allows the variable-matching syntax to be used to skip material, in ways similar to the .code skip directive. The .code nil symbol is also used as a .code block name, both in the \*(TX pattern language and in \*(TL. A block named .code nil is considered to be anonymous. .SS* Keyword Symbols Names beginning with the .code : (colon) character are keyword symbols. These also stand for themselves and may not be used as variables. Keywords are useful for labeling information and situations. .SS* Regular Expressions Regular expressions are a language for specifying sets of character strings. Through the use of pattern-matching elements, a regular expression is able to denote an infinite set of texts. \*(TX contains an original implementation of regular expressions, which supports the following syntax: .coIP . The period is a "wildcard" that matches any character. .coIP [] Character class: matches a single character, from the set specified by special syntax written between the square brackets. This supports basic regexp character class syntax. POSIX notation like .code [:digit:] is not supported. The regex tokens .codn \es , .code \ed and .code \ew are permitted in character classes, but not their complementing counterparts. These tokens simply contribute their characters to the class. The class .code [a-zA-Z] means match an uppercase or lowercase letter; the class .code [0-9a-f] means match a digit or a lowercase letter; the class .code [^0-9] means match a non-digit, and so forth. There are no locale-specific behaviors in \*(TX regular expressions; .code [A-Z] denotes an ASCII/Unicode range of characters. The class .code [\ed.] means match a digit or the period character. A .code ] or .code - can be used within a character class, but must be escaped with a backslash. A .code ^ in the first position denotes a complemented class, unless it is escaped by backslash. In any other position, it denotes itself. Two backslashes code for one backslash. So for instance .code [\e[\e-] means match a .code [ or .code - character, .code [^^] means match any character other than .codn ^ , and .code [\e^\e\e] means match either a .code ^ or a backslash. Regex operators such as .codn * , .code + and .code & appearing in a character class represent ordinary characters. The characters .codn - , .code ] and .code ^ occurring outside of a character class are ordinary. Unescaped .code / characters can appear within a character class. The empty character class .code [] matches no character at all, and its complement .code [^] matches any character, and is treated as a synonym for the .code . (period) wildcard operator. .ccIP @, \es @ \ew and @ \ed These regex tokens each match a single character. The .code \es regex token matches a wide variety of ASCII whitespace characters and Unicode spaces. The .code \ew token matches alphabetic word characters; it is equivalent to the character class .codn [A-Za-z_] . The .code \ed token matches a digit, and is equivalent to .codn [0-9] . .ccIP @, \eS @ \eW and @ \eD These regex tokens are the complemented counterparts of .codn \es , .code \ew and .codn \ed . The .code \eS token matches all those characters which .code \es does not match, .code \eW matches all characters that .code \ew does not match and .code \eD matches nondigits. .coIP empty An empty expression is a regular expression. It represents the set of strings consisting of the empty string; i.e. it matches just the empty string. The empty regex can appear alone as a full regular expression (for instance the \*(TX syntax .code @// with nothing between the slashes) and can also be passed as a subexpression to operators, though this may require the use of parentheses to make the empty regex explicit. For example, the expression .code a| means: match either .codn a , or nothing. The forms .code * and .code (*) are syntax errors; though not useful, the correct way to match the empty expression zero or more times is the syntax .codn ()* . .coIP nomatch The nomatch regular expression represents the empty set: it matches no strings at all, not even the empty string. There is no dedicated syntax to directly express nomatch in the regex language. However, the empty character class .code [] is equivalent to nomatch, and may be considered to be a notation for it. Other representations of nomatch are possible: for instance, the regex .code ~.* which is the complement of the regex that denotes the set of all possible strings, and thus denotes the empty set. A nomatch has uses; for instance, it can be used to temporarily "comment out" regular expressions. The regex .code ([]abc|xyz) is equivalent to .codn (xyz) , since the .code []abc branch cannot match anything. Using .code [] to "block" a subexpression allows you to leave it in place, then enable it later by removing the "block". .coIP (R) If .code R is a regular expression, then so is .codn (R) . The contents of parentheses denote one regular expression unit, so that for instance in .codn (RE)* , the .code * operator applies to the entire parenthesized group. The syntax .code () is valid and equivalent to the empty regular expression. .coIP R? Optionally match the preceding regular expression .codn R . .coIP R* Match the expression .code R zero or more times. This operator is sometimes called the "Kleene star", or "Kleene closure". The Kleene closure favors the longest match. Roughly speaking, if there are two or more ways in which .code R1*R2 can match, then that match occurs in which .code R1* matches the longest possible text. .coIP R+ Match the preceding expression .code R one or more times. Like .codn R* , this favors the longest possible match: .code R+ is equivalent to .codn RR* . .coIP R1%R2 Match .code R1 zero or more times, then match .codn R2 . If this match can occur in more than one way, then it occurs such that .code R1 is matched the fewest number of times, which is opposite from the behavior of .codn R1*R2 . Repetitions of .code R1 terminate at the earliest point in the text where a nonempty match for .code R2 occurs. Because it favors shorter matches, .code % is termed a non-greedy operator. If .code R2 is the empty expression, or equivalent to it, then .code R1%R2 reduces to . codn R1* . So for instance .code (R%) is equivalent to .codn (R*) , since the missing right operand is interpreted as the empty regex. Note that whereas the expression .code (R1*R2) is equivalent to .codn (R1*)R2 , the expression .code (R1%R2) is .B not equivalent to .codn (R1%)R2 . Also note that .code A(XY%Z)B is equivalent to .codn AX(Y%Z)B . This is because the precedence of .code % is higher than that of catenation on its left side; this rule prevents the given syntax from expressing the .code XY catenation. The expression may be understood as: .code A(X(Y%Z))B where the inner parentheses clarify how the syntax surrounding the .code % operator is being parsed, and the outer parentheses are superfluous. The correct way to assert catenation of .code XY as the left operand of .code % is .codn A(XY)%ZB . To specify .code XY as the left operand, and limit the right operand to just .codn Z , the correct syntax is .codn A((XY)%Z)B . By contrast, the expression .code A(X%YZ)B is not equivalent to .code A(X%Y)ZB because the precedence of .code % is lower than that of catenation on its right side. The operator is effectively "bi-precedential". .coIP ~R Match the opposite of the following expression .codn R ; that is, match exactly those texts that .code R does not match. This operator is called complement, or logical not. .coIP R1R2 Two consecutive regular expressions denote catenation: the left expression must match, and then the right. .coIP R1|R2 Match either the expression .code R1 or .codn R2 . This operator is known by a number of names: union, logical or, disjunction, branch, or alternative. .coIP R1&R2 Match both the expression .code R1 and .code R2 simultaneously; i.e. the matching text must be one of the texts which are in the intersection of the set of texts matched by .code R1 and the set matched by .codn R2 . This operator is called intersection, logical and, or conjunction. .PP Any character which is not a regular-expression operator, a backslash escape, or the slash delimiter, denotes a one-position match of that character itself. Any of the special characters, including the delimiting .codn / , and the backslash, can be escaped with a backslash to suppress its meaning and denote the character itself. Furthermore, all of the same escapes that are described in the section Special Characters in Text above are supported \(em the difference is that in regular expressions, the .code @ character is not required, so for example a tab is coded as .code \et rather than .codn @\et . Octal and hex character escapes can be optionally terminated by a semicolon, which is useful if the following characters are octal or hex digits not intended to be part of the escape. Only the above escapes are supported. Unlike in some other regular-expression implementations, if a backlash appears before a character which isn't a regex special character or one of the supported escape sequences, it is an error. This wasn't true of historic versions of \*(TX. See the COMPATIBILITY section. .IP "Precedence table, highest to lowest:" .TS tab(!); l l l. Operators!Class!Associativity \f[4](R) []\f[]!primary! \f[4]R? R+ R* R%...\f[]!postfix!left-to-right \f[4]R1R2\f[]!catenation!left-to-right \f[4]~R ...%R\f[]\f[]\f[]!unary!right-to-left \f[4]R1&R2\f[]!intersection!left-to-right \f[4]R1|R2\f[]!union!left-to-right .TE .PP The .code % operator is like a postfix operator with respect to its left operand, but like a unary operator with respect to its right operand. Thus .code a~b%c~d is .codn a(~(b%(c(~d)))) , demonstrating right-to-left associativity, where all of .code b% may be regarded as a unary operator being applied to .codn c~d . Similarly, .code a?*+%b means .codn (((a?)*)+)%b , where the trailing .code %b behaves like a postfix operator. In \*(TX, regular expression matches do not span multiple lines. The regex language has no feature for multiline matching. However, the .code @(freeform) directive allows the remaining portion of the input to be treated as one string in which line terminators appear as explicit characters. Regular expressions may freely match through this sequence. It's possible for a regular expression to match an empty string. For instance, if the next input character is .codn z , facing the regular expression .codn /a?/ , there is a zero-character match: the regular expression's state machine can reach an acceptance state without consuming any characters. Examples: .IP code: .mono \ @A@/a?/@/.*/ .onom .IP data: .mono \ zzzzz .onom .IP result: .mono \ A="" .onom .PP .IP code: .mono \ @{A /a?/}@B .onom .IP data: .mono \ zzzzz .onom .IP result: .mono \ A="", B="zzzz" .onom .PP .IP code: .mono \ @*A@/a?/ .onom .IP data: .mono \ zzzzz .onom .IP result: .mono \ A="zzzzz" .onom .PP In the first example, variable .code @A is followed by a regular expression which can match an empty string. The expression faces the letter .code "z" at position 0 in the data line. A zero-character match occurs there, therefore the variable .code A takes on the empty string. The .code @/.*/ regular expression then consumes the line. Similarly, in the second example, the .code /a?/ regular expression faces a .codn "z" , and thus yields an empty string which is bound to .codn A . Variable .code @B consumes the entire line. The third example requests the longest match for the variable binding. Thus, a search takes place for the rightmost position where the regular expression matches. The regular expression matches anywhere, including the empty string after the last character, which is the rightmost place. Thus variable .code A fetches the entire line. For additional information about the advanced regular-expression operators, see NOTES ON EXOTIC REGULAR EXPRESSIONS below. .SS* Compound Expressions If the .code @ escape character is followed by an open parenthesis or square bracket, this is taken to be the start of a \*(TL compound expression. The \*(TX language has the unusual property that its syntactic elements, so-called .IR directives , are Lisp compound expressions. These expressions not only enclose syntax, but expressions which begin with certain symbols de facto behave as tokens in a phrase structure grammar. For instance, the expression .code @(collect) begins a block which must be terminated by the expression .codn @(end) , otherwise there is a syntax error. The .code collect expression can contain arguments which modify the behavior of the construct, for instance .codn "@(collect :gap 0 :vars (a b))" . In some ways, this situation might be compared to HTML, in which an element such as .code must be terminated by .code and can have attributes such as .codn "" . Compound expressions contain subexpressions which are other compound expressions or literal objects of various kinds. Among these are: symbols, numbers, string literals, character literals, quasiliterals and regular expressions. These are described in the following sections. Additional kinds of literal objects exist, which are discussed in the TXR LISP section of the manual. Some examples of compound expressions are: .verb (banana) (a b c (d e f)) ( a (b (c d) (e ) )) ("apple" #\eb #\espace 3) (a #/[a-z]*/ b) (_ `@file.txt`) .brev Symbols occurring in a compound expression follow a slightly more permissive lexical syntax than the .meta bident in the syntax .mono .meti <> @{ bident } .onom introduced earlier. The .code / (slash) character may be part of an identifier, or even constitute an entire identifier. In fact a symbol inside a directive is a .metn lident . This is described in the Symbol Tokens section under TXR LISP. A symbol must not be a number; tokens that look like numbers are treated as numbers and not symbols. .SS* Character Literals Character literals are introduced by the .code #\e (hash-backslash) syntax, which is either followed by a character name, the letter .code x followed by hex digits, the letter .code o followed by octal digits, or a single character. Valid character names are: .verb nul linefeed return alarm newline esc backspace vtab space tab page pnul .brev For instance .code #\eesc denotes the escape character. This convention for character literals is similar to that of the Scheme language. Note that .code #\elinefeed and .code #\enewline are the same character. The .code #\epnul character is specific to \*(TX and denotes the .code U+DC00 code in Unicode; the name stands for "pseudo-null", which is related to its special function. For more information about this, see the section "Character Handling and International Characters". .SS* String Literals String literals are delimited by double quotes. A double quote within a string literal is encoded using .mono \e" .onom and a backslash is encoded as .codn \e\e . Backslash escapes like .code \en and .code \et are recognized, as are hexadecimal escapes like .code \exFF or .code \exabc and octal escapes like .codn \e123 . Ambiguity between an escape and subsequent text can be resolved by adding a semicolon delimiter after the escape: .str "\exabc;d" is a string consisting of the character .code "U+0ABC" followed by .strn "d" . The semicolon delimiter disappears. To write a literal semicolon immediately after a hex or octal escape, write two semicolons, the first of which will be interpreted as a delimiter. Thus, .str "\ex21;;" represents .strn "!;" . Note that the source code syntax of \*(TX string literals is specified in UTF-8, which is decoded into an internal string representation consisting of code points. The numeric escape sequences are an abstract syntax for specifying code points, not for specifying bytes to be inserted into the UTF-8 representation, even if they lie in the 8-bit range. Bytes cannot be directly specified, other than literally. However, when a \*(TX string object is encoded to UTF-8, every code point lying in the range U+DC00 through U+DCFF is converted to a single byte by taking the low-order eight bits of its value. By manipulating code points in this special range, \*(TX programs can reproduce arbitrary byte sequences in text streams. Also note that the .code \eu escape sequence for specifying code points found in some languages is unnecessary and absent, since the existing hexadecimal and octal escapes satisfy this requirement. More detailed information is given in the earlier section Character Handling and International Characters. If the line ends in the middle of a literal, it is an error, unless the last character is a backslash. This backslash is a special escape which does not denote a character; rather, it indicates that the string literal continues on the next line. The backslash is deleted, along with whitespace which immediately precedes it, as well as leading whitespace in the following line. The escape sequence .str "\e " (backslash space) can be used to encode a significant space. Example: .verb "foo \e bar" "foo \e \e bar" "foo\e \e bar" .brev The first string literal is the string .strn "foobar" . The second two are .strn "foo bar" . .SS* Word List Literals A word list literal (WLL) provides a convenient way to write a list of strings when such a list can be given as whitespace-delimited words. There are two flavors of the WLL: the regular WLL which begins with .mono #" .onom (hash, double quote) and the splicing list literal which begins with .mono #*" .onom (hash, star, double quote). Both types are terminated by a double quote, which may be escaped as .mono \e" .onom in order to include it as a character. All the escaping conventions used in string literals can be used in word literals. Unlike in string literals, whitespace (tabs and spaces) is not significant in word literals: it separates words. A whitespace character may be escaped with a backslash in order to include it as a literal character. Just like in string literals, an unescaped newline character is not allowed. A newline preceded by a backslash is permitted. Such an escaped backslash, together with any leading and trailing unescaped whitespace, is removed and replaced with a single space. Example: .verb #"abc def ghi" --> notates ("abc" "def" "ghi") #"abc def \e ghi" --> notates ("abc" "def" "ghi") #"abc\e def ghi" --> notates ("abc def" "ghi") #"abc\e def\e \e \e ghi" --> notates ("abc def " " ghi") .brev A splicing word literal differs from a word literal in that it does not produce a list of string literals, but rather it produces a sequence of string literals that is merged into the surrounding syntax. Thus, the following two notations are equivalent: .verb (1 2 3 #*"abc def" 4 5 #"abc def") (1 2 3 "abc" "def" 4 5 ("abc" "def")) .brev The regular WLL produced a single list object, but the splicing WLL expanded into multiple string literal objects. .SS* String Quasiliterals Quasiliterals are similar to string literals, except that they may contain variable references denoted by the usual .code @ syntax. The quasiliteral represents a string formed by substituting the values of those variables into the literal template. If .code a is bound to .str "apple" and .code b to .strn "banana" , the quasiliteral .code "`one @a and two @{b}s`" represents the string .strn "one apple and two bananas" . A backquote escaped by a backslash represents itself. Unlike in directive syntax, two consecutive .code @ characters do not code for a literal .codn @ , but cause a syntax error. The reason for this is that compounding of the .code @ syntax is meaningful. Instead, there is a .code \e@ escape for encoding a literal .code @ character. Quasiliterals support the full output variable syntax. Expressions within variable substitutions follow the evaluation rules of \*(TL. This hasn't always been the case: see the COMPATIBILITY section. Quasiliterals can be split into multiple lines in the same way as ordinary string literals. .SS* Quasiword List Literals The quasiword list literals (QLLs) are to quasiliterals what WLLs are to ordinary literals. (See the above section Word List Literals.) A QLL combines the convenience of the WLL with the power of quasistrings. Just as in the case of WLLs, there are two flavors of the QLL: the regular QLL which begins with .code #` (hash, backquote) and the splicing QLL which begins with .code #*` (hash, star, backquote). Both types are terminated by a backquote, which may be escaped as .code \e` in order to include it as a character. All the escaping conventions used in quasiliterals can be used in QLLs. Unlike in quasiliterals, whitespace (tabs and spaces) is not significant in QLLs: it separates words. A whitespace character may be escaped with a backslash in order to include it as a literal character. A newline is not permitted unless escaped. An escaped newline works exactly the same way as it does in WLLs. Note that the delimiting into words is done before the variable substitution. If the variable .code a contains spaces, then .code #`@a` nevertheless expands into a list of one item: the string derived from .codn a . Examples: .verb #`abc @a ghi` --> notates (`abc` `@a` `ghi`) #`abc @d@e@f \e ghi` --> notates (`abc` `@d@e@f` `ghi`) #`@a\e @b @c` --> notates (`@a @b` `@c`) .brev A splicing QLL differs from an ordinary QLL in that it does not produce a list of quasiliterals, but rather it produces a sequence of quasiliterals that is merged into the surrounding syntax. .SS* Numbers \*(TX supports integers and floating-point numbers. An integer literal is made up of digits .code 0 through .codn 9 , optionally preceded by a .code + or .code - sign. The character .code , (comma) may appear between digits, as a visual separator of no semantic significance. The digit sequence must start and end with a digit. Runs of consecutive commas are permitted. Commas outside of the digit sequence are interpreted as the Lisp unquote syntax. Compatibility node: support for separator commas appeared in \*(TX 283. Older \*(TX versions will interpret commas in the middle of numeric constants as instances of the unquote syntax. Examples: .verb 123 -34 +0 -0 +234483527304983792384729384723234 -1,000,000,001 1,2,3,,4 ;; equivalent to 1234 .brev Examples that are not integer tokens: .verb ,123 ;; equivalent to (sys:unquote 123) 123,a ;; equivalent to 123, followed by (sys:unquote a) -,1 ;; symbol - followed by (sys:unquote 1) .brev An integer constant can also be specified in hexadecimal using the prefix .code #x followed by an optional sign, followed by hexadecimal digits: .code 0 through .code 9 and the uppercase or lowercase letters .code A through .codn F : .verb #xFF ;; 255 #x-ABC ;; -2748 .brev These digits may contain separator commas, just as in the case of the decimal integer: .verb #xFFFF,FFFF,FFFF .brev Similarly, octal numbers are supported with the prefix .code #o followed by octal digits: .verb #o777 ;; 511 #o123,456 ;; 42797 .brev and binary numbers can be written with a .code #b prefix: .verb #b1110 ;; 14 #b1111,1111 ;; 255 .brev A comma between the radix prefix and digits is a syntax error: .verb #x,DEF5,549C ;; Syntax error #b,1001,1101 ;; Likewise .brev Note that the .code #b prefix is also used for buffer literals. A floating-point literal is marked by the inclusion of a decimal point, the scientific E notation, or both. It is an optional sign, followed by a mantissa consisting of digits, a decimal point, more digits, and then an optional E notation consisting of the letter .code "e" or .codn "E" , an optional .code + or .code - sign, and then digits indicating the exponent value. In the mantissa, the digits are not optional. At least one digit must either precede the decimal point or follow it. That is to say, a decimal point by itself is not a floating-point constant. The digits of the mantissa may include separator commas, in the same manner as decimal integer literals, in both the integer and fractional part. The digits of the exponent may not include separator commas. Examples: .verb .123 123. 1E-3 20E40 .9E1 9.E19 -.5 +3E+3 1.E5 1,123,456.935,342E+013 .brev Examples which are not floating-point constant tokens: .verb . ;; dot token, not a number 123E ;; the symbol 123E 1.0E- ;; syntax error: invalid floating point constant 1.0E ;; syntax error: invalid floating point constant 1.E ;; syntax error: invalid floating point literal .e ;; syntax error: dot token followed by symbol ,1.0 ;; equivalent to (sys:unquote 1.0) .brev In \*(TX there is a special "dotdot" token consisting of two consecutive periods. An integer constant followed immediately by dotdot is recognized as such; it is not treated as a floating constant followed by a dot. That is to say, .code 123.. does not mean .code "123. ." (floating point .code 123.0 value followed by dot token). It means .code "123 .." (integer .code 123 followed by .code .. token). Dialect Note: unlike in Common Lisp, .code 123. is not an integer, but the floating-point number .codn 123.0 . Integers within a certain small range centered on zero have .code fixnum type. Values in the .code fixnum range fit into a Lisp value directly, not requiring heap allocation. A value which is implemented as a reference to a heap-allocated object is called .IR boxed , whereas a self-contained value not referencing any storage elsewhere is called .IR unboxed . Thus values in the .code fixnum are unboxed; those outside of the range have .code bignum type instead, and are boxed. The variables .code fixnum-min and .code fixnum-max indicate the range. Floating-point values are all unboxed if \*(TX is built with "NaN boxing" enabled, otherwise they are all boxed. The Lisp expression .code "(eq (read \(dq0.0\(dq) (read \(dq0.0\(dq))" returns .code t under NaN boxing, indicating that the two instances of 0.0 are the same object. In the absence of NaN boxing, the two .code read calls produce distinct, boxed representations of 0.0, which compare unequal under .codn eq . (The expression .code "(eq 0.0 0.0)" may not be relied upon if it is compiled, since compilation may deduplicate identical boxed literals, leading to a false positive.) .SS* Comments Comments of the form .code @; were introduced earlier. Inside compound expressions, another convention for comments exists: Lisp comments, which are introduced by the .code ; (semicolon) character and span to the end of the line. Example: .verb @(foo ; this is a comment bar ; this is another comment ) .brev This is equivalent to .codn "@(foo bar)" . .SH* DIRECTIVES .SS* Overview When a \*(TL compound expression occurs in \*(TX preceded by a .codn @ , it is a .IR directive . Directives which are based on certain symbols are, additionally, involved in a phrase-structure syntax which uses Lisp expressions as if they were tokens. For instance, the directive .verb @(collect) .brev not only denotes a compound expression with the .code collect symbol in its head position, but it also introduces a syntactic phrase which requires a matching .code @(end) directive. In other words, .code @(collect) is not only an expression, but serves as a kind of token in a higher-level, phrase-structure grammar. Effectively, .code collect is a reserved symbol in the \*(TX language. A \*(TX program cannot use this symbol as the name of a pattern function due to its role in the syntax. The symbol has no reserved role in \*(TL. Usually if this type of directive occurs alone in a line, not preceded or followed by other material, it is involved in a "vertical" (or line-oriented) syntax. If such a directive is embedded in a line (has preceding or trailing material) then it is in a horizontal syntactic and semantic context (character-oriented). There is an exception: the definition of a horizontal function looks like this: .verb @(define name (arg))body material@(end) .brev Yet, this is considered one vertical item, which means that it does not match a line of data. (This is necessary because all horizontal syntax matches something within a line of data, which is undesirable for definitions.) Many directives exhibit both horizontal and vertical syntax, with different but closely related semantics. Some are vertical only, some are horizontal only. A summary of the available directives follows: .coIP @(eof) Explicitly match the end of file. Fails if unmatched data remains in the input stream. Can capture or match the termination status of a pipe. .coIP @(eol) Explicitly match the end of line. Fails if the current position is not the end of a line. Also fails if no data remains (there is no current line). .coIP @(next) Continue matching in another file or data source. .coIP @(block) Groups together a sequence of directives into a logical name block, which can be explicitly terminated from within by using the .code @(accept) and .code @(fail) directives. Blocks are described in the section Blocks below. .coIP @(skip) Treat the remaining query as a subquery unit, and search the lines (or characters) of the input file until that subquery matches somewhere. A .code skip is also an anonymous block. .coIP @(trailer) Treat the remaining query or subquery as a match for a trailing context. That is to say, if the remainder matches, the data position is not advanced. .coIP @(freeform) Treat the remainder of the input as one big string, and apply the following query line to that string. The newline characters (or custom separators) appear explicitly in that string. .coIP @(fuzz) The .code fuzz directive, inspired by the patch utility, specifies a partial match for some lines. .ccIP @ @(line) and @ @(chr) These directives match a variable or expression against the current line number or character position. .coIP @(name) Match a variable against the name of the current data source. .coIP @(data) Match a variable against the remaining data (a lazy list of strings). .coIP @(some) Multiple clauses are each applied to the same input. Succeeds if at least one of the clauses matches the input. The bindings established by earlier successful clauses are visible to the later clauses. .coIP @(all) Multiple clauses are applied to the same input. Succeeds if and only if each one of the clauses matches. The clauses are applied in sequence, and evaluation stops on the first failure. The bindings established by earlier successful clauses are visible to the later clauses. .coIP @(none) Multiple clauses are applied to the same input. Succeeds if and only if none of them match. The clauses are applied in sequence, and evaluation stops on the first success. No bindings are ever produced by this construct. .coIP @(maybe) Multiple clauses are applied to the same input. No failure occurs if none of them match. The bindings established by earlier successful clauses are visible to the later clauses. .coIP @(cases) Multiple clauses are applied to the same input. Evaluation stops on the first successful clause. .coIP @(require) The .code require directive is similar to the .code do directive in that it evaluates one or more \*(TL expressions. If the result of the rightmost expression is .codn nil , then .code require triggers a match failure. See the TXR LISP section far below. .ccIP @, @(if) @, @(elif) and @ @(else) The .code if directive with optional .code elif and .code else clauses allows one of multiple bodies of pattern-matching directives to be conditionally selected by testing the values of Lisp expressions. It is also available inside .code @(output) for conditionally selecting output clauses. .coIP @(choose) Multiple clauses are applied to the same input. The one whose effect persists is the one which maximizes or minimizes the length of a particular variable. .coIP @(empty) The .code @(empty) directive matches the empty string. It is useful in certain situations, such as expressing an empty match in a directive that doesn't accept an empty clause. The .code @(empty) syntax has another meaning in .code @(output) clauses, in conjunction with .codn @(repeat) . .meIP @(define < name >> ( args ...)) Introduces a function. Functions are described in the Functions section below. .meIP @(call < expr << arg *) Performs function indirection. Evaluates .metn expr , which must produce a symbol that names a pattern function. Then that pattern function is invoked. .coIP @(gather) Searches text for matches for multiple clauses which may occur in arbitrary order. For convenience, lines of the first clause are treated as separate clauses. .coIP @(collect) Search the data for multiple matches of a clause. Collect the bindings in the clause into lists, which are output as array variables. The .code @(collect) directive is line-oriented. It works with a multiline pattern and scans line by line. A similar directive called .code @(coll) works within one line. A collect is an anonymous block. .coIP @(and) Separator of clauses for .codn @(some) , .codn @(all) , .codn @(none) , .code @(maybe) and .codn @(cases) . Equivalent to .codn @(or) . The choice is stylistic. .coIP @(or) Separator of clauses for .codn @(some) , .codn @(all) , .codn @(none) , .code @(maybe) and .codn @(cases) . Equivalent to .codn @(and) . The choice is stylistic. .coIP @(end) Required terminator for .codn @(some) , .codn @(all) , .codn @(none) , .codn @(maybe) , .codn @(cases) , .codn @(if) , .codn @(collect) , .codn @(coll) , .codn @(output) , .codn @(repeat) , .codn @(rep) , .codn @(try) , .code @(block) and .codn @(define) . .coIP @(fail) Terminate the processing of a block, as if it were a failed match. Blocks are described in the section Blocks below. .coIP @(accept) Terminate the processing of a block, as if it were a successful match. What bindings emerge may depend on the kind of block: .code collect has special semantics. Blocks are described in the section Blocks below. .coIP @(try) Indicates the start of a try block, which is related to exception handling, described in the Exceptions section below. .ccIP @ @(catch) and @ @(finally) Special clauses within .codn @(try) . See Exceptions below. .ccIP @ @(defex) and @ @(throw) Define custom exception types; throw an exception. See Exceptions below. .coIP @(assert) The .code assert directive requires the following material to match, otherwise it throws an exception. It is useful for catching mistakes or omissions in parts of a query that are surefire matches. .coIP @(flatten) Normalizes a set of specified variables to one-dimensional lists. Those variables which have a scalar value are reduced to lists of that value. Those which are lists of lists (to an arbitrary level of nesting) are converted to flat lists of their leaf values. .coIP @(merge) Binds a new variable which is the result of merging two or more other variables. Merging has somewhat complicated semantics. .coIP @(cat) Decimates a list (any number of dimensions) to a string, by catenating its constituent strings, with an optional separator string between all of the values. .coIP @(bind) Binds one or more variables against a value using a structural pattern match. A limited form of unification takes place which can cause a match to fail. .coIP @(set) Destructively assigns one or more existing variables using a structural pattern, using syntax similar to bind. Assignment to unbound variables triggers an error. .coIP @(rebind) Evaluates an expression in the current binding environment, and then creates new bindings for the variables in the structural pattern. Useful for temporarily overriding variable values in a scope. .coIP @(forget) Removes variable bindings. .coIP @(local) Synonym of .codn @(forget) . .ccIP @ @(output) and @(push) A directive which encloses an output clause in the query. An output section does not match text, but produces text which can be directed to various destinations, the default being standard output. Most directives cannot be used inside an output clause. The .code @(push) clause is a variant of .code @(output) which produces text that implicitly pushed back into the input stream to be matched. .coIP @(repeat) A directive understood within an .code @(output) section, for repeating multiline text, with successive substitutions pulled from lists. The directive .code @(rep) produces iteration over lists horizontally within one line. These directives have a different meaning in matching clauses, providing a shorthand notation for .code "@(collect :vars nil)" and .codn "@(coll :vars nil)" , respectively. .coIP @(deffilter) The .code deffilter directive is used for defining named filters, which are useful for filtering variable substitutions in output blocks. Filters are useful when data must be translated between different representations that have different special characters or other syntax, requiring escaping or similar treatment. Note that it is also possible to use a function as a filter. See Function Filters below. Named filters are stored in the hash table held in the Lisp special variable .codn *filters* . .coIP @(filter) The .code filter directive passes one or more variables through a given filter or chain or filters, updating them with the filtered values. .ccIP @ @(load) and @ @(include) The .code load and .code include directives allow \*(TX programs to be modularized. They bring in code from a file, in two different ways. .coIP @(do) The .code do directive is used to evaluate \*(TL expressions, discarding their result values. See the TXR LISP section far below. .coIP @(mdo) The .code mdo (macro .codn do ) directive evaluates \*(TL expressions immediately, during the parsing of the \*(TX syntax in which it occurs. .coIP @(in-package) The .code in-package directive is used to switch to a different symbol package. It mirrors the \*(TL macro of the same name. .PP .SS* Subexpression Evaluation Some directives contain subexpressions which are evaluated. Two distinct styles of evaluations occur in \*(TX: bind expressions and Lisp expressions. Which semantics applies to an expression depends on the syntactic context in which it occurs: which position in which directive. The evaluation of \*(TL expressions is described in the TXR LISP section of the manual. Bind expressions are so named because they occur in the .code @(bind) directive. \*(TX pattern function invocations also treat argument expressions as bind expressions. The .codn @(rebind) , .codn @(set) , .codn @(merge) , and .code @(deffilter) directives also use bind expression evaluation. Bind expression evaluation also occurs in the argument position of the .code :tlist keyword in the .code @(next) directive. Unlike Lisp expressions, bind expressions do not support operators. If a bind expression is a nested list structure, it is a template denoting that structure. Any symbol in any position of that structure is interpreted as a variable. When the bind expression is evaluated, those corresponding positions in the template are replaced by the values of the variables. Anywhere where a variable can appear in a bind expression's nested list structure, a Lisp expression can appear preceded by the .code @ character. That Lisp expression is evaluated and its value is substituted into the bind expression's template. Moreover, a Lisp expression preceded by .code @ can be used as an entire bind expression. The value of that Lisp expression is then taken as the bind expression value. Any object in a bind expression which is not a nested list structure containing Lisp expressions or variables denotes itself literally. .TP* Examples: In the following examples, the variables .code a and .code b are assumed to have the string values .str foo and .strn bar , respectively. The .code -> notation indicates the value of each expression. .verb a -> "foo" (a b) -> ("foo" "bar") ((a) ((b) b)) -> (("foo") (("bar") "bar")) (list a b) -> error: unbound variable list @(list a b) -> ("foo" "bar") ;; Lisp expression (a @[b 1..:]) -> ("foo" "ar") ;; Lisp eval of [b 1..:] (a @(+ 2 2)) -> ("foo" 4) ;; Lisp eval of (+ 2 2) #(a b) -> (a b) ;; Vector literal, not list. [a b] -> error: unbound variable dwim .brev The last example above .code "[a b]" is a notation equivalent to .code "(dwim a b)" and so follows similarly to the example involving .codn list . .SS* Input Scanning and Data Manipulation .dir next The .code next directive indicates that the remaining directives in the current block are to be applied against a new input source. It can only occur by itself as the only element in a query line, and takes various arguments, according to these possibilities: .mono .mets @(next) .mets @(next < source <> [ :nothrow ] <> [ :noclose ]) .mets @(next :args) .mets @(next :env) .mets @(next :list << lisp-expr ) .mets @(next :tlist << bind-expr ) .mets @(next :string << lisp-expr ) .mets @(next :var << var ) .mets @(next nil) .onom The lone .code @(next) without arguments specifies that subsequent directives will match inside the next file in the argument list which was passed to \*(TX on the command line. If .meta source is given, it must be a \*(TL expression which denotes an input source. Its value may be a string or an input stream. For instance, if variable .code A contains the text .strn "data" , then .code "@(next A)" means switch to the file called .strn "data" , and .code "@(next `@A.txt`)" means to switch to the file .strn "data.txt" . The directive .code "@(next (open-command `git log`))" switches to the input stream connected to the output of the .code "git log" command. If the input source cannot be opened for whatever reason, \*(TX throws an exception (see Exceptions below). An unhandled exception will terminate the program. Often, such a drastic measure is inconvenient; if .code @(next) is invoked with the .code :nothrow keyword, then if the input source cannot be opened, the situation is treated as a simple match failure. The .code :nothrow keyword also ensures that when the stream is later closed, which occurs when the lazy list reads all of the available data, the implicit call to the .code close-stream function specifies .code nil as the argument value to that function's .meta throw-on-error-p parameter. This .code :nothrow mechanism does not suppress all exceptions related to the processing of that stream; unusual conditions encountered during the reading of data from the stream may throw exceptions. When the subsequent directives which follow .code @(next) are processed, the directive terminates, and any stream which had been opened for .meta source is closed. If the .code :noclose keyword is present, then this is prevented; the stream remains open. Note: keeping the stream open may be necessary if the .code @(data) directive is used to capture the input list into a variable whose value is used after the .code @(next) directive terminates, because the input list is lazy, and may depend on the stream continuing to be open. The variant .code "@(next :args)" means that the remaining command-line arguments are to be treated as a data source. For this purpose, each argument is considered to be a line of text. The argument list does include that argument which specifies the file that is currently being processed or was most recently processed. As the arguments are matched, they are consumed. This means that if a .code @(next) directive without arguments is executed in the scope of .codn "@(next :args)" , it opens the file named by the first unconsumed argument. To process arguments, and then continue with the original file and argument list, wrap the argument processing in a .codn @(block) . When the block terminates, the input source and argument list are restored to what they were before the block. The variant .code "@(next :env)" means that the list of process environment variables is treated as a source of data. It looks like a text file stream consisting of lines of the form .strn "name=value" . If this feature is not available on a given platform, an exception is thrown. The syntax .mono .meti @(next :list << lisp-expr ) .onom treats \*(TL expression .meta lisp-expr as a source of text. The value of .meta lisp-expr is flattened to a simple list in a way similar to the .code @(flatten) directive. The resulting list is treated as if it were the lines of a text file: each element of the list must be a string, which represents a line. If the strings happen contain embedded newline characters, they are a visible constituent of the line, and do not act as line separators. The syntax .mono .meti @(next :tlist << bind-expr ) .onom is similar to .code "@(next :list ...)" except that .meta bind-expr is not a \*(TL expression, but a \*(TX bind expression. The syntax .mono .meti @(next :var << var ) .onom requires .meta var to be a previously bound variable. The value of the variable is retrieved and treated like a list, in the same manner as under .codn "@(next :list ...)" . Note that .code "@(next :var x)" is not always the same as .codn "@(next :tlist x)" , because .code ":var x" strictly requires .code x to be a \*(TX variable, whereas the .code x in .code ":tlist x" is an expression which can potentially refer to Lisp variable. The syntax .mono .meti @(next :string << lisp-expr ) .onom treats expression .meta lisp-expr as a source of text. The value of the expression must be a string. Newlines in the string are interpreted as line terminators. A string which is not terminated by a newline is tolerated, so that: .verb @(next :string "abc") @a .brev binds .code a to .strn "abc" . Likewise, this is also the case with input files and other streams whose last line is not terminated by a newline. However, watch out for empty strings, which are analogous to a correctly formed empty file which contains no lines: .verb @(next :string "") @a .brev This will not bind .code a to .strn "" ; it is a matching failure. The behavior of .code :list is different. The query .verb @(next :list "") @a .brev binds .code a to .strn "" . The reason is that under .code :list the string .str "" is flattened to the list .mono ("") .onom which is not an empty input stream, but a stream consisting of one empty line. The .code "@(next nil)" variant indicates that the following subquery is applied to empty data, and the list of data sources from the command line is considered empty. This directive is useful in front of \*(TX code which doesn't process data sources from the command line, but takes command-line arguments. The .code "@(next nil)" incantation absolutely prevents \*(TX from trying to open the first command-line argument as a data source. Note that the .code @(next) directive only redirects the source of input over the scope of subquery in which the that directive appears. For example, the following query looks for the line starting with .str "xyz" at the top of the file .strn "foo.txt" , within a .code some directive. After the .code @(end) which terminates the .codn @(some) , the .str "abc" is matched in the previous input stream which was in effect before the .code @(next) directive: .verb @(some) @(next "foo.txt") xyz@suffix @(end) abc .brev However, if the .code @(some) subquery successfully matched .str "xyz@suffix" within the file .codn foo.text , there is now a binding for the .code suffix variable, which is visible to the remainder of the entire query. The variable bindings survive beyond the clause, but the data stream does not. .dir skip The .code skip directive considers the remainder of the query as a search pattern. The remainder is no longer required to strictly match at the current line in the current input stream. Rather, the current stream is searched, starting with the current line, for the first line where the entire remainder of the query will successfully match. If no such line is found, the .code skip directive fails. If a matching position is found, the remainder of the query is processed from that point. The remainder of the query can itself contain .code skip directives. Each such directive performs a recursive subsearch. Skip comes in vertical and horizontal flavors. For instance, skip and match the last line: .verb @(skip) @last @(eof) .brev Skip and match the last character of the line: .verb @(skip)@{last 1}@(eol) .brev The .code skip directive has two optional arguments, which are evaluated as \*(TL expressions. If the first argument evaluates to an integer, its value limits the range of lines scanned for a match. Judicious use of this feature can improve the performance of queries. Example: scan until .str "size: @SIZE" matches, which must happen within the next 15 lines: .verb @(skip 15) size: @SIZE .brev Without the range limitation, .code skip will keep searching until it consumes the entire input source. In a horizontal .codn skip , the range-limiting numeric argument is expressed in characters, so that .verb abc@(skip 5)def .brev means: there must be a match for .str "abc" at the start of the line, and then within the next five characters, there must be a match for .strn "def" . Sometimes a skip is nested within a .codn collect , or following another skip. For instance, consider: .verb @(collect) begin @BEG_SYMBOL @(skip) end @BEG_SYMBOL @(end) .brev The above .code collect iterates over the entire input. But, potentially, so does the embedded .codn skip . Suppose that .str "begin x" is matched, but the data has no matching .strn "end x" . The skip will search in vain all the way to the end of the data, and then the collect will try another iteration back at the beginning, just one line down from the original starting point. If it is a reasonable expectation that an .code "end x" occurs 15 lines of a .strn "begin x" , this can be specified instead: .verb @(collect) begin @BEG_SYMBOL @(skip 15) end @BEG_SYMBOL @(end) .brev If the symbol .code nil is used in place of a number, it means to scan an unlimited range of lines; thus, .code "@(skip nil)" is equivalent to .codn @(skip) . If the symbol .code :greedy is used, it changes the semantics of the skip to longest match semantics. For instance, match the last three space-separated tokens of the line: .verb @(skip :greedy) @a @b @c .brev Without .codn :greedy , the variable .code @c may match multiple tokens, and end up with spaces in it, because nothing follows .code @c and so it matches from any position which follows a space to the end of the line. Also note the space in front of .codn @a . Without this space, .code @a will get an empty string. A line-oriented example of greedy skip: match the last line without using .codn @(eof) : .verb @(skip :greedy) @last_line .brev There may be a second numeric argument. This specifies a minimum number of lines to skip before looking for a match. For instance, skip 15 lines and then search indefinitely for .codn "begin ..." : .verb @(skip nil 15) begin @BEG_SYMBOL .brev The two arguments may be used together. For instance, the following matches if and only if the 15th line of input starts with .codn "begin " : .verb @(skip 1 15) begin @BEG_SYMBOL .brev Essentially, .mono .meti @(skip 1 << n ) .onom means "hard skip by .meta n lines". .code "@(skip 1 0)" is the same as .codn "@(skip 1)" , which is a noop, because it means: "the remainder of the query must match starting on the next line", or, more briefly, "skip exactly zero lines", which is the behavior if the .code skip directive is omitted altogether. Here is one trick for grabbing the fourth line from the bottom of the input: .verb @(skip) @fourth_from_bottom @(skip 1 3) @(eof) .brev Or using greedy skip: .verb @(skip :greedy) @fourth_from_bottom @(skip 1 3) .brev Non-greedy skip with the .code @(eof) directive has a slight advantage because the greedy skip will keep scanning even though it has found the correct match, then backtrack to the last good match once it runs out of data. The regular skip with explicit .code @(eof) will stop when the .code @(eof) matches. .NP* Reducing Backtracking with Blocks The .code skip directive can consume considerable CPU time when multiple skips are nested. Consider: .verb @(skip) A @(skip) B @(skip) C .brev This is actually nesting: the second and third skips occur within the body of the first one, and thus this creates nested iteration. \*(TX is searching for the combination of skips which match the pattern of lines .codn A , .code B and .code C with backtracking behavior. The outermost skip marches through the data until it finds .code A followed by a pattern match for the second skip. The second skip iterates to find .code B followed by the third skip, and the third skip iterates to find .codn C . If .code A and .code B are only one line each, then this is reasonably fast. But suppose there are many lines matching .code A and .codn B , giving rise to a large number of combinations of skips which match .code A and .codn B , and yet do not find a match for .codn C , triggering backtracking. The nested stepping which tries the combinations of .code A and .code B can give rise to a considerable running time. One way to deal with the problem is to unravel the nesting with the help of blocks. For example: .verb @(block) @ (skip) A @(end) @(block) @ (skip) B @(end) @(skip) C .brev Now the scope of each skip is just the remainder of the block in which it occurs. The first skip finds .codn A , and then the block ends. Control passes to the next block, and backtracking will not take place to a block which completed (unless all these blocks are enclosed in some larger construct which backtracks, causing the blocks to be re-executed. This rewrite is not equivalent, and cannot be used for instance in backreferencing situations such as: .verb @; @; Find three lines anywhere in the input which are identical. @; @(skip) @line @(skip) @line @(skip) @line .brev This example depends on the nested search-within-search semantics. .dir trailer The .code trailer directive introduces a trailing portion of a query or subquery which matches input material normally, but in the event of a successful match, does not advance the current position. This can be used, for instance, to cause .code @(collect) to match partially overlapping regions. Trailer can be used in vertical context: .mono .mets @(trailer) .mets < directives .mets ... .onom or horizontal: .mono .mets @(trailer) < directives ... .onom A vertical .code trailer prevents the vertical input position from advancing as it is matched by .metn directives , whereas a horizontal .code trailer prevents the horizontal position from advancing. In other words, .code trailer performs matching without consuming the input, providing a lookahead mechanism. Example: .verb @(collect) @line @(trailer) @(skip) @line @(end) .brev This script collects each line which has a duplicate somewhere later in the input. Without the .code @(trailer) directive, this does not work properly for inputs like: .verb 111 222 111 222 .brev Without .codn @(trailer) , the first duplicate pair constitutes a match which spans over the .codn 222 . After that pair is found, the matching continues after the second .codn 111 . With the .code @(trailer) directive in place, the collect body, on each iteration, only consumes the lines matched prior to .codn @(trailer) . .dir freeform The .code freeform directive provides a useful alternative to \*(TX's line-oriented matching discipline. The .code freeform directive treats all remaining input from the current input source as one big line. The query line which immediately follows freeform is applied to that line. The syntax variations are: .verb @(freeform) ... query line .. .mets @(freeform << number ) ... query line .. .mets @(freeform << string ) ... query line .. .mets @(freeform < number << string ) ... query line .. .brev where .meta number and .meta string denote \*(TL expressions which evaluate to an integer or string value, respectively. If .meta number and .meta string are both present, they may be given in either order. If the .meta number argument is given, its value limits the range of lines which are combined together. For instance .code "@(freeform 5)" means to only consider the next five lines to be one big line. Without this argument, .code freeform is "bottomless". It can match the entire file, which creates the risk of allocating a large amount of memory. If the .meta string argument is given, it specifies a custom line terminator. The default terminator is .strn "\en" . The terminator does not have to be one character long. Freeform does not convert the entire remainder of the input into one big line all at once, but does so in a dynamic, lazy fashion, which takes place as the data is accessed. So at any time, only some prefix of the data exists as a flat line in which newlines are replaced by the terminator string, and the remainder of the data still remains as a list of lines. After the subquery is applied to the virtual line, the unmatched remainder of that line is broken up into multiple lines again, by looking for and removing all occurrences of the terminator string within the flattened portion. Care must be taken if the terminator is other than the default .strn "\en" . All occurrences of the terminator string are treated as line terminators in the flattened portion of the data, so extra line breaks may be introduced. Likewise, in the yet unflattened portion, no breaking takes place, even if the text contains occurrences of the terminator string. The extent of data which is flattened, and the amount of it which remains, depends entirely on the query line underneath .codn @(flatten) . In the following example, lines of data are flattened using $ as the line terminator. .IP code: .mono \ @(freeform "$") @a$@b: @c @d .onom .IP data: .mono \ 1 2:3 4 .onom .IP "output (\f[4]-B\f[]):" .mono \ a="1" b="2" c="3" d="4" .onom .PP The data is turned into the virtual line .codn 1$2:3$4$ . The .code @a$@b: subquery matches the .code 1$2: portion, binding .code a to .strn 1 , and .code b to .strn 2 . The remaining portion .code 3$4$ is then split into separate lines again according to the line terminator .codn $i : .verb 3 4 .brev Thus the remainder of the query .verb @c @d .brev faces these lines, binding .code c to .code 3 and .code d to .codn 4 . Note that since the data does not contain dollar signs, there is no ambiguity; the meaning may be understood in terms of the entire data being flattened and split again. In the following example, .code freeform is used to solve a tokenizing problem. The Unix password file has fields separated by colons. Some fields may be empty. Using freeform, we can join the password file using .str ":" as a terminator. By restricting freeform to one line, we can obtain each line of the password file with a terminating .strn ":" , allowing for a simple tokenization, because now the fields are colon-terminated rather than colon-separated. Example: .verb @(next "/etc/passwd") @(collect) @(freeform 1 ":") @(coll)@{token /[^:]*/}:@(end) @(end) .brev .dir fuzz The .code fuzz directive allows for an imperfect match spanning a set number of lines. It takes two arguments, both of which are \*(TL expressions that should evaluate to integers: .mono .meti @(fuzz m n) ... .onom This expresses that over the next .meta n query lines, the matching strictness is relaxed a little bit. Only .meta m out of those .meta n lines have to match. Afterward, the rest of the query follows normal, strict processing. In the degenerate situation where there are fewer than .meta n query lines following the .code fuzz directive, then .meta m of them must succeed anyway. (If there are fewer than .metn m , then this is impossible.) .dirs line chr The .code line and .code chr directives perform binding between the current input line number or character position within a line, against an expression or variable: .verb @(line 42) @(line x) abc@(chr 3)def@(chr y) .brev The directive .code "@(line 42)" means "match the current input line number against the integer 42". If the current line is 42, then the directive matches, otherwise it fails. .code line is a vertical directive which doesn't consume a line of input. Thus, the following matches at the beginning of an input stream, and .code x ends up bound to the first line of input: .verb @(line 1) @(line 1) @(line 1) @x .brev The directive .code "@(line x)" binds variable .code x to the current input line number, if .code x is an unbound variable. If .code x is already bound, then the value of .code x must match the current line number, otherwise the directive fails. The .code chr directive is similar to .code line except that it's a horizontal directive, and matches the character position rather than the line position. Character positions are measured from zero, rather than one. .code chr does not consume a character. Hence the two occurrences of .code chr in the following example both match, and .code x takes the entire line of input: .verb @(chr 0)@(chr 0)@x .brev The argument of .code line or .code chr may be an .codn @ -delimited Lisp expression. This is useful for matching computed lines or character positions: .verb @(line @(+ a (* b c))) .brev .dir name The .code name directive performs a binding between the name of the current data source and a variable or bind expression: .verb @(name na) @(name "data.txt") .brev If .code na is an unbound variable, it is bound and takes on the name of the data source, such as a file name. If .code na is bound, then it has to match the name of the data source, otherwise the directive fails. The directive .mono @(name "data.txt") .onom fails unless the current data source has that name. .dir data The .code data directive performs a binding between the unmatched data at the current position, and and a variable or bind expression. The unmatched data takes the form of a list of strings: .verb @(data d) .brev The binding is performed on object equality. If .code d is already bound, a matching failure occurs unless .code d contains the current unmatched data. Matching the current data has various uses. For instance, two branches of pattern matching can, at some point, bind the current data into different variables. When those paths join, the variables can be bound together to create the assertion that the current data had been the same at those points: .verb @(all) @ (skip) foo @ (skip) bar @ (data x) @(or) @ (skip) xyzzy @ (skip) bar @ (data y) @(end) @(require (eq x y)) .brev Here, two branches of the .code @(all) match some material which ends in the line .codn "bar" . However, it is possible that this is a different line. The .code data directives are used to create an assertion that the data regions matched by the two branches are identical. That is to say, the unmatched data .code x captured after the first .code "bar" and the unmatched data .code y captured after the second .code "bar" must be the same object in order for .code "@(require (eq x y))" to succeed, which implies that the same .code "bar" was matched in both branches of the .codn @(all) . Another use of .code data is simply to gain access to the trailing remainder of the unmatched input in order to print it, or do some special processing on it. The .code tprint Lisp function is useful for printing the unmatched data as newline-terminated lines: .verb @(data remainder) @(do (tprint remainder)) .brev .dir eof The .code eof directive, if not given any argument, matches successfully when no more input is available from the current input source. In the following example, the .meta line variable captures the text .str "One-line file" and then since that is the last line of input, the .code eof directive matches: .IP code: .mono \ @line @(eof) .onom .IP data: .mono \ One-line file .onom .PP If the data consisted of two or more lines, .code eof would fail. The .code eof directive may be given a single argument, which is a pattern that matches the termination status of the input source. This is useful when the input source is a process pipe. For the purposes of .codn eof , sources which are not process pipes have the symbol .code t as their termination status. In the following example, which assumes the availability of a POSIX shell command interpreter in the host system, the variable .meta a captures the string .str a and the .meta status variable captures the integer value .codn 5 , which is the termination status of the command: .verb @(next (open-command "echo a; exit 5")) @a @(eof status) .brev .dirs some all none maybe cases choose These directives, called the parallel directives, combine multiple subqueries, which are applied at the same input position, rather than to consecutive input. They come in vertical (line mode) and horizontal (character mode) flavors. In horizontal mode, the current position is understood to be a character position in the line being processed. The clauses advance this character position by moving it to the right. In vertical mode, the current position is understood to be a line of text within the stream. A clause advances the position by some whole number of lines. The syntax of these parallel directives follows this example: .verb @(some) subquery1 . . . @(and) subquery2 . . . @(and) subquery3 . . . @(end) .brev And in horizontal mode: .verb @(some)subquery1...@(and)subquery2...@(and)subquery3...@(end) .brev Long horizontal lines can be broken up with line continuations, allowing the above example to be written like this, which is considered a single logical line: .verb @(some)@\e subquery1...@\e @(and)@\e subquery2...@\e @(and)@\e subquery3...@\e @(end) .brev The .codn @(some) , .codn @(all) , .codn @(none) , .codn @(maybe) , .code @(cases) or .code @(choose) must be followed by at least one subquery clause, and be terminated by .codn @(end) . If there are two or more subqueries, these additional clauses are indicated by .code @(and) or .codn @(or) , which are interchangeable. The separator and terminator directives also must appear as the only element in a query line. The .code choose directive requires keyword arguments. See below. The syntax supports arbitrary nesting. For example: .verb QUERY: SYNTAX TREE: @(all) all -+ @ (skip) +- skip -+ @ (some) | +- some -+ it | | +- TEXT @ (and) | | +- and @ (none) | | +- none -+ was | | | +- TEXT @ (end) | | | +- end @ (end) | | +- end a dark | +- TEXT @(end) *- end .brev nesting can be indicated using whitespace between .code @ and the directive expression. Thus, the above is an .code @(all) query containing a .code @(skip) clause which applies to a .code @(some) that is followed by the text line .strn "a dark" . The .code @(some) clause combines the text line .strn it , and a .code @(none) clause which contains just one clause consisting of the line .strn was . The semantics of the parallel directives is: .coIP @(all) Each of the clauses is matched at the current position. If any of the clauses fails to match, the directive fails (and thus does not produce any variable bindings). Clauses following the failed directive are not evaluated. Bindings extracted by a successful clause are visible to the clauses which follow, and if the directive succeeds, all of the combined bindings emerge. .meIP @(some [ :resolve >> ( var ...) ]) Each of the clauses is matched at the current position. If any of the clauses succeed, the directive succeeds, retaining the bindings accumulated by the successfully matching clauses. Evaluation does not stop on the first successful clause. Bindings extracted by a successful clause are visible to the clauses which follow. The .code :resolve parameter is for situations when the .code @(some) directive has multiple clauses that need to bind some common variables to different values: for instance, output parameters in functions. Resolve takes a list of variable name symbols as an argument. This is called the resolve set. If the clauses of .code @(some) bind variables in the resolve set, those bindings are not visible to later clauses. However, those bindings do emerge out of the .code @(some) directive as a whole. This creates a conflict: what if two or more clauses introduce different bindings for a variable in the resolve set? This is why it is called the resolve set: conflicts for variables in the resolve set are automatically resolved in favor of later directives. Example: .verb @(some :resolve (x)) @ (bind a "a") @ (bind x "x1") @(or) @ (bind b "b") @ (bind x "x2") @(end) .brev Here, the two clauses both introduce a binding for .codn x . Without the .code :resolve parameter, this would mean that the second clause fails, because .code x comes in with the value .strn x1 , which does not bind with .strn x2 . But because .code x is placed into the resolve set, the second clause does not see the .str x1 binding. Both clauses establish their bindings independently creating a conflict over .codn x . The conflict is resolved in favor of the second clause, and so the bindings which emerge from the directive are: .verb a="a" b="b" x="x2" .brev .coIP @(none) Each of the clauses is matched at the current position. The directive succeeds only if all of the clauses fail. If any clause succeeds, the directive fails, and subsequent clauses are not evaluated. Thus, this directive never produces variable bindings, only matching success or failure. .coIP @(maybe) Each of the clauses is matched at the current position. The directive always succeeds, even if all of the clauses fail. Whatever bindings are found in any of the clauses are retained. Bindings extracted by any successful clause are visible to the clauses which follow. .coIP @(cases) Each of the clauses is matched at the current position. The clauses are matched, in order, at the current position. If any clause matches, the matching stops and the bindings collected from that clause are retained. Any remaining clauses after that one are not processed. If no clause matches, the directive fails, and produces no bindings. .meIP @(choose [ :longest < var | :shortest < var ]) Each of the clauses is matched at the current position in order. In this construct, bindings established by an earlier clause are not visible to later clauses. Although any or all of the clauses can potentially match, the clause which succeeds is the one which maximizes or minimizes the length of the text bound to the specified variable. The other clauses have no effect. For all of the parallel directives other than .code @(none) and .codn @(choose) , the query advances the input position by the greatest number of lines that match in any of the successfully matching subclauses that are evaluated. The .code @(none) directive does not advance the input position. For instance if there are two subclauses, and one of them matches three lines, but the other one matches five lines, then the overall clause is considered to have made a five line match at its position. If more directives follow, they begin matching five lines down from that position. .dir require The syntax of .code @(require) is: .mono .mets @(require << lisp-expression ) .onom The .code require directive evaluates a \*(TL expression. (See TXR LISP far below.) If the expression yields a true value, then it succeeds, and matching continues with the directives which follow. Otherwise the directive fails. In the context of the .code require directive, the expression should not be introduced by the .code @ symbol; it is expected to be a Lisp expression. Example: .verb @; require that 4 is greater than 3 @; This succeeds; therefore, @a is processed @(require (> (+ 2 2) 3)) @a .brev .dir if The .code if directive allows for conditional selection of pattern-matching clauses, based on the Boolean results of Lisp expressions. A variant of the .code if directive is also available for use inside an .code output clauses, where it similarly allows for the conditional selection of output clauses. The syntax of the .code if directive can be exemplified as follows: .mono .mets @(if << lisp-expr ) . . . .mets @(elif << lisp-expr ) . . . .mets @(elif << lisp-expr ) . . . @(else) . . . @(end) .onom The .code @(elif) and .code @(else) clauses are all optional. If .code @(else) is present, it must be last, before .codn @(end) , after any .code @(elif) clauses. Any of the clauses may be empty. .TP* "Example:" .verb @(if (> (length str) 42)) foo: @a @b @(else) {@c} @(end) .brev In this example, if the length of the variable .code str is greater than .codn 42 , then matching continues with .strn "foo: @a b" , otherwise it proceeds with .codn {@c} . .PP More precisely, how the .code if directive works is as follows. The Lisp expressions are evaluated in order, starting with the .code if expression, then the .code elif expressions if any are present. If any Lisp expression yields a true result (any value other than .codn nil ) then evaluation of Lisp expressions stops. The corresponding clause of that Lisp expression is selected and pattern matching continues with that clause. The result of that clause (its success or failure, and any newly bound variables) is then taken as the result of the .code if directive. If none of the Lisp expressions yield true, and an .code else clause is present, then that clause is processed and its result determines the result of the .code if directive. If none of the Lisp expressions yield true, and there is no .code else clause, then the .code if directive is deemed to have trivially succeeded, allowing matching to continue with whatever directive follows it. .coNP The Lisp @ if versus TXR @ if The .code @(output) directive supports the embedding of Lisp expressions, whose values are interpolated into the output. In particular, Lisp .code if expressions are useful. For instance .code "@(if expr \(dqA\(dq \(dqB\(dq)" reproduces .code A if .code expr yields a true value, otherwise .codn B . Yet the .code @(if) directive is also supported in .codn @(output) . How the apparent conflict between the two is resolved is that the two take different numbers of arguments. An .code @(if) which has no arguments at all is a syntax error. One that has one argument is the head of the .code if directive syntax which must be terminated by .code @(end) and which takes the optional .code @(elif) and .code @(else) clauses. An .code @(if) which has two or more arguments is parsed as a self-contained Lisp expression. .dir gather Sometimes text is structured as items that can appear in an arbitrary order. When multiple matches need to be extracted, there is a combinatorial explosion of possible orders, making it impractical to write pattern matches for all the possible orders. The .code gather directive is for these situations. It specifies multiple clauses which all have to match somewhere in the data, but in any order. For further convenience, the lines of the first clause of the .code gather directive are implicitly treated as separate clauses. The syntax follows this pattern: .verb @(gather) one-line-query1 one-line-query2 . . . one-line-queryN @(and) multi line query1 . . . @(and) multi line query2 . . . @(end) .brev The multiline clauses are optional. The .code gather directive takes keyword parameters, see below. .coNP The @ until / @ last clause in @ gather Similarly to .codn collect , .code gather has an optional .cod3 until / last clause: .verb @(gather) ... @(until) ... @(end) .brev How .code gather works is that the text is searched for matches for the single-line and multiline queries. The clauses are applied in the order in which they appear. Whenever one of the clauses matches, any bindings it produces are retained and it is removed from further consideration. Multiple clauses can match at the same text position. The position advances by the longest match from among the clauses which matched. If no clauses match, the position advances by one line. The search stops when all clauses are eliminated, and then the cumulative bindings are produced. If the data runs out, but unmatched clauses remain, the directive fails. Example: extract several environment variables, which do not appear in a particular order: .verb @(next :env) @(gather) USER=@USER HOME=@HOME SHELL=@SHELL @(end) .brev If the .code until or .code last clause is present and a match occurs, then the matches from the other clauses are discarded and the .code gather terminates. The difference between .cod3 until / last is that any bindings bindings established in .code last are retained, and the input position is advanced past the matching material. The .cod3 until / last clause has visibility to bindings established in the previous clauses in that same iteration, even though those bindings end up thrown away. For consistency, the .code :mandatory keyword is supported in the .cod3 until / last clause of .codn gather . The semantics of using .code :mandatory in this situation is tricky. In particular, if it is in effect, and the .code gather terminates successfully by collecting all required matches, it will trigger a failure. On the other hand, if the .code until or .code last clause activates before all required matches are gathered, a failure also occurs, whether or not the clause is .codn :mandatory . Meaningful use of .code :mandatory requires that the gather be open-ended; it must allow some (or all) variables not to be required. The presence of the option means that for .code gather to succeed, all required variables must be gathered first, but then termination must be achieved via the .cod3 until / last clause before all .code gather clauses are satisfied. .coNP Keyword Parameters in @ gather The .code gather directive accepts the keyword parameter .codn :vars . The argument to .code :vars is a list of required and optional variables. A required variable is specified as a symbol. An optional variable is specified as a two element list which pairs a symbol with a Lisp expression. That Lisp expression is evaluated and specifies the default value for the variable. Example: .verb @(gather :vars (a b c (d "foo"))) ... @(end) .brev Here, .codn a , .code b and .code c are required variables, and .code d is optional, with the default value given by the Lisp expression .strn foo . The presence of .code :vars changes the behavior in three ways. Firstly, even if all the clauses in the .code gather match successfully and are eliminated, the directive will fail if the required variables do not have bindings. It doesn't matter whether the bindings are existing, or whether they are established by .codn gather . Secondly, if some of the clauses of .code gather did not match, but all of the required variables have bindings, then the directive succeeds. Without the presence of .codn :vars , it would fail in this situation. Thirdly, if .code gather succeeds (all required variables have bindings), then all of the optional variables which do not have bindings are given bindings to their default values. The expressions which give the default values are evaluated whenever the .code gather directive is evaluated, whether or not their values are used. .dir collect The syntax of the .code collect directive is: .verb @(collect) ... lines of subquery @(end) .brev or with an .code until or .code last clause: .verb @(collect) ... lines of subquery: main clause @(until) ... lines of subquery: until clause @(end) @(collect) ... lines of subquery: main clause @(last) ... lines of subquery: last clause @(end) .brev The .code repeat symbol may be specified instead of .codn collect , which changes the meaning: .verb @(repeat) ... lines of subquery @(end) .brev The .code @(repeat) syntax is equivalent to .code "@(collect :vars nil)" and doesn't take the .code :vars clause. It accepts other .code collect parameters. The subquery is matched repeatedly, starting at the current line. If it fails to match, it is tried starting at the subsequent line. If it matches successfully, it is tried at the line following the entire extent of matched data, if there is one. Thus, the collected regions do not overlap. (Overlapping behavior can be obtained: see the .code @(trailer) directive.) Unless certain keywords are specified, or unless the collection is explicitly failed with .codn @(fail) , it always succeeds, even if it collects nothing, and even if the .cod3 until / last clause never finds a match. If no .cod3 until / last clause is specified, and the .code collect is not limited using parameters, the collection is unbounded: it consumes the entire data file. .coNP The @ until / @ last clause in @ collect If an .cod3 until / last clause is specified, the collection stops when that clause matches at the current position. If an .code until clause terminates .codn collect , no bindings are collected at that position, even if the main clause matches at that position also. Moreover, the position is not advanced. The remainder of the query begins matching at that position. If a .code last clause terminates .codn collect , the behavior is different. Any bindings captured by the main clause are thrown away, just like with the .code until clause. However, the bindings in the .code last clause itself survive, and the position is advanced to skip over that material. Example: .IP code: .mono \ @(collect) @a @(until) 42 @b @(end) @c .onom .IP data: .mono \ 1 2 3 42 5 6 .onom .IP result: .mono \ a[0]="1" a[1]="2" a[2]="3" c="42" .onom .PP The line .code 42 is not collected, even though it matches .codn @a . Furthermore, the .code @(until) does not advance the position, so variable .code c takes .codn 42 . If the .code @(until) is changed to .code @(last) the output will be different: .IP result: .mono \ a[0]="1" a[1]="2" a[2]="3" b="5" c="6" .onom .PP The .code 42 is not collected into a list, just like before. But now the binding captured by .code @b emerges. Furthermore, the position advances so variable now takes .codn 6 . The binding variables within the clause of a .code collect are treated specially. The multiple matches for each variable are collected into lists, which then appear as array variables in the final output. Example: .IP code: .mono \ @(collect) @a:@b:@c @(end) .onom .IP data: .mono \ John:Doe:101 Mary:Jane:202 Bob:Coder:313 .onom .IP result: .mono \ a[0]="John" a[1]="Mary" a[2]="Bob" b[0]="Doe" b[1]="Jane" b[2]="Coder" c[0]="101" c[1]="202" c[2]="313" .onom .PP The query matches the data in three places, so each variable becomes a list of three elements, reported as an array. Variables with list bindings may be referenced in a query. They denote a multiple match. The .code -D command-line option can establish a one-dimensional list binding. The clauses of .code collect may be nested. Variable matches collated into lists in an inner .code collect are again collated into nested lists in the outer .codn collect . Thus an unbound variable wrapped in N nestings of .code @(collect) will be an N-dimensional list. A one-dimensional list is a list of strings; a two-dimensional list is a list of lists of strings, etc. It is important to note that the variables which are bound within the main clause of a .codn collect , that is, the variables which are subject to collection, appear, within the .codn collect , as normal one-value bindings. The collation into lists happens outside of the .codn collect . So for instance in the query: .mono @(collect) @x=@x @(end) .onom The left .code @x establishes a binding for some material preceding an equal sign. The right .code @x refers to that binding. The value of .code @x is different in each iteration, and these values are collected. What finally comes out of the .code collect clause is a single variable called .code x which holds a list containing each value that was ever instantiated under that name within the .code collect clause. Also note that the .code until clause has visibility over the bindings established in the main clause. This is true even in the terminating case when the .code until clause matches, and the bindings of the main clause are discarded. .coNP Keyword Parameters in @ collect By default, .code collect searches the rest of the input indefinitely, or until the .cod3 until / last clause matches. It skips arbitrary amounts of nonmatching material before the first match, and between matches. Within the .code @(collect) syntax, it is possible to specify keyword parameters for additional control of the behavior. A keyword parameter consist of a keyword symbol followed by an argument, enclosed within the .code @(collect) syntax. The following are the supported keywords. .meIP :maxgap < n The .code :maxgap keyword takes a numeric argument .metn n , which is a Lisp expression. It causes .code collect to terminate if it fails to find a match after skipping .meta n lines from the starting position, or more than .meta n lines since any successful match. For example, .verb @(collect :maxgap 5) .brev specifies that the gap between the current position and the first match for the body of the .codn collect , or between consecutive matches can be no longer than five lines. A .code :maxgap value of .code 0 means that the collected regions must be adjacent and must match right from the starting position. For instance: .verb @(collect :maxgap 0) M @a @(end) .brev means: from here, collect consecutive lines of the form .strn "M ..." . This will not search for the first such line, nor will it skip lines which do not match this form. .meIP :mingap < n The .code :mingap keyword complements .codn :maxgap , though not exactly. Its argument .metn n , a Lisp expression, specifies a minimum number of lines which must separate consecutive matches. However, it has no effect on the distance from the starting position to the first match. .meIP :gap < n The .code :gap keyword effectively specifies .code :mingap and .code :maxgap at the same time, and can only be used if these other two are not used. Thus: .verb @(collect :gap 1) @a @(end) .brev means: collect every other line starting with the current line. .meIP :times < n This shorthand means the same thing as if .meIP :mintimes < n :maxtimes < n were specified. This means that exactly .meta n matches must occur. If fewer occur, then .code collect fails. The .code collect stops once it achieves .code n matches. .meIP :mintimes < n The argument .meta n of the .code :mintimes keyword is a Lisp expression which specifies that at least .meta n matches must occur, or else .code collect fails. .meIP :mintimes < n The Lisp argument expression .meta n of the .code :mintimes keyword specifies that at most .meta n matches are collected. .meIP :lines < n The argument .meta n of the .code :lines keyword parameter is a Lisp expression which specifies the upper bound on how many lines should be scanned by .codn collect , measuring from the starting position. The extent of the .code collect body is not counted. Example: .verb @(collect :lines 2) foo: @a bar: @b baz: @c @(end) .brev The above .code collect will look for a match only twice: at the current position, and one line down. .meIP :vars >> ({ variable | >> ( variable << default-value)}*) The .code :vars keyword specifies a restriction on what variables will emanate from the .codn collect . Its argument is a list of variable names. An empty list may be specified using empty parentheses or, equivalently, the symbol .codn nil . The .meta default-value element of the syntax is a Lisp expression. The behavior of the .code :vars keyword is specified in the following section, "Specifying variables in .codn collect \(dq. .meIP :lists <> ( variable *) The .code :lists keyword indicates a list of variables. After the .code collect terminates, each .meta variable in the list which does not have a binding is bound to the empty list symbol .codn nil . Unlike .code :vars the .code :lists mechanism doesn't assert that only the listed variables may emanate from the .codn collect . It also doesn't assert that each iteration of the .code collect must bind each of those variables. .meIP :counter >> { variable | >> ( variable << starting-value )} The .code :counter keyword's argument is a variable name symbol, or a compound expression consisting of a variable name symbol and the \*(TL expression .metn starting-value . If this keyword argument is specified, then a binding for .meta variable is established prior to each repetition of the .code collect body, to an integer value representing the repetition count. By default, repetition counts begin at zero. If .meta starting-value is specified, it must evaluate to a number. This number is then added to each repetition count, and .meta variable takes on the resulting displaced value. If there is an existing binding for .meta variable prior to the processing of the .codn collect , then the variable is shadowed. The binding is collected in the same way as other bindings that are established in the .code collect body. The repetition count only increments after a successful match. The .code variable is visible to the .codn collect 's .cod3 until / last clause. If that clause is being processed after a successful match of the body, then .meta variable holds an integer value. If the body fails to match, then the .cod3 until / last clause sees a binding for .code variable with a value of .codn nil . .PP .coNP Specifying Variables in @ collect Normally, any variable for which a new binding occurs in a .code collect block is collected. A .code collect clause may be "sloppy": it can neglect to collect some variables on some iterations, or bind some variables which are intended to behave like local temporaries, but end up collated into lists. Another issue is that the .code collect clause might not match anything at all, and then none of the variables are bound. The .code :vars keyword allows the query writer to add discipline the .code collect body. The argument to .code :vars is a list of variable specs. A variable spec is either a symbol, denoting a required variable, or a .mono .meti >> ( symbol << default-value ) .onom pair, where .meta default-value is a Lisp expression whose value specifies a default value for the variable, which is optional. When a .code :vars list is specified, it means that only the given variables can emerge from the successful .codn collect . Any newly introduced bindings for other variables do not propagate. More precisely, whenever the .code collect body matches successfully, the following three rules apply: .IP 1. If .code :vars specifies required variables, the .code collect body must bind all of them, or else must not bind any variable at all, whether listed in .code :vars or not, otherwise an exception of type .code query-error is thrown. .IP 2. If .code :vars specifies required variables, and also specifies default variables, and the .code collect body binds no variable at all, then the default variables are not bound to their default values. .IP 3. If .code :vars specifies optional variables, and all required variables are bound by the .code collect body, then all those optional variables that are not bound by the .code collect body are bound to their default values. Under this rule, if .code :vars specifies no required variables, that is deemed to be logically equivalent to all required variables being bound. .PP In the event that .code collect does not match anything, the variables specified in .codn :vars , whether required or optional, are all bound to empty lists. These bindings are established after the processing of the .cod3 until / last clause, if present. Example: .verb @(collect :vars (a b (c "foo"))) @a @c @(end) .brev Here, if the body .str @a @c matches, an error will be thrown because one of the mandatory variables is .codn b , and the body neglects to produce a binding for .codn b . Example: .verb @(collect :vars (a (c "foo"))) @a @b @(end) .brev Here, if .str @a @b matches, only .code a will be collected, but not .codn b , because .code b is not in the variable list. Furthermore, because there is no binding for .code c in the body, a binding is created with the value .strn foo , exactly as if .code c matched such a piece of text. In the following example, the assumption is that .code "THIS NEVER MATCHES" is not found anywhere in the input but the line .code "THIS DOES MATCH" is found and has a successor which is bound to .codn a . Because the body did not match, the .code :vars .code a and .code b should be bound to empty lists. But .code a is bound by the last clause to some text, so this takes precedence. Only .code b is bound to an empty list. .verb @(collect :vars (a b)) THIS NEVER MATCHES @(last) THIS DOES MATCH @a @(end) .brev The following means: do not allow any variables to propagate out of any iteration of the .code collect and therefore collect nothing: .verb @(collect :vars nil) ... @(end) .brev Instead of writing .codn "@(collect :vars nil)" , it is possible to write .codn @(repeat) . .code @(repeat) takes all .code collect keywords, except for .codn :vars . There is a .code @(repeat) directive used in .code @(output) clauses; that is a different directive. .coNP Mandatory @ until and @ last The .cod3 until / last clause supports the option keyword .codn :mandatory , exemplified by the following: .verb @(collect) ... @(last :mandatory) ... @(end) .brev This means that the .code collect .B must be terminated by a match for the .cod3 until / last clause, or else by an explicit .codn @(accept) . Specifically, the .code collect cannot terminate due to simply running out of data, or exceeding a limit on the number of matches that may be collected. In those situations, if an .code until or .code last clause is present with .codn :mandatory , the .code collect is deemed to have failed. .dir coll The .code coll directive is the horizontal version of .codn collect . Whereas .code collect works with multiline clauses on line-oriented material, .code coll works within a single line. With .codn coll , it is possible to recognize repeating regularities within a line and collect lists. Regular-expression-based Positive Match variables work well with .codn coll . Example: collect a comma-separated list, terminated by a space. .IP code: .mono \ @(coll)@{A /[^, ]+/}@(until) @(end)@B .onom .IP data: .mono \ foo,bar,xyzzy blorch .onom .IP result: .mono \ A[0]="foo" A[1]="bar" A[2]="xyzzy" B=blorch .onom .PP Here, the variable .code A is bound to tokens which match the regular expression .codn "/[^, ]+/" : nonempty sequence of characters other than commas or spaces. Like .codn collect , .code coll searches for matches. If no match occurs at the current character position, it tries at the next character position. Whenever a match occurs, it continues at the character position which follows the last character of the match, if such a position exists. If not bounded by an until clause, it will exhaust the entire line. If the until clause matches, then the collection stops at that position, and any bindings from that iteration are discarded. Like collect, coll also supports an .cod3 until / last clause, which propagates variable bindings and advances the position. The .code :mandatory keyword is supported. .code coll clauses nest, and variables bound within a coll are available to clauses within the rest of the .code coll clause, including the .cod3 until / last clause, and appear as single values. The final list aggregation is only visible after the .code coll clause. The behavior of .code coll leads to difficulties when a delimited variable are used to match material which is delimiter separated rather than terminated. For instance, entries in a comma-separated files usually do not appear as .str a,b,c, but rather .strn a,b,c . So for instance, the following result is not satisfactory: .IP code: .mono \ @(coll)@a @(end) .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ a[0]="1" a[1]="2" a[2]="3" a[3]="4" .onom .PP The .code 5 is missing because it isn't followed by a space, which the text-delimited variable match .str "@a " looks for. After matching "4 ", coll continues to look for matches, and doesn't find any. It is tempting to try to fix it like this: .IP code: .mono \ @(coll)@a@/ ?/@(end) .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ a[0]="" a[1]="" a[2]="" a[3]="" a[4]="" a[5]="" a[6]="" a[7]="" a[8]="" .onom .PP The problem now is that the regular expression .code "/ ?/" (match either a space or nothing), matches at any position. So when it is used as a variable delimiter, it matches at the current position, which binds the empty string to the variable, the extent of the match being zero. In this situation, the .code coll directive proceeds character by character. The solution is to use positive matching: specify the regular expression which matches the item, rather than a trying to match whatever follows. The .code collect directive will recognize all items which match the regular expression: .IP code: .mono \ @(coll)@{a /[^ ]+/}@(end) .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ a[0]="1" a[1]="2" a[2]="3" a[3]="4" a[4]="5" .onom .PP The .code until clause can specify a pattern which, when recognized, terminates the collection. So for instance, suppose that the list of items may or may not be terminated by a semicolon. We must exclude the semicolon from being a valid character inside an item, and add an until clause which recognizes a semicolon: .IP code: .mono \ @(coll)@{a /[^ ;]+/}@(until);@(end); .onom .IP data: .mono \ 1 2 3 4 5; .onom .IP result: .mono \ a[0]="1" a[1]="2" a[2]="3" a[3]="4" a[4]="5" .onom .PP Whether followed by the semicolon or not, the items are collected properly. Note that the .code @(end) is followed by a semicolon. That's because when the .code @(until) clause meets a match, the matching material is not consumed. This repetition can be avoided by using .code @(last) instead of .code @(until) since .code @(last) consumes the terminating material. Instead of the above regular-expression-based approach, this extraction problem can also be solved with .codn cases : .IP code: .mono \ @(coll)@(cases)@a @(or)@a@(end)@(end) .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ a[0]="1" a[1]="2" a[2]="3" a[3]="4" a[4]="5" .onom .PP .coNP Keyword Parameters in @ coll The .code @(coll) directive takes most of the same parameters as .codn @(collect) . See the section Keyword parameters in .code collect above. So for instance .code "@(coll :gap 0)" means that the collects must be consecutive, and .code "@(coll :maxtimes 2)" means that at most two matches will be collected. The .code :lines keyword does not exist, but there is an analogous .code :chars keyword. The .code @(coll) directive takes the .code :vars keyword. The shorthand .code @(rep) may be used instead of .codn "@(coll :vars nil)" . .code @(rep) takes all keywords, except .codn :vars . .dir flatten The .code flatten directive can be used to convert variables to one-dimensional lists. Variables which have a scalar value are converted to lists containing that value. Variables which are multidimensional lists are flattened to one-dimensional lists. Example (without .codn @(flatten) ): .IP code: .mono \ @b @(collect) @(collect) @a @(end) @(end) .onom .IP data: .mono \ 0 1 2 3 4 5 .onom .IP result: .mono \ b="0" a_0[0]="1" a_1[0]="2" a_2[0]="3" a_3[0]="4" a_4[0]="5" .onom .PP Example (with .codn @(flatten) ): .IP code: .mono \ @b @(collect) @(collect) @a @(end) @(end) @(flatten a b) .onom .IP data: .mono \ 0 1 2 3 4 5 .onom .IP result: .mono \ b="0" a[0]="1" a[1]="2" a[2]="3" a[3]="4" a[4]="5" .onom .PP .dir merge The syntax of .code merge follows the pattern: .mono .meti @(merge < destination >> [ sources ...]) .onom .meta destination is a variable, which receives a new binding. .meta sources are bind expressions. The .code merge directive provides a way of combining collected data from multiple nested lists in a way which normalizes different nesting levels among the sources. This directive is useful for combining the results from collects at different levels of nesting into a single nested list such that parallel elements are at equal depth. A new binding is created for the .meta destination variable, which holds the result of the operation. The .code merge directive performs its special function if invoked with at least three arguments: a destination and two sources. The one-argument case .code "@(merge x)" binds a new variable .code x and initializes it with the empty list and is thus equivalent to .codn "@(bind x)" . Likewise, the two-argument case .code "@(merge x y)" is equivalent to .codn "@(bind x y)" , establishing a binding for .code x which is initialized with the value of .codn y . To understand what merge does when two sources are given, as in .codn "@(merge C A B)" , we first have to define a property called depth. The depth of an atom such as a string is defined as .codn 1 . The depth of an empty list is .codn 0 . The depth of a nonempty list is one plus the depth of its deepest element. So for instance .str foo has depth 1, .mono ("foo") .onom has depth 2, and .mono ("foo" ("bar")) .onom has depth three. We can now define a binary (two argument) merge(A, B) function as follows. First, merge(A, B) normalizes the values A and B to produce a pair of values which have equal depth, as defined above. If either value is an atom it is first converted to a one-element list containing that atom. After this step, both values are lists; and the only way an argument has depth zero is if it is an empty list. Next, if either value has a smaller depth than the other, it is wrapped in a list as many times as needed to give it equal depth. For instance if A is .code ("a") and B is .code "((((\(dqb\(dq \(dqc\(dq) (\(dqd\(dq \(dqe))))" then A is converted to .codn "((((\(dqa\(dq))))" . Finally, the list values are appended together to produce the merged result. In the case of the preceding two example values, the result is: .codn "((((\(dqa\(dq))) (((\(dqb\(dq \(dqc\(dq) (\(dqd\(dq \(dqe))))" . The result is stored into a the newly bound destination variable .codn C . If more than two source arguments are given, these are merged by a left-associative reduction, which is to say that a three argument .code "merge(X, Y, Z)" is defined as .codn "merge(merge(X, Y), Z)" . The leftmost two values are merged, and then this result is merged with the third value, and so on. .dir cat The .code cat directive converts a list variable into a single piece of text. The syntax is: .mono .mets @(cat < var <> [ sep ]) .onom The .meta sep argument is a Lisp expression whose value specifies a separating piece of text. If it is omitted, then a single space is used as the separator. Example: .IP code: .mono \ @(coll)@{a /[^ ]+/}@(end) @(cat a ":") .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ a="1:2:3:4:5" .onom .PP .dir bind The syntax of the .code bind directive is: .mono .mets @(bind < pattern < bind-expression >> { keyword << value }*) .onom The .code bind directive is a kind of pattern match, which matches one or more variables given in .meta pattern against a value produced by the .meta bind-expression on the right. Variable names occurring in the .meta pattern expression may refer to bound or unbound variables. All variable references occurring in .meta bind-expression must have a value. Binding occurs as follows. The tree structure of .meta pattern and the value of .meta bind-expression are considered to be parallel structures. Any variables in .meta pattern which are unbound receive a new binding, which is initialized with the structurally corresponding piece of the object produced by .metn bind-expression . Any variables in .meta pattern which are already bound must match the corresponding part of the value of .metn bind-expression , or else the .code bind directive fails. Variables which are already bound are not altered, retaining their current values even if the matching is inexact. The simplest .code bind is of one variable against itself, for instance binding .code A against .codn A : .verb @(bind A A) .brev This will throw an exception if .code A is not bound. If .code A is bound, it succeeds, since .code A matches itself. The next simplest .code bind binds one variable to another: .verb @(bind A B) .brev Here, if .code A is unbound, it takes on the same value as .codn B . If .code A is bound, it has to match .codn B , or the .code bind fails. Matching means that either .IP - .code A and . code B are the same text .IP - .code A is text, .code B is a list, and .code A occurs within .codn B . .IP - vice versa: .code B is text, .code A is a list, and .code B occurs within .codn A . .IP - .code A and .code B are lists and are either identical, or one is found as a substructure within the other. .PP The right-hand side does not have to be a variable. It may be some other object, like a string, quasiliteral, regexp, or list of strings, etc. For instance, .verb @(bind A "ab\etc") .brev will bind the string .str ab\etc to the variable .code A if .code A is unbound. If .code A is bound, this will fail unless .code A already contains an identical string. However, the right-hand side of a .code bind cannot be an unbound variable, nor a complex expression that contains unbound variables. The left-hand side of .code bind can be a nested list pattern containing variables. The last item of a list at any nesting level can be preceded by a .code . (dot), which means that the variable matches the rest of the list from that position. .TP* "Example 1:" Suppose that the list A contains .mono ("now" "now" "brown" "cow"). .onom Then the directive .codn "@(bind (H N . C) A)" , assuming that .codn H , .code N and .code C are unbound variables, will bind .code H to .strn how , code N to .strn now , and .code C to the remainder of the list .mono ("brown" "cow"). .onom Example: suppose that the list .code A is nested to two dimensions and contains .mono (("how" "now") ("brown" "cow")). .onom Then .code "@(bind ((H N) (B C)) A)" binds .code H to .strn how , .code N to .strn now , .code B to .str brown and .code C to .strn cow . The dot notation may be used at any nesting level. it must be followed by an item. The forms .code (.) and .code "(X .)" are invalid, but .code "(. X)" is valid and equivalent to .codn X . The number of items in a left pattern match must match the number of items in the corresponding right side object. So the pattern .code () only matches an empty list. The notations .code () and .code nil mean exactly the same thing. The symbols .codn nil , .code t and keyword symbols may be used on either side. They represent themselves. For example .code "@(bind :foo :bar)" fails, but .code "@(bind :foo :foo)" succeeds since the two sides denote the same keyword symbol object. .TP* "Example 2:" In this example, suppose .code A contains .str foo and .code B contains bar. Then .code "@(bind (X (Y Z)) (A (B \(dqhey\(dq)))" binds .code X to .strn foo , .code Y to .str bar and .code Z to .strn hey . This is because the .meta bind-expression produces the object .mono ("foo" ("bar" "hey")) .onom which is then structurally matched against the pattern .codn "(X (Y Z))" , and the variables receive the corresponding pieces. .coNP Keywords in The @ bind Directive The .code bind directive accepts these keywords: .coIP :lfilt The argument to .code :lfilt is a filter specification. When the left side pattern contains a binding which is therefore matched against its counterpart from the right side expression, the left side is filtered through the filter specified by .code :lfilt for the purposes of the comparison. For example: .verb @(bind "a" "A" :lfilt :upcase) .brev produces a match, since the left side is the same as the right after filtering through the :upcase filter. .coIP :rfilt The argument to .code :rfilt is a filter specification. The specified filter is applied to the right-hand-side material prior to matching it against the left side. The filter is not applied if the left side is a variable with no binding. It is only applied to determine a match. Binding takes place the unmodified right-hand-side object. For example, the following produces a match: .verb @(bind "A" "a" :rfilt :upcase) .brev .coIP :filter This keyword is a shorthand to specify both filters to the same value. For instance .code ":filter :upcase" is equivalent to .codn ":lfilt :upcase :rfilt :upcase" . For a description of filters, see Output Filtering below. Compound filters like .code "(:fromhtml :upcase)" are supported with all these keywords. The filters apply across arbitrary patterns and nested data. Example: .verb @(bind (a b c) ("A" "B" "C")) @(bind (a b c) (("z" "a") "b" "c") :rfilt :upcase) .brev Here, the first bind establishes the values for .codn a , .code b and .codn c , and the second bind succeeds, because the value of a matches the second element of the list .mono ("z" "a") .onom if it is upcased, and likewise .code b matches .str b and .code c matches .str c if these are upcased. .coNP Lisp Forms in The @ bind Directive \*(TL forms, introduced by .code @ may be used in the .meta bind-expression argument of .codn bind , or as the entire form. This is consistent with the rules for bind expressions. \*(TL forms can be used in the .meta pattern expression also. Example: .verb @(bind a @(+ 2 2)) @(bind @(+ 2 2) @(* 2 2)) .brev Here, .code a is bound to the integer .codn 4 . The second .code bind then succeeds because the forms .code "(+ 2 2)" and .code "(* 2 2)" produce equal values. .dir set The syntax of the .code set directive is: .mono .mets @(set < pattern << bind-expression ) .onom The .code set directive syntactically resembles .codn bind , but is not a pattern match. It overwrites the previous values of variables with new values from the right-hand side. Each variable that is assigned must have an existing binding: .code set will not induce binding. Examples follow. Store the value of .code A back into .codn A , an operation with no effect: .verb @(set A A) .brev Exchange the values of .code A and .codn B : .verb @(set (A B) (B A)) .brev Store a string into .codn A : .verb @(set A "text") .brev Store a list into .codn A : .verb @(set A ("line1" "line2")) .brev Destructuring assignment. .code A ends up with .strn A , .code B ends up with .mono ("B1" "B2") .onom and .code C binds to .mono ("C1" "C2"). .onom .verb @(bind D ("A" ("B1" "B2") "C1" "C2")) @(bind (A B C) (() () ())) @(set (A B . C) D) .brev Note that .code set does not support a \*(TL expression on the left side, so the following are invalid syntax: .verb @(set @(+ 1 1) @(* 2 2)) @(set @b @(list "a")) .brev The second one is erroneous even though there is a variable on the left. Because it is preceded by the .code @ escape, it is a Lisp variable, and not a pattern variable. The .code set directive also doesn't support Lisp expressions in the .metn pattern , which must consist only of variables. .dir rebind The syntax of the .code rebind directive is: .mono .mets @(rebind < pattern << bind-expression ) .onom The .code rebind directive resembles .codn bind . It combines the semantics of .code local and .code bind into a single directive. The .meta bind-expression is evaluated in the current environment, and its value remembered. Then a new environment is produced in which all the variables specified in .meta pattern are absent. Then, the pattern is newly bound in that environment against the previously produced value, as if using .codn bind . The old environment with the previous variables is not modified; it continues to exist. This is in contrast with the .code set directive, which mutates existing bindings. .code rebind makes it easy to create temporary bindings based on existing bindings. .verb @(define pattern-function (arg)) @;; inside a pattern function: @(rebind recursion-level @(+ recursion-level 1)) @;; ... @(end) .brev When the function terminates, the previous value of recursion-level is restored. The effect is less verbose and more efficient than the following equivalent .verb @(define pattern-function (arg)) @;; inside a pattern function: @(local temp) @(set temp recursion-level) @(local recursion-level) @(set recursion-level @(+ temp 1)) @;; ... @(end) .brev Like .codn bind , .code rebind supports nested patterns, such as .verb @(rebind (a (b c)) (1 (2 3)) .brev but it does not support any keyword arguments. The filtering features of .code bind do not make sense in .code rebind because the variables are always reintroduced into an environment in which they don't exist, whereas filtering applies in situations when bound variables are matched against values. The .code rebind directive also doesn't support Lisp expressions in the .metn pattern , which must consist only of variables. .dir forget The .code forget has two spellings: .code @(forget) and .codn @(local) . The arguments are one or more symbols, for example: .verb @(forget a) @(local a b c) .brev this can be written .verb @(local a) @(local a b c) .brev Directives which follow the .code forget or .code local directive no longer see any bindings for the symbols mentioned in that directive, and can establish new bindings. It is not an error if the bindings do not exist. It is strongly recommended to use the .code @(local) spelling in functions, because the forgetting action simulates local variables: for the given symbols, the machine forgets any earlier variables from outside of the function, and consequently, any new bindings for those variables belong to the function. (Furthermore, functions suppress the propagation of variables that are not in their parameter list, so these locals will be automatically forgotten when the function terminates.) .dir do The syntax of .code @(do) is: .mono .mets @(do << lisp-expression *) .onom The .code do directive evaluates zero or more \*(TL expressions. (See TXR LISP far below.) The value of the expression is ignored, and matching continues with the directives which follow the .code do directive, if any. In the context of the .code do directive, the expression should not be introduced by the .code @ symbol; it is expected to be a Lisp expression. Example: .verb @; match text into variables a and b, then insert into hash table h @(bind h @(hash)) @a:@b @(do (set [h a] b)) .brev .dir mdo The syntax of .code @(mdo) is: .mono .mets @(mdo << lisp-expression *) .onom Like the .code do directive, .code mdo (macro-time .codn do ) evaluates zero or more \*(TL expressions. Unlike .codn do , .code mdo performs this evaluation immediately upon being parsed. Then it disappears from the syntax. The effect of .code "@(mdo e0 e1 e2 ...)" is exactly like .code "@(do (macro-time e0 e1 e2 ...))" except that .code do doesn't disappear from the syntax. Another difference is that .code do can be used as a horizontal or vertical directive, whereas .code mdo is only vertical. .dir in-package The .code in-package directive shares the same syntax and semantics as the \*(TL macro of the same name: .mono .mets (in-package << name ) .onom The .code in-package directive is evaluated immediately upon being parsed, leaving no trace in the syntax tree of the surrounding \*(TX query. It causes the .code *package* special variable to take on the package denoted by .metn name . The directive that .meta name is either a string or symbol. An error exception is thrown if this isn't the case. Otherwise it searches for the package. If the package is not found, an error exception is thrown. .SS* Blocks .NP* Overview Blocks are sections of a query which are either denoted by a name, or are anonymous. They may nest: blocks can occur within blocks and other constructs. Blocks are useful for terminating parts of a pattern-matching search prematurely, and escaping to a higher level. This makes blocks not only useful for simplifying the semantics of certain pattern matches, but also an optimization tool. Judicious use of blocks and escapes can reduce or eliminate the amount of backtracking that \*(TX performs. .dir block The .mono .meti @(block << name ) .onom directive introduces a named block, except when .meta name is the symbol .codn nil . The .code @(block) directive introduces an unnamed block, equivalent to .codn "@(block nil)" . The .code @(skip) and .code @(collect) directives introduce implicit anonymous blocks, as do function bodies. Blocks must be terminated by .code "@(end)" and can be vertical: .mono .mets @(block <> [ name ]) ... .mets @(end) .onom or horizontal: .mono .mets @(block <> [ name ])...@(end) .onom .NP* Block Scope The names of blocks are in a distinct namespace from the variable binding space. So .code "@(block foo)" is unrelated to the variable .codn @foo . A block extends from the .code "@(block ...)" directive which introduces it, until the matching .codn @(end) , and may be empty. For instance: .verb @(some) abc @(block foo) xyz @(end) @(end) .brev Here, the block foo occurs in a .code @(some) clause, and so it extends to the .code @(end) which terminates the block. After that .codn @(end) , the name foo is not associated with a block (is not "in scope"). The second .code @(end) terminates the .code @(some) block. The implicit anonymous block introduced by .code @(skip) has the same scope as the .codn @(skip) : it extends over all of the material which follows the skip, to the end of the containing subquery. .NP* Block Nesting Blocks may nest, and nested blocks may have the same names as blocks in which they are nested. For instance: .verb @(block) @(block) ... @(end) @(end) .brev is a nesting of two anonymous blocks, and .verb @(block foo) @(block foo) @(end) @(end) .brev is a nesting of two named blocks which happen to have the same name. When a nested block has the same name as an outer block, it creates a block scope in which the outer block is "shadowed"; that is to say, directives which refer to that block name within the nested block refer to the inner block, and not to the outer one. .NP* Block Semantics A block normally does nothing. The query material in the block is evaluated normally. However, a block serves as a termination point for .code @(fail) and .code @(accept) directives which are in scope of that block and refer to it. The precise meaning of these directives is: .meIP @(fail << name ) Immediately terminate the enclosing query block called .metn name , as if that block failed to match anything. If more than one block by that name encloses the directive, the innermost block is terminated. No bindings emerge from a failed block. .coIP @(fail) Immediately terminate the innermost enclosing anonymous block, as if that block failed to match. The .code @(fail) directive has a vertical and horizontal form. If the implicit block introduced by .code @(skip) is terminated in this manner, this has the effect of causing .code skip itself to fail. In other words, the behavior is as if .codn @(skip) 's search did not find a match for the trailing material, except that it takes place prematurely (before the end of the available data source is reached). If the implicit block associated with a .code @(collect) is terminated this way, then the entire .code collect fails. This is a special behavior, because a .code collect normally does not fail, even if it matches nothing and collects nothing! To prematurely terminate a .code collect by means of its anonymous block, without failing it, use .codn @(accept) . .meIP @(accept << name ) Immediately terminate the enclosing query block called .metn name , as if that block successfully matched. If more than one block by that name encloses the directive, the innermost block is terminated. .coIP @(accept) Immediately terminate the innermost enclosing anonymous block, as if that block successfully matched. .code @(accept) communicates the current bindings and input position to the terminated block. These bindings and current position may be altered by special interactions between certain directives and .codn @(accept) , described in the following section. Communicating the current bindings and input position means that the block which is terminated by .code @(accept) exhibits the bindings which were collected just prior to the execution of that .code @(accept) and the input position which was in effect at that time. .code @(accept) has a vertical and horizontal form. In the horizontal form, it communicates a horizontal input position. A horizontal input position thus communicated will only take effect if the block being terminated had been suspended on the same line of input. If the implicit block introduced by .code @(skip) is terminated by .codn @(accept) , this has the effect of causing the skip itself to succeed, as if all of the trailing material had successfully matched. If the implicit block associated with a .code @(collect) is terminated by .codn @(accept) , then the collection stops. All bindings collected in the current iteration of the collect are discarded. Bindings collected in previous iterations are retained, and collated into lists in accordance with the semantics of collect. Example: alternative way to achieve .code @(until) termination: .verb @(collect) @ (maybe) --- @ (accept) @ (end) @LINE @(end) .brev This query will collect entire lines into a list called .codn LINE . However, if the line .code --- is matched (by the embedded .codn @(maybe) ), the collection is terminated. Only the lines up to, and not including the .code --- line, are collected. The effect is identical to: .verb @(collect) @LINE @(until) --- @(end) .brev The difference (not relevant in these examples) is that the until clause has visibility into the bindings set up by the main clause. However, the following example has a different meaning: .verb @(collect) @LINE @ (maybe) --- @ (accept) @ (end) @(end) .brev Now, lines are collected until the end of the data source, or until a line is found which is followed by a .code --- line. If such a line is found, the collection stops, and that line is not included in the collection! The .code @(accept) terminates the process of the collect body, and so the action of collecting the last .code @LINE binding into the list is not performed. .PP Example: communication of bindings and input position: .IP code: .mono \ @(some) @(block foo) @first @(accept foo) @ignored @(end) @second .onom .IP data: .mono \ 1 2 3 .onom .IP result: .mono \ first="1" second="2" .onom .PP At the point where the .code accept occurs, the foo block has matched the first line, bound the text .str 1 to the variable .codn @first . The block is then terminated. Not only does the .code @first binding emerge from this terminated block, but what also emerges is that the block advanced the data past the first line to the second line. Next, the .code @(some) directive ends, and propagates the bindings and position. Thus the .code @second which follows then matches the second line and takes the text .strn 2 . Example: abandonment of .code @(some) clause by .codn @(accept) : In the following query, the foo block occurs inside a maybe clause. Inside the foo block there is a .code @(some) clause. Its first subclause matches variable .code @first and then terminates block foo. Since block foo is outside of the .code @(some) directive, this has the effect of terminating the .code @(some) clause: .IP code: .mono \ @(maybe) @(block foo) @ (some) @first @ (accept foo) @ (or) @one @two @three @four @ (end) @(end) @second .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ first="1" second="2" .onom .PP The second clause of the .code @(some) directive, namely: .verb @one @two @three @four .brev is never processed. The reason is that subclauses are processed in top to bottom order, but the processing was aborted within the first clause the .codn "@(accept foo)" . The .code @(some) construct never gets the opportunity to match four lines. If the .code "@(accept foo)" line is removed from the above query, the output is different: .IP code: .mono \ @(maybe) @(block foo) @ (some) @first @# <-- @(accept foo) removed from here!!! @ (or) @one @two @three @four @ (end) @(end) @second .onom .IP data: .mono \ 1 2 3 4 5 .onom .IP result: .mono \ first="1" one="1" two="2" three="3" four="4" second="5" .onom .PP Now, all clauses of the .code @(some) directive have the opportunity to match. The second clause grabs four lines, which is the longest match. And so, the next line of input available for matching is .codn 5 , which goes to the .code @second variable. .coNP Interaction Between The @ trailer and @ accept Directives If one of the clauses which follow a .code @(trailer) requests a successful termination to an outer block via .codn @(accept) , then .code @(trailer) intercepts the escape and adjusts the data extent to the position that it was given. Example: .IP code: .mono \ @(block) @(trailer) @line1 @line2 @(accept) @(end) @line3 .onom .IP data: .mono \ 1 2 3 .onom .IP result: .mono \ line1="1" line2="2" line3="1" .onom .PP The variable .code line3 is bound to .str 1 because although .code @(accept) yields a data position which has advanced to the third line, this is intercepted by .code @(trailer) and adjusted back to the first line. Neglecting to do this adjustment would violate the semantics of .codn trailer . .coNP Interaction Between The @ next and @ accept Directives When the clauses under a .code next directive are terminated by an .codn accept , such that control passes to a block which surrounds that .codn next , the .code accept is intercepted by .codn next . The input position being communicated by the .code accept is replaced with the original input position in the original stream which is in effect prior to the .code next directive. The .code accept transfer is then resumed. In other words, .code accept cannot be used to "leak" the new stream out of a .code next scope. However, .code next has no effect on the bindings being communicated. Example: .mono \ @(next "file-x") @(block b) @(next "file-y") @line @(accept b) @(end) .onom Here, the variable .code line matches the first line of the file .strn file-y , after which an .code accept transfer is initiated, targeting block .codn b . This transfer communicates the .code line binding, as well as the position within .codn file-y , pointing at the second line. However, the .code accept traverses the .code next directive, causing it to be abandoned. The special unwinding action within that directive detects this transfer and rewrites the input position to be the original one within the stream associated with .strn file-x . Note that this special handling exists in order for the behavior to be consistent with what would happen if the .code "@(accept b)" were removed, and the block .code b terminated normally: because the inner .code next is nested within that block, \*(TX would backtrack to the previous input position within .strn file-x . .coNP Interaction Between Functions and the @ accept Directive If a pattern function is terminated due to .codn accept , the function return mechanism intercepts the .codn accept . The bindings being communicated by that .code accept are then subject to the special resolution with respect to the function parameters, exactly as if the bindings were being returned normally out of the function. The resolved bindings then replace those being communicated by the .code accept and the .code accept transfer is resumed. Example: .mono \ @(define fun (a)) @ (bind a "a") @ (bind b "b") @ (accept blk) @(end) @(block blk) @(fun x) this line is skipped by accept @(end) .onom Here, the .code accept initiates a control transfer which communicates the .code a and .code b variable bindings which are visible in that scope. This transfer is intercepted by the function, and the treatment of the bindings follows to the same rules as a normal return (which, in the given function, would readily take place if the .code accept directive were removed). The .code b variable is suppressed, because .code b isn't a parameter of the function. Because .code a .B is a parameter, and the argument to that parameter is the unbound variable .codn x , the effect is that .code x is bound to the value of .codn a . When the accept transfer reaches block .code blk and terminates it, all that emerges is the .code x binding carrying .strn a . If the .code accept invocation is removed from .codn fun , then the function returns normally, producing the .code x binding. In that case, the line .code "this line is skipped by accept" isn't skipped since the block isn't being terminated; that line must match something. .coNP Interaction Between @ finally and the @ accept directive If the exception handling .code try directive protected body is terminated by an .code accept transfer, and if that .code try has a .code finally block, then there is a special interaction between the .code finally block and the .code accept transfer. The processing of the .code finally block detects that it has been triggered by an .code accept transfer. Consequently, it retrieves the current input position and bindings from that transfer, and uses that position and those bindings for the processing of the .code finally clauses. If the .code finally clauses succeed, then the new input position and new bindings are installed into the .code accept control transfer and that transfer resumes. If the .code finally clauses fail, then the .code accept transfer is converted to a .codn fail , with exactly the same block as its destination. .coNP Vertical-Horizontal Mismatch Between @ block and @ accept The .codn block , .code accept and .code fail directives comes in horizontal and vertical forms. This creates the possibility that an .code accept in horizontal context targets a vertical .code block or vice versa, raising the question of how the input position is treated. The semantics of this is defined. If a horizontal-context .code accept targets a vertical block, the current position at the target block will be the following line. That is to say, when the horizontal .code accept occurs, there is a current input line which may have unconsumed material past the current position. If the .code accept communicates its input position to a vertical context, that unconsumed material is skipped, as if it had been matched and the vertical position is advanced to the next line. If a horizontal block catches a vertical accept, it rejects that .codn accept 's position and stays at the current backtracking position for that block. Only the bindings from the .code accept are retained. .coNP Horizontal-Horizontal Mismatch between @ block and @ accept It is possible for a horizontal .code accept to terminate in a horizontal block which is processing a different line of input (or even a different input stream). This situation is treated the same way as vertical accept terminating in a horizontal block: the position communicated by .code accept is ignored, and only the bindings are taken. .SS* Functions .NP* Overview \*(TX functions allow a query to be structured to avoid repetition. On a theoretical note, because \*(TX functions support recursion, functions enable \*(TX to match some kinds of patterns which exhibit self-embedding, or nesting, and thus cannot be matched by a regular language. Functions in \*(TX are not exactly like functions in mathematics or functional languages, and are not like procedures in imperative programming languages. They are not exactly like macros either. What it means for a \*(TX function to take arguments and produce a result is different from the conventional notion of a function. A \*(TX function may have one or more parameters. When such a function is invoked, an argument must be specified for each parameter. However, a special behavior is at play here. Namely, some or all of the argument expressions may be unbound variables. In that case, the corresponding parameters behave like unbound variables also. Thus \*(TX function calls can transmit the "unbound" state from argument to parameter. It should be mentioned that functions have access to all bindings that are visible in the caller; functions may refer to variables which are not mentioned in their parameter list. With regard to returning, \*(TX functions are also unconventional. If the function fails, then the function call is considered to have failed. The function call behaves like a kind of match; if the function fails, then the call is like a failed match. When a function call succeeds, then the bindings emanating from that function are processed specially. Firstly, any bindings for variables which do not correspond to one of the function's parameters are thrown away. Functions may internally bind arbitrary variables in order to get their job done, but only those variables which are named in the function argument list may propagate out of the function call. Thus, a function with no arguments can only indicate matching success or failure, but not produce any bindings. Secondly, variables do not propagate out of the function directly, but undergo a renaming. For each parameter which went into the function as an unbound variable (because its corresponding argument was an unbound variable), if that parameter now has a value, that value is bound onto the corresponding argument. Example: .verb @(define collect-words (list)) @(coll)@{list /[^ \et]+/}@(end) @(end) .brev The above function .code collect-words contains a query which collects words from a line (sequences of characters other than space or tab), into the list variable called .codn list . This variable is named in the parameter list of the function, therefore, its value, if it has one, is permitted to escape from the function call. Suppose the input data is: .verb Fine summer day .brev and the function is called like this: .verb @(collect-words wordlist) .brev The result (with .codn "txr -B" ) is: .verb wordlist[0]=Fine wordlist[1]=summer wordlist[1]=day .brev How it works is that in the function call .codn "@(collect-words wordlist)" , .code wordlist is an unbound variable. The parameter corresponding to that unbound variable is the parameter .codn list . Therefore, that parameter is unbound over the body of the function. The function body collects the words of .str Fine summer day into the variable .codn list , and then yields the that binding. Then the function call completes by noticing that the function parameter .code list now has a binding, and that the corresponding argument .code wordlist has no binding. The binding is thus transferred to the .code wordlist variable. After that, the bindings produced by the function are thrown away. The only enduring effects are: .IP - the function matched and consumed some input; and .IP - the function succeeded; and .IP - the .code wordlist variable now has a binding. .PP Another way to understand the parameter behavior is that function parameters behave like proxies which represent their arguments. If an argument is an established value, such as a character string or bound variable, the parameter is a proxy for that value and behaves just like that value. If an argument is an unbound variable, the function parameter acts as a proxy representing that unbound variable. The effect of binding the proxy is that the variable becomes bound, an effect which is settled when the function goes out of scope. Within the function, both the original variable and the proxy are visible simultaneously, and are independent. What if a function binds both of them? Suppose a function has a parameter called .codn P , which is called with an argument .codn A , which is an unbound variable, and then, in the function, both .code A and .code P bound. This is permitted, and they can even be bound to different values. However, when the function terminates, the local binding of A simply disappears (because the symbol .code A is not among the parameters of the function). Only the value bound to .code P emerges, and is bound to .codn A , which still appears unbound at that point. The .code P binding disappears also, and the net effect is that .code A is now bound. The "proxy" binding of .code A through the parameter .code P "wins" the conflict with the direct binding. .NP* Definition Syntax Function definition syntax comes in two flavors: vertical and horizontal. Horizontal definitions actually come in two forms, the distinction between which is hardly noticeable, and the need for which is made clear below. A function definition begins with a .code "@(define ...)" directive. For vertical functions, this is the only element in a line. The .code define symbol must be followed by a symbol, which is the name of the function being defined. After the symbol, there is a parenthesized optional argument list. If there is no such list, or if the list is specified as .code () or the symbol .code nil then the function has no parameters. Examples of valid .code define syntax are: .verb @(define foo) @(define bar ()) @(define match (a b c)) .brev If the .code define directive is followed by more material on the same line, then it defines a horizontal function: .verb @(define match-x)x@(end) .brev If the define is the sole element in a line, then it is a vertical function, and the function definition continues below: .verb @(define match-x) x @(end) .brev The difference between the two is that a horizontal function matches characters within a line, whereas a vertical function matches lines within a stream. The former .code match-x matches the character .codn x , advancing to the next character position. The latter .code match-x matches a line consisting of the character .codn x , advancing to the next line. Material between .code @(define) and .code @(end) is the function body. The define directive may be followed directly by the .code @(end) directive, in which case the function has an empty body. Functions may be nested within function bodies. Such local functions have dynamic scope. They are visible in the function body in which they are defined, and in any functions invoked from that body. The body of a function is an anonymous block. (See Blocks above.) .NP* Two Forms of The Horizontal Function If a horizontal function is defined as the only element of a line, it may not be followed by additional material. The following construct is erroneous: .verb @(define horiz (x))@foo:@bar@(end)lalala .brev This kind of definition is actually considered to be in the vertical context, and like other directives that have special effects and that do not match anything, it does not consume a line of input. If the above syntax were allowed, it would mean that the line would not only define a function but also match .codn "lalala" . This would, in turn, would mean that the .code @(define)...@(end) is actually in horizontal mode, and so it matches a span of zero characters within a line (which means that is would require a line of input to match: a surprising behavior for a nonmatching directive!) A horizontal function can be defined in an actual horizontal context. This occurs if its is in a line where it is preceded by other material. For instance: .verb X@(define fun)...@(end)Y .brev This is a query line which must match the text .codn XY . It also defines the function .codn fun . The main use of this form is for nested horizontal functions: .verb @(define fun)@(define local_fun)...@(end)@(end) .brev .NP* Vertical-Horizontal Overloading A function of the same name may be defined as both vertical and horizontal. Both functions are available at the same time. Which one is used by a call is resolved by context. See the section Vertical Versus Horizontal Calls below. .NP* Call Syntax A function is invoked by compound directive whose first symbol is the name of that function. Additional elements in the directive are the arguments. Arguments may be symbols, or other objects like string and character literals, quasiliterals ore regular expressions. Example: .IP code: .mono \ @(define pair (a b)) @a @b @(end) @(pair first second) @(pair "ice" cream) .onom .IP data: .mono \ one two ice milk .onom .IP result: .mono \ first="one" second="two" cream="milk" .onom .PP The first call to the function takes the line .strn "one two" . The parameter .code a takes .str one and parameter .code b takes .strn two . These are rebound to the arguments .code first and .codn second . The second call to the function binds the a parameter to the word .strn ice , and the .code b is unbound, because the corresponding argument .code cream is unbound. Thus inside the function, .code a is forced to match .codn "ice" . Then a space is matched and .code b collects the text .strn milk . When the function returns, the unbound .str cream variable gets this value. If a symbol occurs multiple times in the argument list, it constrains both parameters to bind to the same value. That is to say, all parameters which, in the body of the function, bind a value, and which are all derived from the same argument symbol must bind to the same value. This is settled when the function terminates, not while it is matching. Example: .IP code: .mono \ @(define pair (a b)) @a @b @(end) @(pair same same) .onom .IP data: .mono \ one two .onom .IP result: .mono \ [query fails] .onom .PP Here the query fails because .code a and .code b are effectively proxies for the same unbound variable .code same and are bound to different values, creating a conflict which constitutes a match failure. .NP* Vertical Versus Horizontal Calls A function call which is the only element of the query line in which it occurs is ambiguous. It can go either to a vertical function or to the horizontal one. If both are defined, then it goes to the vertical one. Example: .IP code: .mono \ @(define which (x))@(bind x "horizontal")@(end) @(define which (x)) @(bind x "vertical") @(end) @(which fun) .onom .IP result: .mono \ fun="vertical" .onom .PP Not only does this call go to the vertical function, but it is in a vertical context. If only a horizontal function is defined, then that is the one which is called, even if the call is the only element in the line. This takes place in a horizontal character-matching context, which requires a line of input which can be traversed: Example: .IP code: .mono \ @(define which (x))@(bind x "horizontal")@(end) @(which fun) .onom .IP data: .mono \ ABC .onom .IP result: .mono \ [query fails] .onom .PP The query fails because since .code "@(which fun)" is in horizontal mode, it matches characters in a line. Since the function body consists only of .code "@(bind ...)" which doesn't match any characters, the function call requires an empty line to match. The line .code ABC is not empty, and so there is a matching failure. The following example corrects this: Example: .IP code: .mono \ @(define which (x))@(bind x "horizontal")@(end) @(which fun) .onom .IP data: .mono \ [empty line] .onom .IP result: .mono \ fun="horizontal" .onom .PP A call made in a clearly horizontal context will prefer the horizontal function, and only fall back on the vertical one if the horizontal one doesn't exist. (In this fallback case, the vertical function is called with empty data; it is useful for calling vertical functions which process arguments and produce values.) In the next example, the call is followed by trailing material, placing it in a horizontal context. Leading material will do the same thing: Example: .IP code: .mono \ @(define which (x))@(bind x "horizontal")@(end) @(define which (x)) @(bind x "vertical") @(end) @(which fun)B .onom .IP data: .mono \ B .onom .IP result: .mono \ fun="horizontal" .onom .PP .NP* Local Variables As described earlier, variables bound in a function body which are not parameters of the function are discarded when the function returns. However, that, by itself, doesn't make these variables local, because pattern functions have visibility to all variables in their calling environment. If a variable .code x exists already when a function is called, then an attempt to bind it inside a function may result in a failure. The .code local directive must be used in a pattern function to list which variables are local. Example: .verb @(define path (path))@\e @(local x y)@\e @(cases)@\e (@(path x))@(path y)@(bind path `(@x)@y`)@\e @(or)@\e @{x /[.,;'!?][^ \et\ef\ev]/}@(path y)@(bind path `@x@y`)@\e @(or)@\e @{x /[^ .,;'!?()\et\ef\ev]/}@(path y)@(bind path `@x@y`)@\e @(or)@\e @(bind path "")@\e @(end)@\e @(end) .brev This is a horizontal function which matches a path, which lands into four recursive cases. A path can be parenthesized path followed by a path; it can be a certain character followed by a path, or it can be empty This function ensures that the variables it uses internally, .code x and .codn y , do not have anything to do with any inherited bindings for .code x and .codn y . Note that the function is recursive, which cannot work without .code x and .code y being local, even if no such bindings exist prior to the top-level invocation of the function. The invocation .code "@(path x)" causes .code x to be bound, which is visible inside the invocation .codn "@(path y)" , but that invocation needs to have its own binding of .code x for local use. .NP* Nested Functions Function definitions may appear in a function. Such definitions are visible in all functions which are invoked from the body (and not necessarily enclosed in the body). In other words, the scope is dynamic, not lexical. Inner definitions shadow outer definitions. This means that a caller can redirect the function calls that take place in a callee, by defining local functions which capture the references. Example: .IP code: .mono \ @(define which) @ (fun) @(end) @(define fun) @ (output) top-level fun! @ (end) @(end) @(define callee) @ (define fun) @ (output) local fun! @ (end) @ (end) @ (which) @(end) @(callee) @(which) .onom .IP output: .mono \ local fun! top-level fun! .onom .PP Here, the function .code which is defined which calls .codn fun . A top-level definition of .code fun is introduced which outputs .strn "top-level fun!" . The function .code callee provides its own local definition of .code fun which outputs .str "local fun!" before calling .codn which . When .code callee is invoked, it calls .codn which , whose .code @(fun) call is routed to callee's local definition. When .code which is called directly from the top level, its .code fun call goes to the top-level definition. .NP* Indirect Calls Function indirection may be performed using the .code call directive. If .meta fun-expr is an Lisp expression which evaluates to a symbol, and that symbol names a function which takes no arguments, then .verb @(call fun-expr) .brev may be used to invoke the function. Additional expressions may be supplied which specify arguments. Example 1: .mono \ @(define foo (arg)) @(bind arg "abc") @(end) @(call 'foo b) .onom In this example, the effect is that .code foo is invoked, and .code b ends up bound to .strn abc . The .code call directive here uses the .code 'foo expression to calculate the name of the function to be invoked. (See the .code quote operator). This particular .code call expression can just be replaced by the direct invocation syntax .codn "@(foo b)" . The power of .code call lies in being able to specify the function as a value which comes from elsewhere in the program, as in the following example. .mono \ @(define foo (arg)) @(bind arg "abc") @(end) @(bind f @'foo) @(call f b) .onom Here the .code call directive obtains the name of the function from the .code f variable. Note that function names are resolved to functions in the environment that is apparent at the point in execution where the .code call takes place. The directive .code "@(call f args ...)" is precisely equivalent to .code "@(s args ...)" if, at the point of the call, .code f is a variable which holds the symbol .code s and symbol .code s is defined as a function. Otherwise it is erroneous. .SS* Modularization .dirs load include The syntax of the .code load and .code include directives is: .mono .mets @(load << expr ) .mets @(include << expr ) .onom Where .meta expr is a Lisp expression that evaluates to a string giving the path of the file to load. Firstly, the path given by .meta expr is converted to an effective path, as follows. If the value of the .code *load-path* variable has a current value which is not .code nil and the path given in .meta expr is pure relative according to the .code pure-rel-path-p function, then the effective path is interpreted taken relative to the directory portion of the path which is stored in .codn *load-path* . If .code *load-path* is .codn nil , or the load path is not pure relative, then the path is taken as-is as the effective path. Next, an attempt is made to open the file for processing, in almost exactly the same manner as by the \*(TL function .codn load . The difference is that if the effective path is unsuffixed, then the .code .txr suffix is added to it, and that resulting path is tried first, and if it succeeds, then the file is treated as \*(TX Pattern Language syntax. If that fails, then the suffix .code .tlo is tried, and so forth, as described for the .code load function. If these initial attempts to find the file fail, and the failure is due to the file not being found rather than some other problem such as a permission error, and .meta expr isn't an absolute path according to .codn abs-path-p , then additional attempts are made by searching for the file in the list of directories given in the .code *load-search-dirs* variable. Details are given in the description of the \*(TL .code load function. Both the .code load and .code include directives bind the .code *load-path* variable to the path of the loaded file just before parsing syntax from it, The .code *package* variable is also given a new dynamic binding, whose value is the same as the existing binding. These bindings are removed when the load operation completes, restoring the prior values of these variables. The .code *load-hooks* variable is given a new dynamic binding, with a .code nil value. If the file opened for processing is \*(TL source, or a compiled \*(TL file, then it is processed in the manner described for the .code load function. Different requirements apply to the processing of the file under the .code load and .code include directives. The .code include directive performs the processing of the file at parse time. If the file being processed is \*(TX Pattern Language, then it is parsed, and then its syntax replaces the .code include directive, as if it had originally appeared in its place. If a \*(TL source or a compiled \*(TL file is processed by .code include then the .code include directive is removed from the syntax. The .code load directive performs the processing of the file at evaluation time. Evaluation time occurs after a \*(TX program is read from beginning to end and parsed. That is to say, when a \*(TX query is parsed, any embedded .code "@(load ...)" forms in it are parsed and constitute part of its syntax tree. They are executed when that query is executed, whenever its execution reaches those .code load directives. When the .code load directive processes \*(TX Pattern Language syntax, it parses the file in its entirety and then executes that file's directives against the current input position. Repeated executions of the same .code load directive result in repeated processing of the file. Note: the .code include directive is useful for loading \*(TX files which contain Lisp macros which are needed by the parent program. The parent program cannot use .code load to bring in macros because macros are required during expansion, which takes place prior to evaluation time, whereas .code load doesn't execute until evaluation time. Note: the .code load directive doesn't provide access to the value propagated by a .code return via the .code load block. See also: the .code load function, and the .codn self-path , .code stdlib and .code *load-path* variables in \*(TL. .SS* Output .NP* Introduction A \*(TX query may perform custom output. Output is performed by .code output clauses, which may be embedded anywhere in the query, or placed at the end. Output occurs as a side effect of producing a part of a query which contains an .code @(output) directive, and is executed even if that part of the query ultimately fails to find a match. Thus output can be useful for debugging. An .code output clause specifies that its output goes to a file, pipe, or (by default) standard output. If any output clause is executed whose destination is standard output, \*(TX makes a note of this, and later, just prior to termination, suppresses the usual printing of the variable bindings or the word false. .dir output The syntax of the .code @(output) directive is: .mono .mets @(output [ < destination ] { < bool-keyword | < keyword < value }* ) . . one or more output directives or lines . @(end) .onom If the directive has arguments, then the first one is evaluated. If it is an object other than a keyword symbol, then it specifies the optional .metn destination . Any remaining arguments after the optional destination are the keyword list. If the destination is missing, then the entire argument list is a keyword list. The .meta destination argument, if present, is treated as a \*(TL expression and evaluated. The resulting value is taken as the output destination. The value may be a string which gives the pathname of a file to open for output. Otherwise, the destination must be a stream object. The keyword list consists of a mixture of Boolean keywords which do not have an argument, or keywords with arguments. The following Boolean keywords are supported: .coIP :nothrow The .code output directive throws an exception if the output destination cannot be opened, unless the .code :nothrow keyword is present, in which case the situation is treated as a match failure. Note that since command pipes are processes that report errors asynchronously, a failing command will not throw an immediate exception that can be suppressed with .codn :nothrow . This is for synchronous errors, like trying to open a destination file, but not having permissions, etc. .coIP :append This keyword is meaningful for files, specifying append mode: the output is to be added to the end of the file rather than overwriting the file. .PP The following value keywords are supported: .coIP :filter The argument can be a symbol, which specifies a filter to be applied to the variable substitutions occurring within the .code output clause. The argument can also be a list of filter symbols, which specifies that multiple filters are to be applied, in left-to-right order. See the later sections Output Filtering below, and The Deffilter Directive. .coIP :into The argument of .code :into is a symbol which denotes a variable. The output will go into that variable. If the variable is unbound, it will be created. Otherwise, its contents are overwritten unless the .code :append keyword is used. If .code :append is used, then the new content will be appended to the previous content of the variable, after flattening the content to a list, as if by the .code flatten directive. .coIP :named The argument of .code :named is a symbol which denotes a variable. The file or pipe stream which is opened for the output is stored in this variable, and is not closed at the end of the output block. This allows a subsequent output block to continue output on the same stream, which is possible using the next two keywords, .code :continue or .codn :finish . A new binding is established for the variable, even if it already has an existing binding. .coIP :continue A destination should not be specified if .code :continue is used. The argument of .code :continue is an expression, such as a variable name, that evaluates to a stream object. That stream object is used for the output block. At the end of the output block, the stream is flushed, but not closed. A usage example is given in the documentation for the Close Directive below. .coIP :finish A destination should not be specified if .code :finish is used. The argument of .code :finish is an expression, such as a variable name, that evaluates to a stream object. That stream object is used for the output block. At the end of the output block, the stream is closed. An example is given in the documentation for the Close Directive below. .dir push The .code @(push) directive is a variant of .code @(output) which produces lines of text that are pushed back into the input stream. This directive supports only the .code :filter keyword argument. This directive doesn't take any of the keyword arguments supported by .code @(output) except for the .code :filter keyword. After the execution of a .codn @(push) , the next pattern matching syntax that is evaluated now faces the material produced by that .code @(push) followed by the original input. In order to preserve the line numbering of the original input, .code @(push) adjusts the line number for the synthetic input by subtracting the number of synthetic lines from the original input's line number. For instance if the original input is line 5, and 7 lines are prepended by .codn @(push) , then those lines are numbered -2 to 4. The input-synthesizing effect of .code @(push) is visible to a subsequent form in exactly those situations in which an input-consuming effect of a pattern matching directive would also be visible. For instance, a .code @(push) occurring in the body of a .code @(collect) can produce input that is visible to the next iteration. The .code @(push) directive interacts with the parallel matching directives such as .codn @(some) . When multiple parallel clauses match, the input position is advanced by the longest match. Lines pushed into the input by .code @(push) look like negative advancement. If one clause advances in the input, while another one pushes into it, the push will lose to the advancement and its effect will disappear. If two clauses push varying amounts of material, the shorter push will win. .TP* Example: Swap the first two lines if they start with a colon, changing the colon to a period: .IP Code: .verb @(maybe) :@a :@b @ (push) .@b .@a @ (end) @(end) @(data capture) @(do (tprint capture)) .brev .IP Data: .verb :hello :there rest of data .brev .IP Output: .verb .there .hello rest of data .brev .NP* Output Text Text in an output clause is not matched against anything, but is output verbatim to the destination file, device or command pipe. .NP* Output Variables Variables occurring in an output clause do not match anything; instead their contents are output. A variable being output can be any object. If it is of a type other than a list or string, it will be converted to a string as if by the .code tostring function in \*(TL. A value which is a sequence is converted to a string in a special way: the elements are individually converted to strings and then they are catenated together. The default separator string for most sequences is a single space: an alternate separation can be specified as an argument in the brace substitution syntax. Empty sequences turn into an empty string. More details are given in the .B "Output Variables: Separation" section below. Lists may be output within .code @(repeat) or .code @(rep) clauses. Each nesting of these constructs removes one level of nesting from the list variables that it contains. In an output clause, the .mono .meti >> @{ name << number } .onom variable syntax generates fixed-width field, which contains the variable's text. The absolute value of the number specifies the field width. For instance .code -20 and .code 20 both specify a field width of twenty. If the text is longer than the field, then it overflows the field. If the text is shorter than the field, then it is left-adjusted within that field, if the width is specified as a positive number, and right-adjusted if the width is specified as negative. An output variable may specify a filter which overrides any filter established for the output clause. The syntax for this is .mono .meti @{NAME :filter << filterspec }. .onom The filter specification syntax is the same as in the output clause. See Output Filtering below. .NP* Output Variables: Buffer Objects When the value of an output variable is a buffer (object of type .codn buf ), it is rendered as a sequence of hexadecimal digit pairs, with no line breaks. The digits .code a through .code f are rendered in lower case. .NP* Output Variables: Separation As mentioned in the previous section, the value of a variable can be a sequence. The individual elements of a sequence are turned into strings, and then catenated together with the separator, which may be specified as a string modifier in the variable syntax. For most sequences, the default separator is a space. When the value of a variable is a character string, and the separator is not specified, the string is output as-is. Effectively, the string is treated as a sequence but with an empty default separator. When the value of a variable is a buffer, it is rendered in hexadecimal, as described in the previous section. If a separator string modifier is specified, it separates pairs of digits, rather than individual digits. Example: .verb @(bind str "string") @(bind buf #b'cafef00d') @(output) @{str[0..3] "--"} @{buf[0..2] ":"} @{buf[2..4] "/"} @(end) .brev The above example produces the output .verb s--t--r ca:fe f0/0d .brev .NP* Output Variables: Indexing Additional syntax is supported in output variables that does not appear in pattern-matching variables. A square bracket index notation may be used to extract elements or ranges from a variable, which works with strings, vectors and lists. Elements are indexed from zero. This notation is only available in brace-enclosed syntax, and looks like this: .meIP <> @{name[ expr ]} Extract the element at the position given by .metn expr . .meIP <> @{name[ expr1..expr2 ]} Extract a range of elements from the position given by .metn expr1 , up to one position less than the position given by .metn expr2 . If the variable is a list, it is treated as a list substitution, exactly as if it were the value of an unsubscripted list variable. The elements of the list are converted to strings and catenated together with a separator string between them, the default one being a single space. An alternate character may be given as a string argument in the brace notation. .PP Example: .verb @(bind a ("a" "b" "c" "d")) @(output) @{a[1..3] "," 10} @(end) .brev The above produces the text .str b,c in a field .code 10 spaces wide. The .code [1..3] argument extracts a range of .codn a ; the .str , argument specifies an alternate separator string, and .code 10 specifies the field width. When a variable includes indexing, separation and a field width, the indexing operation is first applied to select a subsequence. Then separation is applied to produce a textual representation. Finally the representation is rendered din the specified field width. .NP* Output Substitutions The brace syntax has another syntactic and semantic extension in .code output clauses. In place of the symbol, an expression may appear. The value of that expression is substituted. Example: .mono @(bind a "foo") @(output) @{`@a:` -10} .onom Here, the quasiliteral expression .code `@a:` is evaluated, producing the string .strn foo: . This string is printed right-adjusted in a .code 10 character field. .dir repeat The .code repeat directive generates repeated text from a "boilerplate", by taking successive elements from lists. The syntax of repeat is like this: .verb @(repeat) . . main clause material, required . . special clauses, optional . . @(end) .brev .code repeat has four types of special clauses, any of which may be specified with empty contents, or omitted entirely. They are described below. .code repeat takes arguments, also described below. All of the material in the main clause and optional clauses is examined for the presence of variables. If none of the variables hold lists which contain at least one item, then no output is performed, (unless the repeat specifies an .code @(empty) clause, see below). Otherwise, among those variables which contain nonempty lists, repeat finds the length of the longest list. This length of this list determines the number of repetitions, R. If the .code repeat contains only a main clause, then the lines of this clause is output R times. Over the first repetition, all of the variables which, outside of the repeat, contain lists are locally rebound to just their first item. Over the second repetition, all of the list variables are bound to their second item, and so forth. Any variables which hold shorter lists than the longest list eventually end up with empty values over some repetitions. Example: if the list .code A holds .strn 1 , .str 2 and .strn 3 ; the list .code B holds .strn A , .strn B ; and the variable .code C holds .strn X , then .verb @(repeat) >> @C >> @A @B @(end) .brev will produce three repetitions (since there are two lists, the longest of which has three items). The output is: .verb >> X >> 1 A >> X >> 2 B >> X >> 3 .brev The last line has a trailing space, since it is produced by .strn "@A @B" , where .code B has an empty value. Since .code C is not a list variable, it produces the same value in each repetition. The special clauses are: .coIP @(single) If the .code repeat produces exactly one repetition, then the contents of this clause are processed for that one and only repetition, instead of the main clause or any other clause which would otherwise be processed. .coIP @(first) The body of this clause specifies an alternative body to be used for the first repetition, instead of the material from the main clause. .coIP @(last) The body of this clause is used instead of the main clause for the last repetition. .coIP @(empty) If the repeat produces no repetitions, then the body of this clause is output. If this clause is absent or empty, the repeat produces no output. .coIP "@(mod n m)" The forms .code n and .code m are Lisp expressions that evaluate to integers. The value of .code m should be nonzero. The clause denoted this way is active if the repetition modulo .code m is equal to .codn n . The first repetition is numbered zero. For instance the clause headed by .code "@(mod 0 2)" will be used on repetitions 0, 2, 4, 6, ... and .code "@(mod 1 2)" will be used on repetitions 1, 3, 5, 7, ... .coIP "@(modlast n m)" The meaning of .code n and .code m is the same as in .codn "@(mod n m)" , but one more condition is imposed. This clause is used if the repetition modulo .code m is equal to .codn n , and if it is the last repetition. .PP The precedence among the clauses which take an iteration is: .codn "single > first > modlast > last > mod > main" . That is, whenever two or more of these clauses can apply to a repetition, then the leftmost one in this precedence list will be selected. It is possible for all these clauses to be viable for processing the same repetition. If a .code repeat occurs which has only one repetition, then that repetition is simultaneously the first, only and last repetition. Moreover, it also matches .code "(mod 0 m)" and, because it is the last repetition, it matches .codn "(modlast 0 m)" . In this situation, if there is a .code @(single) clause present, then the repetition shall be processed using that clause. Otherwise, if there is a .code @(first) clause present, that clause is activated. Failing that, .code @(modlast) is used if there is such a clause, featuring an .code n argument of zero. If there isn't, then the .code @(last) clause is considered, if present. Otherwise, the .code @(mod) clause is considered if present with an .code n argument of zero. Otherwise, none of these clauses are present or applicable, and the repetition is processed using the main clause. The .code @(empty) clause does not appear in the above precedence list because it is mutually exclusive with respect to the others: it is processed only when there are no iterations, in which case even the main clause isn't active. The .code @(repeat) clause supports arguments. .mono .mets @(repeat .mets \ \ \ [:counter >> { symbol | >> ( symbol << expr )}] .mets \ \ \ [:vars >> ({ symbol | >> ( symbol << expr )}*)]) .onom The .code :counter argument designates a symbol which will behave as an integer variable over the scope of the clauses inside the repeat. The variable provides access to the repetition count, starting at zero, incrementing with each repetition. If the argument is given as .mono .meti >> ( symbol << expr ) .onom then .meta expr is a Lisp expression whose value is taken as a displacement value which is added to each iteration of the counter. For instance .code ":counter (c 1)" specifies a counter .code c which counts from 1. The .code :vars argument specifies a list of variable name symbols .meta symbol or else pairs of the form .mono .meti >> ( symbol << init-form ) .onom consisting of a variable name and Lisp expression. Historically, the former syntax informed .code repeat about references to variables contained in Lisp code. This usage is no longer necessary as of \*(TX 243, since the .code repeat construct walks Lisp code, identifying all free variables. The latter syntax introduces a new pattern variable binding for .meta symbol over the scope of the .code repeat construct. The .meta init-form specifies a Lisp expression which is evaluated to produce the binding's value. The .code repeat directive then processes the list of variables, selecting from it those which have a binding, either a previously existing binding or the one just introduced. For each selected variable, repeat will assume that the variable occurs in the repeat block and contains a list to be iterated. The variable binding syntax supported by .code :vars of the form .mono .meti >> ( symbol << init-form ) .onom provides a solution for situations when it is necessary to iterate over some list, but that list is the result of an expression, and not stored in any variable. A repeat block iterates only over lists emanating from variables; it does not iterate over lists pulled from arbitrary expressions. Example: output all file names matching the .code *.txr pattern in the current directory: .verb @(output) @(repeat :vars ((name (glob "*.txr")))) @name @(end) @(end) .brev Prior to \*(TX 243, the simple variable-binding syntax supported by .code :vars of the form .meta symbol was needed for situations in which \*(TL expressions which referenced variables were embedded in .code @(repeat) blocks. Variable references embedded in Lisp code were not identified in .codn @(repeat) . For instance, the following produced no output, because no variables were found in the .code repeat body: .verb @(bind trigraph ("abc" "def" "ghi")) @(output) @(repeat) @(reverse trigraph) @(end) @(end) .brev There is a reference to .meta trigraph but it's inside the .code "(reverse trigraph)" Lisp expression that was not processed by .codn repeat . The solution was to mention .meta trigraph in the .code :vars construct: .verb @(bind trigraph ("abc" "def" "ghi")) @(output) @(repeat :vars (trigraph)) @(reverse trigraph) @(end) @(end) .brev Then the .code repeat block would iterate over .metn trigraph , producing the output .verb cba fed igh .brev This workaround is no longer required as of \*(TX 243; the output is produced by the first example, without .codn :vars . .coNP Nested @ repeat directives If a .code repeat clause encloses variables which hold multidimensional lists, those lists require additional nesting levels of .code repeat (or .codn rep ). It is an error to attempt to output a list variable which has not been decimated into primary elements via a .code repeat construct. Suppose that a variable .code X is two-dimensional (contains a list of lists). .code X must be nested twice in a .codn repeat . The outer .code repeat will traverse the lists contained in .codn X . The inner .code repeat will traverse the elements of each of these lists. A nested .code repeat may be embedded in any of the clauses of a .codn repeat , not only in the main clause. .dir rep The .code rep directive is similar to .codn repeat . Whereas .code repeat is line-oriented, .code rep generates material within a line. It has all the same clauses, but everything is specified within one line: .verb @(rep)... main material ... .... special clauses ...@(end) .brev More than one .code @(rep) can occur within a line, mixed with other material. A .code @(rep) can be nested within a .code @(repeat) or within another .codn @(rep) . Also, .code @(rep) accepts the same .code :counter and .code :vars arguments. .coNP @ repeat and @ rep Examples Example 1: show the list .code L in parentheses, with spaces between the elements, or the word .code EMPTY if the list is empty: .verb @(output) @(rep)@L @(single)(@L)@(first)(@L @(last)@L)@(empty)EMPTY@(end) @(end) .brev Here, the .code @(empty) clause specifies .codn EMPTY . So if there are no repetitions, the text .code EMPTY is produced. If there is a single item in the list .codn L , then .code @(single)(@L) produces that item between parentheses. Otherwise if there are two or more items, the first item is produced with a leading parenthesis followed by a space by .code @(first)(@L and the last item is produced with a closing parenthesis: .codn @(last)@L) . All items in between are emitted with a trailing space by the main clause: .codn @(rep)@L . Example 2: show the list L like Example 1 above, but the empty list is .codn () . .verb @(output) (@(rep)@L @(last)@L@(end)) @(end) .brev This is simpler. The parentheses are part of the text which surrounds the .code @(rep) construct, produced unconditionally. If the list .code L is empty, then .code @(rep) produces no output, resulting in .codn () . If the list .code L has one or more items, then they are produced with spaces each one, except the last which has no space. If the list has exactly one item, then the .code @(last) applies to it instead of the main clause: it is produced with no trailing space. .dir close The syntax of the .code close directive is: .mono .mets @(close << expr ) .onom Where .meta expr evaluates to a stream. The .code close directive can be used to explicitly close streams created using .mono .meti @(output ... :named << var ) .onom syntax, as an alternative to .mono .meti @(output :finish << expr ). .onom Examples: Write two lines to .str foo.txt over two output blocks using a single stream: .verb @(output "foo.txt" :named foo) Hello, @(end) @(output :continue foo) world! @(end) @(close foo) .brev The same as above, using .code :finish rather than .code :continue so that the stream is closed at the end of the second block: .verb @(output "foo.txt" :named foo) Hello, @(end) @(output :finish foo) world! @(end) .brev .NP* Output Filtering Often it is necessary to transform the output to preserve its meaning under the convention of a given data format. For instance, if a piece of text contains the characters .code < or .codn > , then if that text is being substituted into HTML, these should be replaced by .code < and .codn > . This is what filtering is for. Filtering is applied to the contents of output variables, not to any template text. \*(TX implements named filters. Built-in filters are named by keywords, given below. User-defined filters are possible, however. See notes on the deffilter directive below. Instead of a filter name, the syntax .mono .meti (fun << name ) .onom can be used. This denotes that the function called .meta name is to be used as a filter. This is described in the next section Function Filters below. Built-in filters named by keywords: .coIP :tohtml Filter text to HTML, representing special characters using HTML ampersand sequences. For instance .code > is replaced by .codn > . .coIP :tohtml* Filter text to HTML, representing special characters using HTML ampersand sequences. Unlike .codn :tohtml , this filter doesn't treat the single and double quote characters. It is not suitable for preparing HTML fragments which end up inserted into HTML tag attributes. .coIP :fromhtml Filter text with HTML codes into text in which the codes are replaced by the corresponding characters. For instance .code > is replaced by .codn > . .coIP :upcase Convert the 26 lowercase letters of the English alphabet to uppercase. .coIP :downcase Convert the 26 uppercase letters of the English alphabet to lowercase. .coIP :frompercent Decode percent-encoded text. Character triplets consisting of the .code % character followed by a pair of hexadecimal digits (case insensitive) are are converted to bytes having the value represented by the hexadecimal digits (most significant nybble first). Sequences of one or more such bytes are treated as UTF-8 data and decoded to characters. .coIP :topercent Convert to percent encoding according to RFC 3986. The text is first converted to UTF-8 bytes. The bytes are then converted back to text as follows. Bytes in the range 0 to 32, and 127 to 255 (note: including the ASCII DEL), bytes whose values correspond to ASCII characters which are listed by RFC 3986 as being in the "reserved set", and the byte value corresponding to the ASCII .code % character are encoded as a three-character sequence consisting of the .code % character followed by two hexadecimal digits derived from the byte value (most significant nybble first, upper case). All other bytes are converted directly to characters of the same value without any such encoding. .coIP :fromurl Decode from URL encoding, which is like percent encoding, except that if the unencoded .code + character occurs, it is decoded to a space character. The .code %20 sequence still decodes to space, and .code %2B to the .code + character. .coIP :tourl Encode to URL encoding, which is like percent encoding except that a space maps to .code + rather than .codn %20 . The .code + character, being in the reserved set, encodes to .codn %2B . .coIP :frombase64 Decode from the Base 64 encoding described in RFC 4648, section 5. .coIP :tobase64 Encode to the Base 64 encoding described in RFC 4648, section 5. .coIP :frombase64url Decode from the Base64 encoding described in RFC 4648, section 6. This uses the URL and filename safe alphabet, in which the .code + (plus) and .code / (slash) characters used in regular Base 64 are respectively replaced with .code - (minus) and .code _ (underscore). .coIP :tobase64url Encode to the Base 64 encoding described in RFC 4648, section 6. See .code :frombase64url above. .coIP :tonumber Converts strings to numbers. Strings that contain a period, .code e or .code E are converted to floating point as if by the Lisp function .codn flo-str . Otherwise they are converted to integer as if using .code int-str with a radix of 10. Non-numeric junk results in the object .codn nil . .coIP :toint Converts strings to integers as if using .code int-str with a radix of 10. Non-numeric junk results in the object .codn nil . .coIP :tofloat Converts strings to floating-point values as if using the function .codn flo-str . Non-numeric junk results in the object .codn nil . .coIP :hextoint Converts strings to integers as if using .code int-str with a radix of 16. Non-numeric junk results in the object .codn nil . .PP Examples: To escape HTML characters in all variable substitutions occurring in an output clause, specify .code ":filter :tohtml" in the directive: .verb @(output :filter :tohtml) ... @(end) .brev To filter an individual variable, add the syntax to the variable spec: .verb @(output) @{x :filter :tohtml} @(end) .brev Multiple filters can be applied at the same time. For instance: .verb @(output) @{x :filter (:upcase :tohtml)} @(end) .brev This will fold the contents of .code x to uppercase, and then encode any special characters into HTML. Beware of combinations that do not make sense. For instance, suppose the original text is HTML, containing codes like .codn " . The compound filter .code "(:upcase :fromhtml)" will not work because .code " will turn to .code " which no longer be recognized by the .code :fromhtml filter, since the entity names in HTML codes are case-sensitive. Capture some numeric variables and convert to numbers: .verb @date @time @temperature @pressure @(filter :tofloat temperature pressure) @;; temperature and pressure can now be used in calculations .brev .NP* Function Filters A function can be used as a filter. For this to be possible, the function must conform to certain rules: .IP 1. The function must take two special arguments, which may be followed by additional arguments. .IP 2. When the function is called, the first argument will be bound to a string, and the second argument will be unbound. The function must produce a value by binding it to the second argument. If the filter is to be used as the final filter in a chain, it must produce a string. .PP For instance, the following is a valid filter function: .verb @(define foo_to_bar (in out)) @ (next :string in) @ (cases) foo @ (bind out "bar") @ (or) @ (bind out in) @ (end) @(end) .brev This function binds the .code out parameter to .str bar if the in parameter is .strn foo , otherwise it binds the .code out parameter to a copy of the .code in parameter. This is a simple filter. To use the filter, use the syntax .code "(:fun foo_to_bar)" in place of a filter name. For instance in the .code bind directive: .verb @(bind "foo" "bar" :lfilt (:fun foo_to_bar)) .brev The above should succeed since the left side is filtered from .str foo to .strn bar , so that there is a match. Function filters can be used in a chain: .verb @(output :filter (:downcase (:fun foo_to_bar) :upcase)) ... @(end) .brev Here is a split function which takes an extra argument which specifies the separator: .verb @(define split (in out sep)) @ (next :list in) @ (coll)@(maybe)@token@sep@(or)@token@(end)@(end) @ (bind out token) @(end) .brev Furthermore, note that it produces a list rather than a string. This function separates the argument in into tokens according to the separator text carried in the variable .codn sep . Here is another function, .codn join , which catenates a list: .verb @(define join (in out sep)) @ (output :into out) @ (rep)@in@sep@(last)@in@(end) @ (end) @(end) .brev Now here is these two being used in a chain: .verb @(bind text "how,are,you") @(output :filter (:fun split ",") (:fun join "-")) @text @(end) .brev Output: .verb how-are-you .brev When the filter invokes a function, it generates the first two arguments internally to pass in the input value and capture the output. The remaining arguments from the .code "(:fun ...)" construct are also passed to the function. Thus the string objects .str "," and .str "-" are passed as the .code sep argument to .code split and .codn join . Note that .code split puts out a list, which .code join accepts. So the overall filter chain operates on a string: a string goes into split, and a string comes out of join. .dir deffilter The .code deffilter directive allows a query to define a custom filter, which can then be used in .code output clauses to transform substituted data. The syntax of .code deffilter is illustrated in this example: .IP code: .mono \ @(deffilter rot13 ("a" "n") ("b" "o") ("c" "p") ("d" "q") ("e" "r") ("f" "s") ("g" "t") ("h" "u") ("i" "v") ("j" "w") ("k" "x") ("l" "y") ("m" "z") ("n" "a") ("o" "b") ("p" "c") ("q" "d") ("r" "e") ("s" "f") ("t" "g") ("u" "h") ("v" "i") ("w" "j") ("x" "k") ("y" "l") ("z" "m")) @(collect) @line @(end) @(output :filter rot13) @(repeat) @line @(end) @(end) .onom .IP data: .mono \ hey there! .onom .IP output: .mono \ url gurer! .onom .PP The .code deffilter symbol must be followed by the name of the filter to be defined, followed by bind expressions which evaluate to lists of strings. Each list must be at least two elements long and specifies one or more texts which are mapped to a replacement text. For instance, the following specifies a telephone keypad mapping from uppercase letters to digits. .verb @(deffilter alpha_to_phone ("E" "0") ("J" "N" "Q" "1") ("R" "W" "X" "2") ("D" "S" "Y" "3") ("F" "T" "4") ("A" "M" "5") ("C" "I" "V" "6") ("B" "K" "U" "7") ("L" "O" "P" "8") ("G" "H" "Z" "9")) @(deffilter foo (`@a` `@b`) ("c" `->@d`)) @(bind x ("from" "to")) @(bind y ("---" "+++")) @(deffilter sub x y) .brev The last .code deffilter has the same effect as the .mono @(deffilter sub ("from" "to") ("---" "+++")) .onom directive. Filtering works using a longest match algorithm. The input is scanned from left to right, and the longest piece of text is identified at every character position which matches a string on the left-hand side, and that text is replaced with its associated replacement text. The scanning then continues at the first character after the matched text. If none of the strings matches at a given character position, then that character is passed through the filter untranslated, and the scan continues at the next character in the input. Filtering is not in-place but rather instantiates a new text, and so replacement text is not re-scanned for more replacements. If a filter definition accidentally contains two or more repetitions of the same left-hand string with different right-hand translations, the later ones take precedence. No warning is issued. .dir filter The syntax of the .code filter directive is: .verb @(filter FILTER { VAR }+ ) .brev A filter is specified, followed by one or more variables whose values are filtered and stored back into each variable. Example: convert .codn a , .codn b , and .code c to uppercase and HTML encode: .verb @(filter (:upcase :tohtml) a b c) .brev .SS* Exceptions .NP* Introduction The exceptions mechanism in \*(TX is another disciplined form of nonlocal transfer, in addition to the blocks mechanism (see Blocks above). Like blocks, exceptions provide a construct which serves as the target for a dynamic exit. Both blocks and exceptions can be used to bail out of deep nesting when some condition occurs. However, exceptions provide more complexity. Exceptions are useful for error handling, and \*(TX in fact maps certain error situations to exception control transfers. However, exceptions are not inherently an error-handling mechanism; they are a structured dynamic control transfer mechanism, one of whose applications is error handling. An exception control transfer (simply called an exception) is always identified by a symbol, which is its type. Types are organized in a subtype-supertype hierarchy. For instance, the .code file-error exception type is a subtype of the .code error type. This means that a file error is a kind of error. An exception handling block which catches exceptions of type .code error will catch exceptions of type .codn file-error , but a block which catches .code file-error will not catch all exceptions of type .codn error . A .code query-error is a kind of error, but not a kind of .codn file-error . The symbol .code t is the supertype of every type: every exception type is considered to be a kind of .codn t . (Mnemonic: .code t stands for type, as in any type). Exceptions are handled using .code @(catch) clauses within a .code @(try) directive. In addition to being useful for exception handling, the .code @(try) directive also provides unwind protection by means of a .code @(finally) clause, which specifies query material to be executed unconditionally when the .code try clause terminates, no matter how it terminates. .dir try The general syntax of the .code try directive is .verb @(try) ... main clause, required ... ... optional catch clauses ... ... optional finally clause @(end) .brev A .code catch clause looks like: .verb @(catch TYPE [ PARAMETERS ]) . . . .brev and also this simple form: .verb @(catch) . . . .brev which catches all exceptions, and is equivalent to .codn "@(catch t)" . A .code finally clause looks like: .verb @(finally) ... . . .brev The main clause may not be empty, but the catch and finally may be. A try clause is surrounded by an implicit anonymous block (see Blocks section above). So for instance, the following is a no-op (an operation with no effect, other than successful execution): .verb @(try) @(accept) @(end) .brev The .code @(accept) causes a successful termination of the implicit anonymous block. Execution resumes with query lines or directives which follow, if any. .code try clauses and blocks interact. For instance, an .code accept from within a try clause invokes a .codn finally . .IP code: .mono \ @(block foo) @ (try) @ (accept foo) @ (finally) @ (output) bye! @ (end) @ (end) .onom .IP output: .mono \ bye! .onom .PP How this works: the .code try block's main clause is .codn "@(accept foo)" . This causes the enclosing block named .code foo to terminate, as a successful match. Since the .code try is nested within this block, it too must terminate in order for the block to terminate. But the try has a .code finally clause, which executes unconditionally, no matter how the try block terminates. The .code finally clause performs some output, which is seen. Note that .code finally interacts with .code accept in subtle ways not revealed in this example; they are documented in the description of .code accept under the .code block directive documentation. .coNP The @ finally clause A .code try directive can terminate in one of three ways. The main clause may match successfully, and possibly yield some new variable bindings. The main clause may fail to match. Or the main clause may be terminated by a nonlocal control transfer, like an exception being thrown or a block return (like the block foo example in the previous section). No matter how the .code try clause terminates, the .code finally clause is processed. The .code finally clause is itself a query which binds variables, which leads to questions: what happens to such variables? What if the .code finally block fails as a query? As well as: what if a .code finally clause itself initiates a control transfer? Answers follow. Firstly, a .code finally clause will contribute variable bindings only if the main clause terminates normally (either as a successful or failed match). If the main clause of the .code try block successfully matches, then the .code finally block continues matching at the next position in the data, and contributes bindings. If the main clause fails, then the .code finally block tries to match at the same position where the main clause failed. The overall .code try directive succeeds as a match if either the main clause or the .code finally clause succeed. If both fail, then the .code try directive is a failed match. Example: .IP code: .mono \ @(try) @a @(finally) @b @(end) @c .onom .IP data: .mono \ 1 2 3 .onom .IP result: .mono \ a="1" b="2" c="3" .onom .PP In this example, the main clause of the .code try captures line .str 1 of the data as variable .codn a , then the finally clause captures .str 2 as .codn b , and then the query continues with the .code @c line after try block, so that .code c captures .strn "3" . Example: .IP code: .mono \ @(try) hello @a @(finally) @b @(end) @c .onom .IP data: .mono \ 1 2 .onom .IP result: .mono \ b="1" c="2" .onom .PP In this example, the main clause of the .code try fails to match, because the input is not prefixed with .strn "hello " . However, the .code finally clause matches, binding .code b to .strn "1" . This means that the try block is a successful match, and so processing continues with .code @c which captures .strn "2" . When .code finally clauses are processed during a nonlocal return, they have no externally visible effect if they do not bind variables. However, their execution makes itself known if they perform side effects, such as output. A .code finally clause guards only the main clause and the .code catch clauses. It does not guard itself. Once the finally clause is executing, the .code try block is no longer guarded. This means if a nonlocal transfer, such as a block accept or exception, is initiated within the finally clause, it will not re-execute the .code finally clause. The .code finally clause is simply abandoned. The disestablishment of blocks and .code try clauses is properly interleaved with the execution of .code finally clauses. This means that all surrounding exit points are visible in a .code finally clause, even if the .code finally clause is being invoked as part of a transfer to a distant exit point. The finally clause can make a control transfer to an exit point which is more near than the original one, thereby "hijacking" the control transfer. Also, the anonymous block established by the .code try directive is visible in the .code finally clause. Example: .verb @(try) @ (try) @ (next "nonexistent-file") @ (finally) @ (accept) @ (end) @(catch file-error) @ (output) file error caught @ (end) @(end) .brev In this example, the .code @(next) directive throws an exception of type .codn file-error , because the given file does not exist. The exit point for this exception is the .code "@(catch file-error)" clause in the outermost .code try block. The inner block is not eligible because it contains no catch clauses at all. However, the inner try block has a finally clause, and so during the processing of this exception which is headed for .codn "@(catch file-error)" , the .code finally clause performs an anonymous .codn accept . The exit point for that .code accept is the anonymous block surrounding the inner .codn try . So the original transfer to the .code catch clause is thereby abandoned. The inner .code try terminates successfully due to the .codn accept , and since it constitutes the main clause of the outer try, that also terminates successfully. The .str "file error caught" message is never printed. .c1NP catch clauses .code catch clauses establish their associated .code try blocks as potential exit points for exception-induced control transfers (called "throws"). A .code catch clause specifies an optional list of symbols which represent the exception types which it catches. The .code catch clause will catch exceptions which are a subtype of any one of those exception types. If a .code try block has more than one .code catch clause which can match a given exception, the first one will be invoked. When a .code catch is invoked, it is understood that the main clause did not terminate normally, and so the main clause could not have produced any bindings. .code catch clauses are processed prior to .codn finally . If a .code catch clause itself throws an exception, that exception cannot be caught by that same clause or its siblings in the same try block. The .code catch clauses of that block are no longer visible at that point. Nevertheless, the .code catch clauses are still protected by the finally block. If a catch clause throws, or otherwise terminates, the .code finally block is still processed. If a .code finally block throws an exception, then it is simply aborted; the remaining directives in that block are not processed. So the success or failure of the .code try block depends on the behavior of the .code catch clause or the .code finally clause, if there is one. If either of them succeed, then the try block is considered a successful match. Example: .IP code: .mono \ @(try) @ (next "nonexistent-file") @ x @ (catch file-error) @a @(finally) @b @(end) @c .onom .IP data: .mono \ 1 2 3 .onom .IP result: .mono \ a="1" b="2" c="3" .onom .PP Here, the .code try block's main clause is terminated abruptly by a .code file-error exception from the .code @(next) directive. This is handled by the .code catch clause, which binds variable .code a to the input line .strn 1 . Then the .code finally clause executes, binding .code b to .strn 2 . The .code try block then terminates successfully, and so .code @c takes .strn "3" . .coNP @ catch Clauses with Parameters A .code catch clause may have parameters following the type name, like this: .verb @(catch pair (a b)) .brev To write a catch-all with parameters, explicitly write the master supertype t: .verb @(catch t (arg ...)) .brev Parameters are useful in conjunction with .codn throw . The built-in .code error exceptions carry one argument, which is a string containing the error message. Using .codn throw , arbitrary parameters can be passed from the throw site to the catch site. .dir throw The .code throw directive generates an exception. A type must be specified, followed by optional arguments, which are bind expressions. For example, .verb @(throw pair "a" `@file.txt`) .brev throws an exception of type .codn pair , with two arguments, being .str a and the expansion of the quasiliteral .codn `@file.txt` . The selection of the target .code catch is performed purely using the type name; the parameters are not involved in the selection. Binding takes place between the arguments given in .code throw and the target .codn catch . If any .code catch parameter, for which a .code throw argument is given, is a bound variable, it has to be identical to the argument, otherwise the catch fails. (Control still passes to the .codn catch , but the catch is a failed match). .IP code: .mono \ @(bind a "apple") @(try) @(throw e "banana") @(catch e (a)) @(end) .onom .IP result: .mono \ [query fails] .onom .PP If any argument is an unbound variable, the corresponding parameter in the .code catch is left alone: if it is an unbound variable, it remains unbound, and if it is bound, it stays as is. .IP code: .mono \ @(try) @(throw e "honda" unbound) @(catch e (car1 car2)) @car1 @car2 @(end) .onom .IP data: .mono \ honda toyota .onom .IP result: .mono \ car1="honda" car2="toyota" .onom .PP If a .code catch has fewer parameters than there are throw arguments, the excess arguments are ignored: .IP code: .mono \ @(try) @(throw e "banana" "apple" "pear") @(catch e (fruit)) @(end) .onom .IP result: .mono \ fruit="banana" .onom .PP If a .code catch has more parameters than there are throw arguments, the excess parameters are left alone. They may be bound or unbound variables. .IP code: .mono \ @(try) @(throw e "honda") @(catch e (car1 car2)) @car1 @car2 @(end) .onom .IP data: .mono \ honda toyota .onom .IP result: .mono \ car1="honda" car2="toyota" .onom .PP A .code throw argument passing a value to a .code catch parameter which is unbound causes that parameter to be bound to that value. .code throw arguments are evaluated in the context of the .codn throw , and the bindings which are available there. Consideration of what parameters are bound is done in the context of the catch. .IP code: .mono \ @(bind c "c") @(try) @(forget c) @(bind (a c) ("a" "lc")) @(throw e a c) @(catch e (b a)) @(end) .onom .IP result: .mono \ c="c" b="a" a="lc" .onom .PP In the above example, .code c has a top-level binding to the string .strn "c" , but then becomes unbound via .code forget within the .code try construct, and rebound to the value .strn lc . Since the .code try construct is terminated by a .codn throw , these modifications of the binding environment are discarded. Hence, at the end of the query, variable .code c ends up bound to the original value .strn c . The .code throw still takes place within the scope of the bindings set up by the .code try clause, so the values of .code a and .code c that are thrown are .str a and .strn lc . However, at the .code catch site, variable .code a does not have a binding. At that point, the binding to .str a established in the .code try has disappeared already. Being unbound, the .code catch parameter .code a can take whatever value the corresponding throw argument provides, so it ends up with .strn lc . There is a horizontal form of .codn throw . For instance: .verb abc@(throw e 1) .brev throws exception .code e if .code abc matches. If .code throw is used to generate an exception derived from type .code error and that exception is not handled, \*(TX will issue diagnostics on the .code *stderr* stream and terminate. If an exception derived from .code warning is not handled, \*(TX will generate diagnostics on the .code *stderr* stream, after which control returns to the .code throw directive, and proceeds with the next directive. If an exception not derived from .code error is thrown, control returns to the .code throw directive and proceeds with the next directive. .dir defex The .code defex directive allows the query writer to invent custom exception types, which are arranged in a type hierarchy (meaning that some exception types are considered subtypes of other types). Subtyping means that if an exception type .code B is a subtype of .codn A , then every exception of type .code B is also considered to be of type .codn A . So a catch for type .code A will also catch exceptions of type .codn B . Every type is a supertype of itself: an .code A is a kind of .codn A . This implies that every type is a subtype of itself also. Furthermore, every type is a subtype of the type .codn t , which has no supertype other than itself. Type .code nil is a subtype of every type, including itself. The subtyping relationship is transitive also. If .code A is a subtype of .codn B , and .code B is a subtype of .codn C , then .code A is a subtype of .codn C . .code defex may be invoked with no arguments, in which case it does nothing: .verb @(defex) .brev It may be invoked with one argument, which must be a symbol. This introduces a new exception type. Strictly speaking, such an introduction is not necessary; any symbol may be used as an exception type without being introduced by .codn @(defex) : .verb @(defex a) .brev Therefore, this also does nothing, other than document the intent to use a as an exception. If two or more argument symbols are given, the symbols are all introduced as types, engaged in a subtype-supertype relationship from left to right. That is to say, the first (leftmost) symbol is a subtype of the next one, which is a subtype of the next one and so on. The last symbol, if it had not been already defined as a subtype of some type, becomes a direct subtype of the master supertype .codn t . Example: .verb @(defex d e) @(defex a b c d) .brev The first directive defines .code d as a subtype of .codn e , and .code e as a subtype of .codn t . The second defines .code a as a subtype of .codn b , .code b as a subtype of .codn c , and .code c as a subtype of .codn d , which is already defined as a subtype of .codn e . Thus .code a is now a subtype of .codn e . The above can be condensed to: .verb @(defex a b c d e) .brev Example: .IP code: .mono \ @(defex gorilla ape primate) @(defex monkey primate) @(defex human primate) @(collect) @(try) @(skip) @(cases) gorilla @name @(throw gorilla name) @(or) monkey @name @(throw monkey name) @(or) human @name @(throw human name) @(end)@#cases @(catch primate (name)) @kind @name @(output) we have a primate @name of kind @kind @(end)@#output @(end)@#try @(end)@#collect .onom .IP data: .mono \ gorilla joe human bob monkey alice .onom .IP output: .mono \ we have a primate joe of kind gorilla we have a primate bob of kind human we have a primate alice of kind monkey .onom .PP Exception types have a pervasive scope. Once a type relationship is introduced, it is visible everywhere. Moreover, the .code defex directive is destructive, meaning that the supertype of a type can be redefined. This is necessary so that something like the following works right: .verb @(defex gorilla ape) @(defex ape primate) .brev These directives are evaluated in sequence. So after the first one, the .code ape type has the type .code t as its immediate supertype. But in the second directive, .code ape appears again, and is assigned the .code primate supertype, while retaining .code gorilla as a subtype. This situation could be diagnosed as an error, forcing the programmer to reorder the statements, but instead \*(TX obliges. However, there are limitations. It is an error to define a subtype-supertype relationship between two types if they are already connected by such a relationship, directly or transitively. So the following definitions are in error: .verb @(defex a b) @(defex b c) @(defex a c)@# error: a is already a subtype of c, through b @(defex x y) @(defex y x)@# error: circularity; y is already a supertype of x. .brev .dir assert The .code assert directive requires the remaining query or subquery which follows it to match. If the remainder fails to match, the .code assert directive throws an exception. If the directive is simply .verb @(assert) .brev Then it throws an assertion of type assert, which is a subtype of error. The .code assert directive also takes arguments similar to the .code throw directive: an exception symbol and additional arguments which are bind expressions, and may be unbound variables. The following assert directive, if it triggers, will throw an exception of type .codn foo , with arguments .code 1 and .strn 2 : .verb @(assert foo 1 "2") .brev Example: .verb @(collect) Important Header ---------------- @(assert) Foo: @a, @b @(end) .brev Without the assertion in places, if the .code "Foo: @a, @b" part does not match, then the entire interior of the .code @(collect) clause fails, and the collect continues searching for another match. With the assertion in place, if the text .str "Important Header" and its underline match, then the remainder of the collect body must match, otherwise an exception is thrown. Now the program will not silently skip over any Important Header sections due to a problem in its matching logic. This is particularly useful when the matching is varied with numerous cases, and they must all be handled. There is a horizontal .code assert directive also. For instance: .verb abc@(assert)d@x .brev asserts that if the prefix .str abc is matched, then it must be followed by a successful match for .strn "d@x" , or else an exception is thrown. If the exception is not handled, and is derived from .code error then \*(TX issues diagnostics on the .code *stderr* stream and terminates. If the exception is derived from .code warning and not handled, \*(TX issues a diagnostic on .code *stderr* after which control returns to the .code assert directive. Control silently returns to the .code assert directive if an exception of any other kind is not handled. When control returns to .code assert due to an unhandled exception, it behaves like a failed match, similarly to the require directive. .SH* TXR LISP The \*(TX language contains an embedded Lisp dialect called \*(TL. This language is exposed in \*(TX in a number of ways. In any situation that calls for an expression, a Lisp expression can be used, if it is preceded by the .code @ character. The Lisp expression is evaluated and its value becomes the value of that expression. Thus, \*(TX directives are embedded in literal text using .codn @ , and Lisp expressions are embedded in directives using .code @ also. Furthermore, certain directives evaluate Lisp expressions without requiring .codn @ . These are .codn @(do) , .codn @(require) , .codn @(assert) , .code @(if) and .codn @(next) . \*(TL code can be placed into files. On the command line, \*(TX treats files with a .strn .tl , .str .tlo or .str .tlo.gz suffix as \*(TL source or compiled code, and the .code @(load) directive does also. \*(TX also provides an interactive listener for Lisp evaluation. Lastly, \*(TL expressions can be evaluated via the command line, using the .code -e and .code -p options. .B Examples: Bind variable .code a to the integer 4: .verb @(bind a @(+ 2 2)) .brev Bind variable .code b to the standard input stream. Note that .code @ is not required on a Lisp variable: .verb @(bind a *stdin*) .brev Define several Lisp functions inside .codn @(do) : .verb @(do (defun add (x y) (+ x y)) (defun occurs (item list) (cond ((null list) nil) ((atom list) (eql item list)) (t (or (eq (first list) item) (occurs item (rest list))))))) .brev Trigger a failure unless previously bound variable .code answer is greater than 42: .verb @(require (> (int-str answer) 42) .brev .SS* Overview \*(TL is a small and simple dialect, like Scheme, but much more similar to Common Lisp than Scheme. It has separate value and function binding namespaces, like Common Lisp (and thus is a Lisp-2 type dialect), and represents Boolean .B true and .B false with the symbols .code t and .code nil (note the case sensitivity of identifiers denoting symbols!). Furthermore, the symbol .code nil is also the empty list, which terminates nonempty lists. \*(TL has lexically scoped local variables and dynamic global variables, similarly to Common Lisp, including the convention that .code defvar marks symbols for dynamic binding in local scopes. Lexical closures are supported. \*(TL also supports global lexical variables via .codn defvarl . Functions are lexically scoped in \*(TL; they can be defined in the pervasive global environment using .code defun or in local scopes using .code flet and .codn labels . .SS* Additional Syntax Much of the \*(TL syntax has been introduced in the previous sections of the manual, since directive forms are based on it. There is some additional syntax that is useful in \*(TL programming. .NP* Symbol Tokens The symbol tokens in \*(TL, called a .meta lident (Lisp identifier) has a similar syntax to the .meta bident (braced identifier) in the \*(TX pattern language. It may consist of all the same characters, as well as the .code / (slash) character which may not be used in a .metn bident . Thus a .meta lident may consist of these characters, in addition to letters, numbers and underscores: .mono ! $ % & * + - < = > ? \e ~ / .onom and may not look like a number. A .meta lident may also include all of the Unicode characters which are permitted in a .metn bident . The one character which is allowed in a .meta lident but not in a .meta bident is .code / (forward slash). A lone .code / is a valid .meta lident and consequently a symbol token in \*(TL. The token .code /abc/ is also a symbol, and, unlike in a braced expression, is not a regular expression. In \*(TL expressions, regular expressions are written with a leading .codn # . .NP* Package Prefixes If a symbol name contains a colon, the .I lident characters, if any, before that colon constitute the package prefix. For example, the syntax .code foo:bar denotes .code bar symbol in the .code foo package. It is a syntax error to read a symbol whose package doesn't exist. If the package exists, but the symbol name doesn't exist in that package, then the symbol is interned in that package. If the package name is an empty string (the colon is preceded by nothing), the package is understood to be the .code keyword package. The symbol is interned in that package. The syntax .code :test denotes the symbol .code test in the .code keyword package, the same as .codn keyword:test . Symbols in the keyword package are self-evaluating. This means that when a keyword symbol is evaluated as a form, the value of that form is the keyword symbol itself. Exactly two non-keyword symbols also have this special self-evaluating behavior: the symbols .code t and .code nil in the user package, whose fully qualified names are .code usr:t and .codn usr:nil . The syntax .code @foo:bar denotes the meta prefix .code @ being applied to the .code foo:bar symbol, not to a symbol in the .code @foo package. The syntax .code #:bar denotes an uninterned symbol named .codn bar , described in the next section. .TP* "Dialect Note:" In ANSI Common Lisp, the .code foo:bar syntax does not intern the symbol .code bar in the .code foo package; the symbol must exist and be an exported symbol, or else the syntax is erroneous. In ANSI Common Lisp, the syntax .code foo::bar does intern .code foo in the .code bar package. \*(TX's package system has no double-colon syntax, and lacks the concept of exported symbols. .NP* Uninterned Symbols Uninterned symbols are written with the .code #: prefix, followed by zero or more .I lident characters. When an uninterned symbol is read, a new, unique symbol is constructed, with the specified name. Even if two uninterned symbols have the same name, they are different objects. The .code make-sym and .code gensym functions produce uninterned symbols. "Uninterned" means "not entered into a package". Interning refers to a process which combines package lookup with symbol creation, which ensures that multiple occurrences of a symbol name in written syntax are all converted to the same object: the first occurrence creates the symbol and associates it with its name in a package. Subsequent occurrences do not create a new symbol, but retrieve the existing one. .NP* Meta-Atoms and Meta-Expressions An expression may be preceded by the .code @ (at sign) character. If the expression is an .codn atom , then this is a meta-atom, otherwise it is a meta-expression. When the atom is a symbol, this is also called a meta-symbol and in situations when such a symbol behaves like a variable, it is also referred to as a meta-variable. When the atom is an integer, the meta-atom expression is called a meta-number. Meta-atom and meta-expression expressions have no evaluation semantics; evaluating them throws an exception. They play a syntactic role in the .code op operator, which makes use of meta-variables and meta-numbers, and in structural pattern matching, which uses meta-variables as pattern variables and whose operator vocabulary is based on meta-expressions. Meta-expressions also appear in the quasiliteral notation. In other situations, application code may assign meaning to meta syntax as the programmer sees fit. Meta syntax is defined as a shorthand notation, as follows: If .code X is the syntax of an atom, such as a symbol, string or vector, then .code @X is a shorthand for the expression .codn "(sys:var X)" . Here, .code sys:var refers to the .code var symbol in the .codn system-package . If .code X is a compound expression, either .code "(...)" or .codn "[...]" , then .code @X is a shorthand for the expression .codn "(sys:expr X)" . The behavior of .code @ followed by the syntax of a floating-point constant introduced by a leading decimal point, not preceded by digits, is unspecified. Examples of this are .code "@.123" and .codn "@.123E+5" . The behavior of .code @ followed by the syntax of a floating-point expression in E notation, which lacks a decimal point, is also unspecified. An example of this is .codn @12E5 . It is a syntax error for .code @ to be followed by what appears to be a floating-point constant consisting of a decimal point flanked by digits on both sides. For instance .code @1.2 is rejected. A meta-expression followed by a period, and the syntax of another object is otherwise interpreted as a referencing dot expression. For instance .code @1.E3 denotes .code "(qref @1 E3)" which, in turn, denotes .codn "(qref (sys:var 1) E3)" , even though the unprefixed character sequence .code 1.E3 is otherwise a floating-point constant. .NP* Consing Dot Unlike other major Lisp dialects, \*(TL allows a consing dot with no forms preceding it. This construct simply denotes the form which follows the dot. That is to say, the parser implements the following transformation: .verb (. expr) -> expr .brev This is convenient in writing function argument lists that only take variable arguments. Instead of the syntax: .verb (defun fun args ...) .brev the following syntax can be used: .verb (defun fun (. args) ...) .brev When a .code lambda form is printed, it is printed in the following style. .verb (lambda nil ...) -> (lambda () ...) (lambda sym ...) -> (lambda (. sym) ...) (lambda (sym) ...) -> (lambda (sym) ...) .brev In no other circumstances is .code nil printed as .codn () , or an atom .meta sym as .codn "(. sym)" . This notation is implemented for the square brackets, according to this transformation: .verb [. expr] -> (dwim . expr) .brev This is useful in Structural Pattern Matching, allowing a pattern like .verb [. @args] .brev to match a .code dwim expression and capture all of its arguments in a variable, without having to resort to the internal notation: Compatibility Note: support for .code "[. expr]" was introduced in \*(TX 282. Older versions do not read the syntax, but do print .code "(dwim . @var)" as .code "[. @var]" which is then unreadable in those versions, breaking read-print consistency. .NP* Referencing Dot A dot token which is flanked by expressions on both sides, without any intervening whitespace, is the referencing dot, and not the consing dot. The referencing dot is a syntactic sugar which translated to the .code qref syntax ("quoted ref"). When evaluated as a form, this syntax denotes structure access; see Structures. However, it is possible to put this syntax to use for other purposes, in other contexts. .verb ;; a.b may be almost any expressions a.b <--> (qref a b) a.b.c <--> (qref a b c) a.(qref b c) <--> (qref a b c) (qref a b).c <--> (qref (qref a b) c) .brev That is to say, this dot operator constructs a .code qref expression out of its left and right arguments. If the right argument of the dot is already a qref expression (whether produced by another instance of the dot operator, or expressed directly) it is merged. This requires the qref dot operator to be right-to-left associative, so that .code a.b.c works by first translating .code b.c to .codn "(qref b c)" , and then adjoining .code a to produce .codn "(qref a b c)" . If the referencing dot is immediately followed by a question mark, it forms a single token, which produces the following syntactic variation, in which the following item is annotated as a list headed by the symbol .codn t : .verb a.?b <--> (t a).b <--> (qref (t a) b) a.?b.?c <--> (t a).(t b).c <--> (qref (t a) (t b) c) a.?(b) <--> (t a).(b) <--> (qref (t a) (b)) (a).?b <--> (t (a)).b <--> (qref (t (a)) b) .brev This syntax denotes .I null-safe access to structure slots and methods. .code a.?b means that .code a may evaluate to .codn nil , in which case the expression yields .codn nil ; otherwise, .code a must evaluate to a .code struct which has a slot .codn b , and the expression denotes access to that slot. Similarly, .code "a.?(b 1)" means that if .code a evaluates to .codn nil , the expression yields .codn nil ; otherwise, .code a is treated as a struct object whose method .code b is invoked with argument .codn 1 , and the value returned by that method becomes the value of the expression. Integer tokens cannot be involved in this syntax, because they form floating-point constants when juxtaposed with a dot. Such ambiguous uses of floating-point tokens are diagnosed as syntax errors: .verb (a.4) ;; error: cramped floating-point literal (a .4) ;; good: a followed by 0.4 .brev .NP* Unbound Referencing Dot Closely related to the referencing dot syntax is the unbound referencing dot. This is a dot which is flanked by an expression on the right, without any intervening whitespace, but is not preceded by an expression Rather, it is preceded by whitespace, or some punctuation such as .codn [ , .code ( or .codn ' . This is a syntactic sugar which translates to .code uref syntax: .verb .a <--> (uref a) .a.b <--> (uref a b) .a.?b <--> (uref (t a) b) .brev If the unbound referencing dot is itself combined with a question mark to form the .code .? token, then the translation to .code uref is as follows: .verb .?a <--> (uref t a) .?a.b <--> (uref t a b) .?a.?b <--> (uref t a (t b)) .brev When the unbound referencing dot is applied to a dotted expression, this can be understood as a conversion of .code qref to .codn uref . Indeed, this is exactly what happens if the unbound dot is applied to an explicit .code qref expression: .verb .(qref a b) <--> (uref a b) .brev The unbound referencing dot takes its name from the semantics of the .code uref macro, which produces a function that implements late binding of an object to a method slot. Whereas the expression .code obj.a.b denotes accessing object .code obj to retrieve slot .code a and then accessing slot .code b of the object from that slot, the expression .code .a.b. represents a "disembodied" reference: it produces a function which takes an object as an argument and then performs the implied slot referencing on that argument. When the function is called, it is said to bind the referencing to the object. Hence that referencing is "unbound". Whereas the expression .code .a produces a function whose argument must be an object, .code .?a produces a function whose argument may be .codn nil . The function detects this case and returns .codn nil . .NP* Quote and Quasiquote .RS .meIP >> ' expr The quote character in front of an expression is used for suppressing evaluation, which is useful for forms that evaluate to something other than themselves. For instance if .code "'(+ 2 2)" is evaluated, the value is the three-element list .codn "(+ 2 2)" , whereas if .code "(+ 2 2)" is evaluated, the value is .codn 4 . Similarly, the value of .code 'a is the symbol .code a itself, whereas the value of .code a is the contents of the variable .codn a . .meIP >> ^ qq-template The caret in front of an expression is a quasiquote. A quasiquote is like a quote, but with the possibility of substitution of material. Under a quasiquote, form is considered to be a quasiquote template. The template is considered to be a literal structure, except that it may contain the notations .mono .meti >> , expr .onom and .mono .meti >> ,* expr .onom which denote non-constant parts. A quasiquote gets translated into code which, when evaluated, constructs the structure implied by .metn qq-template , taking into account the unquotes and splices. A quasiquote also processes nested quasiquotes specially. If .meta qq-template does not contain any unquotes or splices (which match its level of nesting), or is simply an atom, then .mono .meti >> ^ qq-template .onom is equivalent to .mono .meti >> ' qq-template . .onom in other words, it is like an ordinary quote. For instance .code "^(a b ^(c ,d))" is equivalent to .codn "'(a b ^(c ,d))" . Although there is an unquote ,d it belongs to the inner quasiquote .codn "^(c ,d)" , and the outer quasiquote does not have any unquotes of its own, making it equivalent to a quote. Dialect Note: in Common Lisp and Scheme, .code ^form is written .codn `form , and quasiquotes are also informally known as backquotes. In \*(TX, the backquote character .code ` used for quasistring literals. .meIP >> , expr The comma character is used within a .meta qq-template to denote an unquote. Whereas the quasiquote suppresses evaluation, similarly to the quote, the comma introduces an exception: an element of a form which is evaluated. For example, list .code "^(a b c ,(+ 2 2) (+ 2 2))" is the list .codn "(a b c 4 (+ 2 2))" . Everything in the quasiquote stands for itself, except for the .code ",(+ 2 2)" which is evaluated. Note: if a variable is called .codn *x* , then the syntax .code ,*x* means .codn ",* x*" : splice the value of .codn x* . In this situation, whitespace between the comma and the variable name must be used: .codn ", *x*" . .meIP >> ,* expr The comma-star operator is used within quasiquote list to denote a splicing unquote. The form which follows .code ,* must evaluate to a list. That list is spliced into the structure which the quasiquote denotes. For example: .code "'(a b c ,*(list (+ 3 3) (+ 4 4) d))" evaluates to .codn "(a b c 6 8 d)" . The expression .code "(list (+ 3 3) (+ 4 4))" is evaluated to produce the list .codn "(6 8)" , and this list is spliced into the quoted template. .meIP >> @,* expr This syntax is not a distinct quasiquoting operator, but rather the combination of an unquote occurring as a meta-expression, denoting the structure .codn "(sys:expr ,expr)" . This structure is treated specially by the quasiquote expander. Code is generated for it such that if .meta expr evaluates to a value .meta val which is an .codn atom , then the result will be the .mono .meti (sys:var << val ) .onom structure. If .meta val is a .code cons rather than an .codn atom , then the result is the .mono .meti (sys:expr << val ) .onom structure. In other words, when quasiquoting is used to insert a value under the .code @ meta prefix, the expander generates code to analyze the type of the value, and produce to the form which is most likely intended. .RE .TP* "Dialect Notes:" In other Lisp dialects, like Scheme and ANSI Common Lisp, the equivalent syntax is usually .code ,@ (comma at). The .code @ character already has an assigned meaning in \*(TX, so .code * is used. However, .code * is also a character that may appear in a symbol name, which creates a potential for ambiguity. The syntax .code ,*abc denotes the application of the .code ,* splicing operator to the symbolic expression .codn abc ; to apply the ordinary non-splicing unquote to the symbol .codn *abc , whitespace must be used: .codn ", *abc" . In \*(TX, the unquoting and splicing forms may freely appear outside of a quasiquote template. If they are evaluated as forms, however, they throw an exception: .verb ,(+ 2 2) ;; error! ',(+ 2 2) --> ,(+ 2 2) .brev In other Lisp dialects, a comma not enclosed by backquote syntax is treated as a syntax error by the reader. \*(TX's quasiquote supports splicing multiple items into a .codn quote , if that quote is itself evaluated via an unquote. Concretely, these two examples produce the same result: .verb (eval (eval (let ((args '(a b c))) ^^(let ((a 1) (b 2) (c 3)) (list ,',*args))))) -> (1 2 3) (eval (eval (let ((args '(a b c))) ^^(let ((a 1) (b 2) (c 3)) (list ,*',args))))) -> (1 2 3) .brev The only difference is that the former example uses .code ",',*args" whereas the latter .codn ",*',args" . Thus the former example splices .code args into the quote as if by .code "(quote ,*args)" which is invalid .code quote syntax if .code args doesn't expand to exactly one element. This invalid quote syntax is accepted by the quasiquote expander when it occurs in the above unquoting and splicing situation. Effectively, it behaves as if the splice distributes across the quoted unquote, such that all the arguments of the .code quote end up individually quoted, and spliced into the surrounding list. The Common Lisp equivalent this combination, .codn ",',@args" , works in some Common Lisp implementations, such as CLISP. .NP* Quasiquoting non-List Objects Quasiquoting is supported over hash table and vector literals (see Vectors and Hashes below). A hash table or vector literal can be quoted, like any object, for instance: .verb '#(1 2 3) .brev The .code "#(1 2 3)" literal is turned into a vector atom right in the \*(TX parser, and this atom is being quoted: this is .mono .meti (quote << atom ) .onom syntactically, which evaluates to .metn atom . When a vector is quasi-quoted, this is a case of .mono .meti >> ^ atom .onom which evaluates to .metn atom . A vector can be quasiquoted, for example: .verb ^#(1 2 3) .brev Unquotes can occur within a quasiquoted vector: .verb (let ((a 42)) ^#(1 ,a 3)) ; value is #(1 42 3) .brev In this situation, the .code ^#(...) notation produces code which constructs a vector. The vector in the following example is also a quasivector. It contains unquotes, and though the quasiquote is not directly applied to it, it is embedded in a quasiquote: .verb (let ((a 42)) ^(a b c #(d ,a))) ; value is (a b c #(d 42)) .brev Hash-table literals have two parts: the list of hash construction arguments and the key-value pairs. For instance: .verb #H((:eql-based) (a 1) (b 2)) .brev where .code (:eql-based) indicates that this hash table's keys are treated using .code eql equality, and .code "(a 1)" and .code "(b 2)" are the key/value entries. Hash literals may be quasiquoted. In quasiquoting, the arguments and pairs are treated as separate syntax; it is not one big list. So the following is not a possible way to express the above hash: .verb ;; not supported: splicing across the entire syntax (let ((hash-syntax '((:eql-based) (a 1) (b 2)))) ^#H(,*hash-syntax)) .brev This is correct: .verb ;; fine: splicing hash arguments and contents separately (let ((hash-args '(:eql-based)) (hash-contents '((a 1) (b 2)))) ^#H(,hash-args ,*hash-contents)) .brev .NP* Quasiquoting combined with Quasiliterals When a quasiliteral is embedded in a quasiquote, it is possible to use splicing to insert material into the quasiliteral. Example: .verb (eval (let ((a 3)) ^`abc @,a @{,a} @{(list 1 2 ,a)}`)) -> "abc 3 3 1 2 3" .brev .NP* Vector Literals .coIP "#(...)" A hash token followed by a list denotes a vector. For example .code "#(1 2 a)" is a three-element vector containing the numbers .code 1 and .codn 2 , and the symbol .codn a . .NP* Struct Literals .meIP >> #S( name >> { slot << value }*) The notation .code #S followed by a nested list syntax denotes a struct literal. The first item in the syntax is a symbol denoting the struct type name. This must be the name of a struct type, otherwise the literal is erroneous. Followed by the struct type are slot names interleaved with their values. The values are literal expressions, not subject to evaluation. Each slot name which is present in the literal must name a slot in the struct type, though not all slots in the struct type must be present in the literal. When a struct literal is read, the denoted struct type is constructed as if by a call to .code make-struct with an empty .meta plist argument, followed by a sequence of assignments which store into each .meta slot the corresponding .meta value expression. .NP* Hash Literals .meIP <> #H(( hash-argument *) >> ( key << value )*) The notation .code #H followed by list syntax denotes a hash-table literal. The first item in the syntax is a list of keywords. These are the same keywords as are used when calling the function hash to construct a hash table. Allowed keywords are: .codn :equal-based , .codn :eql-based , .codn :eq-based , .codn :weak-keys , .codn :weak-vals , and .codn :userdata . If the .code :userdata keyword is present, it must be followed by an object; that object specifies the hash table's user data, which can be retrieved using the .code hash-userdata function. The .codn :equal-based , .code :eql-based and .code :eq-based keywords are mutually exclusive. An empty list can be specified as .code nil or .codn () , which defaults to a hash table based on the .code eql function, with no weak semantics or user data. The entire syntax following .code #H may be an empty list; however, that empty list may not be specified as .codn nil ; the empty parentheses notation is required. The hash table's key-value contents are specified as zero or more two-element lists, whose first element specifies the .meta key and whose second specifies the .metn value . Both expressions are literal objects, not subject to evaluation. .NP* Range Literals .meIP >> #R( from << to ) The notation .code #R followed by a two-element list syntax denotes a range literal. It combines .meta from and .meta to expressions, themselves literals not subject to evaluation, producing the range object whose corresponding .code to and .code from fields are the objects denoted by these expressions. .NP* Buffer Literals .meIP <> #b' hex-data ' The notation .code #b' introduces a buffer object: a data representation for a block of bytes. This .code #b' prefix must be followed by a data section and a closing quote. The data section consists of hexadecimal digits, among which may be interspersed whitespace: tabs, spaces and newlines. There must be an even number of digits, or else the notation is ill-formed. The whitespace is ignored, and pairs of successive hex digits specify bytes. If there are no hex digits, then a zero length buffer is specified. Buffers may be constructed by the .code make-buf function, and other means such as the .code ffi-get function. Note that the .code #b prefix is also used for binary numbers. In that syntax, it is followed by an optional sign, and then a mixture of one or more of the digits .code 0 or .codn 1 . .NP* Tree Node Literals .meIP >> #N([ key >> [ left <> [ right ]]]) The notation .code #N followed by list syntax denotes a tree node literal. The list syntax must be a proper list that has up to three elements. If the list is empty, it may not be written as .codn nil . A tree node is an object of type .codn tnode . Every .code tnode has three elements: a .metn key , a .meta left link and a .meta right link. They may be objects of any type. If the tree node literal syntax omits any of these, they default to .codn nil . .NP* Tree Literals .meIP >> #T([([ keyfun >> [ lessfun <> [ equalfun ]]]) << item *]) The notation .code #T followed by list syntax denotes a tree literal, which specifies an object of type .codn tree . Objects of type .code tree are search trees. The list syntax which follows .code #T may be empty. If so, it cannot be written as .codn nil . The first element of the .code #T syntax, if present, must be a list of zero to three elements. These elements are symbols giving the names of the .code tree object's .IR "key abstraction functions" . .meta keyfun specifies the key function which is applied to each element to retrieve its key. If it is omitted, the object shall use the .code identity function as its key. The .meta lessfun specifies the name of the comparison function by which keys are compared for inequality. It defaults to .codn less . The .meta equalfun specifies the function by which keys are compared for equality. It defaults to .codn equal . A symbol which is specified as the name of any of these three special functions must be an element of the list stored in the special variable .codn *tree-fun-whitelist* , otherwise the string literal is diagnosed as erroneous. Note: this is due to security considerations, since these three functions are executed during the processing of tree syntax. A tree object is constructed from a tree literal by first creating an empty tree endowed with the three key abstraction functions that are indicated in the syntax, either explicitly or as defaults. Then, every .meta element object is constructed from its respective literal syntax and inserted into the tree. Duplicate objects are preserved. For instance the tree literal .code "#T(() 1 1 1)" specifies a tree with three nodes which have the same key. Duplicates appear in the tree in the order that they appear in the literal. .NP* JSON Literals .meIP >> #J json-syntax Introduces a JSON literal. .meIP >> #J^ json-syntax Introduces a JSON quasiquote, allowing unquoting and splicing of Lisp expressions. The implementation of JSON syntax is based on, and intended to conform with the IETF RFC 8259 document. Only \*(TX's extensions to JSON syntax are described in this manual, as well as the correspondence between JSON syntax and Lisp. The .meta json-syntax is translated into a \*(TL object as follows. A JSON string corresponds to a Lisp string. A JSON number corresponds to a Lisp floating-point number. A JSON array corresponds to a Lisp vector. A JSON object corresponds to an .codn equal -based hash table. The JSON Boolean symbols .code true and .code false translate to the Lisp symbols .code t and .codn nil , respectively, those being the standard ones in the .code usr package. The JSON symbol .code null maps to the .code null symbol in the .code usr package. The .mono .meti >> #J json-syntax .onom expression produces the object: .mono .mets (json quote << lisp-object ) .onom where .meta lisp-object is the Lisp value which corresponds to the .metn json-syntax . Similarly, but with a key difference, the .mono .meti >> #J^ json-syntax .onom expression produces the object: .mono .mets (json sys:qquote << lisp-object ) .onom in which .code quote has been replaced with .codn sys:qquote . The .code json symbol is bound as a macro, which is expanded when a .code #J expression is evaluated. The following remarks indicate special treatment and extensions in the processing of JSON. Similar remarks regarding the production of JSON are given under the .code put-json function. When an invalid UTF-8 byte is encountered inside a JSON string, its value is mapped into the code point range U+DC01 to U+DCFF. That byte is consumed, and decoding continues with the next byte. This treatment is consistent with the treatment of invalid UTF-8 bytes in \*(TL literals and I/O streams. If the valid UTF-8 byte U+0000 (ASCII NUL) occurs in a JSON string, it is also mapped to U+DC00, \*(TX's pseudo-null character. This treatment is consistent with \*(TX string literals and I/O streams. The JSON escape sequence .code "\eu0000" denoting the U+0000 NUL character is also converted to U+DC00. \*(TL does not impose the restriction that the keys in a JSON object must be strings: .code "#J{1:2,true:false}" is accepted. \*(TL allows the circle notation to occur within JSON syntax. See the section Notation for Circular and Shared Structure. \*(TL supports the extension of Lisp comments in JSON. When the .code ; character (semicolon) occurs in the middle of JSON syntax, outside of a token, that character and all characters until the end of the line constitute a comment that is discarded. \*(TL never produces comments when printing JSON. \*(TL allows for JSON syntax to be quasiquoted, and provides two extensions for writing unquotes and splicing unquotes. Within a JSON quasiquote, the .code ~ (tilde) character introduces a Lisp expression whose value is to be substituted at that point. Thus, the tilde serves the role of the unquoting comma used in Lisp quasiquotes. Splicing is indicated by the character sequence .codn ~* , which introduces a Lisp expression that is expected to produce a list, whose elements are interpolated into the JSON value. Note: quasiquoting allows Lisp values to be introduced into the resulting object which are outside of the JSON type system, such as integers, characters, symbols or structures. These objects have no representation in JSON syntax. .TP* Examples: .verb ;; Basic JSON: #Jtrue -> t #Jfalse -> nil (list #J true #Jtrue #Jfalse) -> (t t nil) #J[1, 2, 3.14] -> #(1.0 2.0 3.14) #J{"foo":"bar"} -> #H(() ("foo" "bar")) ;; Quoting JSON shows the json expression '#Jfalse -> (json quote ()) '#Jtrue -> (json quote t) '#J["a", true, 3.0] -> (json quote #("a" t 3.0)) '#J^[~(+ 2 2), 3] -> (json sys:qquote #(,(+ 2 2) 3.0)) :; Circle notation: #J[#1="abc", #1#, #1#] -> #("abc" "abc" "abc") ;; JSON Quasiquote: #J^[~*(list 1.0 2.0 3.0), ~(* 2.0 2), 5.0] --> #(1.0 2.0 3.0 4.0 5.0) ;; Lisp quasiquote around JSON quote: requires evaluation round. ^#J[~*(list 1.0 2.0 3.0), ~(* 2.0 2), 5.0] --> (json quote #(1.0 2.0 3.0 4.0 5.0)) (eval ^#J[~*(list 1.0 2.0 3.0), ~(* 2.0 2), 5.0]) --> #(1.0 2.0 3.0 4.0 5.0) ;; Comment extension #J[1, ; Comment inside JSON. 2, ; Another one. 3] ; Lisp comment outside of JSON. --> #(1.0 2.0 3.0) .brev .coNP The @ .. notation In \*(TL, there is a special "dotdot" notation consisting of a pair of dots. This can be written between successive atoms or compound expressions, and is a shorthand for .codn rcons . That is to say, .code "A .. B" translates to .codn "(rcons A B)" , and so for instance .code "(a b .. (c d) e .. f . g)" means .codn "(a (rcons b (c d)) (rcons e f) . g)" . The .code rcons function constructs a range object, which denotes a pair of values. Range objects are most commonly used for referencing subranges of sequences. For instance, if .code L is a list, then .code "[L 1 .. 3]" computes a sublist of .code L consisting of elements 1 through 2 (counting from zero). Note that if this notation is used in the dot position of an improper list, the transformation still applies. That is, the syntax .code "(a . b .. c)" is valid and produces the object .code "(a . (rcons b c))" which is another way of writing .codn "(a rcons b c)" , which is quite probably nonsense. The notation's .code .. operator associates right to left, so that .code a..b..c denotes .codn "(rcons a (rcons b c))" . Note that range objects are not printed using the dotdot notation. A range literal has the syntax of a two-element list, prefixed by .codn #R . (See Range Literals above.) In any context where the dotdot notation may be used, and where it is evaluated to its value, a range literal may also be specified. If an evaluated dotdot notation specifies two constant expressions, then an equivalent range literal can replace it. For instance the form .code "[L 1 .. 3]" can also be written .codn "[L #R(1 3)]" . The two are syntactically different, and so if these expressions are being considered for their syntax rather than value, they are not the same. .NP* The DWIM Brackets \*(TL has a square bracket notation. The syntax .code [...] is a shorthand way of writing .codn "(dwim ...)" . The .code [] syntax is useful for situations where the expressive style of a Lisp-1 dialect is useful. For instance if .code foo is a variable which holds a function object, then .code "[foo 3]" can be used to call it, instead of .codn "(call foo 3)" . If foo is a vector, then .code "[foo 3]" retrieves the fourth element, like .codn "(vecref foo 3)" . Indexing over lists, strings and hash tables is possible, and the notation is assignable. Furthermore, any arguments enclosed in .code [] which are symbols are treated according to a modified namespace lookup rule. More details are given in the documentation for the .code dwim operator. .NP* Compound Forms In \*(TL, there are two types of compound forms: the Lisp-2 style compound forms, denoted by ordinary lists that are expressed with parentheses. There are Lisp-1 style compound forms denoted by the DWIM Brackets, described in the previous section. The first position of an ordinary Lisp-2 style compound form, is expected to have a function or operator name. Then arguments follow. There may also be an expression in the dotted position, if the form is a function call. If the form is a function call then the arguments are evaluated. If any of the arguments are symbols, they are treated according to Lisp-2 namespacing rules. A function name may be a symbol, or else any of the syntactic forms given in the description of the function .codn func-get-name . .NP* Dot Position in Function Calls If there is an expression in the dotted position of a function call expression, it is also evaluated, and the resulting value is involved in the function call in a special way. Firstly, note that a compound form cannot be used in the dot position, for obvious reasons, namely that .code "(a b c . (foo z))" does not mean that there is a compound form in the dot position, but denotes an alternate spelling for .codn "(a b c foo z)" , where foo behaves as a variable. If the dot position of a compound form is an atom, then the behavior may be understood according to the following transformations: .verb (f a b c ... . x) --> (apply (fun f) a b c ... x) [f a b c ... . x] --> [apply f a b c ... x] .brev In addition to atoms, meta-expressions and meta-symbols can appear in the dot position, even though their underlying syntax is actually a compound expression. This is made to work according to a transformation pattern which superficially resembles the above one for atoms: .verb (f a b c ... . @x) --> (apply (fun f) a b c ... @x) .brev However, in this situation, the .code @x is a notation denoting the expression .code "(sys:var x)" and thus the entire form is a proper list, not a dotted list. With the underlying syntax revealed, the transformation looks like this: .verb (f a b c ... sys:var x) --> (apply (fun f) a b c ... (sys:var @x)) .brev That is to say, the \*(TL form expander reacts to the presence of a .code sys:var or .code sys:expr atom in embedded in the form. That symbol and the items which follow it are wrapped in an additional level of nesting, converted into a single compound form element. Effectively, in all these cases, the dot notation constitutes a shorthand for .codn apply . Examples: .verb ;; a contains 3 ;; b contains 4 ;; c contains #(5 6 7) ;; s contains "xyz" (foo a b . c) ;; calls (foo 3 4 5 6 7) (foo a) ;; calls (foo 3) (foo . s) ;; calls (foo #\ex #\ey #\ez) (list . a) ;; yields 3 (list a . b) ;; yields (3 . 4) (list a . c) ;; yields (3 5 6 7) (list* a c) ;; yields (3 . #(5 6 7)) (cons a . b) ;; error: cons isn't variadic. (cons a b . c) ;; error: cons requires exactly two arguments. [foo a b . c] ;; calls (foo 3 4 5 6 7) [c 1] ;; indexes into vector #(5 6 7) to yield 6 (call (op list 1 . @1) 2) ;; yields (1 . 2) .brev Note that the atom in the dot position of a function call may be a symbol macro. Since the semantics works as if by transformation to an apply form in which the original dot position atom is an ordinary argument, the symbol macro may produce a compound form. Thus: .verb (symacrolet ((x 2)) (list 1 . x)) ;; yields (1 . 2) (symacrolet ((x (list 1 2))) (list 1 . x)) ;; yields (1 1 2) .brev That is to say, the expansion of .code x is not substituted into the form .code "(list 1 . x)" but rather the transformation to .code apply syntax takes place first, and so the substitution of .code x takes place in a form resembling .codn "(apply (fun list) 1 x)" . Dialect Note: In some other Lisp dialects like ANSI Common Lisp, the improper list syntax may not be used as a function call; a function called apply (or similar) must be used for application even if the expression which gives the trailing arguments is a symbol. Moreover, applying sequences other than lists is not supported. .NP* Improper Lists as Macro Calls \*(TL allows macros to be called using forms which are improper lists. These forms are simply destructured by the usual macro parameter list destructuring. To be callable this way, the macro must have an argument list which specifies a parameter match in the dot position. This dot position must either match the terminating atom of the improper list form, or else match the trailing portion of the improper list form. For instance if a macro mac is defined as .verb (defmacro mac (a b . c) ...) .brev then it may not be invoked as .code "(mac 1 . 2)" because the required argument .code b is not satisfied, and so the .code 2 argument cannot match the dot position .code c as required. The macro may be called as .code "(mac 1 2 . 3)" in which case .code c receives the form .codn 3 . If it is called as .code "(mac 1 2 3 . 4)" then .code c receives the improper list form .codn "3 . 4" . .NP* Regular-Expression Literals In \*(TL, the .code / character can occur in symbol names, and the .code / token is a symbol. Therefore the .code /regex/ syntax is not used for denoting regular expressions; rather, the .code #/regex/ syntax is used. .NP* Notation for Circular and Shared Structure \*(TL supports a printed notation called .I "circle notation" which accurately articulates the representation of objects which contain shared substructures as well as circular references. The notation is supported as a means of input, and is also optionally produced as output, controlled by the .code *print-circle* variable. Ordinarily, shared substructure in printed objects is not evident, except in the case of multiple occurrences of interned symbols, in whose semantics it is implicit that they refer to the same object. Other shared structure is printed as separate copies which look like distinct objects. For instance, the object produced by .code "(let ((shared '(1 2))) (list shared shared))" is printed as .codn "((1 2) (1 2))" , where it is not clear that the two occurrences of .code "(1 2)" are actually the same object. Under the circle notation, this object can be represented as .codn "(#5=(1 2) #5#)" . The .code #5= part introduces a reference label, associating the arbitrarily chosen nonnegative integer 5 with the object which follows. The subsequent notation .code #5# simply refers to the object labeled by 5, reproducing that object by reference. The result is a two-element list which has the same .code "(1 2)" in two places. Circular structure presents a greater challenge to printing: namely, if it is printed by a naive recursive descent, it results in infinite output, and possibly stack exhaustion due to recursion. The circle notation detects and handles circular references. For instance, the object produced by .code "(let ((c (list 1))) (rplacd c c))" produces a circular list which looks like an infinite list of 1's: .codn "(1 1 1 1 ...)" . This cannot be printed. However, under the circle notation, it can be represented as .codn "#1=(1 . #1#)" . The entire object itself is labeled by the integer 1. Then, enclosed within the syntax of that labeled object itself, a reference occurs to the label. This circular label reference represents the corresponding circular reference in the object. A detailed description of the notational elements follows: .meIP <> # digits = < object The .code #= syntax introduces an object label which denotes the object whose printed representation follows. The label is identified by the integer value arising from digits .meta digits which are one or more decimal digits. Note: the value zero is permitted; even though when the notation is produced by the \*(TL printer, labeling begins at 1. Negative values are not possible because a leading sign is not part of the syntax. There may be no more than one definition for a given label within the syntactic scope being parsed, otherwise a syntax error occurs. In \*(TX pattern language code, an entire source file is parsed as one unit, and so scope for the circular notation's references is the entire source file. Files processed by .code @(include) have their own scope. The scope for labels in \*(TL source code is the top-level expression in which they appear. Consequently, references in one \*(TL top-level expression cannot reach definitions in another. .meIP <> # digits # The .code ## syntax denotes a label reference: the repetition of an object that was previously labeled by the integer given by .metn digits . If no such label had been introduced in the syntactic scope, a syntax error occurs. An object was previously labeled by .meta digits if a .code #= definition occurs in the same syntactic scope as the reference, and is applied to an object which either encloses the reference, or lexically precedes the reference. Forward references such as .code "(#1# #1=(1 2))" are not supported. .PP Note: Circular notation can span hash-table literals. The syntax .code "#1=#H((:eql-based) (#1# #1#))" denotes an .codn eql -based hash table which contains one entry, in which that same table itself is both the key and value. This kind of circularity is not supported for .codn equal -based hash tables. The analogous syntax .code "#1=#H(() (#1# #1#))" produces a hash table in an inconsistent state. Dialect Note: Circle notation is taken from Common Lisp, intended to be unsurprising to users familiar with that language. The implementation is based on descriptions in the ANSI Common Lisp document, judiciously taking into account the content of the X3J13 Cleanup Issues named PRINT-CIRCLE-STRUCTURE:USER-FUNCTIONS-WORK and PRINT-CIRCLE-SHARED:RESPECT-PRINT-CIRCLE. .NP* Notation for Erasing Objects .meIP #; < expr The \*(TL notation .code #; in TXR Lisp indicates that the expression .meta expr is to be read and then discarded, as if it were replaced by whitespace. This is useful for temporarily "commenting out" an expression. .PP Notes: Whereas it is valid for a \*(TL source file to be empty, it is a syntax error if a \*(TL source file contains nothing but one or more objects which are each suppressed by a preceding .codn #; . In the interactive listener, an input line consisting of nothing but commented-out objects is similarly a syntax error. The notation does not cascade; consecutive occurrences of .code #; trigger a syntax error. The notation interacts with the circle notation. Firstly, if an object which is erased by .code #; contains circular-referencing instances of the label notation, those instances refer to .codn nil . Secondly, commented-out objects may introduce labels which are subsequently referenced in .metn expr . An example of the first situation occurs in: .verb #;(#1=(#1#)) .brev Here the .code #1# label is a circular reference because it refers to an object which is a parent of the object which contains that reference. Such a reference is only satisfied by a "backpatching" process once the entire surrounding syntax is processed to the top level. The erasure perpetrated by .code #; causes the .code #1# label reference to be replaced by .codn nil , and therefore the labeled object is the object .codn (nil) . An example of the second situation is .verb #;(#2=(a b c)) #2# .brev Here, even though the expression .code "(#2=(a b c))" is suppressed, the label definition which it has introduced persists into the following object, where the label reference .code #2# resolves to .codn "(a b c)" . A combination of the two situations occurs in .verb #;(#1=(#1#)) #1# .brev which yields .codn "(nil)" . This is because the .code #1= label is available; but the earlier .code #1# reference, being a circular reference inside an erased object, had lapsed to .codn nil . .SS* Generalization of List Accessors In ancient Lisp in the 1960's, it was not possible to apply the operations .code car and .code cdr to the .code nil symbol (empty list), because it is not a .code cons cell. In the InterLisp dialect, this restriction was lifted: these operations were extended to accept .code nil (and return .codn nil ). The convention was adopted in other Lisp dialects such as MacLisp and eventually in Common Lisp. Thus there exists an object which is not a cons, yet which takes .code car and .codn cdr . In \*(TL, this relaxation is extended further. For the sake of convenience, the operations .code car and .codn cdr , are made to work with strings and vectors: .verb (cdr "") -> nil (car "") -> nil (car "abc") -> #\ea (cdr "abc") -> "bc" (cdr #(1 2 3)) -> #(2 3) (car #(1 2 3)) -> 1 .brev Moreover, structure types which define the methods .codn car , .code cdr and .code nullify can also be treated in the same way. The .code ldiff function is also extended in a special way. When the right parameter a non-list sequence, then it uses the equal equality test rather than eq for detecting the tail of the list. .verb (ldiff "abcd" "cd") -> (#\ea #\eb) .brev The .code ldiff operation starts with .str "abcd" and repeatedly applies .code cdr to produce .str "bcd" and .strn "cd" , until the suffix is equal to the second argument: .mono (equal "cd" "cd") .onom yields true. Operations based on .codn car , .code cdr and .codn ldiff , such as .code keep-if and .code remq extend to strings and vectors. Most derived list processing operations such as .code remq or .code mapcar obey the following rule: the returned object follows the type of the leftmost input list object. For instance, if one or more sequences are processed by .codn mapcar , and the leftmost one is a character string, the function is expected to return characters, which are converted to a character string. However, in the event that the objects produced cannot be assembled into that type of sequence, a list is returned instead. For example .mono [mapcar list "ab" "12"] .onom returns .codn "((#\ea #\eb) (#\e1 #\e2))" , because a string cannot hold lists of characters. However .mono [mappend list "ab" "12"] .onom returns .strn "a1b2" . The lazy versions of these functions such as .code mapcar* do not have this behavior; they produce lazy lists. .SS* Generalization of Iteration \*(TL implements a unified paradigm for iterating over sequence-like container structures and abstract spaces such as bounded and unbounded ranges of integers. This concept is based around an iterator abstraction which is directly compatible with Lisp cons-cell traversal in the sense that when iteration takes place over lists, the iterator instance is nothing but a cons cell. An iterator is created using the constructor function .code iter-begin which takes a single argument. The argument denotes a space to be traversed; the iterator provides the means for that traversal. When the .code iter-begin function is applied to a list (a .code cons cell or the .code nil object), the return value is that object itself. The remaining functions in the iterator API then behave like aliases for list processing functions. The .code iter-more function behaves like .codn identity , .code iter-item behaves like .code car and .code iter-step behaves like .codn cdr . For example, the following loops not only produce identical behavior, but the .code iter variable steps through the .code cons cells in the same manner in both: .verb ;; print all symbols in the list (a b c d): (let ((iter '(a b c d))) (while iter (prinl (car iter)) (set iter (cdr iter)))) ;; likewise: (let ((iter (iter-begin '(a b c d)))) (while (iter-more iter) (prinl (iter-item iter)) (set iter (iter-step iter)))) .brev There are three important differences. Firstly, both examples will still work if the list .code "(a b c d)" is replaced by a different kind of sequence, such as the string .str abcd or the vector .codn "#(a b c d)" . However, the former example will not execute efficiently on these objects. The reason is that the .code cdr function will construct successive suffixes of the string and list object. That requires not only the allocation of memory, but changes the running time complexity of the loop from linear to quadratic. Secondly, the former example with .cod3 car / cdr will not work correctly if the sequence is an empty non-list sequence, like the null string or empty vector. Rectifying this problem requires the .code nullify function to be used: .verb ;; print all symbols in the list (a b c d): (let ((iter (nullify "abcd"))) (while iter (prinl (car iter)) (set iter (cdr iter)))) .brev The .code nullify function converts empty sequences of all kinds into the empty list .codn nil . Thirdly, the second example will work even if the input list is replaced with certain objects which are not sequences at all: .verb ;; Print the integers from 0 to 3 (let ((iter (iter-begin 0..4))) (while (iter-more iter) (prinl (iter-item iter)) (set iter (iter-step iter)))) ;; Print incrementing integers starting at 1, ;; breaking out of the loop after 100. (let ((iter (iter-begin 1))) (while (iter-more iter) (if (eql 100 (prinl (iter-item iter))) (return)) (set iter (iter-step iter)))) .brev In \*(TL, numerous functions that appear as list processing functions in other contemporary Lisp dialects, and historically, are actually sequence processing functions based on the above iterator paradigm. .SS* Callable Objects In \*(TL, sequences (strings, vectors and lists) as well as hashes and regular expressions can be used as functions everywhere, not just with the DWIM brackets. Sequences work as one- or two-argument functions. With a single argument, an element is selected by position and returned. With two arguments, a range is extracted and returned. Moreover, when a sequence is used as a function of one argument, and the argument is a range object rather than an integer, then the call is equivalent to the two-argument form. This is the basis for array slice syntax like .mono ["abc" 0..1] . .onom Hashes also work as one or two argument functions, corresponding to the arguments of the gethash function. A regular expression behaves as a one, two, or three argument function, which operates on a string argument. It returns the leftmost matching substring, or else .codn nil . Structure objects are callable if they implement the .code lambda method. Integers and ranges are callable like functions. They take one argument, which must be a sequence or hash. An integer selects the corresponding element position from the sequence, and a range extracts a slice of its argument. .B Example 1: .verb (mapcar "abc" '(2 0 1)) -> (#\ec #\ea #\eb) .brev Here, .code mapcar treats the string .str abc as a function of one argument (since there is one list argument). This function maps the indices .codn 0 , .code 1 and .code 2 to the corresponding characters of string .strn abc . Through this function, the list of integer indices .code "(2 0 1)" is taken to the list of characters .codn "(#\ec #\ea #\eb)" . .B Example 2: .verb (call '(1 2 3 4) 1..3) -> (2 3) .brev Here, the shorthand .code "1 .. 3" denotes .codn "(rcons 1 3)" . A range used as an argument to a sequence performs range extraction: taking a slice starting at index 1, up to and not including index 3, as if by the call .codn "(sub '(1 2 3 4) 1 3)" . .B Example 3: .verb (call '(1 2 3 4) '(0 2)) -> (1 2) .brev A sequence applied to a list of index arguments is equivalent to using the select function, as if .code "(select '(1 2 3 4) '(0 2))" were called. .B Example 4: .verb (call #/b./ "abcd") -> "bc" .brev Here, the regular expression, called as a function, finds the matching substring .str bc within the argument .strn abcd . .B Example 5: .verb [1 "abcd"] -> #\eb ["abcd" 1] -> #\eb .brev An integer used as function indexes into sequence. This produces the same result as when the sequence is used as a function with an integer argument. .B Example 6: .verb [1..3 '(a b c d)] -> (b c) ['(a b c d) 1..3] -> (b c) .brev A range used as a function extracts a slice of its argument. .SS* Special Variables Similarly to Common Lisp, \*(TL is lexically scoped by default, but also has dynamically scoped (a.k.a "special") variables. When a variable is defined with .code defvar or .codn defparm , a binding for the symbol is introduced in the global name space, regardless of in what scope the .code defvar form occurs. Furthermore, at the time the defvar form is evaluated, the symbol which names the variable is tagged as special. When a symbol is tagged as special, it behaves differently when it is used in a lexical binding construct like .codn let , and all other such constructs such as function parameter lists. Such a binding is not the usual lexical binding, but a "rebinding" of the global variable. Over the dynamic scope of the form, the global variable takes on the value given to it by the rebinding. When the form terminates, the prior value of the variable is restored. (This is true no matter how the form terminates; even if by an exception.) Because of this "pervasive special" behavior of a symbol that has been used as the name of a global variable, a good practice is to make global variables have visually distinct names via the "earmuffs" convention: beginning and ending the name with an asterisk. .TP* "Example:" .verb (defvar *x* 42) ;; *x* has a value of 42 (defun print-x () (format t "~a\en" *x*)) (let ((*x* "abc")) ;; this overrides *x* (print-x)) ;; *x* is now "abc" and so that is printed (print-x) ;; *x* is 42 again and so "42" is printed .brev .TP* "Dialect Note 1:" The terms .I bind and .I binding are used differently in \*(TL compared to ANSI Common Lisp. In \*(TL binding is an association between a symbol and an abstract storage location. The association is registered in some namespace, such as the global namespace or a lexical scope. That storage location, in turn, contains a value. In ANSI Lisp, a binding of a dynamic variable is the association between the symbol and a value. It is possible for a dynamic variable to exist, and not have a value. A value can be assigned, which creates a binding. In \*(TL, an assignment is an operation which transfers a value into a binding, not one which creates a binding. In ANSI Lisp, a dynamic variable can exist which has no value. Accessing the value signals a condition, but storing a value is permitted; doing so creates a binding. By contrast, in \*(TL a global variable cannot exist without a value. If a .code defvar form doesn't specify a value, and the variable doesn't exist, it is created with a value of .codn nil . .TP* "Dialect Note 2:" Unlike ANSI Common Lisp, \*(TL has global lexical variables in addition to special variables. These are defined using .code defvarl and .codn defparml . The only difference is that when variables are introduced by these macros, the symbols are not marked special, so their binding in lexical scopes is not altered to dynamic binding. Many variables in \*(TL's standard library are global lexicals. Those which are special variables obey the "earmuffs" convention in their naming. For instance .codn s-ifmt , .code log-emerg and .code sig-hup are global lexicals, because they provide constant values for which overriding doesn't make sense. On the other hand the standard output stream variable .code *stdout* is special. Overriding it over a dynamic scope is useful, as a means of redirecting the output of functions which write to the .code *stdout* stream. .TP* "Dialect Note 3:" In Common Lisp, .code defparm is known as .codn defparameter . .SS* Syntactic Places and Accessors The \*(TL feature known as .I syntactic places allows programs to use the syntax of a form which is used to .I access a value from an environment or object, as an expression which denotes a .I place where a value may be .I stored. They are almost exactly the same concept as "generalized references" in Common Lisp, and are related to "lvalues" in languages in the C family, or "designators" in Pascal. .NP* Symbolic Places A symbol is a is a syntactic place if it names a variable. If .code a is a variable, then it may be assigned using the .code set operator: the form .code "(set a 42)" causes .code a to have the integer value 42. .NP* Compound Places A compound expression can be a syntactic place, if its leftmost constituent is as symbol which is specially registered, and if the form has the correct syntax for that kind of place, and suitable semantics. Such an expression is a compound place. An example of a compound place is a .code car form. If .code c is an expression denoting a .code cons cell, then .code "(car c)" is not only an expression which retrieves the value of the .code car field of the cell. It is also a syntactic place which denotes that field as a storage location. Consequently, the expression .mono (set (car c) "abc") .onom stores the character string .str "abc" in that location. Although the same effect can be obtained with .mono (rplaca c "abc") .onom the syntactic place frees the programmer from having to remember different update functions for different kinds of places. There are various other advantages. \*(TL provides a plethora of operators for modifying a place in addition to .codn set . Subject to certain usage restrictions, these operators work uniformly on all places. For instance, the expression .code "(rotate (car x) [str 3] y)" causes three different kinds of places to exchange contents, while the three expressions denoting those places are evaluated only once. New kinds of place update macros like .code rotate are quite easily defined, as are new kinds of compound places. .NP* Accessor Functions When a function call form such as the above .code "(car x)" is a syntactic place, then the function is called an .IR accessor . This term is used throughout this document to denote functions which have associated syntactic places. .NP* Macro Call Syntactic Places Syntactic places can be macros (global and lexical), including symbol macros. So for instance in .code "(set x 42)" the .code x place can actually be a symbolic macro which expands to, say, .codn "(cdr y)" . This means that the assignment is effectively .codn "(set (cdr y) 42)" . .NP* User-Defined Syntactic Places and Place Operators Syntactic places, as well as operators upon syntactic places, are both open-ended. Code can be written quite easily in \*(TL to introduce new kinds of places, as well as new place-mutating operators. New places can be introduced with the help of the .codn defplace , .code define-accessor or .code defset macros, or possibly the .code define-place-macro macro in simple cases when a new syntactic place can be expressed as a transformation to the syntax of an existing place. Three ways exist for developing new place update macros (place operators). They can be written using the ordinary macro definer ordinary macro definer .codn defmacro , with the help of special utility macros called .codn with-update-expander , .codn with-clobber-expander , and .codn with-delete-expander . They can also be written using .code defmacro in conjunction with the operators .code placelet or .codn placelet* . Simple update macros similar to .code inc and .code push can be written compactly using .codn define-modify-macro . .NP* Deletable Places Unlike generalized references in Common Lisp, \*(TL syntactic places support the concept of deletion. Some kinds of places can be deleted, which is an action distinct from (but does not preclude) being overwritten with a value. What exactly it means for a place to be deleted, or whether that is even permitted, depends on the kind of place. For instance a place which denotes a lexical variable may not be deleted, whereas a global variable may be. A place which denotes a hash-table entry may be deleted, and results in the entry being removed from the hash table. Deleting a place in a list causes the trailing items, if any, or else the terminating atom, to move in to close the gap. Users may define new kinds of places which support deletion semantics. .NP* Evaluation of Places To bring about their effect, place operators must evaluate one or more places. Moreover, some of them evaluate additional forms which are not places. Which arguments of a place operator form are places and which are ordinary forms depends on its specific syntax. For all the built-in place operators, the position of an argument in the syntax determines whether it is treated as (and consequently required to be) a syntactic place, or whether it is an ordinary form. All built-in place operators perform the evaluation of place and non-place argument forms in strict left-to-right order. Place forms are evaluated not in order to compute a value, but in order to determine the storage location. In addition to determining a storage location, the evaluation of a place form may possibly give rise to side effects. Once a place is fully evaluated, the storage location can then be accessed. Access to the storage location is not considered part of the evaluation of a place. To determine a storage location means to compute some hidden referential object which provides subsequent access to that location without the need for a reevaluation of the original place form. (The subsequent access to the place through this referential object may still require a multi-step traversal of a data structure; minimizing such steps is a matter of optimization.) Place forms may themselves be compounds, which contain subexpressions that must be evaluated. All such evaluation for the built-in places takes place in left to right order. Certain place operators, such as .code shift and .codn rotate , exhibit an unspecified behavior with regard to the timing of the access of the prior value of a place, relative to the evaluation of places which occur later in the same place operator form. Access to the prior values may be delayed until the entire form is evaluated, or it may be interleaved into the evaluation of the form. For example, in the form .codn "(shift a b c 1)" , the prior value of .code a can be accessed and saved as soon as .code a is evaluated, prior to the evaluation of .codn b . Alternatively, .code a may be accessed and saved later, after the evaluation of .code b or after the evaluation of all the forms. This issue affects the behavior of place-modifying forms whose subforms contain side effects. It is recommended that such forms not be used in programs. .NP* Nested Places Certain place forms are required to have one or more arguments which are themselves places. The prime example of this, and the only example from among built-in syntactic places, are DWIM forms. A DWIM form has the syntax .mono .mets (dwim < obj-place < index <> [ alt ]) .onom and the square-bracket-notation equivalent: .mono .mets >> [ obj-place < index <> [ alt ]] .onom Note that not only is the entire form a place, denoting some element or element range of .metn obj-place , but there is the added constraint that .meta obj-place must also itself be a syntactic place. This requirement is necessary, because it supports the behavior that when the element or element range is updated, then .meta obj-place is also potentially updated. After the assignment .mono (set [obj 0..3] '("forty" "two")) .onom not only is the range of places denoted by .code "[obj 0..3]" replaced by the list of strings .mono ("forty" "two") .onom but .code obj may also be overwritten with a new value. This behavior is necessary because the DWIM brackets notation maintains the illusion of an encapsulated array-like container over several dissimilar types, including Lisp lists. But Lisp lists do not behave as fully encapsulated containers. Some mutations on Lisp lists return new objects, which then have to stored (or otherwise accepted) in place of the original objects in order to maintain the array-like container illusion. .NP* Built-In Syntactic Places The following is a summary of the built-in place forms, in addition to symbolic places denoting variables. New syntactic place forms can be defined by \*(TX programs. .mono .mets (car << object ) .mets (first << object ) .mets (rest << object ) .mets (second << object ) .mets (third << object ) .mets ... .mets (tenth << object ) .mets (last < object <> [ num ]) .mets (butlast < object <> [ num ]) .mets (cdr << object ) .mets (caar << object ) .mets (cadr << object ) .mets (cdar << object ) .mets (cddr << object ) .mets ... .mets (cdddddr << object ) .mets (nthcdr < index << obj ) .mets (nthlast < index << obj ) .mets (butlastn < num << obj ) .mets (nth < index << obj ) .mets (ref < seq << idx ) .mets (sub < sequence >> [ from <> [ to ]]) .mets (vecref < vec << idx ) .mets (chr-str < str << idx ) .mets (gethash < hash < key <> [ alt ]) .mets (hash-userdata << hash ) .mets (dwim < obj-place < index <> [ alt ]) .mets (dwim < integer < obj-place ) ;; integers are callable .mets (dwim < range < obj-place ) ;; ranges are callable .mets (sub-list < obj >> [ from <> [ to ]]) .mets (sub-vec < obj >> [ from <> [ to ]]) .mets (sub-str < str >> [ from <> [ to ]]) .mets >> [ obj-place < index <> [ alt ]] ;; equivalent to dwim .mets >> [ integer < obj-place ] .mets >> [ range < obj-place ] .mets (symbol-value << symbol-valued-form ) .mets (symbol-function << function-name-valued-form ) .mets (symbol-macro << symbol-valued-form ) .mets (fun << function-name ) .mets (force << promise ) .mets (errno) .mets (slot < struct-obj << slot-name-valued-form ) .mets (qref < struct-obj << slot-name ) ;; by macro-expansion to (slot ...) .mets >< struct-obj . slot-name ;; equivalent to qref .mets (sock-peer << socket ) .mets (sock-opt < socket < level < option <> [ ffi-type ]) .mets (carray-sub < carray >> [ from <> [ to ]]) .mets (sub-buf < buf >> [ from <> [ to ]]) .mets (left << node ) .mets (right << node ) .mets (key << node ) .mets (read-once << node ) .onom .NP* Built-In Place-Mutating Operators The following is a summary of the built-in place mutating macros. They are described in detail in their own sections. .meIP (set >> { place << new-value }*) Assigns the values of expressions to places, performing assignments in left-to-right order, returning the value assigned to the rightmost place. .meIP (pset >> { place << new-value }*) Assigns the values of expressions to places, performing the determination of places and evaluation of the expressions left to right, but the assignment in parallel. Returns the value assigned to the rightmost place. .meIP (zap < place <> [ new-value ]) Assigns .meta new-value to place, defaulting to .codn nil , and returns the prior value. .meIP (flip << place ) Logically toggles the Boolean value of .metn place , and returns the new value. .meIP (test-set << place ) If .meta place contains .codn nil , stores .code t into the place and returns .code t to indicate that the store took place. Otherwise does nothing and returns .codn nil . .meIP (test-clear << place ) If .meta place contains a Boolean true value, stores .code nil into the place and returns .code t to indicate that the store took place. Otherwise does nothing and returns .codn nil . .meIP (compare-swap < place < cmp-fun < cmp-val << store-val ) Examines the value of .meta place and compares it to .meta cmp-val using the comparison function given by the function name .metn cmp-fun . If the comparison is false, returns .codn nil . Otherwise, stores the .meta store-val value into .meta place and returns .codn t . .meIP (ensure < place << init-expr ) If the place is .codn nil , evaluates .codn init-expr , stores that value into .meta place and returns it. Otherwise, returns the value of .meta place without changing its value or evaluating .codn init-expr . .meIP (inc < place <> [ delta ]) Increments .meta place by .metn delta , which defaults to 1, and returns the new value. .meIP (dec < place <> [ delta ]) Decrements .meta place by .metn delta , which defaults to 1, and returns the new value. .meIP (pinc < place <> [ delta ]) Increments .meta place by .metn delta , which defaults to 1, and returns the old value. .meIP (pdec < place <> [ delta ]) Decrements .meta place by .metn delta , which defaults to 1, and returns the old value. .meIP (test-inc < place >> [ delta <> [ from-val ]]) Increments .meta place by .meta delta and returns .code t if the previous value was .code eql to .metn from-val , where .meta delta defaults to 1 and .meta from-val defaults to zero. .meIP (test-dec < place >> [ delta <> [ to-val ]]) Decrements .meta place by .meta delta and returns .code t if the new value is .code eql to .metn to-val , where .meta delta defaults to 1 and .meta to-val defaults to 0. .meIP (swap < left-place << right-place ) Exchanges the values of .meta left-place and .metn right-place . .meIP (push < item << place ) Adds .meta item to the front of the list which is currently stored in .codn place , then stores the extended list back into .code place and returns it. .meIP (pop << place ) Pop the list stored in .meta place and returns the popped value. .meIP (shift << place + << shift-in-value) Treats one or more places as a "multi-place shift register". Values are shifted to the left among the places. The rightmost place receives .metn shift-in-value , and the value of the leftmost place emerges as the return value. .meIP (rotate << place *) Treats zero or more places as a "multi-place rotate register". The places exchange values among themselves, by a rotation by one place to the left. The value of the leftmost place goes to the rightmost place, and that value is returned. .meIP (del << place ) Deletes a place which supports deletion, and returns the value which existed in that place prior to deletion. .meIP (lset <> { place }+ << sequence ) Sets multiple places to values obtained from successive elements of .metn sequence . .meIP (upd < place << opip-arg *) Applies an .codn opip -style operational pipeline to the value of .meta place and stores the result back into .metn place . .meIP (set-mask < place << integer *) Sets to 1 the bits in .meta place corresponding to bits that are equal to 1 in the mask made up of the .meta integer arguments (by combining them together with the inclusive or operation). .meIP (clear-mask < place << integer *) Clears (sets to 0) the bits in .meta place corresponding to bits that are equal to 1 in the mask made up of the .meta integer arguments (by combining them together with the inclusive or operation). .PP .SS* Namespaces and Environments \*(TL is a Lisp-2 dialect: it features separate namespaces for functions and variables. .NP* Global Functions and Operator Macros In \*(TL, global functions and operator macros coexist, meaning that the same symbol can be defined as both a macro and a function. There is a global namespace for functions, into which functions can be introduced with the .code defun macro. The global function environment can be inspected and modified using the .code symbol-function accessor. There is a global namespace for macros, into which macros are introduced with the .code defmacro macro. The global function environment can be inspected and modified using the .code symbol-macro accessor. If a name .code x is defined as both a function and a macro, then an expression of the form .code "(x ...)" is expanded by the macro, whereas an expression of the form .code "[x ...]" refers to the function. Moreover, the macro can produce a call to the function. The expression .code "(fun x)" will retrieve the function object. .NP* Global and Dynamic Variables There is a global namespace for variables also. The operators .code defvar and .code defparm introduce bindings into this namespace. These operators have the side effect of marking a symbol as a special variable, of the symbol are treated as dynamic variables, subject to rebinding. The global variable namespace together with the special dynamic rebinding is called the dynamic environment. The dynamic environment can be inspected and modified using the .code symbol-value accessor. The operators .code defvarl and .code defparml introduce bindings into the global namespace without marking symbols as special variables. Such bindings are called global lexical variables. .NP* Global Symbol Macros Symbol macros may be defined over the global variable namespace using .codn defsymacro . Note that whereas a symbol may simultaneously have both a function and macro binding in the global namespace, a symbol may not simultaneously have a variable and symbol macro binding. .NP* Lexical Environments In addition to global and dynamic namespaces, \*(TL provides lexically scoped binding for functions, variables, macros, and symbol macros. Lexical variable binding are introduced with .codn let , .code let* or various binding macros derived from these. Lexical functions are bound with .code flet and .codn labels . Lexical macros are established with .code macrolet and lexical symbol macros with .codn symacrolet . Macros receive an environment parameter with which they may expand forms in their correct environment, and perform some limited introspection over that environment in order to determine the nature of bindings, or the classification of forms in those environments. This introspection is provided by .codn lexical-var-p , .codn lexical-fun-p , and .codn lexical-lisp1-binding . Lexical operator macros and lexical functions can also coexist in the following way. A lexical function shadows a global or lexical macro completely. However, the reverse is not the case. A lexical macro shadows only those uses of a function which look like macro calls. This is succinctly demonstrated by the following form: .verb (flet ((foo () 43)) (macrolet ((foo () 44)) (list (fun foo) (foo) [foo]))) -> (# 44 43) .brev The .code "(fun foo)" and .code [fun] expressions are oblivious to the macro; the macro expansion process process the symbol .code foo in those contexts. However the form .code (foo) is subject to macro-expansion and replaced with .codn 44 . If the .code flet and .code macrolet are reversed, the behavior is different: .verb (macrolet ((foo () 44)) (flet ((foo () 43)) (list (fun foo) (foo) [foo]))) -> (# 43 43) .brev All three forms refer to the function, which lexically shadows the macro. .NP* Pattern Language and Lisp Scope Nesting \*(TL expressions can be embedded in the \*(TX pattern language in various ways. Likewise, the pattern language can be invoked from \*(TL. This brings about the possibility that Lisp code attempts to access pattern variables bound in the pattern language. The \*(TX pattern language can also attempt to access \*(TL variables. The rules are as follows, but they have undergone historic changes. See the COMPATIBILITY section, in particular notes under 138 and 121, and also 124. A Lisp expression evaluated from the \*(TX pattern language executes in a null lexical environment. The current set of pattern variables captured up to that point by the pattern language are installed as dynamic variables. They shadow any Lisp global variables (whether those are defined by .code defvar or .codn defvarl ). In the reverse direction, a variable reference from the \*(TX pattern language searches the pattern variable space first. If a variable doesn't exist there, then the lookup refers to the \*(TL global variable space. The pattern language doesn't see Lisp lexical variables. When Lisp code is evaluated from the pattern language, the pattern variable bindings are not only installed as dynamic variables for the sake of their visibility from Lisp, but they are also specially stored in a dynamic environment frame. When \*(TX pattern code is reentered from Lisp, these bindings are picked up from the closest such environment frame, allowing the nested invocation of pattern code to continue with the bindings captured by outer pattern code. Concisely, in any context in which a symbol has both a binding as a Lisp global variable as well as a pattern variable, that symbol refers to the pattern variable. Pattern variables are propagated through Lisp evaluation into nested invocations of the pattern language. The pattern language can also reference Lisp variables using the .code @ prefix, which is a consequence of that prefix introducing an expression that is evaluated as Lisp, the name of a variable being such an expression. .SH* LISP OPERATOR, FUNCTION AND MACRO REFERENCE .SS* Conventions The following sections list all of the special operators, macros and functions in \*(TL. In these sections, syntax is indicated using these conventions: .meIP < word .ie n \{\ A symbol in angle brackets .\} .el \{\ A symbol in .meta fixed-width-italic font .\} denotes some syntactic unit: it may be a symbol or compound form. The syntactic unit is explained in the corresponding Description section. .meIP {syntax}* << word * This indicates a repetition of zero or more of the given syntax enclosed in the braces or syntactic unit. The curly braces may be omitted if the scope of the .code * is clear. .meIP {syntax}+ << word + This indicates a repetition of one or more of the given syntax enclosed in the braces or syntactic unit. The curly braces may be omitted if the scope of the .code + is clear. .coIP {syntax | syntax | ...} This indicates a single, mandatory element, which is selected from among the indicated alternatives. May be combined with .code + or .code * repetition. .meIP [syntax] <> [ word ] Square brackets indicate optional syntax. .meIP [syntax | syntax | ...] Square brackets containing piped elements indicate an optional element, which, if present, must be chosen from among the indicated alternatives. .coIP '[' ']' The quoted square brackets indicate literal brackets which appear in the syntax, which they do without quotes. For instance .code "'['foo [ bar ]']'" is a pattern denotes the two possible expressions .code "[foo]" and .codn "[foo bar]" . .meIP syntax -> < result The arrow notation is used in examples to indicate that the evaluation of the given syntax produces a value, whose printed representation is .metn result . .SS* Form Evaluation A compound expression with a symbol as its first element, if intended to be evaluated, denotes either an operator invocation or a function call. This depends on whether the symbol names an operator or a function. When the form is an operator invocation, the interpretation of the meaning of that form is under the complete control of that operator. If the compound form is a function call, the remaining forms, if any, denote argument expressions to the function. They are evaluated in left-to-right order to produce the argument values, which are passed to the function. An exception is thrown if there are not enough arguments, or too many. Programs can define named functions with the defun operator Some operators are macros. There exist predefined macros in the library, and macro operators can also be user-defined using the macro-defining operator .codn defmacro . Operators that are not macros are called special operators. Macro operators work as functions which are given the source code of the form. They analyze the form, and translate it to another form which is substituted in their place. This happens during a code walking phase called the expansion phase, which is applied to each top-level expression prior to evaluation. All macros occurring in a form are expanded in the expansion phase, and subsequent evaluation takes place on a structure which is devoid of macros. All that remains are the executable forms of special operators, function calls, symbols denoting either variables or themselves, and atoms such as numeric and string literals. Special operators can also perform code transformations during the expansion phase, but that is not considered macroexpansion, but rather an adjustment of the representation of the operator into a required executable form. In effect, it is post-macro compilation phase. Note that Lisp forms occurring in \*(TX pattern language are not individual top-level forms. Rather, the entire \*(TX query is parsed at the same time, and the macros occurring in its Lisp forms are expanded at that time. .coNP Operator @ quote .synb .mets (quote << form ) .syne .desc The .code quote operator, when evaluated, suppresses the evaluation of .metn form , and instead returns .meta form itself as an object. For example, if .meta form is a symbol .metn sym , then the value of .mono .meti (quote << sym ) .onom is .meta sym itself. Without .codn quote , .meta sym would evaluate to the value held by the variable which is named .metn sym , or else throw an error if there is no such variable. The .code quote operator never raises an error, if it is given exactly one argument, as required. The notation .mono .meti >> ' obj .onom is translated to the object .mono .meti (quote << obj ) .onom providing a shorthand for quoting. Likewise, when an object of the form .mono .meti (quote << obj ) .onom is printed, it appears as .codn 'obj . .TP* Example: .verb ;; yields symbol a itself, not value of variable a (quote a) -> a ;; yields three-element list (+ 2 2), not 4. (quote (+ 2 2)) -> (+ 2 2) .brev .SS* Variable Binding Variables are associations between symbols and storage locations which hold values. These associations are called .IR bindings . Bindings are held in a context called an .IR environment . .I Lexical environments hold local variables, and nest according to the syntactic structure of the program. Lexical bindings are always introduced by a some form known as a .IR "binding construct" , and the corresponding environment is instantiated during the evaluation of that construct. There also exist bindings outside of any binding construct, in the so-called .IR "global environment" . Bindings in the global environment can be temporarily shadowed by lexically-established binding in the .IR "dynamic environment" . See the Special Variables section above. Certain special symbols cannot be used as variable names, namely the symbols .code t and .codn nil , and all of the keyword symbols (symbols in the keyword package), which are denoted by a leading colon. When any of these symbols is evaluated as a form, the resulting value is that symbol itself. It is said that these special symbols are self-evaluating or self-quoting, similarly to all other atom objects such as numbers or strings. When a form consisting of a symbol, other than the above special symbols, is evaluated, it is treated as a variable, and yields the value of the variable's storage location. If the variable doesn't exist, an exception is thrown. Note: symbol forms may also denote invocations of symbol macros. (See the operators .code defsymacro and .codn symacrolet ). All macros, including symbol macros, which occur inside a form are fully expanded prior to the evaluation of a form, therefore evaluation does not consider the possibility of a symbol being a symbol macro. .coNP Operator @ defvar and Macro @ defparm .synb .mets (defvar < sym <> [ value ]) .mets (defparm < sym << value ) .syne .desc The .code defvar operator binds a name in the variable namespace of the global environment. Binding a name means creating a binding: recording, in some namespace of some environment, an association between a name and some named entity. In the case of a variable binding, that entity is a storage location for a value. The value of a variable is that which has most recently been written into the storage location, and is also said to be a value of the binding, or stored in the binding. If the variable named .meta sym already exists in the global environment, the form has no effect; the .meta value form is not evaluated, and the value of the variable is unchanged. If the variable does not exist, then a new binding is introduced, with a value given by evaluating the .meta value form. If the form is absent, the variable is initialized to .codn nil . The .meta value form is evaluated in the environment in which the .code defvar form occurs, not necessarily in the global environment. The symbols .code t and .code nil may not be used as variables, nor can they be keyword symbols (symbols denoted by a leading colon). In addition to creating a binding, the .code defvar operator also marks .meta sym as the name of a special variable. This changes what it means to bind that symbol in a lexical binding construct such as the .code let operator, or a function parameter list. See the section "Special Variables" far above. The .code defparm macro behaves like .code defvar when a variable named .meta sym doesn't already exist. If .meta sym already denotes a variable binding in the global namespace, .code defparm evaluates the .meta value form and assigns the resulting value to the variable. The following equivalence holds: .verb (defparm x y) <--> (prog1 (defvar x) (set x y)) .brev The .code defvar and .code defparm forms return .metn sym . .coNP Macros @ defvarl and @ defparml .synb .mets (defvarl < sym <> [ value ]) .mets (defparml < sym << value ) .syne .desc The .code defvarl and .code defparml macros behave, respectively, almost exactly like .code defvar and .codn defparm . The difference is that these operators do not mark .meta sym as special. If a global variable .meta sym does not previously exist, then after the evaluation of either of these forms .mono .meti (boundp << sym ) .onom is true, but .mono .meti (special-var-p << sym ) .onom isn't. If .meta sym had been already introduced as a special variable, it stays that way after the evaluation of .code defvarl or .codn defparml . .coNP Operators @ let and @ let* .synb .mets (let >> ({ sym | >> ( sym << init-form )}*) << body-form *) .mets (let* >> ({ sym | >> ( sym << init-form )}*) << body-form *) .syne .desc The .code let and .code let* operators introduce a new scope with variables and evaluate forms in that scope. The operator symbol, either .code let or .codn let* , is followed by a list which can contain any mixture of .meta sym or .mono .meti >> ( sym << init-form ) .onom pairs. Each .meta sym must be a symbol, and specifies the name of variable to be instantiated and initialized. The .mono .meti >> ( sym << init-form ) .onom variant specifies that the new variable .meta sym receives an initial value from the evaluation of .metn init-form . The plain .meta sym variant specifies a variable which is initialized to .codn nil . The .metn init-form s are evaluated in order, by both .code let and .codn let* . The symbols .code t and .code nil may not be used as variables, and neither can be keyword symbols: symbols denoted by a leading colon. The difference between .code let and .code let* is that in .codn let* , later .codn init-form s are in scope of the variables established by earlier variables in the same .code let* construct. In plain .codn let , the .metn init-form s are evaluated in a scope which does not include any of the variables. When the variables are established, the .metn body-form s are evaluated in order. The value of the last .meta body-form becomes the return value of the .codn let . If there are no .metn body-form s, then the return value .code nil is produced. The list of variables may be empty. The list of variables may contain duplicate .metn sym s if the operator is .codn let* . In that situation, a given .meta init-form has in scope the rightmost duplicate of any given .meta sym that has been previously established. The .metn body-form s have in scope the rightmost duplicate of any .meta sym in the construct. Therefore, the following form calculates the value 3: .verb (let* ((a 1) (a (succ a)) (a (succ a))) a) .brev Each duplicate is a separately instantiated binding, and may be independently captured by a lexical closure placed in a subsequent .codn init-form : .verb (let* ((a 0) (f1 (lambda () (inc a))) (a 0) (f2 (lambda () (inc a)))) (list [f1] [f1] [f1] [f2] [f2] [f2])) --> (1 2 3 1 2 3) .brev The preceding example shows that there are two mutable variables named .code a in independent scopes, each respectively captured by the separate closures .code f1 and .codn f2 . Three calls to .code f1 increment the first .code a while the second .code a retains its initial value. Under .codn let , the behavior of duplicate variables is unspecified. Implementation note: the \*(TX compiler diagnoses and rejects duplicate symbols in .code let whereas the interpreter ignores the situation. When the names of a special variables is specified in .code let or .code let* remain, a new binding is created for them in the dynamic environment, rather than the lexical environment. In .codn let* , later .metn init-form s are evaluated in a dynamic scope in which previous dynamic variables are established, and later dynamic variables are not yet established. A special variable may appear multiple times in a .codn let* , just like a lexical variable. Each duplicate occurrence extends the dynamic environment with a new dynamic binding. All these dynamic environments are removed when the .code let or .code let* form terminates. Dynamic environments aren't captured by lexical closures, but are captured in delimited continuations. .TP* Examples: .verb (let ((a 1) (b 2)) (list a b)) -> (1 2) (let* ((a 1) (b (+ a 1))) (list a b (+ a b))) -> (1 2 3) (let ()) -> nil (let (:a nil)) -> error, :a and nil can't be used as variables .brev .TP* "Rationale:" \*(TL follows ANSI Common Lisp in making .code let the parallel binding construct, and .code let* the sequential one. In that language, the situation exists for historic reasons: mainly that .code let was initially understood as being a macro for an immediately-called .code lambda where the parameters come into existence simultaneously, receiving the evaluated values of all the argument expressions. The need for sequential binding was recognized later, by which time .code let was cemented as a parallel binding construct. There are very good arguments for, in a new design, using the .code let name for the construct which has sequential semantics. Nevertheless, in this matter, \*(TL remains compatible with dialects like ANSI CL and Emacs Lisp. .coNP Operator @ progv .synb .mets (progv < symbols-expr < values-expr << body-form *) .syne .desc The .code progv operator binds dynamic variables, and evaluates the .metn body-form s in the dynamic scope of those bindings. The bindings are removed when the form terminates. The result value is that of the last .meta body-form or else .code nil if there are no forms. The .meta symbols-expr and .meta values-expr are expressions which are evaluated. Their values are expected to be lists, of bindable symbols and arbitrary values, respectively. The symbols coming from one list are bound to the values coming from the other list. If there are more symbols than values, then the extra symbols will appear unbound, as if they were first bound and then hidden using the .code makunbound function. If there are more values than symbols, the extra values are ignored. Note that dynamic binding takes place for the symbols even if they have not been introduced as special variables via .code defvar or .codn defparm . However, if those symbols appear as expressions denoting variables inside the .metn body-form s, they will not necessarily be treated as dynamic variables. If they have lexical definitions in scope, those will be referenced. Furthermore, the compiler treats undefined variables as global references, and not dynamic. .TP* Examples: .verb (progv '(a b) '(1 2) (cons a b)) -> (1 . 2) (progv '(x) '(1) (let ((x 4)) (symbol-value 'x))) -> 1 (let ((x 'lexical) (vars (list 'x)) (vals (list 'dynamic))) (progv vars vals (list x (symbol-value 'x)))) --> (lexical dynamic) .brev .SS* Functions .coNP Operator @ defun .synb .mets (defun < name <> ( param * [: << opt-param *] [. << rest-param ]) .mets \ \ << body-form ) .syne .desc The .code defun operator introduces a new function in the global function namespace. The function is similar to a lambda, and has the same parameter syntax and semantics as the .code lambda operator. Note that the above syntax synopsis describes only the canonical parameter syntax which remains after parameter list macros are expanded. See the section Parameter List Macros. Unlike in .codn lambda , the .metn body-form s of a .code defun are surrounded by a block. The name of this block is the same as the name of the function, making it possible to terminate the function and return a value using .mono .meti (return-from < name << value ). .onom For more information, see the definition of the block operator. A function may call itself by name, allowing for recursion. The special symbols .code t and .code nil may not be used as function names. Neither can keyword symbols. It is possible to define methods as well as macros with .codn defun , as an alternative to the .code defmeth and .code defmacro forms. To define a method, the syntax .mono .meti (meth < type << name ) .onom should be used as the argument to the .meta name parameter. This gives rise to the syntax .mono .meti (defun (meth < type << name ) < args << form *) .onom which is equivalent to the .mono .meti (defmeth < type < name < args << form *) .onom syntax. Macros can be defined using .mono .meti (macro << name ) .onom as the .meta name parameter of .codn defun . This way of defining a macro doesn't support destructuring; it defines the expander as an ordinary function with an ordinary argument list. To work, the function must accept two arguments: the entire macro call form that is to be expanded, and the macro environment. Thus, the macro definition syntax is .mono .meti (defun (macro << name ) < form < env << form *) .onom which is equivalent to the .mono .meti (defmacro < name (:form < form :env << env ) << form *) .onom syntax. .TP* "Dialect Note:" In ANSI Common Lisp, keywords may be used as function names. In TXR Lisp, they may not. .TP* "Dialect Note:" A function defined by .code defun may coexist with a macro defined by .codn defmacro . This is not permitted in ANSI Common Lisp. .coNP Operator @ lambda .synb .mets (lambda <> ( param * [: << opt-param *] [. << rest-param ]) .mets \ \ << body-form ) .mets (lambda < rest-param .mets \ \ << body-form ) .syne .desc The .code lambda operator produces a value which is a function. Like in most other Lisps, functions are objects in \*(TL. They can be passed to functions as arguments, returned from functions, aggregated into lists, stored in variables, etc. Note that the above syntax synopsis describes only the canonical parameter syntax which remains after parameter list macros are expanded. See the section Parameter List Macros. The first argument of .code lambda is the list of parameters for the function. It may be empty, and it may also be an improper list (dot notation) where the terminating atom is a symbol other than .codn nil . It can also be a single symbol. The second and subsequent arguments are the forms making up the function body. The body may be empty. When a function is called, the parameters are instantiated as variables that are visible to the body forms. The variables are initialized from the values of the argument expressions appearing in the function call. The dotted notation can be used to write a function that accepts a variable number of arguments. There are two ways write a function that accepts only a variable argument list and no required arguments: .mono .mets (lambda (. << rest-param ) ...) .mets (lambda < rest-param ...) .onom (These notations are syntactically equivalent because the list notation .code "(. X)" actually denotes the object .meta X which isn't wrapped in any list). The keyword symbol .code : (colon) can appear in the parameter list. This is the symbol in the keyword package whose name is the empty string. This symbol is treated specially: it serves as a separator between required parameters and optional parameters. Furthermore, the .code : symbol has a role to play in function calls: it can be specified as an argument value to an optional parameter by which the caller indicates that the optional argument is not being specified. It will be processed exactly that way. An optional parameter can also be written in the form .mono .meti >> ( name < expr <> [ sym ]). .onom In this situation, if the call does not specify a value for the parameter, or specifies a value as the .code : (colon) keyword symbol, then the parameter takes on the value of the expression .metn expr . This expression is only evaluated when its value is required. If .meta sym is specified, then .meta sym will be introduced as an additional binding with a Boolean value which indicates whether or not the optional parameter had been specified by the caller. Each .meta expr that is evaluated is evaluated in an environment in which all of the previous parameters are visible, in addition to the surrounding environment of the .codn lambda . For instance: .verb (let ((default 0)) (lambda (str : (end (length str)) (counter default)) (list str end counter))) .brev In this .codn lambda , the initializing expression for the optional parameter end is .codn "(length str)" , and the .meta str variable it refers to is the previous argument. The initializer for the optional variable counter is the expression default, and it refers to the binding established by the surrounding let. This reference is captured as part of the .codn lambda 's lexical closure. Keyword symbols, and the symbols .code t and .code nil may not be used as parameter names. The behavior is unspecified if the same symbol is specified more than once anywhere in the parameter list, whether as a parameter name or as the indicator .meta sym in an optional parameter or any combination. Implementation note: the \*(TX compiler diagnoses and rejects duplicate symbols in .code lambda whereas the interpreter ignores the situation. Note: it is not always necessary to use the .code lambda operator directly in order to produce an anonymous function. In situations when .code lambda is being written in order to simulate partial evaluation, it may be possible to instead make use of the .code op macro. For instance the function .code "(lambda (. args) [apply + a args])" which adds the values of all of its arguments together, and to the lexically captured variable .code a can be written more succinctly as .codn "(op + a)" . The .code op operator is the main representative of a family of operators: .codn lop , .codn ap , .codn ip , .codn do , .codn ado , .code opip and .codn oand . In situations when functions are simply combined together, the effect may be achieved using some of the available functional combinators, instead of a .codn lambda . For instance chaining together functions as in .code "(lambda (x) (square (cos x)))" is achievable using the .code chain function: .codn "[chain cos square]" . The .code opip operator can also be used: .codn "(opip cos square)" . Numerous combinators are available; see the section Partial Evaluation and Combinators. When a function is needed which accesses an object, there are also alternatives. Instead of .code "(lambda (obj) obj.slot)" and .codn "(lambda (obj arg) obj.(slot arg))" , it is simpler to use the .code ".slot" and .code ".(slot arg)" notations. See the section Unbound Referencing Dot. Also see the functions .code umethod and .code uslot as well as the related convenience macros .code umeth and .codn usl . If a function is needed which partially applies, to some arguments, a method invoked on a specific object, the .code method function or .code meth macro may be used. For instance, instead of .codn "(lambda (arg) obj.(method 3 arg))" , it is possible to write .code "(meth obj 3)" except that the latter produces a variadic function. .TP* Examples: The following expression returns a function which captures the variable .codn counter . Whenever the returned function is called, it increments .code counter by one, and returns the incremented value. .verb (let ((counter 0)) (lambda () (inc counter))) .brev The following produces a variadic function which requires at least two arguments. The third and subsequent arguments are aggregated into a list passed as the single parameter .codn z : .verb (lambda (x y . z) (list 'my-arguments-are x y z)) .brev A variadic function with no required arguments. The parameter name for the received arguments appears alone in place of the parameter list. .verb (lambda args (list 'my-list-of-arguments args)) .brev Same as the previous example, using a dotted notation specific to \*(TL. .verb (lambda (. args) (list 'my-list-of-arguments args)) .brev Note that .code "(. args)" is just a written notation equivalent to .code args and not a different object structure. Optional arguments: .verb [(lambda (x : y) (list x y)) 1] -> (1 nil) [(lambda (x : y) (list x y)) 1 2] -> (1 2) .brev Passing .code : (colon symbol) to request default value of optional parameter: .verb [(lambda (x : (y 42) z) (list x y z)) 1 2 3] -> (1 2 3) [(lambda (x : (y 42) z) (list x y z)) 1 : 3] -> (1 42 3) [(lambda (x : (y 42) z) (list x y z)) 1] -> (1 42 nil) .brev Presence-indicating variable accompanying optional parameter: .verb [(lambda (x : (y 42 have-y)) (list x y have-y)) 1 2] -> (1 2 t) [(lambda (x : (y 42 have-y)) (list x y have-y)) 1] -> (1 42 nil) ;; defaulting via : is indistinguishable from missing [(lambda (x : (y 42 have-y)) (list x y have-y)) 1 :] -> (1 42 nil) .brev .coNP Macros @ flet and @ labels .synb .mets (flet >> ({( name < param-list << function-body-form *)}*) .mets \ \ << body-form *) .mets (labels >> ({( name < param-list << function-body-form *)}*) .mets \ \ << body-form *) .syne .desc The .code flet and .code labels macros bind local, named functions in the lexical scope. Note that the above syntax synopsis describes only the canonical parameter syntax which remains after parameter list macros are expanded. See the section Parameter List Macros. The difference between .code flet and .code labels is that a function defined by .code labels can see itself, and therefore recurse directly by name. Moreover, if multiple functions are defined by the same labels construct, they all have each other's names in scope of their bodies. By contrast, a .codn flet -defined function does not have itself in scope and cannot recurse. Multiple functions in the same .code flet do not have each other's names in their scopes. More formally, the .metn function-body-form s and .meta param-list of the functions defined by .code labels are in a scope in which all of the function names being defined by that same .code labels construct are visible. Under both .code labels and .codn flet , the local functions that are defined are lexically visible to the main .metn body-form s. Note that .code labels and .code flet are properly scoped with regard to macros. During macro expansion, function bindings introduced by these macro operators shadow macros defined by .code macrolet and .codn defmacro . Furthermore, function bindings introduced by .code labels and .code flet also shadow symbol macros defined by .codn symacrolet , when those symbol macros occur as arguments of a .code dwim form. See also: the .code macrolet operator. .TP* "Dialect Note:" The .code flet and .code labels macros do not establish named blocks around the body forms of the local functions which they bind. This differs from ANSI Common Lisp, whose local function have implicit named blocks, allowing for .code return-from to be used. .TP* Examples: .verb ;; Wastefully slow algorithm for determining evenness. ;; Note: ;; - mutual recursion between labels-defined functions ;; - inner is-even bound by labels shadows the outer ;; one bound by defun so the (is-even n) call goes ;; to the local function. (defun is-even (n) (labels ((is-even (n) (if (zerop n) t (is-odd (- n 1)))) (is-odd (n) (if (zerop n) nil (is-even (- n 1))))) (is-even n))) .brev .coNP Function @ call .synb .mets (call < function << argument *) .syne .desc The .code call function invokes .metn function , passing it the given arguments, if any. .meta function need not be a function; other kinds of objects can be used in place of functions with various semantics. The details are given in the description of the .code dwim operator. .TP* Examples: Apply .code lambda to .code "1 2" arguments, adding them to produce .codn 3 : .verb (call (lambda (a b) (+ a b)) 1 2) .brev Useless use of .code call on a named function; equivalent to .codn "(list 1 2)" : .verb (call (fun list) 1 2) .brev .coNP Functions @ apply and @ iapply .synb .mets (apply < function <> [ arg * << trailing-args ]) .mets (iapply < function <> [ arg * << trailing-args ]) .syne .desc The .code apply function invokes .metn function , optionally passing to it an argument list. The return value of the .code apply call is that of .metn function . If no arguments are present after .metn function , then .meta function is invoked without arguments. If one argument is present after .metn function , then it is interpreted as .metn trailing-args . If this is a sequence (a list, vector or string), then the elements of the sequence are passed as individual arguments to .metn function . If .meta trailing-args is not a sequence, then .meta function is invoked with an improper argument list, terminated by the .meta trailing-args atom. If two or more arguments are present after .metn function , then the last of these arguments is interpreted as .metn trailing-args . The previous arguments represent leading arguments. When the argument list is formed to which .meta function is applied, the leading arguments become individual arguments presented in the same order, followed by arguments taken from the .meta trailing_args list. Note that if .meta trailing-args value is an atom or an improper list, the function is then invoked with an improper argument list. Only a variadic function may be invoked with an improper argument list. Moreover, all of the function's required and optional parameters must be satisfied by elements of the improper list, such that the terminating atom either matches the .meta rest-param directly (see the .code lambda operator) or else the .meta rest-param receives an improper list terminated by that atom. To treat the terminating atom of an improper list as an ordinary element which can satisfy a required or optional function parameter, the .code iapply function may be used, described next. The .code iapply function ("improper apply") is similar to .codn apply , except with regard to the treatment of .metn trailing-args . Firstly, under .codn iapply , if .meta trailing-args is an atom other than .code nil (possibly a sequence, such as a vector or string), then it is treated as an ordinary argument: .meta function is invoked with a proper argument list, whose last element is .metn trailing-args . Secondly, if .meta trailing-args is a list, but an improper list, then the terminating atom of .meta trailing-args becomes an individual argument. This terminating atom is not split into multiple arguments, even if it is a sequence. Thus, in all possible cases, .code iapply treats an extra .cod2 non- nil atom as an argument, and never calls .meta function with an improper argument list. .TP* Examples: .verb ;; '(1 2 3) becomes arguments to list, thus (list 1 2 3). (apply (fun list) '(1 2 3)) -> (1 2 3) ;; this effectively invokes (list 1 2 3 4) (apply (fun list) 1 2 '(3 4)) -> (1 2 3 4) ;; this effectively invokes (list 1 2 . 3) (apply (fun list) 1 2 3)) -> (1 2 . 3) ;; "abc" is separated into characters ;; which become arguments of list (apply (fun list) "abc") -> (#\ea #\eb #\ec) .brev .TP* "Dialect Note:" Note that some uses of this function that are necessary in other Lisp dialects are not necessary in \*(TL. The reason is that in \*(TL, improper list syntax is accepted as a compound form, and performs application: .verb (foo a b . x) .brev Here, the variables .code a and .code b supply the first two arguments for .codn foo . In the dotted position, .code x must evaluate to a list or vector. The list or vector's elements are pulled out and treated as additional arguments for .codn foo . This syntax can only be used if .code x is a symbolic form or an atom. It cannot be a compound form, because .code "(foo a b . (x))" and .code "(foo a b x)" are equivalent structures. .coNP Operator @ fun .synb .mets (fun << function-name ) .syne .desc The .code fun operator retrieves the function object corresponding to a named function in the current lexical environment. The .meta function-name may be a symbol denoting a named function: a built in function, or one defined by .codn defun . The .meta function-name may also take any of the forms specified in the description of the .code func-get-name function. If such a .meta function-name refers to a function which exists, then the .code fun operator yields that function. Note: the .code fun operator does not see macro bindings via their symbolic names with which they are defined by .codn defmacro . However, the name syntax .mono .meti (macro << name ) .onom may be used to refer to macros. This syntax is documented in the description of .codn func-get-name . It is also possible to retrieve a global macro expander using the function .codn symbol-macro . .coNP Operator @ dwim .synb .mets (dwim << argument *) .mets (set (dwim < obj-place < index <> [ alt ]) << new-value ) .mets (set (dwim >> { integer | << range } << obj-place ) << new-value ) .mets <> '[' argument *']' .mets (set >> '[' obj-place < index <> [ alt ]']' << new-value ) .mets (set >> '[{' integer | << range } << obj-place ']' << new-value ) .syne .desc The .code dwim operator's name is an acronym: DWIM may be taken to mean "Do What I Mean", or alternatively, "Dispatch, in a Way that is Intelligent and Meaningful". The notation .code [...] is a shorthand which denotes .codn "(dwim ...)" . Note that since the .code [ and .code ] are used in this document for indicating optional syntax, in the above Syntax synopsis the quoted notation .code '[' and .code ']' denotes bracket tokens which literally appear in the syntax. The .code dwim operator takes a variable number of arguments, which are treated as expressions to be individually macro-expanded and evaluated, using the same rules. This means that the first argument isn't a function name, but an ordinary expression which can simply compute a function object (or, more generally, a callable object). Furthermore, for those arguments of .code dwim which are symbols (after all macro-expansion is performed), the evaluation rules are altered. For the purposes of resolving symbols to values, the function and variable binding namespaces are considered to be merged into a single space, creating a situation that is similar to a Lisp-1 style dialect. This special Lisp-1 evaluation is not recursively applied. All arguments of .code dwim which, after macro expansion, are not symbols are evaluated using the normal Lisp-2 evaluation rules. Thus, the DWIM operator must be used in every expression where the Lisp-1 rules for reducing symbols to values are desired. If a symbol has bindings both in the variable and function namespace in scope, and is referenced by a dwim argument, this constitutes a conflict which is resolved according to two rules. When nested scopes are concerned, then an inner binding shadows an outer binding, regardless of their kind. An inner variable binding for a symbol shadows an outer or global function binding, and vice versa. If a symbol is bound to both a function and variable in the global namespace, then the variable binding is favored. Macros do not participate in the special scope conflation, with one exception. What this means is that the space of symbol macros is not folded together with the space of operator macros. An argument of .code dwim that is a symbol might be symbol macro, variable or function, but it cannot be interpreted as the name of a operator macro. The exception is this: from the perspective of a .code dwim form, function bindings can shadow symbol macros. If a function binding is defined in an inner scope relative to a symbol macro for the same symbol, using .code flet or .codn labels , the function hides the symbol macro. In other words, when macro expansion processes an argument of a .code dwim form, and that argument is a symbol, it is treated specially in order to provide a consistent name lookup behavior. If the innermost binding for that symbol is a function binding, it refers to that function binding, even if a more outer symbol macro binding exists, and so the symbol is not expanded using the symbol macro. By contrast, in an ordinary form, a symbolic argument never resolves to a function binding. The symbol refers to either a symbol macro or a variable, whichever is nested closer. If, after macro expansion, the leftmost argument of the .code dwim is the name of a special operator or macro, the .code dwim form doesn't denote an invocation of that operator or macro. A .code dwim form is an invocation of the .code dwim operator, and the leftmost argument of that operator, if it is a symbol, is treated as a binding to be resolved in the variable or function namespace, like any other argument. Thus .code "[if x y]" is an invocation of the .code if function, not the .code if operator. How many arguments are required by the .code dwim operator depends on the type of object to which the first argument expression evaluates. The possibilities are: .RS .meIP >> [ function << argument *] Call the given function object with the given arguments. .meIP >> [ symbol << argument *] If the first expression evaluates to a symbol, that symbol is resolved in the function namespace, and then the resulting function, if found, is called with the given arguments. .meIP >> [ sequence << index ] Retrieve an element from .metn sequence , at the specified .metn index , which is a nonnegative integer. This form is also a syntactic place. If a value is stored to this place, it replaces the element. The place may also be deleted, which has the effect of removing the element from the sequence, shifting the elements at higher indices, if any, down one element position, and shortening the sequence by one. If the place is deleted, and if .meta sequence is a list, then the .meta sequence form itself must be a place. This form is implemented using the .code ref accessor such that, except for the argument evaluation semantics of the DWIM brackets, it is equivalent to using the .mono .meti (ref < sequence << index ) .onom syntax. .meIP >> [ sequence << from-index..to-below-index ] Retrieve the specified range of elements. The range of elements is specified in the .code from and .code to fields of a range object. The .code .. (dotdot) syntactic sugar denotes the construction of the range object via the .code rcons function. See the section on Range Indexing below. This form is also a syntactic place. Storing a value in this place has the effect of replacing the subsequence with a new subsequence. Deleting the place has the effect of removing the specified subsequence from .metn sequence . If .meta sequence is a list, then the .meta sequence form must itself be a place. The .meta new-value argument in a range assignment can be a string, vector or list, regardless of whether the target is a string, vector or list. If the target is a string, the replacement sequence must be a string, or a list or vector of characters. The semantics is implemented using the .code sub accessor, such that the following equivalence holds: .verb [seq from..to] <--> (sub seq from..to) .brev For this reason, .meta sequence may be any object that is iterable by .codn iter-begin . .meIP >> [ sequence << index-seq ] Elements of .meta sequence specified by elements of .metn index-seq , are extracted and returned as a sequence of the same kind as .metn sequence . This form is equivalent to .mono .meti (select < sequence << where-index ) .onom except when the target of an assignment operation. This form is a syntactic place if .meta sequence is one. If a sequence is assigned to this place, then elements of the sequence are distributed to the specified locations. The following equivalences hold between index-sequence-based indexing and the .code select and .code replace functions, except that .code set always returns the value assigned, whereas .code replace returns its first argument: .verb [seq idx-seq] <--> (select seq idx-seq) (set [seq idx-seq] new) <--> (replace seq new idx-seq) .brev Note that unlike the select function, this does not support .mono .meti >> [ hash << index-seq ] .onom because since hash keys may be lists, that syntax is indistinguishable from a simple hash lookup where .meta index-seq is the key. .meIP >> [ hash < key <> [ alt ]] Retrieve a value from the hash table corresponding to .metn key , or else return .meta alt if there is no such entry. The expression .meta alt is always evaluated, whether or not its value is used. .meIP >> [ search-tree << key ] Retrieves an element from the search tree as if by applying the .code tree-lookup function to .metn key . .meIP >> [ search-tree << from-key..to-below-key ] Retrieves a list of elements from the search tree as if by evaluating the .mono .meti (sub-tree < search-tree < from-key << to-below-key ) .onom expression. .meIP >> [ regex >> [ start <> [ from-end ]] << string ] Determine whether regular expression .meta regex matches .metn string , and in that case return the (possibly empty) leftmost matching substring. Otherwise, return .codn nil . If .meta start is specified, it gives the starting position where the search begins, and if .meta from-end is given, and has a value other than .codn nil , it specifies a search from right to left. These optional arguments have the same conventions and semantics as their equivalents in the .code search-regst function. Note that .meta string is always required, and is always the rightmost argument. .meIP >> [ struct << arg *] The structure instance .meta struct is inquired whether it supports a method named by the symbol .metn lambda . If so, that method is invoked on the object. The method receives .meta struct followed by the value of every .metn arg . If this form is used as a place, then the object must support a .code lambda-set method. .meIP >> [ carray << index ] .meIP >> [ carray << from-index..to-below-index ] Element and range indexing is possible on object of type .code carray which manipulate arrays in a foreign ("C language") representation, and are closely associated with the Foreign Function Interface (FFI). Just like in the case of sequences, the semantics of referencing .code carray objects with the bracket notation is based on the functions .codn ref , .codn refset , .code sub and .codn replace . These, in turn, rely on the specialized functions. .codn carray-ref , .codn carray-refset , .code carray-sub and .codn carray-replace . .meIP >> [ buf << index ] Indexing is supported for objects of type .codn buf . This provides a way to access and store the individual bytes of a buffer. .meIP >> [ integer << sequence ] If the left argument is an integer, it denotes selection of an element from .metn sequence . The .meta integer value acts as the index into a vector-like or list-like sequence, or a key into a hash table. .meIP >> [ range >> { seq | << ind }] If the left argument is a range, and there is one argument, the semantics is that of the .code rangeref function: either the selection of a point from the range by an integer index .metn ind , or the selection of a subrange of sequence .meta seq according to the endpoints of .metn range . .RE .PP Note that the various above forms are not actually cases of the .code dwim operator but the due to the semantics of the left argument objects being used as functions. All of the semantics described above is available in any situation in which an object is used as a function: for instance, as an argument of the .code call or .code apply operators, or the functional argument in .codn mapcar . .TP* "Range Indexing:" Vector and list range indexing is based from zero, meaning that the first element is numbered zero, the second one and so on. zero. Negative values are allowed; the value .code -1 refers to the last element of the vector or list, and .code -2 to the second last and so forth. Thus the range .code "1 .. -2" means "everything except for the first element and the last two". The symbol .code t represents the position one past the end of the vector, string or list, so .code 0..t denotes the entire list or vector, and the range .code t..t represents the empty range just beyond the last element. It is possible to assign to .codn t..t . For instance: .verb (defvar list '(1 2 3)) (set [list t..t] '(4)) ;; list is now (1 2 3 4) .brev The value zero has a "floating" behavior when used as the end of a range. If the start of the range is a negative value, and the end of the range is zero, the zero is interpreted as being the position past the end of the sequence, rather than the first element. For instance the range .code -1..0 means the same thing as .codn -1..t . Zero at the start of a range always means the first element, so that .code 0..-1 refers to all the elements except for the last one. .TP* Notes: The dwim operator allows for a Lisp-1 flavor of programming in \*(TL, which is principally a Lisp-2 dialect. A Lisp-1 dialect is one in which an expression like .code "(a b)" treats both a and b as expressions subject to the same evaluation rules\(emat least, when .code a isn't an operator or an operator macro. This means that the symbols .code a and .code b are resolved to values in the same namespace. The form denotes a function call if the value of variable .code a is a function object. Thus in a Lisp-1, named functions do not exist as such: they are just variable bindings. In a Lisp-1, the form .code "(car 1)" means that there is a variable called .codn car , which holds a function, which is retrieved from that variable and applied to the .code 1 argument. In the expression .codn "(car car)" , both occurrences of .code car refer to the variable, and so this form applies the .code car function to itself. It is almost certainly meaningless. In a Lisp-2 .code "(car 1)" means that there is a function called .codn car , in the function namespace. In the expression .code "(car car)" the two occurrences refer to different bindings: one is a function and the other a variable. Thus there can exist a variable .code car which holds a cons-cell object, rather than the .code car function, and the form makes sense. The Lisp-1 approach is useful for functional programming, because it eliminates cluttering occurrences of the call and fun operators. For instance: .verb ;; regular notation (call foo (fun second) '((1 a) (2 b))) ;; [] notation [foo second '((1 a) (2 b))] .brev Lisp-1 dialects can also provide useful extensions by giving a meaning to objects other than functions in the first position of a form, and the .code dwim/[...] syntax does exactly this. \*(TL is a Lisp-2 because Lisp-2 also has advantages. Lisp-2 programs which use macros naturally achieve hygiene because lexical variables do not interfere with the function namespace. If a Lisp-2 program has a local variable called .codn list , this does not interfere with the hidden use of the function .code list in a macro expansion in the same block of code. Lisp-1 dialects have to provide hygienic macro systems to attack this problem. Furthermore, even when not using macros, Lisp-1 programmers have to avoid using the names of functions as lexical variable names, if the enclosing code might use them. The two namespaces of a Lisp-2 also naturally accommodate symbol macros and operator macros. Whereas functions and variables can be represented in a single namespace readily, because functions are data objects, this is not so with symbol macros and operator macros, the latter of which are distinguished syntactically by their position in a form. In a Lisp-1 dialect, given .codn "(foo bar)" , either of the two symbols could be a symbol macro, but only .code foo can possibly be an operator macro. Yet, having only a single namespace, a Lisp-1 doesn't permit .codn "(foo foo)" , where .code foo is simultaneously a symbol macro and an operator macro, though the situation is unambiguous by syntax even in Lisp-1. In other words, Lisp-1 dialects do not entirely remove the special syntactic recognition given to the leftmost position of a compound form, yet at the same time they prohibit the user from taking full advantage of it by providing only one namespace. \*(TL provides the "best of both worlds": the DWIM brackets notation provides a model of Lisp-1 computation that is purer than Lisp-1 dialects (since the leftmost argument is not given any special syntactic treatment at all) while the Lisp-2 foundation provides a traditional Lisp environment with its "natural hygiene". .coNP Function @ functionp .synb .mets (functionp << obj ) .syne .desc The .code functionp function returns .code t if .meta obj is a function, otherwise it returns .codn nil . .coNP Function @ copy-fun .synb .mets (copy-fun << function ) .syne .desc The .code copy-fun function produces and returns a duplicate of .metn function , which must be a function. A duplicate of a function is a distinct function object not .code eq to the original function, yet which accepts the same arguments and behaves exactly the same way as the original. If a function contains no captured environment, then a copy made of that function by .code copy-fun is indistinguishable from the original function in every regard, except for being a distinct object that compares unequal to the original under the .code eq function. If a function contains a captured environment, then a copy of that function made by .code copy-fun has its own copy of that environment. If the copied function changes the values of captured lexical variables, the original function is not affected by these changes and vice versa. The entire lexical environment is copied; the copy and original function do not share any portion of the environment at any level of nesting. .SS* Sequencing, Selection and Iteration .coNP Operators/Functions @ progn and @ prog1 .synb .mets (progn << form *) .mets (prog1 << form *) .syne .desc The .code progn operator evaluates each .meta form in left-to-right order, and returns the value of the last form. The value of the form .code (progn) is .codn nil . The .code prog1 operator evaluates each .meta form in left-to-right order, and returns the value of the first form. The value of the form .code (prog1) is .codn nil . Various other operators such as .code let also arrange for the evaluation of a body of forms, the value of the last of which is returned. These operators are said to feature an implicit .codn progn . These special operators are also functions. The .code progn function accepts zero or more arguments. It returns its last argument, or .code nil if called with no arguments. The .code prog1 function likewise accepts zero or more arguments. It returns its first argument, or .code nil if called with no arguments. .TP* "Dialect Notes:" In ANSI Common Lisp, .code prog1 requires at least one argument. Neither .code prog nor .code prog1 exist as functions. .coNP Macro/Function @ prog2 .synb .mets (prog2 << form *) .syne .desc The .code prog2 evaluates each .meta form in left-to-right order. The value is that of the second form, if present, otherwise it is .codn nil . The form .code "(prog2 1 2 3)" yields .codn 2 . The value of .code "(prog2 1 2)" is also .codn 2 ; .code "(prog2 1)" and .code "(prog2)" yield .codn nil . The .code prog2 symbol also has a function binding. The .code prog2 function accepts any number of arguments. If invoked with at least two arguments, it returns the second one. Otherwise it returns .codn nil . .TP* "Dialect Notes:" In ANSI Common Lisp, .code prog2 requires at least two arguments. It does not exist as a function. .coNP Operator @ cond .synb .mets (cond >> {( test << form *)}*) .syne .desc The .code cond operator provides a multi-branching conditional evaluation of forms. Enclosed in the cond form are groups of forms expressed as lists. Each group must be a list of at least one form. The forms are processed from left to right as follows: the first form, .metn test , in each group is evaluated. If it evaluates true, then the remaining forms in that group, if any, are also evaluated. Processing then terminates and the result of the last form in the group is taken as the result of cond. If .meta test is the only form in the group, then result of .meta test is taken as the result of .codn cond . If the first form of a group yields .codn nil , then processing continues with the next group, if any. If all form groups yield .codn nil , then the cond form yields .codn nil . This holds in the case that the syntax is empty: .code (cond) yields .codn nil . .coNP Macros @, caseq @ caseql and @ casequal .synb .mets (caseq < test-form << normal-clause * <> [ else-clause ]) .mets (caseql < test-form << normal-clause * <> [ else-clause ]) .mets (casequal < test-form << normal-clause * <> [ else-clause ]) .syne .desc These three macros arrange for the evaluation of .metn test-form , whose value is then compared against the key or keys in each .metn normal-clause . When the value matches a key, then the remaining forms of .meta normal-clause are evaluated, and the value of the last form is returned; subsequent clauses are not evaluated. If no .meta normal-clause matches, and there is no .metn else-clause , then the value nil is returned. Otherwise, the forms in the .meta else-clause are evaluated, and the value of the last one is returned. If there are no forms, then .code nil is returned. If duplicates keys are present in such a way that the value of the .meta test-form matches multiple .metn normal-clause s, it is unspecified which of those clauses is evaluated. The syntax of a .meta normal-clause takes on these two forms: .mono .mets >> ( key << form *) .onom where .meta key may be an atom which denotes a single key, or else a list of keys. There is a restriction that the symbol .code t may not be used as .metn key . The form .code (t) may be used as a key to match that symbol. The syntax of an .meta else-clause is: .mono .mets (t << form *) .onom which resembles a form that is often used as the final clause in the .code cond syntax. The three forms of the case construct differ from what type of test they apply between the value of .meta test-form and the keys. The .code caseq macro generates code which uses the .code eq function's equality. The .code caseql macro uses .codn eql , and .code casequal uses .codn equal . .TP* Example .verb (let ((command-symbol (casequal command-string (("q" "quit") 'quit) (("a" "add") 'add) (("d" "del" "delete") 'delete) (t 'unknown)))) ...) .brev .coNP Macros @, caseq* @ caseql* and @ casequal* .synb .mets (caseq* < test-form << normal-clause * <> [ else-clause ]) .mets (caseql* < test-form << normal-clause * <> [ else-clause ]) .mets (casequal* < test-form << normal-clause * <> [ else-clause ]) .syne .desc The .codn caseq* , .codn caseql* , and .code casequal* macros are similar to the macros .codn caseq , .codn caseql , and .codn casequal , differing from them in only the following regard. The .metn normal-clause , of these macros has the form .mono .meti >> ( evaluated-key << form *) .onom where .code evaluated-key is either an atom, which is evaluated to produce a key, or else else a compound form, whose elements are evaluated as forms, producing multiple keys. This evaluation takes place at macro-expansion time, in the global environment. The .meta else-clause works the same way under these macros as under .code caseq et al. Note that although in a .metn normal-clause , .meta evaluated-key must not be the atom .codn t , there is no restriction against it being an atom which evaluates to .codn t . In this situation, the value .code t has no special meaning. The .meta evaluated-key expressions are evaluated in the order in which they appear in the construct, at the time the .codn caseq* , .code caseql* or .code casequal* macro is expanded. Note: these macros allow the use of variables and global symbol macros as case keys. .TP* Example: .verb (defvarl red 0) (defvarl green 1) (defvarl blue 2) (let ((color blue)) (caseql* color (red "hot") ((green blue) "cool"))) --> "cool" .brev .coNP Macros @, ecaseq @, ecaseql @, ecasequal @, ecaseq* @ ecaseql* and @ ecasequal* .synb .mets (ecaseq < test-form << normal-clause * <> [ else-clause ]) .mets (ecaseql < test-form << normal-clause * <> [ else-clause ]) .mets (ecasequal < test-form << normal-clause * <> [ else-clause ]) .mets (ecaseq* < test-form << normal-clause * <> [ else-clause ]) .mets (ecaseql* < test-form << normal-clause * <> [ else-clause ]) .mets (ecasequal* < test-form << normal-clause * <> [ else-clause ]) .syne .desc These macros are error-catching variants of, respectively, .codn caseq , .codn caseql , .codn casequal , .codn caseq* , .code caseql* and .codn casequal* . If the .meta else-clause is present in the invocation of an error-catching case macro, then the the invocation is precisely equivalent to the corresponding non-error-trapping variant. If the .meta else-clause is missing in the invocation of an error-catching variant, then a default .meta else-clause is inserted which throws an exception of type .codn case-error , derived from .codn error . After this insertion, the semantics follows that of the non-error-trapping variant. For instance, .codn "(ecaseql 3)" , which has no .metn else-clause , is equivalent to .mono .meti (caseql 3 (t << expr )) .onom where .meta expr indicates the inserted expression which throws .codn case-error . However, .code "(ecaseql 3 (t 42))" is simply equivalent to .codn "(caseql 3 (t 42))" , since it has an .metn else-clause . Note: the error-catching case macros are intended for situations in which it is a matter of program correctness that every possible value of .meta test-form matches a .metn normal-clause , such that if a failure to match occurs, it indicates a software defect. The error-throwing .meta else-clause helps to ensure that the error situation is noticed. Without this clause, the case macro terminates with a value of .codn nil , which may conceal the defect and delay its identification. .coNP Operator/Function @ if .synb .mets (if < cond < t-form <> [ e-form ]) .mets '['if < cond < then <> [ else ]']' .syne .desc There exist both an .code if operator and an .code if function. A list form with the symbol .code if in the first position is interpreted as an invocation of the .code if operator. The function can be accessed using the DWIM bracket notation and in other ways. The .code if operator provides a simple two-way-selective evaluation control. The .meta cond form is evaluated. If it yields true then .meta t-form is evaluated, and that form's return value becomes the return value of the .codn if . If .meta cond yields false, then .meta e-form is evaluated and its return value is taken to be that of .codn if . If .meta e-form is omitted, then the behavior is as if .meta e-form were specified as .codn nil . The .code if function provides no evaluation control. All of its arguments are evaluated from left to right. If the .meta cond argument is true, then it returns the .meta then argument, otherwise it returns the value of the .meta else argument if present, otherwise it returns .codn nil . .coNP Operator/Function @ and .synb .mets (and << form *) .mets '['and << arg *']' .syne .desc There exist both an .code and operator and an .code and function. A list form with the symbol .code and in the first position is interpreted as an invocation of the operator. The function can be accessed using the DWIM bracket notation and in other ways. The .code and operator provides three functionalities in one. It computes the logical "and" function over several forms. It controls evaluation (a.k.a. "short-circuiting"). It also provides an idiom for the convenient substitution of a value in place of .code nil when some other values are all true. The .code and operator evaluates as follows. First, a return value is established and initialized to the value .codn t . The .metn form s, if any, are evaluated from left to right. The return value is overwritten with the result of each .metn form . Evaluation stops when all .metn form s are exhausted, or when .code nil is stored in the return value. When evaluation stops, the operator yields the return value. The .code and function provides no evaluation control: it receives all of its arguments fully evaluated. If it is given no arguments, it returns .codn t . If it is given one or more arguments, and any of them are .codn nil , it returns .codn nil . Otherwise, it returns the value of the last argument. .TP* Examples: .verb (and) -> t (and (> 10 5) (stringp "foo")) -> t (and 1 2 3) -> 3 ;; shorthand for (if (and 1 2) 3). .brev .coNP Macro/Function @ nand .synb .mets (nand << form *) .mets '['nand << arg *']' .syne .desc There exist both a .code nand macro and a .code nand function. A list form with the symbol .code nand in the first position is interpreted as an invocation of the macro. The function can be accessed using the DWIM bracket notation and in other ways. The .code nand macro and function are the logical negation of the .code and operator and function. They are related according to the following equivalences: .verb (nand f0 f1 f2 ...) <--> (not (and f0 f1 f2 ...)) [nand f0 f1 f2 ...] <--> (not [and f0 f1 f2 ...]) .brev .coNP Operator/Function @ or .synb .mets (or << form *) .mets '['or << arg *']' .syne .desc There exist both an .code or operator and an .code or function. A list form with the symbol .code or in the first position is interpreted as an invocation of the operator. The function can be accessed using the DWIM bracket notation and in other ways. The .code or operator provides three functionalities in one. It computes the logical "or" function over several forms. It controls evaluation (a.k.a. "short-circuiting"). The behavior of .code or also provides an idiom for the selection of the first .cod2 non- nil value from a sequence of forms. The .code or operator evaluates as follows. First, a return value is established and initialized to the value .codn nil . The .metn form s, if any, are evaluated from left to right. The return value is overwritten with the result of each .metn form . Evaluation stops when all .metn form s are exhausted, or when a true value is stored into the return value. When evaluation stops, the operator yields the return value. The .code or function provides no evaluation control: it receives all of its arguments fully evaluated. If it is given no arguments, it returns .codn nil . If all of its arguments are .codn nil , it also returns .codn nil . Otherwise, it returns the value of the first argument which isn't .codn nil . .TP* Examples: .verb (or) -> nil (or 1 2) -> 1 (or nil 2) -> 2 (or (> 10 20) (stringp "foo")) -> t .brev .coNP Macro/Function @ nor .synb .mets (nor << form *) .mets '['nor << arg *']' .syne .desc There exist both a .code nor macro and a .code nor function. A list form with the symbol .code nor in the first position is interpreted as an invocation of the macro. The function can be accessed using the DWIM bracket notation and in other ways. The .code nor macro and function are the logical negation of the .code or operator and function. They are related according to the following equivalences: .verb (nor f0 f1 f2 ...) <--> (not (or f0 f1 f2 ...)) [nor f0 f1 f2 ...] <--> (not [or f0 f1 f2 ...]) .brev .coNP Macros @ when and @ unless .synb .mets (when < expression << form *) .mets (unless < expression << form *) .syne .desc The .code when macro operator evaluates .metn expression . If .meta expression yields true, and there are additional forms, then each .meta form is evaluated. The value of the last form becomes the result value of the .code when form. If there are no forms, then the result is .codn nil . The .code unless operator is similar to .codn when , except that it reverses the logic of the test. The forms, if any, are evaluated if and only if .meta expression is false. .coNP Macros @ while and @ until .synb .mets (while < expression << form *) .mets (until < expression << form *) .syne .desc The .code while macro operator provides a looping construct. It evaluates .metn expression . If .meta expression yields .codn nil , then the evaluation of the .code while form terminates, producing the value .codn nil . Otherwise, if there are additional forms, then each .meta form is evaluated. Next, evaluation returns to .metn expression , repeating all of the previous steps. The .code until macro operator is similar to .codn while , except that the .code until form terminates when .meta expression evaluates true, rather than false. These operators arrange for the evaluation of all their enclosed forms in an anonymous block. Any of the .metn form s, or .metn expression , may use the .code return operator to terminate the loop, and optionally to specify a result value for the form. The only way these forms can yield a value other than .code nil is if the .code return operator is used to terminate the implicit anonymous block, and is given an argument, which becomes the result value. .coNP Macros @ while* and @ until* .synb .mets (while* < expression << form *) .mets (until* < expression << form *) .syne .desc The .code while* and .code until* macros are similar, respectively, to the macros .code while and .codn until . They differ in one respect: they begin by evaluating the .metn form s one time unconditionally, without first evaluating .metn expression . After this evaluation, the subsequent behavior is like that of .code while or .codn until . Another way to regard the behavior is that that these forms execute one iteration unconditionally, without evaluating the termination test prior to the first iteration. Yet another view is that these constructs relocate the test from the top of the loop to the bottom of the loop. .coNP Macro @ whilet .synb .mets (whilet >> ({ sym | >> ( sym << init-form )}+) .mets \ \ << body-form *) .syne .desc The .code whilet macro provides a construct which combines iteration with variable binding. The evaluation of the form takes place as follows. First, fresh bindings are established for .metn sym s as if by the .code let* operator. It is an error for the list of variable bindings to be empty. After the establishment of the bindings, the value of the last .meta sym is tested. If the value is .codn nil , then .code whilet terminates. Otherwise, .metn body-form s are evaluated in the scope of the variable bindings, and then .code whilet iterates from the beginning, again establishing fresh bindings for the .metn sym s, and testing the value of the last .metn sym . All evaluation takes place in an anonymous block, which can be terminated with the .code return operator. Doing so terminates the loop. If the .code whilet loop is thus terminated by an explicit .codn return , a return value can be specified. Under normal termination, the return value is .codn nil . In the syntax, a small convenience is permitted. Instead of the last .mono .meti >> ( sym << init-form ) .onom it is permissible for the syntax .mono .meti <> ( init-form ) .onom to appear, the .meta sym being omitted. A machine-generated variable is substituted in place of the missing .meta sym and that variable is then initialized from .meta init-form and used as the basis of the test. .TP* Examples: .verb ;; read lines of text from *stdin* and print them, ;; until the end-of-stream condition: (whilet ((line (get-line))) (put-line line)) ;; read lines of text from *stdin* and print them, ;; until the end-of-stream condition occurs or ;; a line is identical to the character string "end". (whilet ((line (get-line)) (more (and line (nequal line "end")))) (put-line line)) .brev .coNP Macros @ iflet and @ whenlet .synb .mets (iflet >> {({ sym | >> ( sym << init-form )}+) | << atom-form } .mets \ \ < then-form <> [ else-form ]) .mets (whenlet >> {({ sym | >> ( sym << init-form )}+) | << atom-form } .mets \ \ << body-form *) .syne .desc The .code iflet and .code whenlet macros combine the variable binding of .code let* with conditional evaluation of .code if and .codn when , respectively. In either construct's syntax, a non-compound form .meta atom-form may appear in place of the variable binding list. In this case, .meta atom-form is evaluated as a form, and the construct is equivalent to its respective ordinary .code if or .code when counterpart. If the list of variable bindings is empty, it is interpreted as the atom .code nil and treated as an .metn atom-form . If one or more bindings are specified rather than .metn atom-form , then the evaluation of these forms takes place as follows. First, fresh bindings are established for .metn sym s as if by the .code let* operator. Then, the last variable's value is tested. If it is not .code nil then the test is true, otherwise false. In the syntax, a small convenience is permitted. Instead of the last .mono .meti >> ( sym << init-form ) .onom it is permissible for the syntax .mono .meti <> ( init-form ) .onom to appear, the .meta sym being omitted. A machine-generated variable is substituted in place of the missing .meta sym and that variable is then initialized from .meta init-form and used as the basis of the test. This is intended to be useful in situations in which .meta then-form or .meta else-form do not require access to the tested value. In the case of the .code iflet operator, if the test is true, the operator evaluates .meta then-form and yields its value. Otherwise the test is false, and if the optional .meta else-form is present, that is evaluated instead and its value is returned. If this form is missing, then .code nil is returned. In the case of the .code whenlet operator, if the test is true, then the .metn body-form s, if any, are evaluated. The value of the last one is returned, otherwise .code nil if the forms are missing. If the test is false, then evaluation of .metn body-form s is skipped, and .code nil is returned. .TP* Examples: .verb ;; dispose of foo-resource if present (whenlet ((foo-res (get-foo-resource obj))) (foo-shutdown foo-res) (set-foo-resource obj nil)) ;; Contrast with: above, using when and let (let ((foo-res (get-foo-resource obj))) (when foo-res (foo-shutdown foo-res) (set-foo-resource obj nil))) ;; print frobosity value if it exceeds 150 (whenlet ((fv (get-frobosity-value)) (exceeds-p (> fv 150))) (format t "frobosity value ~a exceeds 150\en" fv)) ;; same as above, taking advantage of the ;; last variable being optional: (whenlet ((fv (get-frobosity-value)) ((> fv 150))) (format t "frobosity value ~a exceeds 150\en" fv)) ;; yield 4: 3 interpreted as atom-form (whenlet 3 4) ;; yield 4: nil interpreted as atom-form (iflet () 3 4) .brev .coNP Macro @ condlet .synb .mets (condlet .mets \ ([({ sym | >> ( sym << init-form )}+) | << atom-form ] .mets \ \ << body-form *)*) .syne .desc The .code condlet macro generalizes .codn iflet . Each argument is a compound consisting of at least one item: a list of bindings or .metn atom-form . This item is followed by zero or more .metn body-form s. If there are no .metn body-form s then the situation is treated as if there were a single .meta body-form specified as .codn nil . The arguments of .code condlet are considered in sequence, starting with the leftmost. If the argument's left item is an .meta atom-form then the form is evaluated. If it yields true, then the .metn body-form s next to it are evaluated in order, and the .code condlet form terminates, yielding the value obtained from the last .metn body-form . If .meta atom-form yields false, then the next argument is considered, if there is one. If the argument's left item is a list of bindings, then it is processed with exactly the same logic as under the .code iflet macro. If the last binding contains a true value, then the adjoining .metn body-form s are evaluated in a scope in which all of the bindings are visible, and .code condlet terminates, yielding the value of the last .metn body-form . Otherwise, the next argument of .code condlet is considered (processed in a scope in which the bindings produced by the current item are no longer visible). If .code condlet runs out of arguments, it terminates and returns .codn nil . .TP* Example: .verb (let ((l '(1 2 3))) (condlet ;; first arg (((a (first l) ;; a binding gets 1 (b (second l)) ;; b binding gets 2 (g (> a b)))) ;; last variable g is nil 'foo) ;; not evaluated ;; second arg (((b (second l) ;; b gets 2 (c (third l)) ;; c gets 3 (g (> b c)))) ;; last variable g is true 'bar))) ;; condlet terminates --> bar ;; result is bar .brev .coNP Macro @ ifa .synb .mets (ifa < cond < then <> [ else ]) .syne .desc The .code ifa macro provides an anaphoric conditional operator resembling the .code if operator. Around the evaluation of the .meta then and .meta else forms, the symbol .code it is implicitly bound to a subexpression of .metn cond , a subexpression which is thereby identified as the .IR it-form . This .code it alias provides a convenient reference to that place or value, similar to the word "it" in the English language, and similar anaphoric pronouns in other languages. If .code it is bound to a place form, the binding is established as if using the .code placelet operator: the form is evaluated only once, even if the .code it alias is used multiple times in the .meta then or .meta else expressions. Furthermore, the place form is implicitly surrounded with .code read-once so that the place's value is accessed only once, and multiple references to .code it refer to a copy of the value cached in a hidden variable, rather than generating multiple accesses to the place. Otherwise, if the form is not a syntactic place .code it is bound as an ordinary lexical variable to the form's value. An .I it-candidate is an an expression viable for having its value or storage location bound to the .code it symbol. An it-candidate is any expression which is not a constant expression according to the .code constantp function, and not a symbol. The .code ifa macro imposes applies several rules to the .meta cond expression: .RS .IP 1. The .meta cond expression must be either an atom, a function call form, or a .code dwim form. Otherwise the .code ifa expression is ill-formed, and throws an exception at macro-expansion time. For the purposes of these rules, a .code dwim form is considered as a function call expression, whose first argument is the second element of the form. That is to say, .code "[f x]" which is equivalent to .code "(dwim f x)" is treated similarly to .code "(f x)" as a one-argument call. .IP 2. If the .meta cond expression is a function call with two or more arguments, at most one of them may be an it-candidate. If two or more arguments are it-candidates, the situation is ambiguous. The .code ifa expression is ill-formed and throws an exception at macro-expansion time. .IP 3. If .meta cond is an atom, or a function call expression with no arguments, then the .code it symbol is not bound. Effectively, .code ifa macro behaves like the ordinary .code if operator. .IP 4. If .meta cond is a function call or .code dwim expression with exactly one argument, then the .code it variable is bound to the argument expression, except when the function being called is .codn not , .codn null , or .codn false . This binding occurs regardless of whether the expression is an it-candidate. .IP 5. If .meta cond is a function call with exactly one argument to the Boolean negation function which goes by one of the three names .codn not , .codn null , or .codn false , then that situation is handled by a rewrite according to the following pattern: .mono .mets (ifa (not << expr ) < then << else ) -> (ifa < expr < else << then ) .onom which applies likewise for .code null or .code false substituted for .codn not . The Boolean inverse function is removed, and the .meta then and .meta else expressions are exchanged. .IP 6. If .meta cond is a function call with two or more arguments, then it is only well-formed if at most one of those arguments is an it-candidate. If there is one such argument, then the .code it variable is bound to it. .IP 7. Otherwise the variable is bound to the leftmost argument expression, regardless of whether that argument expression is an it-candidate. .RE .IP In all other regards, the .code ifa macro behaves similarly to .codn if . The .meta cond expression is evaluated, and, if applicable, the value of, or storage location denoted by the appropriate argument is captured and bound to the variable .code it whose scope extends over the .meta then form, as well as over .metn else , if present. If .meta cond yields a true value, then .meta then is evaluated and the resulting value is returned, otherwise .meta else is evaluated if present and its value is returned. A missing .meta else is treated as if it were the .code nil form. .TP* Examples: .verb (ifa t 1 0) -> 1 ;; Rule 6: it binds to (* x x), which is ;; the only it-candidate. (let ((x 6) (y 49)) (ifa (> y (* x x)) ;; it binds to (* x x) (list it))) -> (36) ;; Rule 4: it binds to argument of evenp, ;; even though 4 isn't an it-candidate. (ifa (evenp 4) (list it)) -> (4) ;; Rule 5: (ifa (not (oddp 4)) (list it)) -> (4) ;; Rule 7: no candidates: choose leftmost (let ((x 6) (y 49)) (ifa (< 1 x y) (list it))) -> (1) -> (4) ;; Violation of Rule 1: ;; while is not a function (ifa (while t (print 42)) (list it)) --> exception! ;; Violation of Rule 2: (let ((x 6) (y 49)) (ifa (> (* y y y) (* x x))) (list it)) --> exception! .brev .coNP Macro @ conda .synb .mets (conda >> {( test << form *)}*) .syne .desc The .code conda operator provides a multi-branching conditional evaluation of forms, similarly to the .code cond operator. Enclosed in the cond form are groups of forms expressed as lists. Each group must be a list of at least one form. The .code conda operator is anaphoric: it expands into a nested structure of zero or more .code ifa invocations, according to these patterns: .verb (conda) -> nil (conda (x y ...) ...) -> (ifa x (progn y ...) (conda ...)) .brev Thus, .code conda inherits all the restrictions on the .meta test expressions from .codn ifa , as well as the anaphoric .code it variable feature. .coNP Macro @ whena .synb .mets (whena < test << form *) .syne .desc The .code whena macro is similar to the .code when macro, except that it is anaphoric in exactly the same manner as the .code ifa macro. It may be understood as conforming to the following equivalence: .verb (whena x f0 f2 ...) <--> (if x (progn f0 f2 ...)) .brev .coNP Macro @ dotimes .synb .mets (dotimes >> ( var < count-form <> [ result-form ]) .mets \ \ << body-form *) .syne .desc The .code dotimes macro implements a simple counting loop. .meta var is established as a variable, and initialized to zero. .meta count-form is evaluated one time to produce a limiting value, which should be a number. Then, if the value of .meta var is less than the limiting value, the .metn body-form s are evaluated, .meta var is incremented by one, and the process repeats with a new comparison of .meta var against the limiting value possibly leading to another evaluation of the forms. If .meta var is found to equal or exceed the limiting value, then the loop terminates. When the loop terminates, its return value is .code nil unless a .meta result-form is present, in which case the value of that form specifies the return value. .metn body-form s as well as .meta result-form are evaluated in the scope in which the binding of .meta var is visible. .coNP Operators @, each @, each* @, collect-each @, collect-each* @ append-each and @ append-each* .synb .mets (each >> ({( sym << init-form )}*) << body-form *) .mets (each* >> ({( sym << init-form )}*) << body-form *) .mets (collect-each >> ({( sym << init-form )}*) << body-form *) .mets (collect-each* >> ({( sym << init-form )}*) << body-form *) .mets (append-each >> ({( sym << init-form )}*) << body-form *) .mets (append-each* >> ({( sym << init-form )}*) << body-form *) .syne .desc These operators establish a loop for iterating over the elements of one or more sequences. Each .meta init-form must evaluate to an iterable object that is suitable as an argument for the .code iter-begin function. The sequences are then iterated in parallel over repeated evaluations of the .metn body-form s, with each .meta sym variable being assigned to successive elements of its sequence. The shortest list determines the number of iterations, so if any of the .metn init-form s evaluate to an empty sequence, the body is not executed. If the list of .mono .meti >> ( sym << init-form ) .onom pairs itself is empty, then an infinite loop is specified. The body forms are enclosed in an anonymous block, allowing the .code return operator to terminate the loop prematurely and optionally specify the return value. The .code collect-each and .code collect-each* variants are like .code each and .codn each* , except that for each iteration, the resulting value of the body is collected into a list. When the iteration terminates, the return value of the .code collect-each or .code collect-each* operator is this collection. The .code append-each and .code append-each* variants are like .code each and .codn each* , except that for each iteration other than the last, the resulting value of the body must be a list. The last iteration may produce either an atom or a list. The objects produced by the iterations are combined together as if they were arguments to the append function, and the resulting value is the value of the .code append-each or .code append-each* operator. The alternate forms denoted by the adorned symbols .codn each* , .code collect-each* and .codn append-each* , differ from .codn each , .code collect-each and .code append-each in the following way. The plain forms evaluate the .metn init-form s in an environment in which none of the .meta sym variables are yet visible. By contrast, the alternate forms evaluate each .meta init-form in an environment in which bindings for the previous .meta sym variables are visible. In this phase of evaluation, .meta sym variables are list-valued: one by one they are each bound to the list object emanating from their corresponding .metn init-form . Just before the first loop iteration, however, the .meta sym variables are assigned the first item from each of their lists. .TP* Note: The semantics of .code collect-each may be understood in terms of an equivalence to a code pattern involving .codn mapcar : .mono (collect-each ((x xinit) (mapcar (lambda (x y) (y yinit)) <--> body) body) xinit yinit) .onom The .code collect-each* variant may be understood in terms of the following equivalence involving .code let* for sequential binding and .codn mapcar : .mono (collect-each* ((x xinit) (let* ((x xinit) (y yinit)) <--> (y yinit)) body) (mapcar (lambda (x y) body) x y)) .onom However, note that the .code let* as well as each invocation of the .code lambda binds fresh instances of the variables, whereas these operators are permitted to bind a single instance of the variables, which are first initialized with the initializing expressions, and then reused as iteration variables which are stepped by assignment. The other operators may be understood likewise, with the substitution of the .code mapdo function in the case of .code each and .code each* and of the .code mappend function in the case of .code append-each and .codn append-each* . .TP* Example: .mono ;; print numbers from 1 to 10 and whether they are even or odd (each* ((n 1..11) ;; n is just a range object in this scope (even (collect-each ((m n)) (evenp m)))) ;; n is an integer in this scope (format t "~s is ~s\en" n (if even "even" "odd"))) .onom .TP* Output: .mono 1 is "odd" 2 is "even" 3 is "odd" 4 is "even" 5 is "odd" 6 is "even" 7 is "odd" 8 is "even" 9 is "odd" 10 is "even" .onom .coNP Macros @ for and @ for* .synb .mets ({for | for*} >> ({ sym | >> ( sym << init-form )}*) .mets \ \ \ \ \ \ \ \ \ \ \ \ \ >> ([ test-form << result-form *]) .mets \ \ \ \ \ \ \ \ \ \ \ \ \ <> [( inc-form *)] .mets \ \ << body-form *) .mets "" .mets ({for | for*} >> ({ sym | >> ( sym << init-form )}*) .mets \ \ \ \ \ \ \ \ \ \ \ \ \ >> ([ test-form << result-form *])) .mets "" .mets ({for | for*} >> ({ sym | >> ( sym << init-form )}*)) .syne .desc The macros .code for and .code for* combine variable binding with loop iteration. The first argument is a list of variables with optional initializers, exactly the same as in the .code let and .code let* operators. Furthermore, the difference between .code for and .code for* is like that between .code let and .code let* with regard to this list of variables. The second variant in the above syntax synopsis shows that when .metn body-form s are absent, then a list of .metn inc-form s which is empty may be omitted from the syntax. The .code for and .code for* macros execute these steps: .RS .IP 1. Establish an anonymous block over the entire form, allowing the .code return operator to be used to terminate the loop. .IP 2. Establish bindings for the specified variables similarly to .code let and .codn let* . The variable bindings are visible over the .metn test-form , each .metn result-form , each .meta inc-form and each .metn body-form . .IP 3. Evaluate .metn test-form . If .meta test-form yields .codn nil , then the loop terminates. Each .meta result-form is evaluated, and the value of the last of these forms is the result value of the loop. If there are no .metn result-form s then the result value is .codn nil . If the .meta test-form is omitted, then the test is taken to be true, and the loop does not terminate. .IP 4. Otherwise, if .meta test-form yields true, then each .meta body-form is evaluated in turn. Then, each .code inc-form is evaluated in turn and processing resumes at step 2. .RE .coNP Macros @ doloop and @ doloop* .synb .mets ({doloop | doloop*} .mets \ \ ({ sym | >> ( sym >> [ init-form <> [ step-form ])}*) .mets \ \ >> ([ test-form << result-form *]) .mets \ \ << tagbody-form *) .syne .desc The .code doloop and .code doloop* macros provide an iteration construct inspired by the ANSI Common Lisp .code do and .code do* macros. Each .meta sym element in the form must be a symbol suitable for use as a variable name. The .metn tagbody-form s are placed into an implicit .codn tagbody , meaning that a .meta tagbody-form which is an integer, character or symbol is interpreted as a .code tagbody label which may be the target of a control transfer via the .code go macro. The .code doloop macro binds each .meta sym to the value produced by evaluating the adjacent .metn init-form . Then, in the environment in which these variables now exist, .meta test-form is evaluated. If that form yields .codn nil , then the loop terminates. The .metn result-form s are evaluated, and the value of the last one is returned. If .metn result-form s are absent, then .code nil is returned. If .meta test-form is also absent, then the loop terminates and returns .codn nil . If .meta test-form produces a true value, then .metn result-form s are not evaluated. Instead, the implicit .code tagbody consisting of the .metn tagbody-form s is evaluated. If that evaluation terminates normally, the loop variables are then updated by assigning to each .meta sym the value of .metn step-form . The following defaulting behaviors apply in regard to the variable syntax. For each .meta sym which has an associated .meta init-form but no .metn step-form , the .meta init-form is duplicated and taken as the .metn step-form . Thus a variable specification like .code "(x y)" is equivalent to .codn "(x y y)" . If both forms are omitted, then the .meta init-form is taken to be .codn nil , and the .meta step-form is taken to be .metn sym . This means that the variable form .code "(x)" is equivalent to .code "(x nil x)" which has the effect that .code x retains its current value when the next loop iteration begins. Lastly, the .meta sym variant is equivalent to .mono .meti <> ( sym ) .onom so that .code x is also equivalent to .codn "(x nil x)" . The differences between .code doloop and .code doloop* are: .code doloop binds the variables in parallel, similarly to .codn let , whereas .code doloop* binds sequentially, like .codn let* ; moreover, .code doloop performs the .meta step-form assignments in parallel as if using a single .mono .meti (pset < sym0 < step-form-0 < sym1 < step-form-1 ...) .onom form, whereas .code doloop* performs the assignment sequentially as if using .code set rather than .codn pset . The .code doloop and .code doloop* macros establish an anonymous .codn block , allowing early return from the loop, with a value, via the .code return operator. .TP* "Dialect Note:" These macros are substantially different from the ANSI Common Lisp .code do and .code do* macros. Firstly, the termination logic is inverted; effectively they implement "while" loops, whereas their ANSI CL counterparts implement "until" loops. Secondly, in the ANSI CL macros, the defaulting of the missing .meta step-form is different. Variables with no .meta step-form are not updated. In particular, this means that the form .code "(x y)" is not equivalent to .codn "(x y y)" ; the ANSI CL macros do not feature the automatic replication of .meta init-form into the .meta step-form position. .coNP Macros @, sum-each @, sum-each* @ mul-each and @ mul-each* .synb .mets (sum-each >> ({( sym << init-form )}*) << body-form *) .mets (sum-each* >> ({( sym << init-form )}*) << body-form *) .mets (mul-each >> ({( sym << init-form )}*) << body-form *) .mets (mul-each* >> ({( sym << init-form )}*) << body-form *) .syne .desc The macros .codn sum-each , and .code mul-each behave very similarly to the .code each operator. Whereas the .code each operator form returns .code nil as its result, the .code sum-each and .code mul-each forms, if they execute to completion and return normally, return an accumulated value. The .code sum-each macro initializes newly instantiated, hidden accumulator variable to the value .codn 0 . For each iteration of the loop, the .metn body-form s are evaluated, and are expected to produce a value. This value is added to the current value of the hidden accumulator using the .code + function, and the result is stored into the accumulator. If .code sum-each returns normally, then the value of this accumulator becomes its resulting value. The .code mul-each macro similarly initializes a hidden accumulator to the value .codn 1 . The value from each iteration of the body is multiplied with the accumulator using the .code * function, and the result is stored into the accumulator. If .code mul-each returns normally, then the value of this accumulator becomes its resulting value. The .code sum-each* and .code mul-each* variants of the macros implement the sequential scoping rule for the variable bindings, exactly the way .code each* alters the semantics of .codn each . The .metn body-form s are enclosed in an implicit anonymous block. If the forms terminate by returning from the anonymous block then these macros terminate with the specified value. When .code sum-each* and .code sum-each are specified with variables whose values specify zero iterations, or with no variables at all, the form terminates with a value of .codn 0 . In this situation, .code mul-each and .code mul-each* terminate with .codn 1 . Note that this behavior differs from .codn each , and its closely-related operators, which loop infinitely when no variables are specified. It is unspecified whether .code mul-each and .code mul-each* continue iterating when the accumulator takes on a value satisfying the .code zerop predicate. .coNP Macros @, each-true @, some-true @ each-false and @ some-false .synb .mets (each-true >> ({( sym << init-form )}*) << body-form *) .mets (some-true >> ({( sym << init-form )}*) << body-form *) .mets (each-false >> ({( sym << init-form )}*) << body-form *) .mets (some-false >> ({( sym << init-form )}*) << body-form *) .syne .desc These macros iterate zero or more variables over sequences, similarly to the .code each operator, and calculate logical results, with short-circuiting semantics. The .code each-true macro initializes an internal result variable to the .code t value. It then evaluates the .metn body-form s for each tuple of variable values, replacing the result variable with the value produced by these forms. If that value is .codn nil , the iteration stops. When the iteration terminates normally, the value of the result variable is returned. If no variables are specified, termination occurs immediately. Note that this is different from the .code each operator, which iterates indefinitely if no variables are specified. The .metn body-form s are surrounded by an implicit anonymous block, making it possible to terminate via .code return or .codn return-from . In these cases, the form terminates with .code nil or the specified return value. The internal result is ignored. The .code some-true macro is similar to .codn each-true , with the following differences. The internal result variable is initialized to .code nil rather than .codn t . The iteration stops whenever the .metn body-form s produce a true value, and that value is returned. The .code each-false and .code some-false macros are, respectively, similar to .code each-true and .codn some-true , with one difference. After each iteration, the value produced by the .metn body-form s is logically inverted using the .code not function prior to being assigned to the result variable. .TP* Examples: .verb (each-true ()) -> t (each-true ((a ()))) -> t (each-true ((a '(1 2 3))) a) -> 3 (each-true ((a '(1 2 3)) (b '(4 5 6))) (< a b)) -> t (each-true ((a '(1 2 3)) (b '(4 0 6))) (< a b)) -> nil (some-true ((a '(1 2 3))) a) -> 1 (some-true ((a '(nil 2 3))) a) -> 2 (some-true ((a '(nil nil nil))) a) -> nil (some-true ((a '(1 2 3)) (b '(4 0 6))) (< a b)) -> t (some-true ((a '(1 2 3)) (b '(0 1 2))) (< a b)) -> nil (each-false ((a '(1 2 3)) (b '(4 5 6))) (> a b)) -> t (each-false ((a '(1 2 3)) (b '(4 0 6))) (> a b)) -> nil (some-false ((a '(1 2 3)) (b '(4 0 6))) (> a b)) -> t (some-false ((a '(1 2 3)) (b '(0 1 2))) (> a b)) -> nil .brev .coNP Macros @, each-prod @ collect-each-prod and @ append-each-prod .synb .mets (each-prod >> ({( sym << init-form )}*) << body-form *) .mets (collect-each-prod >> ({( sym << init-form )}*) << body-form *) .mets (append-each-prod >> ({( sym << init-form )}*) << body-form *) .syne .desc The macros .codn each-prod , .code collect-each-prod and .code append-each-prod have a similar syntax to .codn each , .code collect-each and .codn collect-each-prod . However, instead of iterating over sequences in parallel, they iterate over the Cartesian product of the elements from the sequences. The difference between .code collect-each and .code collect-each-prod is analogous to that between the functions .code mapcar and .codn maprod . Like in the .code each operator family, the .metn body-form s are surrounded by an anonymous block. If these forms execute a return from this block, then these macros terminate with the specified return value. When no iterations are performed, including in the case when an empty list of variables is specified, all these macro forms terminate and return .codn nil . Note that this behavior differs from .codn each , and its closely-related operators, which loop infinitely when no variables are specified. With one caveat noted below, these macros can be understood as providing syntactic sugar according to the pattern established by the following equivalences: .mono (each-prod (block nil ((x xinit) (let ((#:gx xinit) (#:gy yinit)) (y yinit)) <--> (maprodo (lambda (x y) body) body) #:gx #:gy)) (collect-each-prod (block nil ((x xinit) (let ((#:gx xinit) (#:gy yinit)) (y yinit)) <--> (maprod (lambda (x y) body) body) #:gx #:gy)) (append-each-prod (block nil ((x xinit) (let ((#:gx xinit) (#:gy yinit)) (y yinit)) <--> (maprend (lambda (x y) body) body) #:gx #:gy)) .onom However, note that each invocation of the .code lambda binds fresh instances of the variables, whereas these operators are permitted to bind a single instance of the variables, which are then stepped by assignment. .TP* Example: .mono (collect-each-prod ((a '(a b c)) (n #(1 2))) (cons a n)) --> ((a . 1) (a . 2) (b . 1) (b . 2) (c . 1) (c . 2)) .onom .coNP Macros @, each-prod* @ collect-each-prod* and @ append-each-prod* .synb .mets (each-prod* >> ({( sym << init-form )}*) << body-form *) .mets (collect-each-prod* >> ({( sym << init-form )}*) << body-form *) .mets (append-each-prod* >> ({( sym << init-form )}*) << body-form *) .syne .desc The macros .codn each-prod* , .code collect-each-prod* and .code append-each-prod* are variants of .codn each-prod , .code collect-each-prod and .code append-each-prod with sequential binding. These macros can be understood as providing syntactic sugar according to the pattern established by the following equivalences: .mono (each-prod* (let* ((x xinit) ((x xinit) (y yinit)) (y yinit)) <--> (maprodo (lambda (x y) body) body) x y) (collect-each-prod* (let* ((x xinit) ((x xinit) (y yinit)) (y yinit)) <--> (maprod (lambda (x y) body) body) x y) (append-each-prod* (let* ((x xinit) ((x xinit) (y yinit)) (y yinit)) <--> (maprend (lambda (x y) body) body) x y) .onom However, note that the .code let* as well as each invocation of the .code lambda binds fresh instances of the variables, whereas these operators are permitted to bind a single instance of the variables, which are first initialized with the initializing expressions, and then reused as iteration variables which are stepped by assignment. .TP* Example: .mono (collect-each-prod* ((a "abc") (b (upcase-str a))) `@a@b`) --> ("aA" "aB" "aC" "bA" "bB" "bC" "cA" "cB" "cC") .onom .coNP Macros @, sum-each-prod @, sum-each-prod* @ mul-each-prod and @ mul-each-prod* .synb .mets (sum-each-prod >> ({( sym << init-form )}*) << body-form *) .mets (sum-each-prod* >> ({( sym << init-form )}*) << body-form *) .mets (mul-each-prod >> ({( sym << init-form )}*) << body-form *) .mets (mul-each-prod* >> ({( sym << init-form )}*) << body-form *) .syne .desc The macros .code sum-each-prod and .code mul-each-prod have a similar syntax to .code sum-each and .codn mul-each . However, instead of iterating over sequences in parallel, they iterate over the Cartesian product of the elements from the sequences. The macros .code sum-each-prod* and .code mul-each-prod* variants perform sequential variable binding when establishing the initial values of the variables, similarly to the .code each* operator. The .metn body-form s are surrounded by an implicit anonymous block. If these forms execute a return from this block, then these macros terminate with the specified return value. When no iterations are specified, including in the case when an empty list of variables is specified, the summing macros terminate, yielding .codn 0 , and the multiplicative macros terminate with .codn 1 . Note that this behavior differs from .codn each , and its closely-related operators, which loop infinitely when no variables are specified. .TP* Examples: .verb ;; Inefficiently calculate (+ (* 1 2 3) (* 4 3 2)). ;; Every value from (1 2 3) is paired with every value ;; from (4 3 2) to form a partial products, and ;; sum-each-prod adds these together implicitly: (sum-each-prod ((x '(1 2 3)) (y '(4 3 2))) (* x y)) -> 54 .brev .coNP Operators @ block and @ block* .synb .mets (block < name << body-form *) .mets (block* < name-form << body-form *) .syne .desc The .code block operator introduces a named block around the execution of some forms. The .meta name argument may be any object, though block names are usually symbols. Two block .meta name objects are considered to be the same name according to .code eq equality. Since a block name is not a variable binding, keyword symbols are permitted, and so are the symbols .code t and .codn nil . A block named by the symbol nil is slightly special: it is understood to be an anonymous block. The .code block* operator differs from .code block in that it evaluates .metn name-form , which is expected to produce a symbol. The resulting symbol is used for the name of the block. A named or anonymous block establishes an exit point for the .code return-from or .code return operator, respectively. These operators can be invoked within a block to cause its immediate termination with a specified return value. A block also establishes a prompt for a .IR "delimited continuation" . Anywhere in a block, a continuation can be captured using the .code sys:capture-cont function. Delimited continuations are described in the section Delimited Continuations. A delimited continuation allows an apparently abandoned block to be restarted at the capture point, with the entire call chain and dynamic environment between the prompt and the capture point intact. Blocks in \*(TL have dynamic scope. This means that the following situation is allowed: .verb (defun func () (return-from foo 42)) (block foo (func)) .brev The function can return from the .code foo block even though the .code foo block does not lexically surround .codn foo . It is because blocks are dynamic that the .code block* variant exists; for lexically scoped blocks, it would make little sense to have support a dynamically computed name. Thus blocks in \*(TL provide dynamic nonlocal returns, as well as returns out of lexical nesting. It is permitted for blocks to be aggressively .codn progn -converted by compilation. This means that a .code block form which meets certain criteria is converted to a .code progn form which surrounds the .metn body-form s and thus no longer establishes an exit point. A .code block form will be spared from .codn progn -conversion by the compiler if it meets the following rules. .RS .IP 1. Any .meta body-form references the block's .meta name in a .codn return , .codn return-from , .code sys:abscond-from or .code sys:capture-cont expression. .IP 2. The form contains at least one direct call to a function not in the standard \*(TL library. .IP 3. The form contains at least one direct call to the functions .codn sys:capture-cont , .codn return* , .codn sys:abscond* , .codn match-fun , .codn eval , .codn load , .codn compile , .code compile-file or .codn compile-toplevel . .IP 4. The form references any of the functions in rules 2 and 3 as a function binding via the .code dwim operator (or the DWIM brackets notation) or via the .code fun operator. .IP 5. The form is a .code block* form; these are spared from the optimization. .RE .IP Removal of blocks under the above rules means that some use of blocks which works in interpreted code will not work in compiled programs. Programs which adhere to the rules are not affected by such a difference. Additionally, the compiler may .codn progn -convert blocks in contravention of the above rules, but only if doing so makes no difference to visible program behavior. .TP* Examples: .verb (defun helper () (return-from top 42)) ;; defun implicitly defines a block named top (defun top () (helper) ;; function returns 42 (prinl 'notreached)) ;; never printed (defun top2 () (let ((h (fun helper))) (block top (call h)) ;; may progn-convert (block top (call 'helper)) ;; may progn-convert (block top (helper)))) ;; not removed .brev In the above examples, the block containing .code "(call h)" may be converted to .code progn because it doesn't express a .B direct call to the .code helper function. The block which calls .code helper using .code "(call 'helper)" is also not considered to be making a direct call. .TP* "Dialect Note:" In Common Lisp, blocks are lexical. A separate mechanism consisting of catch and throw operators performs nonlocal transfer based on symbols. The \*(TL example: .verb (defun func () (return-from foo 42)) (block foo (func)) .brev is not allowed in Common Lisp, but can be transliterated to: .verb (defun func () (throw 'foo 42)) (catch 'foo (func)) .brev Note that foo is quoted in CL. This underscores the dynamic nature of the construct. .code throw itself is a function and not an operator. Also note that the CL example, in turn, is even more closely transcribed back into \*(TL simply by replacing its .code throw and .code catch with .code return* and .codn block* : .verb (defun func () (return* 'foo 42)) (block* 'foo (func)) .brev Common Lisp blocks also do not support delimited continuations. .coNP Operators @ return and @ return-from .synb .mets (return <> [ value ]) .mets (return-from < name <> [ value ]) .syne .desc The .code return operator must be dynamically enclosed within an anonymous block (a block named by the symbol .codn nil ). It immediately terminates the evaluation of the innermost anonymous block which encloses it, causing it to return the specified value. If the value is omitted, the anonymous block returns .codn nil . The .code return-from operator must be dynamically enclosed within a named block whose name matches the .meta name argument. It immediately terminates the evaluation of the innermost such block, causing it to return the specified value. If the value is omitted, that block returns .codn nil . .TP* Example: .verb (block foo (let ((a "abc\en") (b "def\en")) (pprint a *stdout*) (return-from foo 42) (pprint b *stdout*))) .brev Here, the output produced is .strn "abc" . The value of .code b is not printed because. .code return-from terminates block .codn foo , and so the second pprint form is not evaluated. .coNP Function @ return* .synb .mets (return* < name <> [ value ]) .syne .desc The .code return* function is similar to the .code return-from operator, except that .code name is an ordinary function parameter, and so when .code return* is used, an argument expression must be specified which evaluates to a symbol. Thus .code return* allows the target block of a return to be dynamically computed. The following equivalence holds between the operator and function: .verb (return-from a b) <--> (return* 'a b) .brev Expressions used as .meta name arguments to .code return* which do not simply quote a symbol have no equivalent in .codn return-from . .coNP Macros @ tagbody and @ go .synb .mets (tagbody >> { form | << label }*) .mets (go << label ) .syne .desc The .code tagbody macro provides a form of the "go to" control construct. The arguments of a .code tagbody form are a mixture of zero or more forms and .IR "go labels" . The latter consist of those arguments which are symbols, integers or characters. Labels are not considered by .code tagbody and .code go to be forms, and are not subject to macro expansion or evaluation. The .code go macro is available inside .codn tagbody . It is erroneous for a .code go form to occur outside of a .codn tagbody . This situation is diagnosed by global macro called .codn go , which unconditionally throws an error. In the absence of invocations of .code go or other control transfers, the .code tagbody macro evaluates each .meta form in left-to-right order. The .code go labels are ignored. After the last .meta form is evaluated, the .code tagbody form terminates, and yields .codn nil . Any .meta form itself, or else any of its subforms, may be the form .mono .meti (go << label ) .onom where .meta label matches one of the .code go labels of a surrounding .codn tagbody . When this .code go form is evaluated, then the evaluation of .meta form is immediately abandoned, and control transfers to the specified label. The forms are then evaluated in left-to-right order starting with the form immediately after that label. If the label is not followed by any forms, then the .code tagbody terminates. If .meta label doesn't match to any label in any surrounding .codn tagbody , the .code go form is erroneous. The abandonment of a .meta form by invocation of .code go is a dynamic transfer. All necessary unwinding inside .meta form takes place. The .code go labels are lexically scoped, but dynamically bound. Their scope being lexical means that the labels are not visible to forms which are not enclosed within the .codn tagbody , even if their evaluation is invoked from that .codn tagbody . The dynamic binding means that the labels of a .code tagbody form are established when it begins evaluating, and removed when that form terminates. Once a label is removed, it is not available to be the target of a .code go control transfer, even if that .code go form has the label in its lexical scope. Such an attempted transfer is erroneous. It is permitted for .code tagbody forms to nest arbitrarily. The labels of an inner .code tagbody are not visible to an outer .codn tagbody . However, the reverse is true: a .code go form in an inner .code tagbody may branch to a label in an outer .codn tagbody , in which case the entire inner .code tagbody terminates. In cases where the same objects are used as labels by an inner and outer .codn tagbody , the inner labels shadow the outer labels. There is no restriction on what kinds of symbols may be labels. Symbols in the .code keyword package as well as the symbols .code t and .code nil are valid .code tagbody labels. .TP* "Dialect Note:" ANSI Common Lisp .code tagbody supports only symbols and integers as labels (which are called "go tags"); characters are not supported. .TP* Examples: .verb ;; print the numbers 1 to 10 (let ((i 0)) (tagbody (go skip) ;; forward goto skips 0 again (prinl i) skip (when (<= (inc i) 10) (go again)))) ;; Example of erroneous usage: by the time func is invoked ;; by (call func) the tagbody has already terminated. The ;; lambda body can still "see" the label, but it doesn't ;; have a binding. (let (func) (tagbody (set func (lambda () (go label))) (go out) label (prinl 'never-reached) out) (call func)) ;; Example of unwinding when the unwind-protect ;; form is abandoned by (go out). Output is: ;; reached ;; cleanup ;; out (tagbody (unwind-protect (progn (prinl 'reached) (go out) (prinl 'notreached)) (prinl 'cleanup)) out (prinl 'out)) .brev .coNP Macros @ prog and @ prog* .synb .mets (prog >> ({ sym | >> ( sym << init-form )}*) .mets \ \ >> { body-form | << label }*) .mets (prog* >> ({ sym | >> ( sym << init-form )}*) .mets \ \ >> { body-form | << label }*) .syne .desc The .code prog and .code progn* macros combine the features of .code let and .codn let* , respectively, anonymous block and .codn tagbody . The .code prog macro treats the .meta sym and .code init-form expressions similarly to .codn let , establishing variable bindings in parallel. The .code prog* macro treats these expressions in a similar way to .codn let* . The forms enclosed are treated like the argument forms of the .code tagbody macro: labels are permitted, along with use of .codn go . Finally, an anonymous block is established around all of the enclosed forms (both the .metn init-form s and .metn body-forms s) allowing the use of .code return to terminate evaluation with a value. The .code prog macro may be understood according to the following equivalence: .verb (prog vars forms ...) <--> (block nil (let vars (tagbody forms ...))) .brev Likewise, the .code prog* macro follows an analogous equivalence, with .code let replaced by .codn let* . .SS* Evaluation .coNP Function @ eval .synb .mets (eval < form >> [ env <> [ menv ]]) .syne .desc The .code eval function treats the .meta form object as a Lisp expression, which is expanded and evaluated. The side effects implied by the form are performed, and the value which it produces is returned. The optional .meta env argument specifies an environment for resolving the function and variable references encountered in .metn form . If this argument is omitted, then evaluation takes place in the global environment. The optional .meta menv object specifies a macro environment for expanding macros encountered in .metn form . If this argument is omitted, then .meta form may refer to only global macros. If both .meta menv and .meta env are specified, then .meta env takes precedence over .metn menv , behaving like a more nested scope. Definitions contained in .meta env shadow same-named definitions in .metn menv . The .meta form is not expanded all at once. Rather, it is treated by the following algorithm: .RS .IP 1. First, if .meta form is a macro, it is macro-expanded as if by an application of the function .code macroexpand (with a suitable environment argument, calculated by a combination of .meta env and .metn menv ). .IP 2. If the resulting expanded form is a .codn progn , .codn compile-only , or .code eval-only form, then .code eval iterates over that form's argument expressions, passing each expression to a recursive call to .code eval using the same .metn env . .IP 3. Otherwise, if the expanded form isn't one of the above three kinds of expressions, it is subject to a full expansion and evaluation. .RE .IP This algorithm allows a sequence of top-level forms to be combined into a single top-level form, even when the expansion of forms occurring later in the sequence depends on the evaluation effects of forms earlier in the sequence. For instance, a form like .code "(progn (defmacro foo ()) (foo))" may be processed with .codn eval , because the above algorithm ensures that the .code "(defmacro foo ())" expression is fully evaluated first, thereby providing the macro definition required by .codn "(foo)" . This expansion and evaluation order is important because the semantics of .code eval forms the reference model for how the .code load function processes top-level forms. Moreover, file compilation perform a similar treatment of top-level forms and incremental macro compilation. The result is that the behavior is consistent between source files and compiled files. See the sections Top-Level Forms and File Compilation Model. Note that, according to these rules, the constituent body forms of a .code macrolet or .code symacrolet top-level form are not individual top-level forms, even if the expansion of the construct combines the expanded versions of those forms with .codn progn . The form .code "(macrolet () (defmacro foo ()) (foo))" will therefore not work correctly. However, the specific problem in this situation can be be resolved by rewriting .code foo as a .code macrolet macro: .codn "(macrolet ((foo ())) (foo))" . See also: the .code make-env function. .coNP Function @ constantp .synb .mets (constantp < form <> [ env ]) .syne .desc The .code constantp function determines whether .meta form is a constant form, with respect to environment .metn env . If .meta env is absent, the global environment is used. The .meta env argument is used for fully expanding .meta form prior to analyzing. Currently, .code constantp returns true for any form which, after macro-expansion, is any of the following: a compound form with the symbol .code quote in its first position; a non-symbolic atom; or one of the symbols which evaluate to themselves and cannot be bound as variables. These symbols are the keyword symbols, and the symbols .code t and .codn nil . Additionally, .code constantp returns true for a compound form, or a DWIM form, whose symbol is the member of a set a large number of constant-foldable library functions, and whose arguments are, recursively, .code constantp expressions for the same environment. The arithmetic functions are members of this set. For all other inputs, .code constantp returns .codn nil . Note: some uses of .code constantp require manual expansion. .TP* Examples: .verb (constantp nil) -> t (constantp t) -> t (constantp :key) -> t (constantp :) -> t (constantp 'a) -> nil (constantp 42) -> t (constantp '(+ 2 2 [* 3 (/ 4 4)])) -> t ;; symacrolet form expands to 42, which is constant (constantp '(symacrolet ((a 42)) a)) (defmacro cp (:env e arg) (constantp arg e)) ;; macro call (cp 'a) is replaced by t because ;; the symbol a expands to (+ 2 2) in the given environment, ;; and so (* a a) expands to (* (+ 2 2) (+ 2 2)) which is constantp. (symacrolet ((a (+ 2 2))) (cp '(* a a))) -> t .brev .coNP Function @ make-env .synb .mets (make-env >> [ var-bindings >> [ fun-bindings <> [ next-env ]]]) .syne .desc The .code make-env function creates an environment object suitable as the .code env parameter. The .meta var-bindings and .meta fun-bindings parameters, if specified, should be association lists, mapping symbols to objects. The objects in .meta fun-bindings should be functions, or objects callable as functions. The .meta next-env argument, if specified, should be an environment. Note: bindings can also be added to an environment using the .code env-vbind and .code env-fbind functions. .coNP Functions @ env-vbind and @ env-fbind .synb .mets (env-vbind < env < symbol << value ) .mets (env-fbind < env < symbol << value ) .syne .desc These functions bind a symbol to a value in either the function or variable space of environment .codn env . Values established in the function space should be functions or objects that can be used as functions such as lists, strings, arrays or hashes. If .meta symbol already exists in the environment, in the given space, then its value is updated with .codn value . If .meta env is specified as .codn nil , then the binding takes place in the global environment. .coNP Functions @, env-vbindings @ env-fbindings and @ env-next .synb .mets (env-vbindings << env ) .mets (env-fbindings << env ) .mets (env-next << env ) .syne .desc These function retrieve the components of .metn env , which must be an environment. The .code env-vbindings function retrieves the association list representing variable bindings. Similarly, the .code env-fbindings retrieves the association list of function bindings. The .code env-next function retrieves the next environment, if .meta env has one, otherwise .codn nil . If .code e is an environment constructed by the expression .codn "(make-env v f n)" , then .code "(env-vbindings e)" retrieves .codn v , .code "(env-fbindings e)" retrieves .code f and .code "(env-next e)" returns .codn n . .SS* Global Environment .coNP Accessors @, symbol-function @ symbol-macro and @ symbol-value .synb .mets (symbol-function >> { symbol | < method-name | << lambda-expr }) .mets (symbol-macro << symbol ) .mets (symbol-value << symbol ) .mets (set (symbol-function >> { symbol | << method-name }) << new-value ) .mets (set (symbol-macro << symbol ) << new-value ) .mets (set (symbol-value << symbol ) << new-value ) .syne .desc If given a .meta symbol argument, the .code symbol-function function retrieves the value of the global function binding of the given .meta symbol if it has one: that is, the function object bound to the .metn symbol . If .meta symbol has no global function binding, then .code nil is returned. The .code symbol-function function supports method names of the form .mono .meti (meth < struct << slot ) .onom where .meta struct names a struct type, and .meta slot is either a static slot or one of the keyword symbols .code :init or .code :postinit which refer to special functions associated with a structure type. Names in this format are returned by the .meta func-get-name function. The .code symbol-function function also supports names of the form .mono .meti (macro << name ) .onom which denote macros. Thus, .code symbol-function provides unified access to functions, methods and macros. If a .code lambda expression is passed to .codn symbol-function , then the expression is macro-expanded and if that is successful, the function implied by that expression is returned. It is unspecified whether this function is interpreted or compiled. The .code symbol-macro function retrieves the value of the global macro binding of .meta symbol if it has one. Note: the name of this function has nothing to do with symbol macros; it is named for consistency with .code symbol-function and .codn symbol-value , referring to the "macro-expander binding of the symbol cell". The value of a macro binding is a function object. Intrinsic macros are C functions in the \*(TX kernel, which receive the entire macro call form and macro environment, performing their own destructuring. Currently, macros written in \*(TL are represented as curried C functions which carry the following list object in their environment cell: .mono .mets (# < macro-parameter-list << body-form *) .onom Local macros created by .code macrolet have .code nil in place of the environment object. This representation is likely to change or expand to include other forms in future \*(TX versions. The .code symbol-value function retrieves the value stored in the dynamic binding of .meta symbol that is apparent in the current context. If the variable has no dynamic binding, then .code symbol-value retrieves its value in the global environment. If .meta symbol has no variable binding, but is defined as a global symbol macro, then the value of that symbol macro binding is retrieved. The value of a symbol macro binding is simply the replacement form. Rather than throwing an exception, each of these functions returns .code nil if the argument symbol doesn't have the binding in the respective namespace or namespaces which that function searches. A .codn symbol-function , .codn symbol-macro , or .code symbol-value form denotes a place, if .meta symbol has a binding of the respective kind. This place may be assigned to or deleted. Assignment to the place causes the denoted binding to have a new value. Deleting a place with the .code del macro removes the binding, and returns the previous contents of that binding. A binding denoted by a .code symbol-function form is removed using .codn fmakunbound , one denoted by by .code symbol-macro is removed using .code mmakunbound and a binding denoted by .code symbol-value is removed using .codn makunbound . Deleting a method via .code symbol-function is not possible; an attempt to do so has no effect. Storing a value, using any one of these three accessors, to a nonexistent variable, function or macro binding, is not erroneous. It has has the effect of creating that binding. Using .code symbol-function accessor to assign to a lambda expression is erroneous. Deleting a binding, using any of these three accessors, when the binding does not exist, also isn't erroneous. There is no effect and the .code del operator yields .code nil as the prior value, consistent with the behavior when accessors are used to retrieve a nonexistent value. .TP* "Dialect Note:" In ANSI Common Lisp, the .code symbol-function function retrieves a function, macro or special operator binding of a symbol. These are all in one space and may not coexist. In \*(TL, it retrieves a symbol's function binding only. Common Lisp has an accessor named .code macro-function similar to .codn symbol-macro . .coNP Functions @, boundp @ fboundp and @ mboundp .synb .mets (boundp << symbol ) .mets (fboundp >> { symbol | < method-name | << lambda-expr }) .mets (mboundp << symbol ) .syne .desc .code boundp returns .code t if the .meta symbol is bound as a variable or symbol macro in the global environment, otherwise .codn nil . .code fboundp returns .code t if the .meta symbol has a function binding in the global environment, the method specified by .meta method-name exists, or a lambda expression argument is given. Otherwise it returns .codn nil . .code mboundp returns .code t if the symbol has an operator macro binding in the global environment, otherwise .codn nil . .TP* "Dialect Notes:" The .code boundp function in ANSI Common Lisp doesn't report that global symbol macros have a binding. They are not considered bindings. In \*(TL, they are considered bindings. The ANSI Common Lisp .code fboundp yields true if its argument has a function, macro or operator binding, whereas the \*(TL .code fboundp does not consider operators or macros. The ANSI CL .code fboundp does not yield true for lambda expressions. Behavior similar to the Common Lisp expression .code "(fboundp x)" in Common Lisp can be obtained in \*(TL using the .verb (or (fboundp x) (mboundp x) (special-operator-p x)) .brev expression, except that this will also yield true when .code x is a lambda expression. The .code mboundp function doesn't exist in ANSI Common Lisp. .coNP Function @ makunbound .synb .mets (makunbound << symbol ) .syne .desc The function .code makunbound removes the binding of .meta symbol from either the dynamic environment or the global symbol macro environment. After the call to .codn makunbound , .meta symbol appears to be unbound. If the .code makunbound call takes place in a scope in which there exists a dynamic rebinding of .metn symbol , the information for restoring the previous binding is not affected by .codn makunbound . When that scope terminates, the previous binding will be restored. If the .code makunbound call takes place in a scope in which the dynamic binding for .code symbol is the global binding, then the global binding is removed. When the global binding is removed, then if .meta symbol was previously marked as special (for instance by .codn defvar ) this marking is removed. Otherwise if .meta symbol has a global symbol macro binding, that binding is removed. If .meta symbol has no apparent dynamic binding, and no global symbol macro binding, .code makunbound does nothing. In all cases, .code makunbound returns .metn symbol . .TP* "Dialect Note:" The behavior of .code makunbound differs from its counterpart in ANSI Common Lisp. The .code makunbound function in Common Lisp only removes a value from a dynamic variable. The dynamic variable does not cease to exist, it only ceases to have a value (because a binding is a value). In \*(TL, the variable ceases to exist. The binding of a variable isn't its value, it is the variable itself: the association between a name and an abstract storage location, in some environment. If the binding is undone, the variable disappears. The .code makunbound function in Common Lisp does not remove global symbol macros, which are not considered to be bindings in the variable namespace. That is to say, the Common Lisp .code boundp does not report true for symbol macros. The Common Lisp .code makunbound also doesn't remove the special attribute from a symbol. If a variable is introduced with .code defvar and then removed with .codn makunbound , the symbol continues to exhibit dynamic binding rather than lexical in subsequent scopes. In \*(TL, if a global binding is removed, so is the special attribute. .coNP Functions @ fmakunbound and @ mmakunbound .synb .mets (fmakunbound << symbol ) .mets (mmakunbound << symbol ) .syne .desc The function .code fmakunbound removes any binding for .meta symbol from the function namespace of the global environment. If .meta symbol has no such binding, it does nothing. In either case, it returns .metn symbol . The function .code mmakunbound removes any binding for .meta symbol from the operator macro namespace of the global environment. If .meta symbol has no such binding, it does nothing. In either case, it returns .metn symbol . .TP* "Dialect Note:" The behavior of .code fmakunbound differs from its counterpart in ANSI Common Lisp. The .code fmakunbound function in Common Lisp removes a function or macro binding, which do not coexist. The .code mmakunbound function doesn't exist in Common Lisp. .coNP Function @ func-get-form .synb .mets (func-get-form << func ) .syne .desc The .code func-get-form function retrieves a source code form of .metn func , which must be an interpreted function. The source code form has the syntax .mono .meti >> ( name < arglist << body-form *) . .onom .coNP Function @ func-get-name .synb .mets (func-get-name < func <> [ env ]) .syne .desc The .code func-get-name tries to resolve the function object .meta func to a name. If that is not possible, it returns .codn nil . The resolution is performed by an exhaustive search through up to three spaces. If an environment is specified by .metn env , then this is searched first. If a binding is found in that environment which resolves to the function, then the search terminates and the binding's symbol is returned as the function's name. If the search through environment .meta env fails, or if that argument is not specified, then the global environment is searched for a function binding which resolves to .metn func . If such a binding is found, then the search terminates, and the binding's symbol is returned. If two or more symbols in the global environment resolve to the function, it is not specified which one is returned. If the global function environment search fails, then the function is considered as a possible macro. The global macro environment is searched for a macro binding whose expander function is .metn func , similarly to the way the function environment was searched. If a binding is found, then the syntax .mono .meti (macro << name ) .onom is returned, where .meta name is the name of the global macro binding that was found which resolves to .metn func . If two or more global macro bindings share .metn func , it is not specified which of those bindings provides .metn name . If the global macro search fails, then .meta func is considered as a possible method. The static slot space of all struct types is searched for a slot which contains .metn func . If such a slot is found, then the method name is returned, consisting of the syntax .mono .meti (meth < type << name ) .onom where .meta type is a symbol denoting the struct type and .meta name is the static slot of the struct type which holds .metn func . A check is also performed whether .meta func might be equal to one of the two special functions of a structure type: its .meta initfun or .metn postinitfun , in which case it is returned as either the .mono .meti (meth < type :init) .onom or the .mono .meti (meth < type :postinit) .onom syntax. If .meta func is an interpreted function not found under any name, then a lambda expression denoting that function is returned in the syntax .mono .meti (lambda < args << form *) .onom If .meta func cannot be identified as a function, then .code nil is returned. .coNP Function @ func-get-env .synb .mets (func-get-env << func ) .syne .desc The .code func-get-env function retrieves the environment object associated with function .metn func . The environment object holds the captured bindings of a lexical closure. .coNP Functions @ fun-fixparam-count and @ fun-optparam-count .synb .mets (fun-fixparam-count << func ) .mets (fun-optparam-count << func ) .syne .desc The .code fun-fixparam-count reports .metn func 's number of fixed parameters. The fixed parameters consist of the required parameters and the optional parameters. Variadic functions have a parameter which captures the remaining arguments which are in excess of the fixed parameters. That parameter is not considered a fixed parameter and therefore doesn't contribute to this count. The .code fun-optparam-count reports .metn func 's number of optional parameters. The .meta func argument must be a function. Note: if a function isn't variadic (see the .meta fun-variadic function) then the value reported by .code fun-fixparam-count represents the maximum number of arguments which can be passed to the function. The minimum number of required arguments can be calculated for any function by subtracting the value from .code fun-optparam-count from the value from .codn fun-fixparam-count . .coNP Function @ fun-variadic .synb .mets (fun-variadic << func ) .syne .desc The .code fun-variadic function returns .code t if .meta func is a variadic function, otherwise .codn nil . The .meta func argument must be a function. .coNP Function @ interp-fun-p .synb .mets (interp-fun-p << obj ) .syne .desc The .code interp-fun-p function returns .code t if .meta obj is an interpreted function, otherwise it returns .codn nil . .coNP Function @ vm-fun-p .synb .mets (vm-fun-p << obj ) .syne .desc The .code vm-fun-p function returns .code t if .meta obj a function compiled for the virtual machine: a function representation produced by means of the functions .codn compile-file , .code compile-toplevel or .codn compile . If .meta obj is of any other type, the function returns .codn nil . .coNP Function @ special-var-p .synb .mets (special-var-p << obj ) .syne .desc The .code special-var-p function returns .code t if .meta obj is a symbol marked for special variable binding, otherwise it returns .codn nil . Symbols are marked special by .code defvar and .codn defparm . .coNP Function @ special-operator-p .synb .mets (special-operator-p << obj ) .syne .desc The .code special-operator-p function returns .code t if .meta obj is a symbol which names a special operator, otherwise it returns .codn nil . .coNP Symbol Macro @ %fun% .desc The symbol macro .code %fun% indicates the current function name, There is a global .code %fun% symbol macro which expands to .codn nil . Around certain kinds of named functions, a local binding for .code %fun% is established which provides the function name. The purpose of this name is for use in diagnostic messages; therefore it is an abbreviated name. The .code %fun% macro is established for .codn defun , .code defmacro and .code defmeth forms. It is also established for methods defined inside a .code defstruct form including the methods .codn :init , .codn :postinit , .code :fini and .codn :postfini . The .code %fun% macro is visible not only to the its function's body, but also to the expressions inside the parameter list which compute the default values for optional parameters. The name provided by .code %fun% is intended for use in diagnostic messages and is therefore an informal name, and not the formal name which can be passed to .code symbol-function to retrieve the function. In the case of a .code defun function named .codn x , the .code %fun% name is that symbol, .codn x . Thus, in this case, the name is the same as the formal name. In the case of a .code defmacro named .codn x , .code %fun% also expands to the symbol x .codn x , but that is the formal name of the macro, which is .codn "(macro x)" . In the case of a method .code x of a structure type .codn s , .code %fun% is the two-element list .codn "(s x)" , rather than the formal name .codn "(meth s x)" . .TP* Example: .verb ;; log a message naming the function (defun connect-to-host (addr) (format t "~s: connecting to host ~s" %fun% addr)) .brev .SS* Object Type In \*(TL, objects obey the following type hierarchy. In this type hierarchy, the internal nodes denote abstract types: no object is an instance of an abstract type. Nodes in square brackets indicate an internal structure in the type graph, invisible to programs, and angle brackets indicate a plurality of types which are not listed by name: .verb t ----+--- [cobj types] ---+--- hash | | | +--- hash-iter | | | +--- stream | | | +--- random-state | | | +--- regex | | | +--- buf | | | +--- tree | | | +--- tree-iter | | | +--- seq-iter | | | +--- cptr | | | +--- dir | | | +--- struct-type | | | +--- | | | +--- ... others | | +--- sequence ---+--- string ---+--- str | | | | | +--- lstr | | | | | +--- lit | | | +--- list ---+--- null | | | | | +--- cons | | | | | +--- lcons | | | +--- vec | | | +--- | +--- number ---+--- float | | | +--- integer ---+--- fixnum | | | +--- bignum | +--- chr | +--- sym | +--- env | +--- range | +--- tnode | +--- pkg | +--- fun | +--- args .brev In addition to the above hierarchy, the following relationships also exist: .verb t ---+--- atom --- --- nil | +--- cons ---+--- lcons --- nil | +--- nil sym --- null struct ---- .brev That is to say, the types are exhaustively partitioned into atoms and conses; an object is either a .code cons or else it isn't, in which case it is the abstract type .codn atom . The .code cons type is odd in that it is both an abstract type, serving as a supertype for the type .code lcons and it is also a concrete type in that regular conses are of this type. The type .code nil is an abstract type which is empty. That is to say, no object is of type .codn nil . This type is considered the abstract subtype of every other type, including itself. The type .code nil is not to be confused with the type .code null which is the type of the .code nil symbol. Because the type of .code nil is the type .code null and .code nil is also a symbol, the .code null type is a subtype of .codn sym . Lastly, the symbol .code struct serves as the supertype of all structures. .coNP Function @ typeof .synb .mets (typeof << value ) .syne .desc The .code typeof function returns a symbol representing the type of .metn value . The core types are identified by the following symbols: .coIP cons Cons cell. .coIP str String. .coIP lit Literal string embedded in the \*(TX executable image. .coIP chr Character. .coIP fixnum Fixnum integer: an integer that fits into the value word, not having to be heap-allocated. .coIP bignum A bignum integer: arbitrary precision integer that is heap-allocated. .coIP float Floating-point number. .coIP sym Symbol. .coIP pkg Symbol package. .coIP fun Function. .coIP vec Vector. .coIP lcons Lazy cons. .coIP range Range object. .coIP lstr Lazy string. .coIP env Function/variable binding environment. .coIP hash Hash table. .coIP stream I/O stream of any kind. .coIP regex Regular-expression object. .coIP struct-type A structure type: the type of any one of the values which represents a structure type. .coIP tnode Binary search tree node. .coIP tree Binary search tree. .coIP args Function argument list represented as an object. .PP There are more kinds of objects, such as user-defined structures. .coNP Function @ subtypep .synb .mets (subtypep < left-type << right-type ) .syne .desc The .code subtypep function tests whether .meta left-type and .meta right-type name a pair of types, such that the left type is a subtype of the right type. The arguments are either type symbols, or structure type objects, as returned by the .code find-struct-type function. Thus, the symbol .codn time , which is the name of a predefined struct type, and the object returned by .code "(find-struct-type 'time)" are considered equivalent argument values. If either argument doesn't name a type, the behavior is unspecified. Each type is a subtype of itself. Most other type relationships can be inferred from the type hierarchy diagrams given in the introduction to this section. In addition, there are inheritance relationships among structures. If .meta left-type and .meta right-type are both structure types, then .code subtypep yields true if the types are the same struct type, or if the right type is a direct or indirect supertype of the left. The type symbol .code struct is a supertype of all structure types. .coNP Function @ typep .synb .mets (typep < object << type-symbol ) .syne .desc The .code typep function tests whether the type of .meta object is a subtype of the type named by .metn type-symbol . The following equivalence holds: .verb (typep a b) --> (subtypep (typeof a) b) .brev .coNP Macro @ typecase .synb .mets (typecase < test-form >> {( type-sym << clause-form *)}*) .syne .desc The .code typecase macro evaluates .meta test-form and then successively tests its type against each clause. Each clause consists of a type symbol .meta type-sym and zero or more .metn clause-form s. The first clause whose .meta type-sym is a supertype of the type of .metn test-form 's value is considered to be the matching clause. That clause's .metn clause-form s are evaluated, and the value of the last form is returned. If there is no matching clause, or there are no clauses present, or the matching clause has no .metn clause-form s, then .code nil is returned. Note: since .code t is the supertype of every type, a clause whose .meta type-sym is the symbol .code t always matches. If such a clause is placed as the last clause of a .codn typecase , it provides a fallback case, whose forms are evaluated if none of the previous clauses match. .coNP Macro @ etypecase .synb .mets (etypecase < test-form >> {( type-sym << clause-form *)}*) .syne .desc The .code etypecase macro is the error-catching variant of .codn typecase , similar to the relationship between the .code ecaseq and .code caseq families of macros. If one of the clauses has a .meta type-sym which is the symbol .codn t , then .code etypecase is precisely equivalent to .codn typecase . Otherwise, a clause with a .meta type-sym of .code t and which throws an exception of type .codn case-error , derived from .codn error , is appended to the existing clauses, after which the semantics follows that of .codn typecase . .coNP Function @ built-in-type-p .synb .mets (built-in-type-p << object ) .syne .desc The .code built-in-type-p function returns .code t if .meta object is a symbol which is the name of a built-in type. For all other objects it returns .codn nil . .SS* Object Equivalence .coNP Functions @, identity @ identity* and @ use .synb .mets (identity << value ) .mets (identity* << value *) .mets (use << value ) .syne .desc The .code identity function returns its argument. If the .code identity* function is given at least one argument, then it returns its leftmost argument, otherwise it returns .codn nil . The .code use function is a synonym of .codn identity . .TP* Notes: The .code identity function is useful as a functional argument, when a transformation function is required, but no transformation is actually desired. In this role, the .code use synonym leads to readable code. For instance: .verb ;; construct a function which returns its integer argument ;; if it is odd, otherwise it returns its successor. ;; "If it's odd, use it, otherwise take its successor". [iff oddp use succ] ;; Applications of the function: [[iff oddp use succ] 3] -> 3 ;; use applied to 3 [[iff oddp use succ] 2] -> 3 ;; succ applied to 2 .brev .coNP Functions @, null @ not and @ false .synb .mets (null << value ) .mets (not << value ) .mets (false << value ) .syne .desc The .codn null , .code not and .code false functions are synonyms. They tests whether .meta value is the object .codn nil . They return .code t if this is the case, .code nil otherwise. .TP* Examples: .verb (null '()) -> t (null nil) -> t (null ()) -> t (false t) -> nil (if (null x) (format t "x is nil!")) (let ((list '(b c d))) (if (not (memq 'a list)) (format t "list ~s does not contain the symbol a\en"))) .brev .coNP Functions @ true and @ have .synb .mets (true << value ) .mets (have << value ) .syne .desc The .code true function is the complement of the .codn null , .code not and .code false functions. The .code have function is a synonym for .codn true . It return .code t if the .meta value is any object other than .codn nil . If .meta value is .codn nil , it returns .codn nil . Note: programs should avoid explicitly testing values with true. For instance .code "(if x ...)" should be favored over .codn "(if (true x) ...)" . However, the latter is useful with the .code ifa macro because .mono .meti (ifa (true << expr ) ...) .onom binds the .code it variable to the value of .metn expr , no matter what kind of form .meta expr is, which is not true in the .mono .meti (ifa < expr ...) .onom form. .TP* Example: .verb ;; Compute indices where the list '(1 nil 2 nil 3) ;; has true values: [where '(1 nil 2 nil 3) true] -> (1 3) .brev .coNP Functions @, eq @ eql and @ equal .synb .mets (eq < left-obj << right-obj ) .mets (eql < left-obj << right-obj ) .mets (equal < left-obj << right-obj ) .syne .desc The principal equality test functions .codn eq , .code eql and .code equal test whether two objects are equivalent, using different criteria. They return .code t if the objects are equivalent, and .code nil otherwise. The .code eq function uses the strictest equivalence test, called implementation equality. The eq function returns .code t if and only if, .meta left-obj and .meta right-obj are the same object. Two character values are .code eq if they are the same character, and two fixnum integers are .code eq if they have the same value. Whether two identical floating-point values are always .code eq depends on how \*(TX has been built. On 64 bit systems, \*(TX is usually built with to support unboxed floating-point numbers, which may be reliably compared with .codn eq . On 32 bit targets, floating-point values are pointers to heap-allocated values, and so two identical values might not be .codn eq . Note that even in 64 bit build of \*(TX, the build configuration can override the selection so that floating-point values are heap-allocated. All other object representations are pointers to heap-allocated objects. Two such values are .code eq if and only if they point to the same object in memory. So, for instance, two bignum integers might not be .code eq even if they have the same numeric value, two lists might not be .code eq even if all their corresponding elements are .code eq and two strings might not be eq even if they hold identical text. The .code eql function is less strict than .codn eq . The difference between .code eql and .code eq is that if .meta left-obj and .meta right-obj are numbers which are of the same kind and have the same numeric value, .code eql returns .codn t , even if they are different objects. Note that an integers and a floating-point number are not .code eql even if one has a value which converts to the other: thus, .code "(eql 0.0 0)" yields .codn nil ; a comparison expression which finds these numbers equal is .codn "(= 0.0 0)" . The .code eql function also specially treats range objects. Two distinct range objects are .code eql if their corresponding .meta from and .meta to fields are .codn eql . For all other object types, .code eql behaves like .codn eq . The .code equal function is less strict still than .codn eql . In general, it recurses into some kinds of aggregate objects to perform a structural equivalence check. For struct types, it also supports customization via equality substitution. See the Equality Substitution section under Structures. Firstly, if .meta left-obj and .meta right-obj are .code eql then they are also .codn equal , though the converse isn't necessarily the case. If two objects are both cons cells, then they are equal if their .code car fields are .code equal and their .code cdr fields are .codn equal . If two objects are vectors, they are .code equal if they have the same length, and their corresponding elements are .codn equal . If two objects are strings, they are .code equal if they are textually identical. If two objects are functions, they are .code equal if they have .code equal environments, and if they have the same code. Two compiled functions are considered to have the same code if and only if they are pointers to the same function. Two interpreted functions are considered to have the same code if their list structure is .codn equal . Two hashes are .code equal if they use the same equality (both are .codn :equal-based , or both are .code :eql-based or else both are .codn :eq-based ), if their associated user data elements are equal (see the function .codn hash-userdata ), if their sets of keys are identical, and if the data items associated with corresponding keys from each respective hash are .code equal objects. Two ranges are .code equal if their corresponding .meta to and .meta from fields are equal. For some aggregate objects, there is no special semantics. Two arguments which are symbols, packages, or streams are .code equal if and only if they are the same object. Certain object types have a custom .code equal function. .coNP Functions @, neq @ neql and @ nequal .synb .mets (neq < left-obj << right-obj ) .mets (neql < left-obj << right-obj ) .mets (nequal < left-obj << right-obj ) .syne .desc The functions .codn neq , .code neql and .code nequal are logically negated counterparts of, respectively, .codn eq , .code eql and .codn equal . If .code eq returns .code t for a given pair of arguments .meta left-obj and .metn right-obj , then .code neq returns .codn nil . Vice versa, if .code eq returns .codn nil , .code neq returns .codn t . The same relationship exits between .code eql and .codn neql , and between .code equal and .codn nequal . .coNP Functions @, meq @ meql and @ mequal .synb .mets (meq < left-obj << right-obj *) .mets (meql < left-obj << right-obj *) .mets (mequal < left-obj << right-obj *) .syne .desc The functions .codn meq , .code meql and .code mequal ("member equal" or "multi-equal") provide a particular kind of a generalization of the binary equality functions .codn eq , .code eql and .code equal to multiple arguments. The .meta left-obj value is compared to each .meta right-obj value using the corresponding binary equality function. If a match occurs, then .code t is returned, otherwise .codn nil . The traversal of the .meta right-obj argument values proceeds from left to right, and stops when a match is found. .coNP Function @ less .synb .mets (less < left-obj << right-obj ) .mets (less < obj << obj *) .syne .desc The .code less function, when called with two arguments, determines whether .meta left-obj compares less than .meta right-obj in a generic way which handles arguments of various types. The argument syntax of .code less is generalized. It can accept one argument, in which case it unconditionally returns .code t regardless of that argument's value. If more than two arguments are given, then .code less generalizes in a way which can be described by the following equivalence pattern, with the understanding that each argument expression is evaluated exactly once: .verb (less a b c) <--> (and (less a b) (less b c)) (less a b c d) <--> (and (less a b) (less b c) (less c d)) .brev The .code less function is used as the default for the .meta lessfun argument of the functions .code sort and .codn merge , as well as the .meta testfun argument of the .code pos-min and .codn find-min . The .code less function is capable of comparing numbers, characters, symbols, strings, as well as lists and vectors of these. It can also compare buffers. If both arguments are the same object so that .mono .meti (eq < left-obj << right-obj ) .onom holds true, then the function returns .code nil regardless of the type of .metn left-obj , even if the function doesn't handle comparing different instances of that type. In other words, no object is less than itself, no matter what it is. The .code less function pairs with the .code equal function. If values .code a and .code b are objects which are of suitable types to the .code less function, then exactly one of the following three expressions must be true: .codn "(equal a b)" , .code "(less a b)" or .codn "(less b a)" . The .code less relation is: antisymmetric, such that if .code "(less a b)" is true, then then .code "(less b a)" is false; irreflexive, such that .code "(less a a)" is false; and transitive, such that .code "(less a b)" and .code "(less b c)" imply .codn "(less a c)" . The following are detailed criteria that .code less applies to arguments of different types and combinations thereof. If both arguments are numbers or characters, they are compared as if using the .code < function. If both arguments are strings, they are compared as if using the .code string-lt function. If both arguments are symbols, the following rules apply. If the symbols have names which are different, then the result is that of their names being compared by the .code string-lt function. If .code less is passed symbols which have the same name, and neither of these symbols has a home package, then the raw bit patterns of their values are compared as integers: effectively, the object with the lower machine address is considered lesser than the other. If only one of the two same-named symbols has no home package, then if that symbol is the left argument, .code less returns .codn t , otherwise .codn nil . If both same-named symbols have home packages, then the result of .code less is that of .code string-lt applied to the names of their respective packages. Thus .code a:foo is less than .codn z:foo . If both arguments are conses, then they are compared as follows: .RS .IP 1. The .code less function is recursively applied to the .code car fields of both arguments. If it yields true, then .meta left-obj is deemed to be less than .metn right-obj . .IP 2. Otherwise, if the .code car fields are unequal under the .code equal function, .code less returns .codn nil . .IP 3. If the .code car fields are .code equal then .code less is recursively applied to the .code cdr fields of the arguments, and the result of that comparison is returned. .RE .IP This logic performs a lexicographic comparison on ordinary lists such that for instance .code "(1 1)" is less than .code "(1 1 1)" but not less than .code "(1 0)" or .codn (1) . Note that the empty .code nil list nil compared to a cons is handled by type-based precedence, described below. Two vectors are compared by .code less lexicographically, similarly to strings. Corresponding elements, starting with element 0, of the vectors are compared until an index position is found where corresponding elements of the two vectors are not .metn equal . If this differing position is beyond the end of one of the two vectors, then the shorter vector is considered to be lesser. Otherwise, the result of .code less is the outcome of comparing those differing elements themselves with .codn less . Two buffers are also compared by .code less lexicographically, as if they were vectors of integer byte values. Two ranges are compared by .code less using lexicographic logic similar to conses and vectors. The .code from fields of the ranges are first compared. If they are not .codn equal , equal then .code less is applied to those fields and the result is returned. If the .code from fields are .codn equal , then .code less is applied to the .code to fields and that result is returned. If the two arguments are of the above types, but of different types from each other, then .code less resolves the situation based on the following precedence: numbers and characters are less than ranges, which are less than strings, which are less than symbols, which are less than conses, which are less than vectors, which are less than buffers. Note that since .code nil is a symbol, it is ranked lower than a cons. This interpretation ensures correct behavior when .code nil is regarded as an empty list, since the empty list is lexicographically prior to a nonempty list. If either argument is a structure for which the .code equal method is defined, the method is invoked on that argument, and the value returned is used in place of that argument for performing the comparison. Structures with no .code equal method cannot participate in a comparison, resulting in an error. See the Equality Substitution section under Structures. Finally, if either of the arguments has a type other than the above types, the situation is an error. .coNP Function @ greater .synb .mets (greater < left-obj << right-obj ) .mets (greater < obj << obj *) .syne .desc The .code greater function is equivalent to .code less with the arguments reversed. That is to say, the following equivalences hold: .verb (greater a) <--> (less a) <--> t (greater a b) <--> (less b a) (greater a b c ...) <--> (less ... c b a) .brev The .code greater function is used as the default for the .meta testfun argument of the .code pos-max and .code find-max functions. .coNP Functions @ lequal and @ gequal .synb .mets (lequal < obj << obj *) .mets (gequal < obj << obj *) .syne .desc The functions .code lequal and .code gequal are similar to .code less and .code greater respectively, but differ in the following respect: when called with two arguments which compare true under the .code equal function, the .code lequal and .code gequal functions return .codn t . When called with only one argument, both functions return .code t and both functions generalize to three or more arguments in the same way as do .code less and .codn greater . .coNP Function @ copy .synb .mets (copy << object ) .syne .desc The .code copy function duplicates objects of various supported types: sequences, hashes, structures and random states. If .meta object is .codn nil , it returns .codn nil . Otherwise, .code copy is equivalent to invoking a more specific copying function according to the type of the argument, as follows: .RS .coIP cons .mono .meti (copy-list << object ) .onom .coIP str .mono .meti (copy-str << object ) .onom .coIP vec .mono .meti (copy-vec << object ) .onom .coIP hash .mono .meti (copy-hash << object ) .onom .IP "struct type" If .meta object implements the special .code copy method, then that method is invoked and the return value of that method call is returned as the copy. Otherwise the object is copied as if by: .mono .meti (copy-struct << object ) .onom .coIP fun .mono .meti (copy-fun << object ) .onom .coIP buf .mono .meti (copy-buf << object ) .onom .coIP carray .mono .meti (copy-carray << object ) .onom .coIP random-state .mono .meti (make-random-state << object ) .onom .coIP tnode .mono .meti (copy-tnode << object ) .onom .coIP tree .mono .meti (copy-search-tree << object ) .onom .coIP hash-iter .mono .meti (copy-hash-iter << object ) .onom .coIP tree-iter .mono .meti (copy-tree-iter << object ) .onom .coIP cptr .mono .meti (copy-cptr << object ) .onom .coIP seq-iter .mono .meti (copy-iter << object ) .onom .coIP rng .mono .meti (vec-seq << object ) .onom .RE .IP For all other types of .metn object , the invocation is erroneous. Except in the case when .meta object is .codn nil , .code copy returns a value that is distinct from (not .code eq to) .metn object . When the object is a sequence, the elements of the returned sequence may be .code eq to elements of the original sequence. In other words, .code copy is not required to perform a deep copy. .SS* List Manipulation .coNP Function @ cons .synb .mets (cons < car-value << cdr-value ) .syne .desc The .code cons function allocates, initializes and returns a single cons cell. A cons cell has two fields called .code car and .codn cdr , which are accessed by functions of the same name, or by the functions .code first and .codn rest , which are synonyms for these. Lists are made up of conses. A (proper) list is either the symbol .code nil denoting an empty list, or a cons cell which holds the first item of the list in its .codn car , and the list of the remaining items in .codn cdr . The expression .code "(cons 1 nil)" allocates and returns a single cons cell which denotes the one-element list .codn (1) . The .code cdr is .codn nil , so there are no additional items. A cons cell whose .code cdr is an atom other than .code nil is printed with the dotted pair notation. For example the cell produced by .code "(cons 1 2)" is denoted .codn "(1 . 2)" . The notation .code "(1 . nil)" is perfectly valid as input, but the cell which it denotes will print back as .codn (1) . The notations are equivalent. The dotted pair notation can be used regardless of what type of object is the cons cell's .codn cdr . so that for instance .code "(a . (b c))" denotes the cons cell whose .code car is the symbol a .code a and whose .code cdr is the list .codn "(b c)" . This is exactly the same thing as .codn "(a b c)" . In other words .code "(a b ... l m . (n o ... w . (x y z)))" is exactly the same as .codn "(a b ... l m n o ... w x y z)" . Every list, and more generally cons-cell tree structure, can be written in a "fully dotted" notation, such that there are as many dots as there are cells. For instance the cons structure of the nested list .code "(1 (2) (3 4 (5)))" can be made more explicit using .codn "(1 . ((2 . nil) . ((3 . (4 . ((5 . nil) . nil))) . nil))))" . The structure contains eight conses, and so there are eight dots in the fully dotted notation. The number of conses in a linear list like .code "(1 2 3)" is simply the number of items, so that list in particular is made of three conses. Additional nestings require additional conses, so for instance .code "(1 2 (3))" requires four conses. A visual way to count the conses from the printed representation is to count the atoms, then add the count of open parentheses, and finally subtract one. A list terminated by an atom other than .code nil is called an improper list, and the dot notation is extended to cover improper lists. For instance .code "(1 2 . 3)" is an improper list of two elements, terminated by .codn 3 , and can be constructed using .codn "(cons 1 (cons 2 3))" . The fully dotted notation for this list is .codn "(1 . (2 . 3))" . .coNP Function @ atom .synb .mets (atom << value ) .syne .desc The .code atom function tests whether .meta value is an atom. It returns .code t if this is the case, .code nil otherwise. All values which are not cons cells are atoms. .code "(atom x)" is equivalent to .codn "(not (consp x))" . .TP* Examples: .verb (atom 3) -> t (atom (cons 1 2)) -> nil (atom "abc") -> t (atom '(3)) -> nil .brev .coNP Function @ consp .synb .mets (consp << value ) .syne .desc The .code consp function tests whether .meta value is a cons. It returns .code t if this is the case, .code nil otherwise. .code "(consp x)" is equivalent to .codn "(not (atom x))" . Nonempty lists test positive under .code consp because a list is represented as a reference to the first cons in a chain of one or more conses. Note that a lazy cons is a cons and satisfies the .code consp test. See the function .code make-lazy-cons and the macro .codn lcons . .TP* Examples: .verb (consp 3) -> nil (consp (cons 1 2)) -> t (consp "abc") -> nil (consp '(3)) -> t .brev .coNP Accessors @ car and @ first .synb .mets (car << object ) .mets (first << object ) .mets (set (car << object ) << new-value ) .mets (set (first << object ) << new-value ) .syne .desc The functions .code car and .code first are synonyms. If .meta object is a cons cell, these functions retrieve the .code car field of that cons cell. .code "(car (cons 1 2))" yields .codn 1 . For programming convenience, .meta object may be of several other kinds in addition to conses. .code "(car nil)" is allowed, and returns .codn nil . .meta object may also be a vector or a string. If it is an empty vector or string, then .code nil is returned. Otherwise the first character of the string or first element of the vector is returned. .meta object may be a structure. The .code car operation is possible if the object has a .code car method. If so, .code car invokes that method and returns whatever the method returns. If the structure has no .code car method, but has a .code lambda method, then the .code car function calls that method with one argument, that being the integer zero. Whatever the method returns, .code car returns. If neither method is defined, an error exception is thrown. A .code car form denotes a valid place whenever .meta object is a valid argument for the .code rplaca function. Modifying the place denoted by the form is equivalent to invoking .code rplaca with .meta object as the left argument, and the replacement value as the right argument. It takes place in the manner given under the description .code rplaca function, and obeys the same restrictions. A .code car form supports deletion. The following equivalence then applies: .verb (del (car place)) <--> (pop place) .brev This implies that deletion requires the argument of the .code car form to be a place, rather than the whole form itself. In this situation, the argument place may have a value which is .codn nil , because .code pop is defined on an empty list. The abstract concept behind deleting a .code car is that physically deleting this field from a cons, thereby breaking it in half, would result in just the .code cdr remaining. Though fragmenting a cons in this manner is impossible, deletion simulates it by replacing the place which previously held the cons, with that cons' .code cdr field. This semantics happens to coincide with deleting the first element of a list by a .code pop operation. .coNP Accessors @ cdr and @ rest .synb .mets (cdr << object ) .mets (rest << object ) .mets (set (cdr << object ) << new-value ) .mets (set (rest << object ) << new-value ) .syne .desc The functions .code cdr and .code rest are synonyms. If .meta object is a cons cell, these functions retrieve the .code cdr field of that cons cell. .code "(cdr (cons 1 2))" yields .codn 2 . For programming convenience, .meta object may be of several other kinds in addition to conses. .code "(cdr nil)" is allowed, and returns .codn nil . .meta object may also be a vector or a string. If it is a nonempty string or vector containing at least two items, then the remaining part of the object is returned, with the first element removed. For example .mono (cdr "abc") .onom yields .strn "bc" . If .meta object is a one-element vector or string, or an empty vector or string, then .code nil is returned. Thus .mono (cdr "a") .onom and .mono (cdr "") .onom both result in .codn nil . If .meta object is a structure, then .code cdr requires it to support either the .code cdr method or the .code lambda method. If both are present, .code cdr is used. When the .code cdr function uses the .code cdr method, it invokes it with no arguments. Whatever value the method returns becomes the return value of .codn cdr . When .code cdr invokes a structure's .code lambda method, it passes as the argument the range object .codn "#R(1 t)" . Whatever the .code lambda method returns becomes the return value of .codn cdr . The invocation syntax of a .code cdr or .code rest form is a syntactic place. The place is semantically correct if .meta object is a valid argument for the .code rplacd function. Modifying the place denoted by the form is equivalent to invoking .code rplacd with .meta object as the left argument, and the replacement value as the right argument. It takes place in the manner given under the description .code rplacd function, and obeys the same restrictions. A .code cdr place supports deletion, according to the following near equivalence: .verb (del (cdr place)) <--> (prog1 (cdr place) (set place (car place))) .brev The .code place expression is evaluated only once. Note that this is symmetric with the delete semantics of .code car in that the cons stored in .code place goes away, as does the .code cdr field, leaving just the .codn car , which takes the place of the original cons. .TP* Example: Walk every element of the list .code "(1 2 3)" using a .code for loop: .verb (for ((i '(1 2 3))) (i) ((set i (cdr i))) (print (car i) *stdout*) (print #\enewline *stdout*)) .brev The variable .code i marches over the cons cells which make up the "backbone" of the list. The elements are retrieved using the .code car function. Advancing to the next cell is achieved using .codn "(cdr i)" . If .code i is the last cell in a (proper) list, .code "(cdr i)" yields .code nil and so .code i becomes .codn nil , the loop guard expression .code i fails and the loop terminates. .coNP Functions @ rplaca and @ rplacd .synb .mets (rplaca < object << new-car-value ) .mets (rplacd < object << new-cdr-value ) .syne .desc If .code object is a cons cell or lazy cons cell, then .code rplaca and .code rplacd functions assign new values into the .code car and .code cdr fields of the .metn object . In addition, these functions are meaningful for other kinds of objects also. Note that, except for the difference in return value, .code "(rplaca x y)" is the same as the more generic .codn "(set (car x) y)" , and likewise .code "(rplacd x y)" can be written as .codn "(set (cdr x) y)" . The .code rplaca and .code rplacd functions return .metn cons . Note: In \*(TX versions 89 and earlier, these functions returned the new value. The behavior was undocumented. The .meta cons argument does not have to be a cons cell. Both functions support meaningful semantics for vectors and strings. If .meta cons is a string, it must be modifiable. The .code rplaca function replaces the first element of a vector or first character of a string. The vector or string must be at least one element long. The .code rplacd function replaces the suffix of a vector or string after the first element with a new suffix. The .meta new-cdr-value must be a sequence, and if the suffix of a string is being replaced, it must be a sequence of characters. The suffix here refers to the portion of the vector or string after the first element. It is permissible to use .code rplacd on an empty string or vector. In this case, .meta new-cdr-value specifies the contents of the entire string or vector, as if the operation were done on a nonempty vector or string, followed by the deletion of the first element. The .meta object argument may be a structure. In the case of .codn rplaca , the structure must have a defined .code rplaca method or else, failing that, a .code lambda-set method. The first of these methods which is available, in the given order, is used to perform the operation. Whatever the respective method returns, If the .code lambda-set method is used, it is called with two arguments (in addition to .codn object ): the integer zero, and .metn new-car-value . In the case of .codn rplacd , the structure must have a defined .code rplacd method or else, failing that, a .code lambda-set method. The first of these methods which is available, in the given order, is used to perform the operation. Whatever the respective method returns, If the .code lambda-set method is used, it is called with two arguments (in addition to .codn object ): the range value .code "#R(1 t)" and .metn new-car-value . .coNP Accessors @, second @, third @, fourth @, fifth @, sixth @, seventh @, eighth @ ninth and @ tenth .synb .mets (first << object ) .mets (second << object ) .mets (third << object ) .mets (fourth << object ) .mets (fifth << object ) .mets (sixth << object ) .mets (seventh << object ) .mets (eighth << object ) .mets (ninth << object ) .mets (tenth << object ) .mets (set (first << object ) << new-value ) .mets (set (second << object ) << new-value ) .mets ... .mets (set (tenth << object ) << new-value ) .syne .desc Used as functions, these accessors retrieve the elements of a sequence by position. If the sequence is shorter than implied by the position, these functions return .codn nil . When used as syntactic places, these accessors denote the storage locations by position. The location must exist, otherwise an error exception results. The places support deletion. .TP* Examples: .verb (third '(1 2)) -> nil (second "ab") -> #\eb (third '(1 2 . 3)) -> **error, improper list* (let ((x (copy "abcd"))) (inc (third x)) x) -> "abce" .brev .coNP Functions @ append and @ nconc .synb .mets (append <> [ sequence *]) .mets (nconc <> [ sequence *]) .syne .desc The .code append function creates a new object which is a catenation of the .meta list arguments. All arguments are optional; .code append produces the empty list, and if a single argument is specified, that argument is returned. If two or more arguments are present, then the situation is identified as one or more .meta sequence arguments followed by .metn last-arg . The .meta sequence arguments must be sequences; .meta last-arg may be a sequence or atom. The .code append operation over three or more arguments is left-associative, such that .code "(append x y z)" is equivalent to both .code "(append (append x y) z)" and .codn "(append x (append z y))" . This allows the catenation of an arbitrary number of arguments to be understood in terms of a repeated application of the two-argument case, whose semantics is given by these rules: .RS .IP 1. .code nil catenates with .code nil to produce .codn nil : .verb (append nil nil) -> nil .brev .IP 2. .code nil catenates with a proper or improper list, producing that list itself: .verb (append nil '(1 2)) -> (1 2) (append nil '(1 2 . 3)) -> (1 2 . 3) .brev .IP 3. A proper list catenates with .codn nil , producing that list itself: .verb (append '(1 2) nil) -> (1 2) .brev .IP 4. A proper list catenates with an atom, producing an improper list terminated by that atom, whether or not that atom is a sequence: .verb (append '(1 2) #(3)) -> (1 2 . #(3)) (append '(1 2) 3) -> (1 2 . 3) .brev .IP 5. A non-list sequence catenates with another sequence into a sequence, producing a sequence which contains the elements of both, of the same kind as the left sequence. The elements must be compatible; a string can only catenate with a sequence of characters. .verb (append #(1 2) #(3 4)) -> #(1 2 3 4) (append "ab" "cd") -> "abcd" (append "ab" #(#\ec #\ed)) -> "abcd" (append "ab" #(3 4)) -> ;; error .brev .IP 6. A non-list sequence catenates with an atom if it is a suitable element type for that kind of sequence. The resulting sequence is of the same kind, and includes that atom: .verb (append #(1 2) 3) -> #(1 2 3) (append "ab" #\ec) -> "abc" (append "ab" 3) -> ;; error .brev .IP 7. If an improper list is catenated with any object, the catenation takes place between the terminating atom of that list and that object. This requires the terminating atom to be a sequence. If the catenation is possible, then the result is a new improper list which is a copy of the original, but with the terminating atom replaced by a catenation of that atom and the object: .verb (append '(1 2 . "ab") "c") -> (1 2 . "abc") (append '(1 2 . "ab") '(2 3)) -> ;; error .brev .IP 8. A non-sequence atom doesn't catenate; the situation is erroneous: .verb (append 1 2) -> ;; error (append '(1 . 2) 3) -> ;; error .brev .RE .IP If N arguments are specified, where N > 1, then the first N-1 arguments must be proper lists. Copies of these lists are catenated together. The last argument N, shown in the above syntax as .metn last-arg , may be any kind of object. It is installed into the .code cdr field of the last cons cell of the resulting list. Thus, if argument N is also a list, it is catenated onto the resulting list, but without being copied. Argument N may be an atom other than .codn nil ; in that case .code append produces an improper list. The .code nconc function works like .codn append , but may destructively manipulate any of the input objects. .TP* Examples: .verb ;; An atom is returned. (append 3) -> 3 ;; A list is also just returned: no copying takes place. ;; The eq function can verify that the same object emerges ;; from append that went in. (let ((list '(1 2 3))) (eq (append list) list)) -> t (append '(1 2 3) '(4 5 6) 7) -> '(1 2 3 4 5 6 . 7)) ;; the (4 5 6) tail of the resulting list is the original ;; (4 5 6) object, shared with that list. (append '(1 2 3) '(4 5 6)) -> '(1 2 3 4 5 6) (append nil) -> nil ;; (1 2 3) is copied: it is not the last argument (append '(1 2 3) nil) -> (1 2 3) ;; empty lists disappear (append nil '(1 2 3) nil '(4 5 6)) -> (1 2 3 4 5 6) (append nil nil nil) -> nil ;; atoms and improper lists other than in the last position ;; are erroneous (append '(a . b) 3 '(1 2 3)) -> **error** ;; sequences other than lists can be catenated. (append "abc" "def" "g" #\eh) -> "abcdefgh" ;; lists followed by non-list sequences end with non-list ;; sequences catenated in the terminating atom: (append '(1 2) '(3 4) "abc" "def") -> (1 2 3 4 . "abcdef") .brev .coNP Function @ append* .synb .mets (append* <> [ list *]) .syne .desc The .code append* function lazily catenates lists. If invoked with no arguments, it returns .codn nil . If invoked with a single argument, it returns that argument. Otherwise, it returns a lazy list consisting of the elements of every .meta list argument from left to right. Arguments other than the last are treated as lists, and traversed using .code car and .code cdr functions to visit their elements. The last argument isn't traversed: rather, that object itself becomes the .code cdr field of the last cons cell of the lazy list constructed from the previous arguments. .coNP Functions @ revappend and @ nreconc .synb .mets (revappend < list1 << list2 ) .mets (nreconc < list1 << list2 ) .syne .desc The .code revappend function returns a list consisting of .code list2 appended to a reversed copy of .metn list1 . The returned object shares structure with .metn list2 , which is unmodified. The .code nreconc function behaves similarly, except that the returned object may share structure with not only .meta list2 but also .metn list1 , which is modified. .coNP Function @ list .synb .mets (list << value *) .syne .desc The .code list function creates a new list, whose elements are the argument values. .TP* Examples: .verb (list) -> nil (list 1) -> (1) (list 'a 'b) -> (a b) .brev .coNP Function @ list* .synb .mets (list* << value *) .syne .desc The .code list* function is a generalization of cons. If called with exactly two arguments, it behaves exactly like cons: .code "(list* x y)" is identical to .codn "(cons x y)" . If three or more arguments are specified, the leading arguments specify additional atoms to be consed to the front of the list. So for instance .code "(list* 1 2 3)" is the same as .code "(cons 1 (cons 2 3))" and produces the improper list .codn "(1 2 . 3)" . Generalizing in the other direction, .code list* can be called with just one argument, in which case it returns that argument, and can also be called with no arguments in which case it returns .codn nil . .TP* Examples: .verb (list*) -> nil (list* 1) -> 1 (list* 'a 'b) -> (a . b) (list* 'a 'b 'c) -> (a b . c) .brev .TP* "Dialect Note:" Note that unlike in some other Lisp dialects, the effect of .code "(list* 1 2 x)" can also be obtained using .codn "(list 1 2 . x)" . However, .code "(list* 1 2 (func 3))" cannot be rewritten as .code "(list 1 2 . (func 3))" because the latter is equivalent to .codn "(list 1 2 func 3)" . .coNP Accessor @ sub-list .synb .mets (sub-list < list >> [ from <> [ to ]]) .mets (set (sub-list < list >> [ from <> [ to ]]) << new-value ) .syne .desc The .code sub-list function has the same parameters and semantics as the .code sub function, except that it operates on its .meta list argument using list operations, and assumes that .meta list is terminated by .codn nil . If a .code sub-list form is used as a place, then the .meta list argument form must also be a place. The .code sub-list place denotes a subrange of .meta list as if it were a storage location. The previous value of this location, if needed, is fetched by a call to .codn sub-list . Storing .meta new-value to the place is performed by a call to .codn replace-list . The return value of .meta replace-list is stored into .metn list . In an update operation which accesses the prior value and stores a new value, the arguments .metn list , .metn from , .meta to and .meta new-value are evaluated once. .coNP Function @ replace-list .synb .mets (replace-list < list < item-sequence >> [ from <> [ to ]]) .syne .desc The .code replace-list function is like the .code replace function, except that it operates on its .meta list argument using list operations. It assumes that .meta list is terminated by .codn nil , and that it is made of cells which can be mutated using .codn rplaca . .coNP Functions @ listp and @ proper-list-p .synb .mets (listp << value ) .mets (proper-list-p << value ) .syne .desc The .code listp and .code proper-list-p functions test, respectively, whether .meta value is a list, or a proper list, and return .code t or .code nil accordingly. The .code listp test is weaker, and executes without having to traverse the object. The value produced by the expression .code "(listp x)" is the same as that of .codn "(or (null x) (consp x))" , except that .code x is evaluated only once. The empty list .code nil is a list, and a cons cell is a list. The .code proper-list-p function returns .code t only for proper lists. A proper list is either .codn nil , or a cons whose .code cdr is a proper list. .code proper-list-p traverses the list, and its execution will not terminate if the list is circular. These functions return .code nil for list-like sequences that are not made of actual .code cons cells. Dialect Note: in \*(TX 137 and older, .code proper-list-p is called .codn proper-listp . The name was changed for adherence to conventions and compatibility with other Lisp dialects, like Common Lisp. However, the function continues to be available under the old name. Code that must run on \*(TX 137 and older installations should use .codn proper-listp , but its use going forward is deprecated. .coNP Function @ endp .synb .mets (endp << object ) .syne .desc The .code endp function returns .code t if .meta object is the object .codn nil . If .meta object is a cons cell, then .code endp returns .codn t . Otherwise, .code endp function throws an exception. .coNP Function @ length-list .synb .mets (length-list << list ) .syne .desc The .code length-list function returns the length of .metn list , which may be a proper or improper list. The length of a list is the number of conses in that list. .coNP Function @ copy-list .synb .mets (copy-list << list ) .syne .desc The .code copy-list function which returns a list similar to .metn list , but with a newly allocated cons-cell structure. If .meta list is an atom, it is simply returned. Otherwise, .meta list is a cons cell, and .code copy-list returns the same object as the expression .mono .meti (cons (car << list ) (copy-list (cdr << list ))). .onom Note that the object .mono .meti (car << list ) .onom is not deeply copied, but only propagated by reference into the new list. .code copy-list produces a new list structure out of the same items that are in .metn list . .TP* "Dialect Note:" Common Lisp does not allow the argument to be an atom, except for the empty list .codn nil . .coNP Function @ length-list-< .synb .mets (length-list-< < list << len ) .syne .desc The .code length-list-< function determines whether the length of .metn list , is less than the integer .metn len . The expression .verb (length-list-< x y) .brev is similar to, but usefully different from .verb (< (length-list x) y) .brev because .code length-list-< is required to only traverses .meta list far enough to be able to determine the return value. If the end of the list is reached before .meta len conses are encountered, the function returns .codn t , otherwise if .code len conses are encountered, the function terminates immediately and returns .codn nil . The .code length-list-< function is therefore safe to use with infinite lazy lists and circular lists, for which .code length would not terminate. Note: there is more generic function .code length-< which works with efficiently with different kinds of sequences. Note: the .code length-list-< is useful in situations when a decision must be made between two algorithms based on the length of one or more input lists. The decision can be made without wastefully performing a full pass over the input lists to measure their length. .coNP Function @ copy-cons .synb .mets (copy-cons << cons ) .syne .desc The .code copy-cons function creates and returns a new object that is a replica of .metn cons . The .meta cons argument must be either a .code cons cell, or else a lazy cons: an object of type .codn lcons . A new cell of the same type as .meta cons is created, and all of its fields are initialized by copying the corresponding fields from .metn cons . If .meta cons is lazy, the newly created object is in the same state as the original. If the original has not yet been updated and thus has an update function, the copy also has not yet been updated and has the same update function. .coNP Function @ copy-tree .synb .mets (copy-tree << obj ) .syne .desc The .code copy-tree function returns a copy of .meta obj which represents an arbitrary .codn cons -cell-based structure. The cell structure of .meta obj is traversed and a similar structure is constructed, but without regard for substructure sharing or circularity. More precisely, if .meta obj is an atom, then it is returned. If it is an ordinary .code cons cell, then .code copy-tree is recursively applied to the .code car and .code cdr fields to produce their individual replicas. A new .code cons cell is then produced from the replicated .code car and .codn cdr . If .meta obj is a lazy .codn cons , then just like in the ordinary .code cons case, the .code car and .code cdr fields are duplicated with a recursive call to .codn copy-tree . Then, a lazy .code cons is created from these replicated fields. If .meta cell has an update function, then the newly created lazy .code cons has the same update function; the function isn't copied. Like .codn copy-cons , the .code copy-tree function doesn't trigger the update of lazy conses. The copies of lazy conses which have not been updated are also conses which have not been updated. .coNP Functions @ reverse and @ nreverse .synb .mets (reverse << list ) .mets (nreverse << list ) .syne .desc Description: The functions .code reverse and .code nreverse produce an object which contains the same items as proper list .metn list , but in reverse order. If .meta list is .codn nil , then both functions return .codn nil . The .code reverse function is non-destructive: it creates a new list. The .code nreverse function creates the structure of the reversed list out of the cons cells of the input list, thereby destructively altering it (if it contains more than one element). How .code nreverse uses the material from the original list is unspecified. It may rearrange the cons cells into a reverse order, or it may keep the structure intact, but transfer the .code car values among cons cells into reverse order. Other approaches are possible. .coNP Accessor @ nthlast .synb .mets (nthlast < index << list ) .mets (set (nthlast < index << list ) << new-value ) .syne .desc The .code nthlast function retrieves the n-th last cons cell of a list, indexed from one. The .meta index parameter must be a an integer. If .meta index is positive and so large that it specifies a nonexistent cons beyond the beginning of the list, .code nthlast returns .metn list . Effectively, values of .meta index larger than the length of the list are clamped to the length. If .meta index is negative, then .code nthlast yields nil. An .meta index value of zero retrieves the terminating atom of .meta list or else the value .meta list itself, if .meta list is an atom. The following equivalences hold: .verb (nthlast 1 list) <--> (last list) .brev An .code nthlast place designates the storage location which holds the n-th cell, as indicated by the value of .metn index . A negative .meta index doesn't denote a place. A positive .meta index greater than the length of the list is treated as if it were equal to the length of the list. If .meta list is itself a syntactic place, then the .meta index value .I n is permitted for a list of length .IR n . This index value denotes the .meta list place itself. Storing to this value overwrites .metn list . If .meta list isn't a syntactic place, then storing to position .I n isn't permitted. If .meta list is of length zero, or an atom (in which case its length is considered to be zero) then the above remarks about position .I n apply to an .meta index value of zero: if .meta list is a syntactic place, then the position denotes .meta list itself, otherwise the position doesn't exist as a place. If .meta list contains one or more elements, then .meta index value of zero denotes the .code cdr field of its last cons cell. Storing a value to this place overwrites the terminating atom. .coNP Accessor @ butlastn .synb .mets (butlastn < num << list ) .mets (set (butlastn < num << list ) new-value ) .syne .desc The .code butlastn function calculates that initial portion of .meta list which excludes the last .meta num elements. Note: the .code butlastn function doesn't support non-list sequences as sequences; it treats them as the terminating atom of a zero-length improper list. The .code butlast sequence function supports non-list sequences. If .code x is a list, then the following equivalence holds: .verb (butlastn n x) <--> (butlast x n) .brev If .meta num is zero, or negative, then .code butlastn returns .metn list . If .meta num is positive, and meets or exceeds the length of .metn list , then .code butlastn returns .codn nil . If a .code butlastn form is used as a syntactic place, then .meta list must be a place. Assigning to the form causes .meta list to be replaced with a new list which is a catenation of the new value and the last .meta num elements of the original list, according to the following equivalence: .verb (set (butlastn n x) v) <--> (progn (set x (append v (nthlast n x))) v) .brev except that .codn n , .code x and .code v are evaluated only once, in left-to-right order. .coNP Accessor @ nth .synb .mets (nth < index << object ) .mets (set (nth < index << object ) << new-value ) .syne .desc The .code nth function performs random access on a list, retrieving the n-th element indicated by the zero-based index value given by .metn index . The .meta index argument must be a nonnegative integer. If .meta index indicates an element beyond the end of the list, then the function returns .codn nil . The following equivalences hold: .verb (nth 0 list) <--> (car 0) <--> (first list) (nth 1 list) <--> (cadr list) <--> (second list) (nth 2 list) <--> (caddr list) <--> (third list) (nth x y) <--> (car (nthcdr x y)) .brev .coNP Accessor @ nthcdr .synb .mets (nthcdr < index << list ) .mets (set (nthcdr < index << list ) << new-value ) .syne .desc The .code nthcdr function retrieves the n-th cons cell of a list, indexed from zero. The .meta index parameter must be a nonnegative integer. If .meta index specifies a nonexistent cons beyond the end of the list, then .code nthcdr yields nil. The following equivalences hold: .verb (nthcdr 0 list) <--> list (nthcdr 1 list) <--> (cdr list) (nthcdr 2 list) <--> (cddr list) (car (nthcdr x y)) <--> (nth x y) .brev An .code nthcdr place designates the storage location which holds the n-th cell, as indicated by the value of .metn index . Indices beyond the last cell of .meta list do not designate a valid place. If .meta list is itself a place, then the zeroth index is permitted and the resulting place denotes .metn list . Storing a value to .mono .meti (nthcdr < 0 << list ) .onom overwrites .metn list . Otherwise if .meta list isn't a syntactic place, then the zeroth index does not designate a valid place; .meta index must have a positive value. A .code nthcdr place does not support deletion. .TP* "Dialect Note:" In Common Lisp, .code nthcdr is only a function, not an accessor; .code nthcdr forms do not denote places. .coNP Function @ tailp .synb .mets (tailp < object << list ) .syne .desc The .code tailp function tests whether .meta object is a tail of .metn list . This means that .meta object is either .meta list itself, or else one of the .code cons cells of .meta list or else the terminating atom of .metn list . More formally, a recursive definition follows. If .meta object and .meta list are the same object (thus equal under the .code eq function) then .code tailp returns .codn t . If .meta list is an atom, and is not .metn object , then the function returns .codn nil . Otherwise, .meta list is a .code cons that is not .meta object and .code tailp yields the same value as the .mono .meti "(tailp < object (cdr << list ))" .onom expression. .coNP Accessors @, caar @, cadr @, cdar @, cddr ..., @ cdddddr .synb .mets (caar << object ) .mets (cadr << object ) .mets (cdar << object ) .mets (cddr << object ) .mets ... .mets (cdddr << object ) .mets (set (caar << object ) << new-value ) .mets (set (cadr << object ) << new-value ) .mets ... .syne .desc The .I "a-d accessors" provide a shorthand notation for accessing two to five levels deep into a cons-cell-based tree structure. For instance, the the equivalent of the nested function call expression .mono .meti (car (car (cdr << object ))) .onom can be achieved using the single function call .mono .meti (caadr << object ). .onom The symbol names of the a-d accessors are a generalization of the words "car" and "cdr". They encode the pattern of .code car and .code cdr traversal of the structure using a sequence of the letters .code a and .code d placed between .code c and .codn r . The traversal is encoded in right-to-left order, so that .code cadr indicates a traversal of the .code cdr link, followed by the .codn car . This order corresponds to the nested function call notation, which also encodes the traversal right-to-left. The following diagram illustrates the straightforward relationship: .verb (cdr (car (cdr x))) ^ ^ ^ | / | | / / | / ____/ || / (cdadr x) .brev \*(TL provides all possible a-d accessors up to five levels deep, from .code caar all the way through .codn cdddddr . Expressions involving a-d accessors are places. For example, .code "(caddr x)" denotes the same place as .codn "(car (cddr x))" , and .code "(cdadr x)" denotes the same place as .codn "(cdr (cadr x))" . The a-d accessor places support deletion, with semantics derived from the deletion semantics of the .code car and .code cdr places. For example, .code "(del (caddr x))" means the same as .codn "(del (car (cddr x)))" . .coNP Functions @ cyr and @ cxr .synb .mets (cyr < address << object ) .mets (cxr < address << object ) .syne .desc The .code cyr and .code cxr functions provide .cod3 car / cdr navigation of tree structure driven by numeric address given by the .meta address argument. The .meta address argument can express any combination of the application of .code car and .code cdr functions, including none at all. The difference between .code cyr and .code cxr is the bit order of the encoding. Under .codn cyr , the most significant bit of the encoding given in .meta address indicates the initial .cod3 car / cdr navigation, and the least significant bit gives the final one. Under .codn cxr , it is opposite. Both functions require .meta address to be a positive integer. Any other argument raises an error. Under both functions, the .meta address value .code 1 encodes the .code identity operation: no .cod3 car / cdr .coNP Functions @ flatten and @ flatten* .synb .mets (flatten >> { list | << atom }) .mets (flatten* >> { list | << atom }) .syne .desc The .code flatten function recursively traverses a nested .metn list , returning a list whose elements are all of the .cod2 non- nil atoms contained in .metn list , at any level of nesting. If the argument is an .meta atom rather than a .metn list , then it is returned. Otherwise, the .meta list argument must be a proper list, as must all lists nested within it. The .code flatten* function calculates the same result as .codn flatten , except that it produces a lazy list. It can be used to lazily flatten an infinite lazy list. .TP* Examples: .verb (flatten 42) -> 42 (flatten '(1 2 () (3 4))) -> (1 2 3 4) ;; equivalent to previous, since ;; nil is the same thing as () (flatten '(1 2 nil (3 4))) -> (1 2 3 4) (flatten nil) -> nil (flatten '(((()) ()))) -> nil (flatten '(a (b . c))) -> ;; error .brev .coNP Functions @ flatcar and @ flatcar* .synb .mets (flatcar << tree ) .mets (flatcar* << tree ) .syne .desc The .code flatcar function produces a list of all the atoms contained in the tree structure .metn tree , in the order in which they appear, when the structure is traversed left to right. This list includes those .code nil atoms which appear in .code car fields. The list excludes .code nil atoms which appear in .code cdr fields. If the .meta tree argument is an atom, it is returned. The .code flatcar* function works like .code flatcar except that it produces a lazy list. It can be used to lazily flatten an infinite lazy structure. .TP* Examples: .verb (flatcar '(1 2 () (3 4))) -> (1 2 nil 3 4) (flatcar '(a (b . c) d (e) (((f)) . g) (nil . z) nil . h)) --> (a b c d e f g nil z nil h) .brev .coNP Functions @ tree-find and @ cons-find .synb .mets (tree-find < obj < tree <> [ test-function ]) .mets (cons-find < obj < tree <> [ test-function ]) .syne .desc The .code tree-find and .code cons-find function search .meta tree for an occurrence of .metn obj . Tree can be any atom, or a cons. If .meta tree it is a cons, it is understood to be a proper list whose elements are also trees. The equivalence test is performed by .meta test-function which must take two arguments, and has conventions similar to .codn eq , .code eql or .codn equal . If an argument is omitted, the default function is .codn equal . Under both .code tree-find and .codn cons-find , if .meta tree is equivalent to .meta obj under .metn test-function , then .code t is returned to announce a successful finding. Next, if the mismatched .meta obj is an atom, both functions return .code nil to indicate that the search failed. If none of the above cases occur, the semantics of the functions diverge, as follows. In the case of .codn tree-find , .meta tree is taken to be a proper list, and .code tree-find is recursively applied to each element of the list in turn, using the same .meta obj and .meta test-function arguments, stopping at the first element which returns a .cod2 non- nil value. In the case of .codn cons-find , .meta tree is taken to be .codn cons -cell-based tree structure. The .code cons-find function is recursively applied to the .code car and .code cdr fields of .metn tree . Thus a match may be found in any position in the structure, including the dotted position of a list. .coNP Functions @, memq @ memql and @ memqual .synb .mets (memq < object << list ) .mets (memql < object << list ) .mets (memqual < object << list ) .syne .desc The .codn memq , .code memql and .code memqual functions search .meta list for a member which is, respectively, .codn eq , .code eql or .code equal to .metn object . (See the .codn eq , .code eql and .code equal functions above.) If no such element found, .code nil is returned. Otherwise, that suffix of .meta list is returned whose first element is the matching object. .coNP Functions @ member and @ member-if .synb .mets (member < key < sequence >> [ testfun <> [ keyfun ]]) .mets (member-if < predfun < sequence <> [ keyfun ]) .syne .desc The .code member and .code member-if functions search through .meta sequence for an item which matches a key, or satisfies a predicate function, respectively. The .meta keyfun argument specifies a function which is applied to the elements of the sequence to produce the comparison key. If this argument is omitted, then the untransformed elements of the sequence themselves are examined. The .code member function's .meta testfun argument specifies the test function which is used to compare the comparison keys taken from the sequence to the search key. If this argument is omitted, then the .code equal function is used. If .code member does not find a matching element, it returns .codn nil . Otherwise it returns the suffix of .meta sequence which begins with the matching element. The .code member-if function's .meta predfun argument specifies a predicate function which is applied to the successive comparison keys pulled from the sequence by applying the key function to successive elements. If no match is found, then .code nil is returned, otherwise what is returned is the suffix of .meta sequence which begins with the matching element. .coNP Functions @, rmemq @, rmemql @, rmemqual @ rmember and @ rmember-if .synb .mets (rmemq < object << list ) .mets (rmemql < object << list ) .mets (rmemqual < object << list ) .mets (rmember < key < sequence >> [ testfun <> [ keyfun ]]) .mets (rmember-if < predfun < sequence <> [ keyfun ]) .syne .desc These functions are counterparts to .codn memq , .codn memql , .codn memqual , .code member and .code member-if which look for the rightmost element which matches .metn object , rather than for the leftmost element. .coNP Functions @ conses and @ conses* .synb .mets (conses << list ) .mets (conses* << list ) .syne .desc These functions return a list whose elements are the conses which make up .metn list . The .code conses* function does this in a lazy way, avoiding the computation of the entire list: it returns a lazy list of the conses of .metn list . The .code conses function computes the entire list before returning. The input .meta list may be proper or improper. The first cons of .meta list is that .meta list itself. The second cons is the rest of the list, or .mono .meti (cdr << list ). .onom The third cons is .mono .meti (cdr (cdr << list )) .onom and so on. .TP* Example: .verb (conses '(1 2 3)) -> ((1 2 3) (2 3) (3)) .brev .TP* "Dialect Note:" These functions are useful for simulating the .code maplist function found in other dialects like Common Lisp. \*(TL's .code "(conses x)" can be expressed in Common Lisp as .codn "(maplist #'identity x)" . Conversely, the Common Lisp operation .code "(maplist function list)" can be computed in \*(TL as .codn "(mapcar function (conses list))" . More generally, the Common Lisp operation .verb (maplist function list0 list1 ... listn) .brev can be expressed as: .verb (mapcar function (conses list0) (conses list1) ... (conses listn)) .brev .coNP Function @ delcons .synb .mets (delcons < cons << list ) .syne .desc The .code delcons function destructively removes a cons cell from a list. The .meta list is searched to see whether one of its cons cells is the same object as .metn cons . If so, that cell is removed from the list. The .meta list argument may be a proper or improper list, possibly empty. It may also be an atom other than .codn nil , which is regarded as being, effectively, an empty improper list terminated by that atom. The operation of .code delcons is divided into the following three cases. If .meta cons is the first cons cell of .metn list , then the .code cdr of .meta list is returned. If .meta cons is the second or subsequent cons of .metn list , then .meta list is destructively altered to remove .meta cons and then returned. This means that the .code cdr field of the predecessor of .meta cons is altered from referencing .meta cons to referencing .mono .meti (cdr << cons ) .onom instead. The returned value is the same cons cell as .metn list . The third case occurs when .meta cons is not found in .metn list . In this situation, .meta list is returned unchanged. .TP* Examples: .verb (let ((x (list 1 2 3))) (delcons x x)) -> (2 3) (let ((x (list 1 2 . 3))) (delcons (cdr x) x)) -> (1 . 3) .brev .SS* Association Lists Association lists are ordinary lists formed according to a special convention. Firstly, any empty list is a valid association list. A nonempty association list contains only cons cells as the key elements. These cons cells are understood to represent key/value associations, hence the name "association list". .coNP Function @ assoc .synb .mets (assoc < key << alist ) .syne .desc The .code assoc function searches an association list .meta alist for a cons cell whose .code car field is equivalent to .meta key under the .code equal function. The first such cons is returned. If no such cons is found, .code nil is returned. .coNP Functions @ assq and @ assql .synb .mets (assq < key << alist ) .mets (assql < key << alist ) .syne .desc The .code assq and .code assql functions are very similar to .codn assoc , with the only difference being that they determine equality using, respectively, the .code eq and .code eql functions rather than .codn equal . .coNP Functions @, rassq @ rassql and @ rassoc .synb .mets (rassq < value << alist ) .mets (rassql < value << alist ) .mets (rassoc < value << alist ) .syne .desc The .codn rassq , .code rassql and .code rassoc functions are reverse lookup counterparts to .code assql and .codn assoc . When searching, they examine the .code cdr field of the pairs of .meta alist rather than the .code car field. The .code rassoc function searches association list .meta alist for a cons whose .code cdr field equivalent to .meta value according to the .code equal function. If such a cons is found, it is returned. Otherwise .code nil is returned. The .code rassq and .code rassql functions search in the same way as .code rassoc but compares values using, respectively, .code eq and .codn eql . .coNP Function @ acons .synb .mets (acons < car < cdr << alist ) .syne .desc The .code acons function constructs a new alist by consing a new cons to the front of .metn alist . The following equivalence holds: .verb (acons car cdr alist) <--> (cons (cons car cdr) alist) .brev .coNP Function @ acons-new .synb .mets (acons-new < car < cdr << alist ) .syne .desc The .code acons-new function searches .metn alist , as if using the assoc function, for an existing cell which matches the key provided by the car argument. If such a cell exists, then its cdr field is overwritten with the .meta cdr argument, and then the .meta alist is returned. If no such cell exists, then a new list is returned by adding a new cell to the input list consisting of the .meta car and .meta cdr values, as if by the .code acons function. .coNP Function @ aconsql-new .synb .mets (aconsql-new < car < cdr << alist ) .syne .desc The .code aconsql-new function has similar same parameters and semantics as .codn acons-new , except that the .code eql function is used for equality testing. Thus, the list is searched for an existing cell as if using the .code assql function rather than .codn assoc . .coNP Function @ alist-remove .synb .mets (alist-remove < alist << key *) .syne .desc The .code alist-remove function takes association list .meta alist and produces a duplicate from which cells matching any of the specified .metn key s have been removed. .coNP Function @ alist-nremove .synb .mets (alist-nremove < alist << key *) .syne .desc The .code alist-nremove function is like .codn alist-remove , but potentially destructive. The input list .meta alist may be destroyed and its structural material reused to form the output list. The application should not retain references to the input list. .coNP Function @ copy-alist .synb .mets (copy-alist << alist ) .syne .desc The .code copy-alist function duplicates .codn alist . Unlike .codn copy-list , which only duplicates list structure, .code copy-alist also duplicates each cons cell of the input alist. That is to say, each element of the output list is produced as if by the .code copy-cons function applied to the corresponding element of the input list. .coNP Function @ pairlis .synb .mets (pairlis < keys < values <> [ alist ]) .syne .desc The .code pairlis function returns an association list consisting of pairs formed from the elements of .meta keys and .meta values prepended to the existing .metn alist . If an .meta alist argument is omitted, it defaults to .codn nil . Pairs of elements are formed by taking successive elements from the .meta keys and .meta values sequences in parallel. If the sequences are not of equal length, the excess elements from the longer sequence are ignored. The pairs appear in the resulting list in the original order in which their constituents appeared in .meta keys and .metn values . .TP* "Dialect Note:" The ANSI CL .code pairlis requires .meta key and .meta data to be lists, not sequences. The behavior of the ANSI CL .code pairlis is undefined of those lists are of different lengths. Finally, the elements are permitted to appear in either the original order or reverse order. .TP* Examples: .verb (pairlis nil nil) -> nil (pairlis "abc" #(1 2 3 4)) -> ((#\ea . 1) (#\eb . 2) (#\ec . 3)) (pairlis '(1 2 3) '(a b c) '((x . y) (z . w))) -> ((1 . a) (2 . b) (3 . c) (x . y) (z . w)) .brev .SS* Property Lists A .IR "property list", also referred to as a .IR plist , is a flat list of even length consisting of interleaved pairs of property names (usually symbols) and their values (arbitrary objects). An example property list is (:a 1 :b "two") which contains two properties, :a having value 1, and :b having value "two". An .I "improper plist" represents Boolean properties in a condensed way, as property indicators which are not followed by a value. Such properties only indicate their presence or absence, which is useful for encoding a Boolean value. If it is absent, then the property is false. Correctly using an improper plist requires that the exact set of Boolean keys is established by convention. In this document, the unqualified terms .I "property list" and .I "plist" refer strictly to an ordinary plist, not to an improper plist. .TP* "Dialect Note:" Unlike in some other Lisp dialects, including ANSI Common Lisp, symbols do not have property lists in \*(TL. Improper plists aren't a concept in ANSI CL. .coNP Function @ prop .synb .mets (prop < plist << key ) .syne .desc The .code prop function searches property list .meta plist for key .metn key . If the key is found, then the value next to it is returned. Otherwise .code nil is returned. It is ambiguous whether .code nil is returned due to the property not being found, or due to the property being present with a .code nil value. The indicators in .meta plist are compared with .meta key using .code eq equality, allowing them to be symbols, characters or .code fixnum integers. .coNP Function @ memp .synb .mets (memp < key << plist ) .syne .desc The .code memp function searches property list .meta plist for key .metn key , using .code eq equality. If the key is found, then the entire suffix of .meta plist beginning with the indicator is returned, such that the first element of the returned list is .meta key and the second element is the property value. Note the reversed argument convention relative to the .code prop function, harmonizing with functions in the .code member family. .coNP Functions @ plist-to-alist and @ improper-plist-to-alist .synb .mets (plist-to-alist << plist ) .mets (improper-plist-to-alist < imp-plist << bool-keys ) .syne .desc The functions .code plist-to-alist and .code improper-plist-to-alist convert, respectively, a property list and improper property list to an association list. The .code plist-to-alist function scans .meta plist and returns the indicator-property pairs as a list of cons cells, such that each .code car is the indicator, and each .code cdr is the value. The .code improper-plist-to-alist is similar, except that it handles the Boolean properties which, by convention, aren't followed by a value. The list of all such indicators is specified by the .code bool-keys argument. .TP* "Examples:" .verb (plist-to-alist '(a 1 b 2)) --> ((a . 1) (b . 2)) (improper-plist-to-alist '(:x 1 :blue :y 2) '(:blue)) --> ((:x . 1) (:blue) (:y . 2)) .brev .SS* List Sorting Note: these functions operate on lists. The principal sorting function in \*(TL is .codn sort , described under Sequence Manipulation. The .code merge function described here provides access to an elementary step of the algorithm used internally by .code sort when operating on lists. The .code multi-sort operation sorts multiple lists in parallel. It is implemented using .codn sort . .coNP Function @ merge .synb .mets (merge < seq1 < seq2 >> [ lessfun <> [ keyfun ]]) .syne .desc The .code merge function merges two sorted sequences .meta seq1 and .meta seq2 into a single sorted sequence. The semantics and defaulting behavior of the .meta lessfun and .meta keyfun arguments are the same as those of the sort function. The sequence which is returned is of the same kind as .metn seq1 . This function is destructive of any inputs that are lists. If the output is a list, it is formed out of the structure of the input lists. .coNP Function @ multi-sort .synb .mets (multi-sort < columns < less-funcs <> [ key-funcs ]) .syne .desc The .code multi-sort function regards a list of lists to be the columns of a database. The corresponding elements from each list constitute a record. These records are to be sorted, producing a new list of lists. The .meta columns argument supplies the list of lists which comprise the columns of the database. The lists should ideally be of the same length. If the lists are of different lengths, then the shortest list is taken to be the length of the database. Excess elements in the longer lists are ignored, and do not appear in the sorted output. The .meta less-funcs argument supplies a list of comparison functions which are applied to the columns. Successive functions correspond to successive columns. If .meta less-funcs is an empty list, then the sorted database will emerge in the original order. If .meta less-funcs contains exactly one function, then the rows of the database are sorted according to the first column. The remaining columns simply follow their row. If .meta less-funcs contains more than one function, then additional columns are taken into consideration if the items in the previous columns compare .codn equal . For instance if two elements from column one compare .codn equal , then the corresponding second column elements are compared using the second column comparison function. The .meta less-funcs argument may be a function object, in which case it is treated as if it were a one-element list containing that function object. The optional .meta key-funcs argument supplies transformation functions through which column entries are converted to comparison keys, similarly to the single key function used in the sort function and others. If there are more key functions than less functions, the excess key functions are ignored. .SS* Lazy Lists and Lazy Evaluation .coNP Function @ make-lazy-cons .synb .mets (make-lazy-cons < function >> [ car <> [ cdr ]]) .syne .desc The function .code make-lazy-cons makes a special kind of cons cell called a lazy cons, whose type is .codn lcons . Lazy conses are useful for implementing lazy lists. Lazy lists are lists which are not allocated all at once. Rather, the elements of its structure materialize just before they are accessed. A lazy cons has .code car and .code cdr fields like a regular cons, and those fields are initialized to the values of the .meta car and .meta cdr arguments of .code make-lazy-cons when the lazy cons is created. These arguments default to .code nil if omitted. A lazy cons also has an update function, which is specified by the .meta function argument to .codn make-lazy-cons . The .meta function argument must be a function that may be called with exactly one parameter. When either the .code car and .code cdr fields of a cons are accessed for the first time to retrieve their value, .meta function is automatically invoked first, and is given the lazy cons as a parameter. That function has the opportunity to store new values into the .code car and .code cdr fields. Once the function is called, it is removed from the lazy cons: the lazy cons no longer has an update function. If the update function itself attempts to retrieve the value of the lazy cons cell's .code car or .code cdr field, it will be recursively invoked. The functions .code lcons-car and .code lcons-cdr may be used to access the fields of a lazy cons without triggering the update function. Storing a value into either the .code car or .code cdr field does not have the effect of invoking the update function. If the function terminates by returning normally, the access to the value of the field then proceeds in the ordinary manner, retrieving whatever value has most recently been stored. The return value of the function is ignored. To perpetuate the growth of a lazy list, the function can make another call to .code make-lazy-cons and install the resulting cons as the .code cdr of the lazy cons. .TP* Example: .verb ;;; lazy list of integers between min and max (defun integer-range (min max) (let ((counter min)) ;; min is greater than max; just return empty list, ;; otherwise return a lazy list (if (> min max) nil (make-lazy-cons (lambda (lcons) ;; install next number into car (rplaca lcons counter) ;; now deal wit cdr field (cond ;; max reached, terminate list with nil! ((eql counter max) (rplacd lcons nil)) ;; max not reached: increment counter ;; and extend with another lazy cons (t (inc counter) (rplacd lcons (make-lazy-cons (lcons-fun lcons)))))))))) .brev .coNP Function @ lconsp .synb .mets (lconsp << value ) .syne .desc The .code lconsp function returns .code t if .meta value is a lazy cons cell. Otherwise it returns .codn nil , even if .meta value is an ordinary cons cell. .coNP Function @ lcons-fun .synb .mets (lcons-fun << lazy-cons ) .syne .desc The .code lcons-fun function retrieves the update function of a lazy cons. Once a lazy cons has been accessed, it no longer has an update function and .code lcons-fun returns .codn nil . While the update function of a lazy cons is executing, it is still accessible. This allows the update function to retrieve a reference to itself and propagate itself into another lazy cons (as in the example under .codn make-lazy-cons ). .coNP Functions @ lcons-car and @ lcons-cdr .synb .mets (lcons-car << lazy-cons ) .mets (lcons-cdr << lazy-cons ) .syne .desc The functions .code lcons-car and .code lcons-cdr retrieve the .code car and .code cdr fields of .metn lazy-cons , without triggering the invocation of its associated update function. The .meta lazy-cons argument must be an object of type .codn lcons . Unlike the functions .code car and .codn cdr , These functions cannot be applied to any other type of object. Note: these functions may be used by the update function to retrieve the values which were stored into .meta lazy-cons by the .code make-lazy-cons constructor, without triggering recursion. The function may then overwrite either or both of these values. This allows the fields of the lazy cons to store state information necessary for the propagation of a lazy list. If that state information consists of no more than two values, then no additional context object need be allocated. .coNP Function @ lcons-force .synb .mets (lcons-force << object ) .syne .desc The .code lcons-function recursively forces a lazy cons. If the argument .meta object is of .code lcons type, and has not been previously forced, then it is forced. The associated lazy function is invoked. Then, .code lcons-force is recursively invoked on the .code car and .code cdr fields of the lazy cons. The .code lcons-force function returns its argument. .coNP Macro @ lcons .synb .mets (lcons < car-expression << cdr-expression ) .syne .desc The .code lcons macro simplifies the construction of structures based on lazy conses. Syntactically, it resembles the .code cons function. However, the arguments are expressions rather than values. The macro generates code which, when evaluated, immediately produces a lazy cons. The expressions .meta car-expression and .meta cdr-expression are not immediately evaluated. Rather, when either the .code car or .code cdr field of the lazy cons cell is accessed, these expressions are both evaluated at that time, in the order that they appear in the .code lcons expression, and in the original lexical scope in which that expression was evaluated. The return values of these expressions are used, respectively, to initialize the corresponding fields of the lazy cons. Note: the .code lcons macro may be understood in terms of the following reference implementation, as a syntactic sugar combining the .code make-lazy-cons constructor with a lexical closure provided by a .code lambda function: .verb (defmacro lcons (car-form cdr-form) (let ((lc (gensym))) ^(make-lazy-cons (lambda (,lc) (rplaca ,lc ,car-form) (rplacd ,lc ,cdr-form))))) .brev .TP* Example: .verb ;; Given the following function ... (defun fib-generator (a b) (lcons a (fib-generator b (+ a b)))) ;; ... the following function call generates the Fibonacci ;; sequence as an infinite lazy list. (fib-generator 1 1) -> (1 1 2 3 5 8 13 ...) .brev .coNP Functions @ lazy-stream-cons and @ get-lines .synb .mets (lazy-stream-cons < stream <> [ no-throw-close-p ]) .mets (get-lines >> [ stream <> [ no-throw-close-p ]]) .syne .desc The .code lazy-stream-cons and .code get-lines functions are synonyms, except that the .meta stream argument is optional in .code get-lines and defaults to .codn *stdin* . Thus, the following description of .code lazy-stream-cons also applies to .codn get-lines . The .code lazy-stream-cons returns a lazy cons which generates a lazy list based on reading lines of text from input stream .metn stream , which form the elements of the list. The .code get-line function is called on demand to add elements to the list. The .code lazy-stream-cons function itself makes the first call to .code get-line on the stream. If this returns .codn nil , then the stream is closed and .code nil is returned. Otherwise, a lazy cons is returned whose update function will install that line into the .code car field of the lazy cons, and continue the lazy list by making another call to .codn lazy-stream-cons , installing the result into the .code cdr field. When this lazy list obtains an end-of-file indication from the stream, it closes the stream. .code lazy-stream-cons inspects the real-time property of a stream as if by the .code real-time-stream-p function. This determines which of two styles of lazy list are returned. For an ordinary (non-real-time) stream, the lazy list treats the end-of-file condition accurately: an empty file turns into the empty list .codn nil , a one line file into a one-element list which contains that line and so on. This accuracy requires one line of lookahead which is not acceptable in real-time streams, and so a different type of lazy list is used, which generates an extra .code nil item after the last line. Under this type of lazy list, an empty input stream translates to the list .codn (nil) ; a one-line stream translates to .mono ("line" nil) .onom and so forth. If and when .meta stream is closed by the function directly, or else by the returned lazy list, the .meta no-throw-close-p Boolean argument, defaulting to .codn nil , controls the .meta throw-on-error-p argument of the call to the .code close-stream function. These arguments have opposite polarity: if .meta no-throw-close-p is true, then .meta throw-on-error-p shall be false, and vice versa. Note: the .code lcons-force function may be used on the return value of .code get-lines to force the lazy list. .coNP Macro @ close-lazy-streams .synb .mets (close-lazy-streams << body-form *) .syne .desc The .code close-lazy-streams macro establishes a dynamic environment in which zero or more .metn body-form s are evaluated, yielding the value of the last .metn body-form , or else .code nil if there are no .meta body-form arguments. In this regard, the macro operator resembles .codn progn . The environment established by .code close-lazy-streams sets up special monitoring of the the functions .code lazy-stream-cons and .codn get-lines . Whenever these functions register an I/O stream with a lazy list, in the dynamic scope of this environment, that stream is recorded in a hidden list associated with the innermost enclosing .code close-lazy-streams form. When the form terminates, it invokes .code close-stream on each stream in the hidden list. Note: the .code close-lazy-streams macro provides a possible solution for situations in which a body of code, possibly consisting of nested functions, manipulates lazy lists of lines coming from from I/O streams, such that these lists are not completely forced. Incompletely processed lazy lists will not close their associated streams until they are reclaimed by garbage collection, which could cause the application to run out of file descriptors. The .code close-lazy-streams macro allows the application to delineate a dynamic contour of code upon whose termination all such stream associations generated within that contour will be duly cleaned up. .TP* Example: Collect list of names of .code .tl files which contain the string .strn "(cons " : .verb ;; Incorrect version: could run out of open files if there are many ;; files which contain a match processed, because find-if will stop ;; traversing the list of lines when it finds a match: (build (each ((file (glob "*.tl"))) (if (find-if #/\e(cons / (file-get-lines file)) (add file)))) ;; Addressed with close-lazy-streams: after each iteration, the ;; stream created by file-get-lines is closed. (build (each ((file (glob "*.tl"))) (close-lazy-streams (if (find-if #/\e(cons / (file-get-lines file)) (add file))))) .brev .coNP Macro @ delay .synb .mets (delay << expression ) .syne .desc The delay operator arranges for the delayed (or "lazy") evaluation of .metn expression . This means that the expression is not evaluated immediately. Rather, the delay expression produces a promise object. The promise object can later be passed to the .code force function (described later in this document). The force function will trigger the evaluation of the expression and retrieve the value. The expression is evaluated in the original scope, no matter where the .code force takes place. The expression is evaluated at most once, by the first call to .codn force . Additional calls to .code force only retrieve a cached value. .TP* Example: .verb ;; list is popped only once: the value is computed ;; just once when force is called on a given promise ;; for the first time. (defun get-it (promise) (format t "*list* is ~s\en" *list*) (format t "item is ~s\en" (force promise)) (format t "item is ~s\en" (force promise)) (format t "*list* is ~s\en" *list*)) (defvar *list* '(1 2 3)) (get-it (delay (pop *list*))) Output: *list* is (1 2 3) item is 1 item is 1 *list* is (2 3) .brev .coNP Accessor @ force .synb .mets (force << promise ) .mets (set (force << promise ) << new-value ) .syne .desc The .code force function accepts a promise object produced by the .code delay macro. The first time .code force is invoked, the .meta expression which was wrapped inside .meta promise by the .code delay macro is evaluated (in its original lexical environment, regardless of where in the program the .code force call takes place). The value of .meta expression is cached inside .meta promise and returned, becoming the return value of the .code force function call. If the .code force function is invoked additional times on the same promise, the cached value is retrieved. A .code force form is a syntactic place, denoting the value cache location within .metn promise . Storing a value in a .code force place causes future accesses to the .meta promise to return that value. If the promise had not yet been forced, then storing a value into it prevents that from ever happening. The delayed .meta expression will never be evaluated. If, while a promise is being forced, the evaluation of .meta expression itself causes an assignment to the promise, it is not specified whether the promise will take on the value of .meta expression or the assigned value. .coNP Function @ promisep .synb .mets (promisep << object ) .syne .desc The .code promisep function returns .code t if .meta object is a promise object: an object created by the .code delay macro. Otherwise it returns .codn nil . Note: promise objects are conses. The .code typeof function applied to a promise returns .codn cons . .coNP Macro @ mlet .synb .mets (mlet >> ({ sym | >> ( sym << init-form )}*) << body-form *) .syne .desc The .code mlet macro ("magic let" or "mutual let") implements a variable binding construct similar to .code let and .codn let* . Under .codn mlet , the scope of the bindings of the .meta sym variables extends over the .metn init-form s, as well as the .metn body-form s. Unlike the .code let* construct, each .meta init-form has each .meta sym in scope. That is to say, an .meta init-form can refer not only to previous variables, but also to later variables as well as to its own variable. The variables are not initialized until their values are accessed for the first time. Any .meta sym whose value is not accessed is not initialized. Furthermore, the evaluation of each .meta init-form does not take place until the time when its value is needed to initialize the associated .metn sym . This evaluation takes place once. If a given .meta sym is not accessed during the evaluation of the .code mlet construct, then its .meta init-form is never evaluated. The bound variables may be assigned. If, before initialization, a variable is updated in such a way that its prior value is not needed, it is unspecified whether initialization takes place, and thus whether its .meta init-form is evaluated. Direct circular references are erroneous and are diagnosed. This takes place when the macro-expanded form is evaluated, not during the expansion of .codn mlet . .TP* Examples: .verb ;; Dependent calculations in arbitrary order (mlet ((x (+ y 3)) (z (+ x 1)) (y 4)) (+ z 4)) --> 12 ;; Error: circular reference: ;; x depends on y, y on z, but z on x again. (mlet ((x (+ y 1)) (y (+ z 1)) (z (+ x 1))) z) ;; Okay: lazy circular reference because lcons is used (mlet ((list (lcons 1 list))) list) --> (1 1 1 1 1 ...) ;; circular list .brev In the last example, the .code list variable is accessed for the first time in the body of the .code mlet form. This causes the evaluation of the .code lcons form. This form evaluates its arguments lazily, which means that it is not a problem that .code list is not yet initialized. The form produces a lazy cons, which is then used to initialize .code list. When the .code car or .code cdr fields of the lazy cons are accessed, the .code list expression in the .code lcons argument is accessed. By that time, the variable is initialized and holds the lazy cons itself, which creates the circular reference, and a circular list. .coNP Functions @, generate @ giterate and @ ginterate .synb .mets (generate < while-fun << gen-fun ) .mets (giterate < while-fun < gen-fun <> [ value ]) .mets (ginterate < while-fun < gen-fun <> [ value ]) .syne .desc The .code generate function produces a lazy list which dynamically produces items according to the following logic. The arguments to .code generate are functions which do not take any arguments. The return value of generate is a lazy list. When the lazy list is accessed, for instance with the functions car and cdr, it produces items on demand. Prior to producing each item, .meta while-fun is called. If it returns a true Boolean value (any value other than .codn nil ), then the .meta gen-fun function is called, and its return value is incorporated as the next item of the lazy list. But if .meta while-fun yields .codn nil , then the lazy list immediately terminates. Prior to returning the lazy list, generate invokes the .meta while-fun one time. If .code while-fun yields .codn nil , then .code generate returns the empty list .code nil instead of a lazy list. Otherwise, it instantiates a lazy list, and invokes the .code gen-fun to populate it with the first item. The .code giterate function is similar to .codn generate , except that .meta while-fun and .meta gen-fun are functions of one argument rather than functions of no arguments. The optional .meta value argument defaults to .code nil and is threaded through the function calls. That is to say, the lazy list returned is .mono .meti >> ( value >> [ gen-fun << value ] >> [ gen-fun >> [ gen-fun << value ]] ...). .onom The lazy list terminates when a value fails to satisfy .metn while-fun . That is to say, prior to generating each value, the lazy list tests the value using .metn while-fun . If that function returns .codn nil , then the item is not added, and the sequence terminates. Note: .code giterate could be written in terms of .code generate like this: .verb (defun giterate (w g v) (generate (lambda () [w v]) (lambda () (prog1 v (set v [g v]))))) .brev The .code ginterate function is a variant of .code giterate which includes the test-failing item in the generated sequence. That is to say .code ginterate generates the next value and adds it to the lazy list. The value is then tested using .metn while-fun . If that function returns .codn nil , then the list is terminated, and no more items are produced. .TP* Example: .verb (giterate (op > 5) (op + 1) 0) -> (0 1 2 3 4) (ginterate (op > 5) (op + 1) 0) -> (0 1 2 3 4 5) .brev .coNP Function @ expand-right .synb .mets (expand-right < gen-fun << value ) .syne .desc The .code expand-right function is a complement to .codn reduce-right , with lazy semantics. The .meta gen-fun parameter is a function, which must accept a single argument, and return either a cons pair or .codn nil . The .meta value parameter is any value. The first call to .meta gen-fun receives .metn value . The return value is interpreted as follows. If .meta gen-fun returns a cons-cell pair .mono .meti >> ( elem . << next ) .onom then .meta elem specifies the element to be added to the lazy list, and .meta next specifies the value to be passed to the next call to .metn gen-fun . If .meta gen-fun returns .code nil then the lazy list ends. .TP* Examples: .verb ;; Count down from 5 to 1 using explicit lambda ;; for gen-fun: (expand-right (lambda (item) (if (zerop item) nil (cons item (pred item)))) 5) --> (5 4 3 2 1) ;; Using functional combinators: [expand-right [iff zerop nilf [callf cons identity pred]] 5] --> (5 4 3 2 1) ;; Include zero: [expand-right [iff null nilf [callf cons identity [iff zerop nilf pred]]] 5] --> (5 4 3 2 1 0) .brev .coNP Functions @ expand-left and @ nexpand-left .synb .mets (expand-left < gen-fun << value ) .mets (nexpand-left < gen-fun << value ) .syne .desc The .code expand-left function is a companion to .codn expand-right . Unlike .codn expand-right , it has eager semantics: it calls .code gen-fun repeatedly and accumulates an output list, not returning until .code gen-fun returns .codn nil . The semantics is as follows. .code expand-left initializes an empty accumulation list. Then .meta gen-fun is called, with .meta value as its argument. If .meta gen-fun it returns a cons cell, then the .code car of that cons cell is pushed onto the accumulation list, and the procedure is repeated: .meta gen-fun is called again, with .code cdr taking the place of .metn value . If .meta gen-fun returns .codn nil , then the accumulation list is returned. If the expression .code "(expand-right f v)" produces a terminating list, then the following equivalence holds: .verb (expand-left f v) <--> (reverse (expand-right f v)) .brev The equivalence cannot hold for arguments to .code expand-left which produce an infinite list. The .code nexpand-left function is a destructive version of .codn expand-left . The list returned by .code nexpand-left is composed of the cons cells returned by .code gen-fun whereas the list returned by .code expand-left is composed of freshly allocated cons cells. .coNP Function @ repeat .synb .mets (repeat < list <> [ count ]) .syne .desc If .meta list is empty, then repeat returns an empty list. If .meta count is omitted, the .code repeat function produces an infinite lazy list formed by catenating together copies of .metn list . If .meta count is specified and is zero or negative, then an empty list is returned. Otherwise a list is returned consisting of .meta count repetitions of .meta list catenated together. .coNP Function @ pad .synb .mets (pad < sequence >> [ object <> [ count ]]) .syne .desc The .code pad function produces a lazy list which consists of all of the elements of .meta sequence followed by repetitions of .metn object . If .meta object is omitted, it defaults to .codn nil . If .meta count is omitted, then the repetition of .meta object is infinite. Otherwise the specified number of repetitions occur. Note that .meta sequence may be a lazy list which is infinite. In that case, the repetitions of .meta object will never occur. .coNP Function @ weave .synb .mets (weave <> { sequence }*) .syne .desc The .code weave function interleaves elements from the sequences given as arguments. If called with no arguments, it returns the empty list. If called with a single sequence, it returns the elements of that sequence as a new lazy list. When called with two or more sequences, .code weave returns a lazy list which draws elements from the sequences in a round-robin fashion, repeatedly scanning the sequences from left to right, and taking an item from each one, removing it from the sequence. Whenever a sequence runs out of items, it is deleted; the weaving then continues with the remaining sequences. The weaved sequence terminates when all sequences are eliminated. (If at least one of the sequences is an infinite lazy list, then the weaved sequence is infinite.) .TP* Examples: .verb ;; Weave negative integers with positive ones: (weave (range 1) (range -1 : -1)) -> (1 -1 2 -2 3 -3 ...) (weave "abcd" (range 1 3) '(x x x x x x x)) --> (#\ea 1 x #\eb 2 x #\ec 3 x #\ed x x x x) .brev .coNP Macros @ gen and @ gun .synb .mets (gen < while-expression << produce-item-expression ) .mets (gun << produce-item-expression ) .syne .desc The .code gen macro operator produces a lazy list, in a manner similar to the .code generate function. Whereas the .code generate function takes functional arguments, the .code gen operator takes two expressions, which is often more convenient. The return value of .code gen is a lazy list. When the lazy list is accessed, for instance with the functions .code car and .codn cdr , it produces items on demand. Prior to producing each item, the .meta while-expression is evaluated, in its original lexical scope. If the expression yields a .cod2 non- nil value, then .meta produce-item-expression is evaluated, and its return value is incorporated as the next item of the lazy list. If the expression yields .codn nil , then the lazy list immediately terminates. The .code gen operator itself immediately evaluates .meta while-expression before producing the lazy list. If the expression yields .codn nil , then the operator returns the empty list .codn nil . Otherwise, it instantiates the lazy list and invokes the .meta produce-item-expression to force the first item. The .code gun macro similarly creates a lazy list according to the following rules. Each successive item of the lazy list is obtained as a result of evaluating .metn produce-item-expression . However, when .meta produce-item-expression yields .codn nil , then the list terminates (without adding that .code nil as an item). Note 1: the form .code gun can be implemented as a macro-expanding to an instance of the .code gen operator, like this: .verb (defmacro gun (expr) (let ((var (gensym))) ^(let (,var) (gen (set ,var ,expr) ,var)))) .brev This exploits the fact that the .code set operator returns the value that is assigned, so the set expression is tested as a condition by .codn gen , while having the side effect of storing the next item temporarily in a hidden variable. In turn, .code gen can be implemented as a macro expanding to some .code lambda functions which are passed to the .code generate function: .verb (defmacro gen (while-expr produce-expr) ^(generate (lambda () ,while-expr) (lambda () ,produce-expr))) .brev Note 2: .code gen can be considered as an acronym for Generate, testing Expression before Next item, whereas .code gun stands for Generate Until Null. .TP* Example: .verb ;; Make a lazy list of integers up to 1000 ;; access and print the first three. (let* ((counter 0) (list (gen (< counter 1000) (inc counter)))) (format t "~s ~s ~s\en" (pop list) (pop list) (pop list))) Output: 1 2 3 .brev .coNP Functions @ range and @ range* .synb .mets (range >> [ from >> [ to <> [ step ]]]) .mets (range* >> [ from >> [ to <> [ step ]]]) .syne .desc The .code range and .code range* functions generate a lazy, potentially infinite list, according to several disciplines. There is a major division in behavior depending on whether or not the .code from argument, which specifies the initial item, is an arithmetic type according to the .code arithp function. The following remarks describe the arithmetic case. A description of the non-arithmetic behavior follows. The difference between and .code range* is that .code range* excludes the endpoint. For instance .code "(range 0 3)" generates the list .codn "(0 1 2 3)" , whereas .code "(range* 0 3)" generates .codn "(0 1 2)" . All arguments are optional. If the .meta step argument is omitted, then it defaults to 1 if the .meta to argument is omitted, or else if it is greater than or equal to .meta from according to the .code > function. If .meta to is given, and is less than .metn from , then a missing .code step argument defaults to -1. Each value in the list is obtained from the previous by adding the .meta step value. Positive or negative .meta step values are allowed. There is no check for a step size of zero, or for a step direction which cannot meet the endpoint. The .meta step argument may be a function. The function must accept one argument. That argument is an element of the list, from which the function calculates the next element. The .meta to argument specifies the endpoint value, which, if it occurs in the list, is excluded from it by the .code range* function, but included by the range function. If .meta to is missing, or specified as .codn nil , then there is no endpoint, and the list which is generated is infinite, regardless of .metn step . If .meta from is omitted, then the list begins at zero, otherwise .meta from must be an arithmetic object which specifies the initial value. The list stops if it reaches the endpoint value (which is included in the case of .codn range , and excluded in the case of .codn range *). However, depending on the arguments, it is possible that the generated list doesn't contain the endpoint value, yet steps over it. This occurs when the previous value of the list is less than the endpoint value, but the next value is greater, or vice versa. In this situation, the list also stops, and the excess value which surpasses the endpoint is excluded from the list. The rest of the description applies to the case when the .code from argument is a non-arithmetic type. In the non-arithmetic case, the .meta step argument unconditionally defaults to 1. If it is given, it must either be a function, or else a positive integer. If .meta step is a function, that function is used to determine each successive value from the previous similarly to the arithmetic case. If the .meta to value is omitted, an infinite list is generated this way. If the .meta to argument is present, the list stops if it attains the endpoint value. No provision is made for the endpoint value being skipped, like in the arithmetic case. When the endpoint value is reached, .code range* function omits that value from the list. If .meta step is a positive integer, then range iteration is used. A range value is constructed from the .meta from and .meta to arguments as if by the .mono .meti (rcons* < from << to ) .onom expression. Here, the .code to argument defaults to .code nil if it is missing. An iterator is created for the resulting range object as if by .code iter-begin and this iterator is then used to obtain values for the lazy list returned by .code range or .codn range* . The list ends when the iterator indicates that no more items are available. In the case of the .code range* function, the last value produced by the iterator is omitted from the list. The .meta step size is used to skip items from the iterator. For instance, if the value is 3, then the sequence begins with the .meta from value. The next two values from the sequence are omitted, The fourth item from the sequence is included in the list, (unless there either is no such item, or the function is .codn range* , and that item is the last one). .TP* Examples: .verb (range 1 1) -> (1) (range 0 4) -> (0 1 2 3 4) (range 4 0) -> (4 3 2 1 0) (range 0.0 2.0 0.5) (0.0 0.5 1.0 1.5 2.0) (range #R(0 1) #R(3 4)) (#R(0 1) #R(1 2) #R(3 4)) (range 0 4 2) -> (0 2 4) (range #\ea #\ee 2) (#\ea #\ec #\ee) (range 1 32 (op * 2)) -> (1 2 4 8 16 32)) (range* 1 1) -> nil (range* 0 4) -> (0 1 2 3) (range* 4 0 -2) -> (4 2) (range 0 1.25 0.5) -> (0 0.5 1.0) (range* 0 1.25 0.5) -> (0 0.5 1.0)) (range "A" "A") -> nil (range "AA" "BC") -> ("AA" "AB" "AC" "BA" "BB" "BC") (range "AA" "BC" 2) -> ("AA" "AC" "BB") [range* "ABCD" nil rest] -> ("ABCD" "BCD" "CD" "D") .brev .coNP Functions @ rlist and @ rlist* .synb .mets (rlist << item *) .mets (rlist* << item *) .syne .desc The .code rlist ("range list") function is useful for producing a list consisting of a mixture of discontinuous numeric or character ranges and individual items. The function returns a lazy list of elements. The items are produced by converting the function's successive .meta item arguments into lists, which are lazily catenated together to form the output list. Each .meta item is transformed into a list as follows. Any item which is .B not a range object is trivially turned into a one-element list as if by the .mono .meti (list << item *) .onom expression. Any item which is a range object, whose .code to field .B isn't a range is turned into a lazy list as if by evaluating the .mono .meti (range (from << item) (to << item)) .onom expression. Thus for instance the argument .code 1..10 turns into the (lazy) list .codn "(1 2 3 4 5 6 7 8 9 10)" . Any item which is a range object such that its .code to field is also a range is turned into a lazy list as if by evaluating the .mono .meti (range (from << item) (from (to << item)) (to (to << item))) .onom expression. Thus for instance the argument expression .code 1..10..2 produces an .meta item which .code rlist turns into the lazy list .code "(1 3 5 7 9)" as if by the call .codn "(range 1 10 2)" . Note that the expression .code 1..10..2 stands for the expression .code "(rcons 1 (rcons 10 2))" which evaluates to .codn "#R(1 #R(10 2))" . The .code "#R(1 #R(10 2))" range literal syntax can be passed as an argument to .code rlist with the same result as .codn 1..10..2 . The .code rlist* function differs from .code rlist in one regard: under .codn rlist* , the ranges denoted by the range notation exclude the endpoint. That is, the ranges are generated as if by the .code range* function rather than .codn range . Note: it is permissible for .meta item objects to specify infinite ranges. It is also permissible to apply .code rlist to an infinite argument list. .TP* Examples: .verb (rlist 1 "two" :three) -> (1 "two" :three) (rlist 10 15..16 #\ea..#\ed 2) -> (10 15 16 #\ea #\eb #\ec #\ed 2) (take 7 (rlist 1 2 5..:)) -> (1 2 5 6 7 8 9) .brev .SS* Ranges Ranges are objects that aggregate two values, not unlike .code cons cells. However, they are atoms, and are primarily intended to hold numeric or character values in their two fields. These fields are called .code from and .code to which are the names of the functions which access them. These fields are not mutable; a new value cannot be stored into either field of a range. The printed notation for a range object consists of the prefix .code #R (hash R) followed by the two values expressed as a two-element list. Ranges can be constructed using the .code rcons function. The notation .code x..y corresponds to .codn "(rcons x y)" . Ranges behave as a numeric type and support a subset of the numeric operations. Two ranges can be added or subtracted, which obeys these equivalences: .verb (+ a..b c..d) <--> (+ a c)..(+ b d) (- a..b c..d) <--> (- a c)..(- b d) .brev A range .code a..b can be combined with a character or number .code n using addition or subtractions, which obeys these equivalences: .verb (+ a..b n) <--> (+ n a..b) <--> (+ a n)..(+ b n) (- a..b n) <--> (- a n)..(- b n) (- n a..b) <--> (- n a)..(- n b) .brev A range can be multiplied by a number: .verb (* a..b n) <--> (* n a..b) <--> (* a n)..(* b n) .brev A range can be divided by a number using the .code / or .code trunc functions, but a number cannot be divided by a range: .verb (trunc a..b n) <--> (trunc a n)..(trunc b n) (/ a..b n) <--> (/ a n)..(/ b n) .brev Ranges can be compared using the equality and inequality functions .codn = , .codn < , .codn > , .code <= and .codn >= . Equality obeys this equivalence: .verb (= a..b c..d) <--> (and (= a c) (= b d)) .brev Inequality comparisons treat the .code from component with precedence over .code to such that only if the .code from components of the two ranges are not equal under the .code = function, then the inequality is based solely on them. If they are equal, then the inequality is based on the .code to components. This gives rise to the following equivalences: .verb (< a..b c..d) <--> (if (= a c) (< b d) (< a c)) (> a..b c..d) <--> (if (= a c) (> b d) (> a c)) (>= a..b c..d) <--> (if (= a c) (>= b d) (> a c)) (<= a..b c..d) <--> (if (= a c) (<= b d) (< a c)) .brev Ranges can be negated with the one-argument form of the .code - function, which is equivalent to subtraction from zero: the negation distributes over the two range components. The .code abs function also applies to ranges and distributes into their components. The .code succ and .code pred family of functions also operate on ranges. The length of a range may be obtained with the .code length function; The length of the range .code a..b is defined as .codn "(- b a)" , and may be obtained using the .code length function. The .code empty function accepts ranges and tests them for zero length. .coNP Function @ rcons .synb .mets (rcons < from << to ) .syne .desc The .code rcons function constructs a range object which holds the values .meta from and .metn to . Though range objects are effectively binary cells like conses, they are atoms. They also aren't considered sequences, nor are they structures. Range objects are used for indicating numeric ranges, such as substrings of lists, arrays and strings. The dotdot notation serves as a syntactic sugar for .codn rcons . The syntax .code a..b denotes the expression .codn "(rcons a b)" . Note that ranges are immutable, meaning that it is not possible to replace the values in a range. .coNP Function @ rangep .synb .mets (rangep << value ) .syne .desc The .code rangep function returns .code t if .meta value is a range. Otherwise it returns .codn nil . .coNP Functions @ from and @ to .synb .mets (from << range ) .mets (to << range ) .syne .desc The .code from and .code to functions retrieve, respectively, the from and to fields of a range. Note that these functions are not accessors, which is because ranges are immutable. .coNP Functions @ in-range and @ in-range* .synb .mets (in-range < range << value ) .mets (in-range* < range << value ) .syne .desc The .code in-range and .code in-range* functions test whether the .meta value argument lies in the range represented by the .meta range argument, indicating the Boolean result using one of the values .code t or .codn nil . The .meta range argument must be a range object. It is expected that the range object's .code from value does not exceed the .code to value; a reversed range is considered empty. The .code in-range* function differs from .code in-range in that it excludes the upper endpoint. The implicit comparison against the range endpoints is performed using the .code less and .code lequal functions, as appropriate. The following equivalences hold: .verb (in-range r x) <--> (and (lequal (from r) x) (lequal x (to r))) (in-range* r x) <--> (and (lequal (from r) x) (less x (to r))) .brev .coNP Function @ rangeref .synb .mets (rangeref < range >> [ idx | << seq ]) .syne .desc The .code rangeref function requires its .meta range argument to be a range object. It supports two semantics, based on the type of the second argument. If the second argument is an integer, then it is interpreted as .metn idx . The function then treats the .meta range as if it were a sequence. The .meta range must be a numeric or character range. The .code from field of .meta range is added to .meta idx to form the tentative return value. If the .code to field is a value other than .code t or the .code : (colon) symbol, then the tentative value must be less than the value of this field, or an exception is thrown. In other words, .meta ind must indicate a point within the range. After the above range check is performed, if applicable, the tentative value is returned. If the second argument isn't an integer, it is interpreted as a sequence .metn seq . The .meta range object's values are used to extract a subrange of .metn seq , according to the following equivalence: .verb (rangeref r s) <--> (sub s (from r) (to r)) .brev except that .code r and .code s are evaluated only once, in that order. .SS* Characters and Strings .coNP Functions @ mkstring and @ str .synb .mets (mkstring < length <> [ char ]) .mets (str < length >> [ char | << string ]) .syne .desc The .code mkstring function constructs a string object of a length specified by the .meta length parameter. The .meta length parameter must be non-negative. Every position in the string is initialized with .metn char , which must be a character value. If the optional argument .meta char is not specified, it defaults to the space character. The .code str function resembles .codn mkstring , and behaves the same way when the second argument is omitted, and when it is a character value. The second argument of .code str may be a .metn string , in which case the newly created string is filled by taking successive characters from .metn string . If .meta string is longer than .metn length , its excess characters are ignored. If .meta string is shorter, then characters are taken from the beginning again; .meta string is effectively taken as a fill pattern to be repeated as many times as necessary to provide the required number of characters. If .meta string is empty, .code str fills the new string with spaces. .coNP Function @ copy-str .synb .mets (copy-str << string ) .syne .desc The .code copy-str function constructs a new string whose contents are identical to .metn string . If .meta string is a lazy string, then a lazy string is constructed with the same attributes as .metn string . The new lazy string has its own copy of the prefix portion of .meta string which has been forced so far. The unforced list and separator string are shared between .meta string and the newly constructed lazy string. .coNP Function @ upcase-str .synb .mets (upcase-str << string ) .syne .desc The .code upcase-str function produces a copy of .meta string such that all lowercase characters of the English alphabet are mapped to their uppercase counterparts. .coNP Function @ downcase-str .synb .mets (downcase-str << string ) .syne .desc The .code downcase-str function produces a copy of .meta string such that all uppercase characters of the English alphabet are mapped to their lowercase counterparts. .coNP Function @ string-extend .synb .mets (string-extend < string < tail <> [ final ]) .syne .desc The .code string-extend function destructively increases the length of .metn string , which must be an ordinary dynamic string. It is an error to invoke this function on a literal string or a lazy string. The .meta tail argument can be a character, string or integer. If it is a string or character, it specifies material which is to be added to the end of the string: either a single character or a sequence of characters. If it is an integer, it specifies the number of characters to be added to the string. If .meta tail is an integer, the newly added characters have indeterminate contents. The string appears to be the original one because of an internal terminating null character remains in place, but the characters beyond the terminating zero are indeterminate. The optional Boolean argument .metn final , defaulting to .codn nil , is a hint which indicates whether this .code string-extend call is expected to be the last time that the function is invoked on the given .metn string . If .meta final is true, then the .meta string object's underlying memory allocation is trimmed to fit the actual string data. If the argument is false, the object may be given a larger allocation intended to improves the performance of subsequent .code string-extend calls. .coNP Function @ string-finish .synb .mets (string-finish << string ) .syne .desc The .code string-finish function removes excess allocation from .meta string that may have been produced by previous calls to .codn string-extend . Note: if the most recent call to string .code string-extend specified a true value for the .meta final parameter, then calling .code string-finish is unnecessary and does nothing. .coNP Function @ stringp .synb .mets (stringp << obj ) .syne .desc The .code stringp function returns .code t if .meta obj is one of the several kinds of strings. Otherwise it returns .codn nil . .coNP Function @ length-str .synb .mets (length-str << string ) .syne .desc The .code length-str function returns the length .meta string in characters. The argument must be a string. .coNP Function @ coded-length .synb .mets (coded-length << string ) .syne .desc The .code coded-length function returns the number of bytes required to encode .meta string in UTF-8. The argument must be a character string. If the string contains only characters in the ASCII range U+0001 to U+007F range, then the value returned shall be the same as that returned by the .code length-str function. .coNP Function @ search-str .synb .mets (search-str < haystack < needle >> [ start <> [ from-end ]]) .syne .desc The .code search-str function finds an occurrence of the string .meta needle inside the .meta haystack string and returns its position. If no such occurrence exists, it returns .codn nil . If a .meta start argument is not specified, it defaults to zero. If it is a nonnegative integer, it specifies the starting character position for the search. Negative values of .meta start indicate positions from the end of the string, such that .code -1 is the last character of the string. If the .meta from-end argument is specified and is not .codn nil , it means that the search is conducted right-to-left. If multiple matches are possible, it will find the rightmost one rather than the leftmost one. .coNP Function @ search-str-tree .synb .mets (search-str-tree < haystack < tree >> [ start <> [ from-end ]]) .syne .desc The .code search-str-tree function is similar to .codn search-str , except that instead of searching .meta haystack for the occurrence of a single needle string, it searches for the occurrence of numerous strings at the same time. These search strings are specified, via the .meta tree argument, as an arbitrarily structured tree whose leaves are strings. The function finds the earliest possible match, in the given search direction, from among all of the needle strings. If .meta tree is a single string, the semantics is equivalent to .codn search-str . .coNP Function @ match-str .synb .mets (match-str < bigstring < littlestring <> [ start ]) .syne .desc Without the .meta start argument, the .code match-str function determines whether .meta littlestring is a prefix of .metn bigstring . If the .meta start argument is specified, and is a nonnegative integer, then the function tests whether .meta littlestring matches a prefix of that portion of .meta bigstring which starts at the given position. If the .meta start argument is a negative integer, then .code match-str determines whether .meta littlestring is a suffix of .metn bigstring , ending on that position of bigstring, where .code -1 denotes the last character of .metn bigstring , .code -2 the second last one and so on. If .meta start is .codn -1 , then this corresponds to testing whether .meta littlestring is a suffix of .metn bigstring . The .code match-str function returns .code nil if there is no match. If a prefix match is successful, then an integer value is returned indicating the position, inside .metn bigstring , one character past the matching prefix. If the entire string is matched, then this value corresponds to the length of .metn bigstring . If a suffix match is successful, the return value is the position within .meta bigstring where the leftmost character of .meta littlestring matched. .coNP Function @ match-str-tree .synb .mets (match-str-tree < bigstring < tree <> [ start ]) .syne .desc The .code match-str-tree function is a generalization of .code match-str which matches multiple test strings against .meta bigstring at the same time. The value reported is the longest match from among any of the strings. The strings are specified as an arbitrarily shaped tree structure which has strings at the leaves. If .meta tree is a single string atom, then the function behaves exactly like .codn match-str . .coNP Accessor @ sub-str .synb .mets (sub-str < str >> [ from <> [ to ]]) .mets (set (sub-str < str >> [ from <> [ to ]]) << new-value ) .syne .desc The .code sub-str function has the same parameters and semantics as the .code sub function, function, except that the first argument is operated upon using string operations. If a .code sub-str form is used as a place, it denotes a subrange of .meta list as if it were a storage location. The previous value of this location, if needed, is fetched by a call to .codn sub-str . Storing .meta new-value to the place is performed by a call to .codn replace-str . In an update operation which accesses the prior value and stores a new value, the arguments .metn str , .metn from , .meta to and .meta new-value are evaluated once. The .meta str argument is not itself required to be a place; it is not updated when a value is written to the .code sub-str storage location. .coNP Function @ replace-str .synb .mets (replace-str < string < item-sequence >> [ from <> [ to ]]) .syne .desc The .code replace-str function has the same parameters and semantics as the .code replace function, except that the first argument is operated upon using string operations. .coNP Functions @, cat-str @ join-with and @ join .synb .mets (cat-str < item-seq <> [ sep ]) .mets (join-with < sep << item *) .mets (join << item *) .syne .desc The .codn cat-str , .code join-with and .code join functions combine items into a single string, which is returned. Every .meta item argument must be a character, string or else a possibly empty sequence of items. This rule applies recursively. If a .meta sep argument is present, it must be a character or string. The .meta item-seq argument must be a sequence of any mixture of items which are characters, strings or sequences of items. Note that this means that if .meta item-seq is a character string, it is a valid argument, since it is a sequence of characters. If .meta item-seq is empty, or no .meta item arguments are present, then all three functions return an empty string. All three functions operate on an abstract sequence of character and string items, produced by a left-to-right recursive traversal of their .meta item-seq or .meta item arguments. Under the .code join-with function, as well as the .code cat-str function a .meta sep argument is given to it, the items are catenated together such that .meta sep is interposed between them. If there are .I n character or string items, then .I "n - 1" copies of .meta sep occur in the resulting string, which is returned. Under the .code join function, or .code cat-str function invoked without a .meta sep argument, the items are catenated together directly, without any separator. The resulting string is returned. .coNP Function @ split-str .synb .mets (split-str < string < sep >> [ keep-between <> [ count ]]) .syne .desc The .code split-str function breaks the .meta string into pieces, returning a list thereof. The .meta sep argument must be one of three types: a string, a character or a regular expression. It determines the separator character sequences within .metn string . The following describes the behavior of .code split-str in the case when the integer parameter .meta count is omitted. The semantics of .meta count are then given. All non-overlapping matches for .meta sep within .meta string are identified in left-to-right order, and are removed from .metn string . The string is broken into pieces according to the gaps left behind by the removed separators, and a list of the remaining pieces is returned. If .meta sep is the empty string, then the separator pieces removed from the string are considered to be the empty strings between its characters. In this case, if .meta string is of length one or zero, then it is considered to have no such pieces, and a list of one element is returned containing the original string. These remarks also apply to the situation when .meta sep is a regular expression which matches only an empty substring of .metn string . If a match for .meta sep is not found in the string at all (not even an empty match), then the string is not split at all: a list of one element is returned containing the original string. If .meta sep matches the entire string, then a list of two empty strings is returned, except in the case that the original string is empty, in which case a list of one element is returned, containing the empty string. Whenever two adjacent matches for .meta sep occur, they are considered separate cuts with an empty piece between them. This operation is nondestructive: .meta string is not modified in any way. If the optional .meta keep-between argument is specified and is not .codn nil , If an argument is given and is true, then .meta split-str incorporates the matching separating pieces of .meta string into the resulting list, such that if the resulting list is catenated, a string equivalent to the original string will be produced. Note: to split a string into pieces of length one such that an empty string produces .code nil rather than .codn ("") , use the .mono .meti (tok-str < string #/./) .onom pattern. Note: the function call .code "(split-str s r t)" produces a resulting list identical to .codn "(tok-str s r t)" , for all values of .code r and .codn s , provided that .code r does not match empty strings. If .code r matches empty strings, then the .code tok-str call returns extra elements compared to .codn split-str , because .code tok-str allows empty matches to take place and extract empty tokens before the first character of the string, and after the last character, whereas .code split-str does not recognize empty separators at these outer limits of the string. If the .meta count parameter is present, it must be a non-negative integer. This value specifies the maximum number of pieces of the input .meta string which are extracted by the splitting process. The returned list consists of these pieces, followed by the remainder of the string, if the remainder is nonempty. If .meta keep-sep is true, then separators appear between the pieces, and if the remainder piece is present, the separator between the last piece and the remainder is included. If .meta count is zero, then .code split-str returns a list of one element, which is .metn string . .coNP Functions @ spl and @ spln .synb .mets (spl < sep <> [ keep-between ] << string ) .mets (spln < count < sep <> [ keep-between ] << string ) .syne .desc The .code spl function performs the same computation as .codn split-str . The same-named parameters of .code spl and .code split-str have the same semantics. The difference is the argument order. The .code spl function takes the .meta sep argument first. The last argument is always .meta string whether or not there are two arguments or three. If there are three arguments, then .meta keep-between is the middle one. Note: the argument conventions of .code spl facilitate less verbose partial application, such as with macros in the .code op family, in the common situation when .meta string is the unbound argument. The .code spln function is similar to .codn spl , taking a required argument .metn count , which behaves exactly like the same-named argument of .codn spl-str . .coNP Functions @ split-str-set and @ sspl .synb .mets (split-str-set < string << set ) .mets (sspl < set << string ) .syne .desc The .code split-str-set function breaks the .meta string into pieces, returning a list thereof. The .meta set argument must be a string. It specifies a set of characters. All occurrences of any of these characters within .meta string are identified, and are removed from .metn string . The string is broken into pieces according to the gaps left behind by the removed separators. Adjacent occurrences of characters from .meta set within .meta string are considered to be separate gaps which come between empty strings. This operation is nondestructive: .meta string is not modified in any way. The .code sspl function performs the same operation; the only difference between .code sspl and .code split-str-set is argument order. .coNP Functions @ tok-str and @ tok-where .synb .mets (tok-str < string < regex >> [ keep-between <> [ count ]]) .mets (tok-where < string << regex ) .syne .desc The .code tok-str function searches .meta string for tokens, which are defined as substrings of .meta string which match the regular expression .meta regex in the longest possible way, and do not overlap. These tokens are extracted from the string and returned as a list. Whenever .meta regex matches an empty string, then an empty token is returned, and the search for another token within .meta string resumes after advancing by one character position. However, if an empty match occurs immediately after a nonempty token, that empty match is not turned into a token. So for instance, .mono (tok-str "abc" #/a?/) .onom returns .mono ("a" "" ""). .onom After the token .str "a" is extracted from a nonempty match for the regex, an empty match for the regex occurs just before the character .codn b . This match is discarded because it is an empty match which immediately follows the nonempty match. The character .code b is skipped. The next match is an empty match between the .code b and .code c characters. This match causes an empty token to be extracted. The character .code c is skipped, and one more empty match occurs after that character and is extracted. If the .meta keep-between argument is true, then the behavior of .code tok-str changes in the following way. The pieces of .meta string which are skipped by the search for tokens are included in the output. If no token is found in .metn string , then a list of one element is returned, containing .metn string . Generally, if N tokens are found, then the returned list consists of 2N + 1 elements. The first element of the list is the (possibly empty) substring which had to be skipped to find the first token. Then the token follows. The next element is the next skipped substring and so on. The last element is the substring of .meta string between the last token and the end. If .meta count is specified, it must be a nonnegative integer. The value limits the number of tokens which are extracted. The returned list then includes one more item: the remainder of the string after the last extracted token. This item is omitted if the rest of the string is empty, unless .meta keep-between is true. The .code tok-where function works similarly to .codn tok-str , but instead of returning the extracted tokens themselves, it returns a list of the character position ranges within .meta string where matches for .meta regex occur. The ranges are pairs of numbers, represented as cons cells, where the first number of the pair gives the starting character position, and the second number is one position past the end of the match. If a match is empty, then the two numbers are equal. The .code tok-where function does not support the .meta keep-between parameter. .coNP Function @ tok .synb .mets (tok < regex <> [ keep-between ] << string ) .mets (tokn < count < regex <> [ keep-between ] << string ) .syne .desc The .code tok function performs the same computation as .codn tok-str . The same-named parameters of .code tok and .code tok-str have the same semantics. The difference is the argument order. The .code tok function takes the .meta regex argument first. The last argument is always .meta string whether or not there are two arguments or three. If there are three arguments, then .meta keep-between is the middle one. Note: the argument conventions of .code tok facilitate less verbose partial application, such as with macros in the .code op family, in the common situation when .meta string is the unbound argument. The .code tokn function is similar to .codn tok , taking a required argument .metn count , which behaves exactly like the same-named argument of .codn tok-str . .coNP Function @ list-str .synb .mets (list-str << string ) .syne .desc The .code list-str function converts a string into a list of characters. .coNP Function @ trim-str .synb .mets (trim-str << string ) .syne .desc The .code trim-str function produces a copy of .meta string from which leading and trailing tabs, spaces and newlines are removed. .coNP Function @ str-esc .synb .mets (str-esc < esc-set < esc-tok << str ) .syne .desc The .code str-esc function performs a .I "character escaping" transformation on the input string .metn str . The argument .meta esc-set is a string containing zero or more characters. The .meta esc-tok argument is a character or string. The function returns a transformed version of .meta str in which every character of .meta str which occurs in .meta esc-set is preceded by .metn esc-tok . .TP* Examples; .verb (str-esc "$@#" "$" "$foo @abc #1") -> "$$foo $@abc $#1" (str-esc "'" "'\e\e'" "foo 'bar' baz") -> "foo '\e\e''bar'\e\e'' baz" .brev .coNP Functions @ string-set-code and @ string-get-code .synb .mets (string-set-code < string << value ) .mets (string-get-code << string ) .syne .desc The .code string-set-code and .code string-get-code functions provide a mechanism for associating an integer code with a string. Note: this mechanism is the basis for associating system error messages passed in exceptions with the .code errno values of the failed system library calls which precipitated these error exceptions. Not all string types can have an integer code: lazy strings and literal strings do not have this capability. The .meta string argument must be of type .codn str . The .meta value argument must be an integer or character. It is recommended that its value be confined to the non-negative range of the platform's .code int C type. Otherwise it is unspecified whether the same value shall be observed by .code string-get-code as what was stored with .codn string-set-code . The .code string-set-code function associates the integer .meta value with the given .codn string , and returns .codn string . Any previously associated value is overwritten. The .code string-get-code function retrieves the value most recently associated with .metn string . If .meta string has no associated value, then .code nil is returned. If the .code string-extend is invoked on a .meta string then it is unspecified whether or not .meta string has an associated value and, if so, what value that is, except in the following case: if .code string-extend is invoked with a .meta final argument which is true, then .meta string is caused not to have an associated value. If the .code string-finish function is invoked on a .metn string , that string is caused not to have an associated value. .coNP Function @ chrp .synb .mets (chrp << obj ) .syne .desc Returns .code t if .meta obj is a character, otherwise nil. .coNP Function @ chr-isalnum .synb .mets (chr-isalnum << char ) .syne .desc Returns .code t if .meta char is an alphanumeric character, otherwise nil. Alphanumeric means one of the uppercase or lowercase letters of the English alphabet found in ASCII, or an ASCII digit. This function is not affected by locale. .coNP Function @ chr-isalpha .synb .mets (chr-isalpha << char ) .syne .desc Returns .code t if .meta char is an alphabetic character, otherwise .codn nil . Alphabetic means one of the uppercase or lowercase letters of the English alphabet found in ASCII. This function is not affected by locale. .coNP Function @ chr-isascii .synb .mets (chr-isascii << char ) .syne .desc The .code chr-isascii function returns .code t if the code of character .meta char is in the range 0 to 127 inclusive. For characters outside of this range, it returns .codn nil . .coNP Function @ chr-iscntrl .synb .mets (chr-iscntrl << char ) .syne .desc The .code chr-iscntrl function returns .code t if the character .meta char is a control character. For all other character, it returns .codn nil . A control character is one which belongs to the Unicode C0 or C1 block. C0 consists of the characters U+0000 through U+001F, plus the character U+007F. These are the original ASCII control characters. Block C1 consists of U+0080 through U+009F. .coNP Functions @ chr-isdigit and @ chr-digit .synb .mets (chr-isdigit << char ) .mets (chr-digit << char ) .syne .desc If .meta char is an ASCII decimal digit character, .code chr-isdigit returns the value .code t and .code chr-digit returns the integer value corresponding to that digit character, a value in the range 0 to 9. Otherwise, both functions return .codn nil . .coNP Function @ chr-isgraph .synb .mets (chr-isgraph << char ) .syne .desc The .code chr-isgraph function returns .code t if .meta char is a non-space printable ASCII character. It returns .code nil if it is a space or control character. It also returns .code nil for non-ASCII characters: Unicode characters with a code above 127. .coNP Function @ chr-islower .synb .mets (chr-islower << char ) .syne .desc The .code chr-islower function returns .code t if .meta char is an ASCII lowercase letter. Otherwise it returns .codn nil . .coNP Function @ chr-isprint .synb .mets (chr-isprint << char ) .syne .desc The .code chr-isprint function returns .code t if .meta char is an ASCII character which is not a control character. It also returns .code nil for all non-ASCII characters: Unicode characters with a code above 127. .coNP Function @ chr-ispunct .synb .mets (chr-ispunct << char ) .syne .desc The .code chr-ispunct function returns .code t if .meta char is an ASCII character which is not a control character. It also returns .code nil for all non-ASCII characters: Unicode characters with a code above 127. .coNP Function @ chr-isspace .synb .mets (chr-isspace << char ) .syne .desc The .code chr-isspace function returns .code t if .meta char is an ASCII whitespace character: any of the characters in the set .codn #\espace , .codn #\etab , .codn #\elinefeed , .codn #\enewline , .codn #\ereturn , .code #\evtab and .codn #\epage . For all other characters, it returns .codn nil . .coNP Function @ chr-isblank .synb .mets (chr-isblank << char ) .syne .desc The .code chr-isblank function returns .code t if .meta char is a space or tab: the character .code #\espace or .codn #\etab . For all other characters, it returns .codn nil . .coNP Function @ chr-isunisp .synb .mets (chr-isunisp << char ) .syne .desc The .code chr-isunisp function returns .code t if .meta char is a Unicode whitespace character. This the case for all the characters for which .code chr-isspace returns .codn t . It also returns .code t for these additional characters: .codn #\exa0 , .codn #\ex1680 , .codn #\ex180e , .codn #\ex2000 , .codn #\ex2001 , .codn #\ex2002 , .codn #\ex2003 , .codn #\ex2004 , .codn #\ex2005 , .codn #\ex2006 , .codn #\ex2007 , .codn #\ex2008 , .codn #\ex2009 , .codn #\ex200a , .codn #\ex2028 , .codn #\ex2029 , .codn #\ex205f , and .codn #\ex3000 . For all other characters, it returns .codn nil . .coNP Function @ chr-isupper .synb .mets (chr-isupper << char ) .syne .desc The .code chr-isupper function returns .code t if .meta char is an ASCII uppercase letter. Otherwise it returns .codn nil . .coNP Functions @ chr-isxdigit and @ chr-xdigit .synb .mets (chr-isxdigit << char ) .mets (chr-xdigit << char ) .syne .desc If .meta char is a hexadecimal digit character, .code chr-isxdigit returns the value .code t and .code chr-xdigit returns the integer value corresponding to that digit character, a value in the range 0 to 15. Otherwise, both functions returns .codn nil . A hexadecimal digit is one of the ASCII digit characters .code 0 through .codn 9 , or else one of the letters .code A through .code F or their lowercase equivalents .code a through .code f denoting the values 10 to 15. .coNP Function @ chr-toupper .synb .mets (chr-toupper << char ) .syne .desc If character .meta char is a lowercase ASCII letter character, this function returns the uppercase equivalent character. If it is some other character, then it just returns .metn char . .coNP Function @ chr-tolower .synb .mets (chr-tolower << char ) .syne .desc If character .meta char is an uppercase ASCII letter character, this function returns the lowercase equivalent character. If it is some other character, then it just returns .metn char . .coNP Functions @ int-chr and @ chr-int .synb .mets (int-chr << char ) .mets (chr-int << num ) .syne .desc The .meta char argument must be a character. The .code int-chr function returns that character's Unicode code point value as an integer. The .meta num argument must be a fixnum integer in the range .code 0 to .codn #\ex10FFFF . The .code chr-int function interprets .meta num as a Unicode code point value and returns the corresponding character object. Note: these functions are also known by the obsolescent names .code num-chr and .codn chr-num . .coNP Accessor @ chr-str .synb .mets (chr-str < str << idx ) .mets (set (chr-str < str << idx ) << new-value ) .syne .desc The .code chr-str function performs random access on string .meta str to retrieve the character whose position is given by integer .metn idx , which must be within range of the string. The index value 0 corresponds to the first (leftmost) character of the string and so nonnegative values up to one less than the length are possible. Negative index values are also allowed, such that -1 corresponds to the last (rightmost) character of the string, and so negative values down to the additive inverse of the string length are possible. An empty string cannot be indexed. A string of length one supports index 0 and index -1. A string of length two is indexed left to right by the values 0 and 1, and from right to left by -1 and -2. If the element .meta idx of string .meta str exists, and the string is modifiable, then the .code chr-str form denotes a place. A .code chr-str place supports deletion. When a deletion takes place, then the character at .meta idx is removed from the string. Any characters after that position move by one position to close the gap, and the length of the string decreases by one. .TP* Notes: Direct use of .code chr-str is equivalent to the DWIM bracket notation except that .code str must be a string. The following relation holds: .verb (chr-str s i) --> [s i] .brev since .codn "[s i] <--> (ref s i)" , this also holds: .verb (chr-str s i) --> (ref s i) .brev However, note the following difference. When the expression .code "[s i]" is used as a place, then the subexpression .code s must be a place. When .code "(chr-str s i)" is used as a place, .code s need not be a place. .coNP Function @ chr-str-set .synb .mets (chr-str-set < str < idx << char ) .syne .desc The .code chr-str function performs random access on string .meta str to overwrite the character whose position is given by integer .metn idx , which must be within range of the string. The character at .meta idx is overwritten with character .metn char . The .meta idx argument works exactly as in .codn chr-str . The .meta str argument must be a modifiable string. .TP* Notes: Direct use of .code chr-str is equivalent to the DWIM bracket notation provided that .meta str is a string and .meta idx an integer. The following relation holds: .verb (chr-str-set s i c) --> (set [s i] c) .brev Since .code "(set [s i] c) <--> (refset s i c)" for an integer index .codn i , this also holds: .verb (chr-str s i) --> (refset s i c) .brev .coNP Function @ span-str .synb .mets (span-str < str << set ) .syne .desc The .code span-str function determines the longest prefix of string .meta str which consists only of the characters in string .metn set , in any combination. If both arguments are strings, the function returns an integer between 0 and the length of .metn str . .TP* Examples: .verb (span-str "abcde" "ab") -> 2 (span-str "abcde" "z") -> 0 (span-str "abcde" "") -> 0 (span-str "abcde" "edcba") -> 5 .brev .coNP Function @ compl-span-str .synb .mets (compl-span-str < str << set ) .syne .desc The .code compl-span-str function determines the longest prefix of string .meta str which consists only of the characters which do not appear in .metn set , in any combination. If both arguments are strings, the function returns an integer between 0 and the length of .metn str . .TP* Examples: .verb (compl-span-str "abc,def" ",") -> 3 (compl-span-str "abc," ",") -> 3 (compl-span-str "abc" ",") -> 3 (compl-span-str "abc3" "0123456789") -> 3 (compl-span-str "3" "0123456789") -> 0 .brev .coNP Function @ break-str .synb .mets (break-str < str << set ) .syne .desc The .code break-str function returns an integer which represents the position of the first character in string .meta str which appears in string .metn set . If there is no such character, then .code nil is returned. .TP* Examples: .verb (break-str "abc,def.ghi" ",.:") -> 3 (break-str "abc,def.ghi" ".:") -> 6 (break-str "abc,def.ghi" ":") -> nil .brev .SS* Lazy Strings Lazy strings are objects that were developed for the \*(TX pattern-matching language, and are exposed via \*(TL. Lazy strings behave much like strings, and can be substituted for strings. However, unlike regular strings, which exist in their entirety, first to last character, from the moment they are created, lazy strings do not exist all at once, but are created on demand. If character at index N of a lazy string is accessed, then characters 0 through N of that string are forced into existence. However, characters at indices beyond N need not necessarily exist. A lazy string dynamically grows by acquiring new text from a list of strings which is attached to that lazy string object. When the lazy string is accessed beyond the end of its hitherto materialized prefix, it takes enough strings from the list in order to materialize the index. If the list doesn't have enough material, then the access fails, just like an access beyond the end of a regular string. A lazy string always takes whole strings from the attached list. Lazy string growth is achieved via the .code lazy-str-force-upto function which forces a string to exist up to a given character position. This function is used internally to handle various situations. The .code lazy-str-force function forces the entire string to materialize. If the string is connected to an infinite lazy list, this will exhaust all memory. Lazy strings are specially recognized in many of the regular string functions, which do the right thing with lazy strings. For instance when .code sub-str is invoked on a lazy string, a special version of the .code sub-str logic is used which handles various lazy string cases, and can potentially return another lazy string. Taking a .code sub-str of a lazy string from a given character position to the end does not force the entire lazy string to exist, and in fact the operation will work on a lazy string that is infinite. Furthermore, special lazy string functions are provided which allow programs to be written carefully to take better advantage of lazy strings. What carefully means is code that avoids unnecessarily forcing the lazy string. For instance, in many situations it is necessary to obtain the length of a string, only to test it for equality or inequality with some number. But it is not necessary to compute the length of a string in order to know that it is greater than some value. .coNP Function @ lazy-str .synb .mets (lazy-str < string-list >> [ terminator <> [ limit-count ]]) .syne .desc The .code lazy-str function constructs a lazy string which draws material from .meta string-list which is a list of strings. If the optional .meta terminator argument is given, then it specifies a string which is appended to every string from .metn string-list , before that string is incorporated into the lazy string. If .meta terminator is not given, then it defaults to the string .strn "\en" , and so the strings from .meta string-list are effectively treated as lines which get terminated by newlines as they accumulate into the growing prefix of the lazy string. To avoid the use of a terminator string, a null string .meta terminator argument must be explicitly passed. In that case, the lazy string grows simply by catenating elements from .metn string-list . If the .meta limit-count argument is specified, it must be a positive integer. It expresses a maximum limit on how many elements will be consumed from .meta string-list in order to feed the lazy string. Once that many elements are drawn, the string ends, even if the list has not been exhausted. However, that remaining list, though not contributing to the string, is still incorporated into the value returned by .codn lazy-str-get-trailing-list . .coNP Function @ lazy-stringp .synb .mets (lazy-stringp << obj ) .syne .desc The .code lazy-stringp function returns .code t if .meta obj is a lazy string. Otherwise it returns .codn nil . .coNP Function @ lazy-str-force-upto .synb .mets (lazy-str-force-upto < lazy-str << index ) .syne .desc The .code lazy-str-force-upto function tries to instantiate the lazy string such that the position given by .meta index materializes. The .meta index is a character position, exactly as used in the .code chr-str function. It is an error if the .meta lazy-str argument isn't a lazy string. Some positions beyond .meta index may also materialize, as a side effect, because the operation takes only whole strings from the internal list, according to the algorithm described below. If the string is already materialized through to at least .metn index , or if it is possible to materialize the string that far, then the value .code t is returned to indicate success. If there is sufficient material to force the lazy string through to the .meta index position, then .code t is returned, otherwise .codn nil . The .meta lazy-str object's .meta limit-count is observed: a total of no more than .meta limit-count elements are taken from the object's list. The algorithm is as follows: .RS .IP 1. While the length of the materialized prefix of the string is less than or equal to .meta index and while elements are available in the list, subject to observance of the .metn limit-count , perform the following steps 2 and 3: .IP 2. Remove the next available string from the list, and add it as a suffix to the materialized prefix. .IP 3. Add the .meta terminator string to the materialized prefix. .IP 4. Return .code t if the length of the materialized prefix exceeds .metn index , otherwise .codn nil . .RE .IP The algorithm does not take portions of strings from the list, and always adds the terminator after incorporating each piece into the materialized prefix. .coNP Function @ lazy-str-force .synb .mets (lazy-str-force << lazy-str ) .syne .desc The .meta lazy-str argument must be a lazy string. The lazy string is forced to fully materialize. The return value is an ordinary, non-lazy string equivalent to the fully materialized lazy string. The .meta lazy-str object's .meta limit-count is observed: a total of no more than .meta limit-count elements are taken from the object's list. The algorithm that is followed by .code lazy-str-force is similar to the one followed by .codn lazy-str-force-upto , with only the following modification. The test in step 1 isn't concerned with the length of the materialized prefix, since the goal is to materialize all available characters. Steps 2 and 3 are performed while elements are available in the list, subject to observance of the .metn limit-count . .coNP Function @ lazy-str-get-trailing-list .synb .mets (lazy-str-get-trailing-list < string << index ) .syne .desc The .code lazy-str-get-trailing-list function can be considered, in some way, an inverse operation to the production of the lazy string from its associated list. Note: the behavior of this function changed in \*(TX 274. This is subject to a note in the COMPATIBILITY section. First, the lazy string .meta string is forced up through the position .metn index , as if by a call to .metn lazy-str-force-upto . If .meta string consists of .meta index or more characters, then after the forcing operation, it is guaranteed that at least .meta index characters of the string have been materialized into a single string, called the .IR "materialized prefix" of the lazy string. If fewer than .meta index characters are available, taking into account the contribution of the terminator string, then the number of characters in the materialized prefix fall short of .metn index . The materialized prefix never takes fractional strings from the lazy string's list, and is always terminated by the terminator string. Next, the materialized prefix is split into pieces on occurrences of .metn string 's terminator string, as if by using .code spl function. If the terminator string is empty, it is split into individual characters, in accordance with the semantics of that function. Then, if the last piece of the split prefix is an empty string, it is removed. This situation occurs in two cases: the materialized prefix is empty, or else it ends in the terminating string. For example, if the terminating string is a single newline, and the prefix is .strn "foo\en" . In this case, .code "(spl \(dq\en\(dq \(dqfoo\en\(dq)" produces .code "(\(dqfoo\(dq \(dq\(dq)" from which the trailing empty string is removed, leaving .codn "(\(dqfoo\(dq)" . Finally, a list is formed by appending the split piece of the materialized prefix, calculated as described above, with .metn string 's remaining list of strings which have not been pulled into the materialized prefix. This list is returned. .coNP Functions @, length-str-> @, length-str->= @ length-str-< and @ length-str-<= .synb .mets (length-str-> < string << len ) .mets (length-str->= < string << len ) .mets (length-str-< < string << len ) .mets (length-str-<= < string << len ) .syne .desc These functions compare the lengths of two strings. The following equivalences hold, as far as the resulting value is concerned: .verb (length-str-> s l) <--> (> (length-str s) l) (length-str->= s l) <--> (>= (length-str s) l) (length-str-< s l) <--> (< (length-str s) l) (length-str-<= s l) <--> (<= (length-str s) l) .brev The difference between the functions and the equivalent forms is that if the string is lazy, the .code length-str function will fully force it in order to calculate and return its length. These functions only force a string up to position .metn len , so they are not only more efficient, but on infinitely long lazy strings they are usable. .code length-str cannot compute the length of a lazy string with an unbounded length; it will exhaust all memory trying to force the string. These functions can be used to test such as string whether it is longer or shorter than a given length, without forcing the string beyond that length. .coNP Function @ cmp-str .synb .mets (cmp-str < left-string << right-string ) .syne .desc The .code cmp-str function returns -1 if .meta left-string is lexicographically prior to .metn right-string . If the reverse relationship holds, it returns 1. Otherwise the strings are equal and zero is returned. If either or both of the strings are lazy, then they are only forced to the minimum extent necessary for the function to reach a conclusion and return the appropriate value, since there is no need to look beyond the first character position in which they differ. The lexicographic ordering is naive, based on the character code point values in Unicode taken as integers, without regard for locale-specific collation orders. Note: in \*(TX 232 and earlier versions, .code cmp-str conforms to a weaker requirements: any negative integer value may be returned rather than -1, and any positive integer value can be returned instead of 1. .coNP Functions @, str= @, str< @, str> @ str>= and @ str<= .synb .mets (str= < left-string << right-string ) .mets (str< < left-string << right-string ) .mets (str> < left-string << right-string ) .mets (str<= < left-string << right-string ) .mets (str>= < left-string << right-string ) .syne .desc These functions compare .meta left-string and .meta right-string lexicographically, as if by the .code cmp-str function. The .code str= function returns .code t if the two strings are exactly the same, character for character, otherwise it returns .codn nil . The .code str< function returns .code t if .meta left-string is lexicographically before .metn right-string , otherwise nil. The .code str> function returns .code t if .meta left-string is lexicographically after .metn right-string , otherwise .codn nil . The .code str< function returns .code t if .meta left-string is lexicographically before .metn right-string , or if they are exactly the same, otherwise .codn nil . The .code str< function returns .code t if .meta left-string is lexicographically after .metn right-string , or if they are exactly the same, otherwise .codn nil . .coNP Function @ string-lt .synb .mets (string-lt < left-str << right-str ) .syne .desc The .code string-lt is a deprecated alias for .codn str< . .SS* Vectors .coNP Function @ vector .synb .mets (vector < length <> [ initval ]) .syne .desc The .code vector function creates and returns a vector object of the specified length. The elements of the vector are initialized to .metn initval , or to nil if .meta initval is omitted. .coNP Function @ vec .synb .mets (vec << arg *) .syne .desc The .code vec function creates a vector out of its arguments. .coNP Function @ vectorp .synb .mets (vectorp << obj ) .syne .desc The .code vectorp function returns .code t if .meta obj is a vector, otherwise it returns .codn nil . .coNP Function @ vec-set-length .synb .mets (vec-set-length < vec << len ) .syne .desc The .code vec-set-length modifies the length of .metn vec , making it longer or shorter. If the vector is made longer, then the newly added elements are initialized to nil. The .meta len argument must be nonnegative. The return value is .metn vec . .coNP Accessor @ vecref .synb .mets (vecref < vec << idx ) .mets (set (vecref < vec << idx ) << new-value ) .syne .desc The .code vecref function performs indexing into a vector. It retrieves an element of .meta vec at position .metn idx , counted from zero. The .meta idx value must range from 0 to one less than the length of the vector. The specified element is returned. If the element .meta idx of vector .meta vec exists, then the .code vecref form denotes a place. A .code vecref place supports deletion. When a deletion takes place, then if .meta idx denotes the last element in the vector, the vector's length is decreased by one, so that the vector no longer has that element. Otherwise, if .meta idx isn't the last element, then each elements values at a higher index than .meta idx shifts by one one element position to the adjacent lower index. Then, the length of the vector is decreased by one, so that the last element position disappears. .coNP Function @ vec-push .synb .mets (vec-push < vec << elem ) .syne .desc The .code vec-push function extends the length of a vector .meta vec by one element, and sets the new element to the value .metn elem . The previous length of the vector (which is also the position of .metn elem ) is returned. .coNP Function @ length-vec .synb .mets (length-vec << vec ) .syne .desc The .code length-vec function returns the length of vector .metn vec . It performs similarly to the generic .code length function, except that the argument must be a vector. .coNP Function @ size-vec .synb .mets (size-vec << vec ) .syne .desc The .code size-vec function returns the number of elements for which storage is reserved in the vector .metn vec . .TP* Notes: The .code length of the vector can be extended up to this size without any memory allocation operations having to be performed. .coNP Function @ vec-list .synb .mets (vec-list << list ) .syne .desc The .code vec-list function returns a vector which contains all of the same elements and in the same order as list .metn list . Note: this function is also known by the obsolescent name .codn vector-list . .coNP Function @ list-vec .synb .mets (list-vec << vec ) .syne .desc The .code list-vec function returns a list of the elements of vector .metn vec . Note: this function is also known by the obsolescent name .codn list-vector . .coNP Function @ copy-vec .synb .mets (copy-vec << vec ) .syne .desc The .code copy-vec function returns a new vector object of the same length as .meta vec and containing the same elements in the same order. .coNP Accessor @ sub-vec .synb .mets (sub-vec < vec >> [ from <> [ to ]]) .mets (set (sub-vec < vec >> [ from <> [ to ]]) << new-value ) .syne .desc The .code sub-vec function has the same parameters and semantics as the function .codn sub , except that the .meta vec argument must be a vector. If a .code sub-vec form is used as a place, it denotes a subrange of .meta list as if it were a storage location. The previous value of this location, if needed, is fetched by a call to .codn sub-vec . Storing .meta new-value to the place is performed by a call to .codn replace-vec . In an update operation which accesses the prior value and stores a new value, the arguments .metn vec , .metn from , .meta to and .meta new-value are evaluated once. The .meta vec argument is not itself required to be a place; it is not updated when a value is written to the .code sub-vec storage location. .coNP Function @ replace-vec .synb .mets (replace-vec < vec < item-sequence >> [ from <> [ to ]]) .syne .desc The .code replace-vec is like the .code replace function except that the .meta vec argument must be a vector. .coNP Function @ fill-vec .synb .mets (fill-vec < vec < elem >> [ from <> [ to ]]) .syne .desc The .code fill-vec function overwrites a range of the vector with copies of the .meta elem value. The .meta from and .meta to index arguments follow the same range indexing conventions as the .meta replace and .meta sub functions. If .meta from is omitted, it defaults to zero. If .meta to is omitted, it defaults to the length of .metn vec . Negative values of .meta from and .meta to are adjusted by adding the length of the vector to them, once. If the adjusted value of either .meta from or .meta to is negative, or exceeds the length of .metn vec , an error exception is thrown. The adjusted values of .meta to and .meta from specify a range of vec starting at the .meta from index, and ending at the .meta to index, which is excluded from the range. If the adjusted .meta to is less than or equal to the adjusted .metn from , then .meta vec is unaltered. Otherwise, copies of element are stored into .meta vec starting at the .meta from index, ending just before the .meta to index is reached. The .code fill-vec function returns .metn vec . .TP* Examples: .verb (defvarl v (vec 1 2 3)) v --> #(1 2 3) (fill-vec v 0) --> #(0 0 0) (fill-vec v 3 1) --> #(0 3 3) (fill-vec v 4 -1) --> #(0 3 4) (fill-vec v 5 -3 -1) --> #(5 5 4) .brev .coNP Function @ cat-vec .synb .mets (cat-vec << vec-list ) .syne .desc The .meta vec-list argument is a list of vectors. The .code cat-vec function produces a catenation of the vectors listed in .metn vec-list . It returns a single large vector formed by catenating those vectors together in order. .coNP Functions @ nested-vec and @ nested-vec-of .synb .mets (nested-vec << dimension *) .mets (nested-vec-of < object << dimension *) .syne .desc The .code nested-vec-of function constructs a nested vector according to the .meta dimension arguments, described in detail below. The .code nested-vec function is equivalent to .code nested-vec-of with an .meta object argument of .codn nil . When there are no .meta dimension arguments, .code nested-vec-of returns .codn nil . If there is exactly one .meta dimension argument, it must be a nonnegative integer. A newly created having that many elements is returned, with each element of the vector being .metn object . If there are two or more .meta dimension arguments, nested vector is returned. The first .meta dimension argument specifies the outermost dimension: a vector of that many elements are returned. Each element of that vector is a vector whose length is given by the second dimension. This nesting pattern continues through the remaining dimensions. The last dimension specifies the length of vectors which are filled with .metn object . From the above it follows that if a zero-valued .meta dimension is encountered, every vector corresponding to that level of nesting shall be empty, and that shall be the last dimension regardless of the presence of additional .meta dimension arguments. .TP* Examples: .verb (nested-vec) -> nil (nested-vec-of 0 4) -> #(0 0 0 0) (nested-vec-of 0 4 3) -> #(#(0 0 0) #(0 0 0) #(0 0 0) #(0 0 0)) (nested-vec-of 'a 4 3 2) -> #(#(#(a a) #(a a) #(a a)) #(#(a a) #(a a) #(a a)) #(#(a a) #(a a) #(a a)) #(#(a a) #(a a) #(a a))) (nested-vec-of 'a 1 1 1) -> #(#(#(a))) (nested-vec-of 'a 1 1 0) -> #(#(#())) (nested-vec-of 'a 1 0 1) -> #(#()) (nested-vec-of 'a 1 0) -> #(#()) (nested-vec-of 'a 0 1) -> #() (nested-vec-of 'a 0) -> #() (nested-vec-of 'a 4 0 1) #(#() #() #() #()) (nested-vec-of 'a 4 0) #(#() #() #() #())) .brev .SS* Buffers .coNP The @ buf type Object of the type .code buf are .IR buffers : vector-like objects specialized for holding binary data represented as a sequence of 8-bit bytes. Buffers support operations specialized toward the encoding of Lisp values into machine-oriented data types, and decoding such data types into Lisp values. Buffers are particularly useful in conjunction with the Foreign Function Interface (FFI), since they can be used to prepare arbitrary data which can be passed into and out of a function by pointer. They are also useful for binary I/O. .coNP Conventions Used by the @ buf-put- Functions Buffers support a number of similar functions for converting Lisp numeric values into common data types, which are placed into the buffer. These functions are named starting with the .code buf-put- prefix, followed by an abbreviated type name. Each of these functions takes three arguments: .meta buf specifies the buffer, .meta pos specifies the byte offset position into the buffer which receives the low-order byte of the data transfer, and .meta val indicates the value. If .meta pos has a value such that any portion of the data transfer would like outside of the buffer, the buffer is automatically extended in length to contain the data transfer. If this extension causes any padding bytes to appear between the previous length of the buffer and .metn pos , those bytes are initialized to zero. The argument .meta val giving the value to be stored must be an integer or character, except in the case of the types .meta float and .metn double (the functions .code buf-put-float and .codn buf-put-double ) for which it is required to be of type .codn float , and in case of the function .code buf-put-cptr which expects the .meta val argument to be a .code cptr object. The .meta val argument must be in range for the data type, or an exception results. Unless otherwise indicated, the stored datum is in the local format used by the machine with regard to byte order and other representational details. .coNP Conventions Used by the @ buf-get- Functions Buffers support a number of similar functions for extracting common data types, and converting them into Lisp values. These functions are named starting with the .code buf-get- prefix, followed by an abbreviated type name. Each of these functions takes two arguments: .meta buf specifies the buffer and .meta pos specifies the byte offset position into the buffer which holds the low-order byte of the datum to be extracted. If any portion of requested datum lies outside of the boundaries of the buffer, an error exception is thrown. The extracted value is converted to a Lisp datum. For the majority of these functions, the returned value is of type integer. The .code buf-get-float and .code buf-get-double return a floating-point value. The .code buf-get-cptr function returns a value of type .codn cptr . .coNP Function @ make-buf .synb .mets (make-buf < len >> [ init-val <> [ alloc-size ]]) .syne .desc The .code make-buf function creates a new buffer object which holds .meta len bytes. This argument may be zero. If .meta init-val is present, it specifies the value with which the first .meta len bytes of the buffer are initialized. If omitted, it defaults to zero. The value of .meta init-val must lie in the range 0 to 255. The .meta alloc-size parameter indicates how much memory to actually allocate for the buffer. If an argument is not given, the parameter takes on the same value as .metn len . If an argument is given, its value must not be less than .metn len . .coNP Function @ bufp .synb .mets (bufp << object ) .syne .desc The .code bufp function returns .code t if .meta object is a .codn buf , otherwise it returns .codn nil . .coNP Function @ length-buf .synb .mets (length-buf << buf ) .syne .desc The .code length-buf function retrieves the buffer length: how many bytes are stored in the buffer. Note: the generic .code length function is also applicable to buffers. .coNP Function @ buf-alloc-size .synb .mets (buf-alloc-size << buf ) .syne .desc The .code buf-alloc-size function retrieves the allocation size of the buffer. .coNP Function @ buf-trim .synb .mets (buf-trim << buf ) .syne .desc The .code buf-trim function reduces the amount of memory allocated to the buffer to the minimum required to hold it contents, effectively setting the allocation size to the current length. The previous allocation size is returned. .coNP Function @ buf-set-length .synb .mets (buf-set-length < buf < len <> [ init-val ]) .syne .desc The .code buf-set-length function changes the length of the buffer. If the buffer is made longer, the newly added bytes appear at the end, and are initialized to the value given by .metn init-val . If .meta init-val is specified, its value must be in the range 0 to 255. It defaults to zero. .coNP Function @ copy-buf .synb .mets (copy-buf << buf ) .syne .desc The .code copy-buf function returns a duplicate of .metn buf : an object distinct from .meta buf which has the same length and contents, and compares .code equal to .metn buf . .coNP Accessor @ sub-buf .synb .mets (sub-buf < buf >> [ from <> [ to ]]) .mets (set (sub-buf < buf >> [ from <> [ to ]]) << new-val ) .syne .desc The .code sub-buf function has the same semantics as the .code sub function, except that the first argument must be a buffer. The extracted sub-range of a buffer is itself a buffer object. If .code sub-buf is used as a syntactic place, the argument expressions .metn buf , .metn from , .meta to and .meta new-val are evaluated just once. The prior value, if required, is accessed by calling .code sub-buf and .meta new-val is then stored via .codn replace-buf . .coNP Function @ replace-buf .synb .mets (replace-buf < buf < item-sequence >> [ from <> [ to ]]) .syne .desc The .code replace-buf function has the same semantics as the .code replace function, except that the first argument must be a buffer. The elements of .code item-sequence are stored into .meta buf as if using the .code buf-put-u8 function and therefore must be suitable .meta val arguments for that function. The of the arguments, semantics and return value given for .code replace apply to .codn replace-buf . .coNP Function @ buf-list .synb .mets (buf-list << list ) .syne .desc The .code buf-list function creates and returns a new buffer, whose contents are derived from the elements of .metn list , which may be any kind of sequence. The elements of .meta list must be integers whose values lie in the range 0 to 255, or else characters whose code point values lie in that range. These values are placed into the newly created buffer, which therefore has the same length as .metn list . .coNP Function @ buf-put-buf .synb .mets (buf-put-buf < dst-buf < pos << src-buf ) .syne .desc The .code buf-put-buf function stores a copy of buffer .meta src-buf into .meta dst-buf at the offset indicated by .metn pos . The source and destination memory regions may overlap. The return value is .metn src-buf . Note: the effect of a .code buf-put-buf operation may also be performed by a suitable call to .codn replace-buf ; however, .code buf-put-buf is less general: it doesn't insert or delete by replacing destination ranges with data of differing length, and requires a source operand of buffer type. .coNP Function @ buf-put-i8 .synb .mets (buf-put-i8 < buf < pos << val ) .syne .desc The .code buf-put-i8 converts .meta val into an 8-bit signed integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-u8 .synb .mets (buf-put-u8 < buf < pos << val ) .syne .desc The .code buf-put-u8 converts .meta val into an 8-bit unsigned integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-i16 .synb .mets (buf-put-i16 < buf < pos << val ) .syne .desc The .code buf-put-i16 converts .meta val into a sixteen bit signed integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-u16 .synb .mets (buf-put-u16 < buf < pos << val ) .syne .desc The .code buf-put-u16 converts .meta val into a sixteen bit unsigned integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-i32 .synb .mets (buf-put-i32 < buf < pos << val ) .syne .desc The .code buf-put-i32 converts .meta val into a 32-bit signed integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-u32 .synb .mets (buf-put-u32 < buf < pos << val ) .syne .desc The .code buf-put-u32 converts .meta val into a 32-bit unsigned integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-i64 .synb .mets (buf-put-i64 < buf < pos << val ) .syne .desc The .code buf-put-i64 converts .meta val into a 64-bit signed integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-u64 .synb .mets (buf-put-u64 < buf < pos << val ) .syne .desc The .code buf-put-u64 converts the value .meta val into a 64-bit unsigned integer, and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . .coNP Function @ buf-put-char .synb .mets (buf-put-char < buf < pos << val ) .syne .desc The .code buf-put-char converts .meta val into a value of the C type .code char and stores it into the buffer at the offset indicated by .metn pos . The return value is .metn val . Note that the .code char type may be signed or unsigned. .coNP Function @ buf-put-uchar .synb .mets (buf-put-uchar < buf < pos << val ) .syne .desc The .code buf-put-uchar converts .meta val into a value of the C type .code "unsigned char" and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-short .synb .mets (buf-put-short < buf < pos << val ) .syne .desc The .code buf-put-short converts .meta val into a value of the C type .code short and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-ushort .synb .mets (buf-put-ushort < buf < pos << val ) .syne .desc The .code buf-put-ushort converts .meta val into a value of the C type .code "unsigned short" and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-int .synb .mets (buf-put-int < buf < pos << val ) .syne .desc The .code buf-put-int converts .meta val into a value of the C type .code int and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-uint .synb .mets (buf-put-uint < buf < pos << val ) .syne .desc The .code buf-put-uint converts .meta val into a value of the C type .code "unsigned int" and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-long .synb .mets (buf-put-long < buf < pos << val ) .syne .desc The .code buf-put-long converts .meta val into a value of the C type .code long and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-ulong .synb .mets (buf-put-ulong < buf < pos << val ) .syne .desc The .code buf-put-ulong converts .meta val into a value of the C type .code "unsigned long" and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-float .synb .mets (buf-put-float < buf < pos << val ) .syne .desc The .code buf-put-float converts .meta val into a value of the C type .code float and stores it into the buffer at the offset indicated by .metn pos . Note: the conversion of a \*(TL floating-point value to the C type float may be inexact, reducing the numeric precision. .coNP Function @ buf-put-double .synb .mets (buf-put-double < buf < pos << val ) .syne .desc The .code buf-put-double converts .meta val into a value of the C type .code double and stores it into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-put-cptr .synb .mets (buf-put-cptr < buf < pos << val ) .syne .desc The .code buf-put-cptr expects .meta val to be of type .codn cptr . It stores the object's pointer value into the buffer at the offset indicated by .metn pos . .coNP Function @ buf-get-i8 .synb .mets (buf-get-i8 < buf << pos ) .syne .desc The .code buf-get-i8 function extracts and returns signed 8-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-u8 .synb .mets (buf-get-u8 < buf << pos ) .syne .desc The .code buf-get-u8 function extracts and returns an unsigned 8-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-i16 .synb .mets (buf-get-i16 < buf << pos ) .syne .desc The .code buf-get-i16 function extracts and returns a signed 16-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-u16 .synb .mets (buf-get-u16 < buf << pos ) .syne .desc The .code buf-get-u16 function extracts and returns an unsigned 16-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-i32 .synb .mets (buf-get-i32 < buf << pos ) .syne .desc The .code buf-get-i32 function extracts and returns a signed 32-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-u32 .synb .mets (buf-get-u32 < buf << pos ) .syne .desc The .code buf-get-u32 function extracts and returns an unsigned 32-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-i64 .synb .mets (buf-get-i64 < buf << pos ) .syne .desc The .code buf-get-i64 function extracts and returns a signed 64-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-u64 .synb .mets (buf-get-u64 < buf << pos ) .syne .desc The .code buf-get-u64 function extracts and returns an unsigned 64-bit integer from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-char .synb .mets (buf-get-char < buf << pos ) .syne .desc The .code buf-get-char function extracts and returns a value of the C type .code char from .meta buf at the offset given by .metn pos . Note that .code char may be signed or unsigned. .coNP Function @ buf-get-uchar .synb .mets (buf-get-uchar < buf << pos ) .syne .desc The .code buf-get-uchar function extracts and returns a value of the C type .code "unsigned char" from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-short .synb .mets (buf-get-short < buf << pos ) .syne .desc The .code buf-get-short function extracts and returns a value of the C type .code short from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-ushort .synb .mets (buf-get-ushort < buf << pos ) .syne .desc The .code buf-get-ushort function extracts and returns a value of the C type .code "unsigned short" from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-int .synb .mets (buf-get-int < buf << pos ) .syne .desc The .code buf-get-int function extracts and returns a value of the C type .code int from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-uint .synb .mets (buf-get-uint < buf << pos ) .syne .desc The .code buf-get-uint function extracts and returns a value of the C type .code "unsigned int" from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-long .synb .mets (buf-get-long < buf << pos ) .syne .desc The .code buf-get-long function extracts and returns a value of the C type .code long from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-ulong .synb .mets (buf-get-ulong < buf << pos ) .syne .desc The .code buf-get-ulong function extracts and returns a value of the C type .code "unsigned long" from .meta buf at the offset given by .metn pos . .coNP Function @ buf-get-float .synb .mets (buf-get-float < buf << pos ) .syne .desc The .code buf-get-float function extracts and returns a value of the C type .code float from .meta buf at the offset given by .metn pos , returning that value as a Lisp floating-point number. .coNP Function @ buf-get-double .synb .mets (buf-get-double < buf << pos ) .syne .desc The .code buf-get-double function extracts and returns a value of the C type .code double from .meta buf at the offset given by .metn pos , returning that value as a Lisp floating-point number. .coNP Function @ buf-get-cptr .synb .mets (buf-get-cptr < buf << pos ) .syne .desc The .code buf-get-cptr function extracts a C pointer from .meta buf at the offset given by .metn pos , returning that value as a Lisp object of type .codn cnum . .coNP Function @ put-buf .synb .mets (put-buf < buf >> [ pos <> [ stream ]]) .syne .desc The .code put-buf function writes the contents of buffer .metn buf , starting at position .meta pos to a stream, through to the last byte, if possible. Successive bytes from the buffer are written to the stream as if by a .code put-byte operation. If .meta stream is omitted, it defaults to .codn *stdout* . If .meta pos is omitted, it defaults to zero. It indicates the starting position within the buffer. The stream must support the .code put-byte operation. Streams which support .code put-byte can be expected to support .code put-buf and, conversely, streams which do not support .code put-byte do not support .codn put-buf . The .code put-buf function returns the position of the last byte that was successfully written. If the buffer was written through to the end, then this value corresponds to the length of the buffer. If an error occurs before any bytes are written, the function throws an error. .coNP Functions @ fill-buf and @ fill-buf-adjust .synb .mets (fill-buf < buf >> [ pos <> [ stream ]]) .mets (fill-buf-adjust < buf >> [ pos <> [ stream ]]) .syne .desc The .code fill-buf reads bytes from .meta stream and writes them into consecutive locations in buffer .meta buf starting at position .metn pos . The bytes are read as if using the .code get-byte function. If the .meta stream argument is omitted, it defaults to .codn *stdin* . If .meta pos is omitted, it defaults to zero. It indicates the starting position within the buffer. The stream must support the .code get-byte operation. Buffers which support .code get-byte can be expected to support .code fill-buf and, conversely, streams which do not support .code get-byte do not support .codn fill-buf . The .code fill-buf function returns the position that is one byte past the last byte that was successfully read. If an end-of-file or other error condition occurs before the buffer is filled through to the end, then the value returned is smaller than the buffer length. In this case, the area of the buffer beyond the read size retains its previous content. If an error situation occurs other than a premature end-of-file before any bytes are read, then an exception is thrown. If an end-of-file condition occurs before any bytes are read, then zero is returned. The .code fill-buf-adjust differs usefully from .code fill-buf as follows. Whereas .code fill-buf doesn't manipulate the length of the buffer at any stage of the operation, the .code fill-buf-adjust begins by adjusting the length of the buffer to the underlying allocated size. Then it performs the fill operation in exactly the same manner as .codn fill-buf . Finally, if the operation succeeds, then .code fill-buf-adjust adjusts the length of the buffer to match the position that is returned. .coNP Function @ get-line-as-buf .synb .mets (get-line-as-buf <> [ stream ]) .syne .desc The .code get-line-as-buf reads bytes from .meta stream as if using the .code get-byte function, until either a the newline character is encountered, or else the end of input is encountered. The bytes which are read, exclusive of the newline character, are returned in a new buffer object. The newline character, if it occurs, is consumed. If .meta stream is omitted, it defaults to .codn *stdin* . The stream is required to support byte input. .coNP Functions @ file-get-buf and @ command-get-buf .synb .mets (file-get-buf < name >> [ max-bytes .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ >> [ skip-bytes <> [ mode-opts ]]]) .mets (command-get-buf < cmd >> [ max-bytes <> [ skip-bytes ]]) .syne .desc The .code file-get-buf function opens a binary stream over the file indicated by the string argument .meta name for reading. By default, the entire file is read and its contents are returned as a buffer object. The buffer's length corresponds to the number of bytes read from the file. The .code command-get-buf function opens a binary stream over an input command pipe created for the command string .metn cmd , as if by the .code open-command function. It read bytes from the pipe until the indication that no more input is available. The bytes are returned aggregated into a buffer object. If the .meta max-bytes parameter is given an argument, it must be a nonnegative integer. That value specifies a limit on the number of bytes to read. A buffer no longer than .meta max-bytes shall be returned. If the .meta skip-bytes parameter is given an argument, it must be a nonnegative integer. That value specifies how many initial bytes of the input should be discarded before accumulation of the buffer begins. If possible, the semantics of this parameter is achieved by performing a .code seek-stream operation, falling back on reading and discarding bytes if the stream doesn't support seeking. If .meta max-bytes is specified, then the stream is opened in unbuffered mode, so that bytes beyond the specified range shall not be requested from the underlying file, device or process. The .code file-get-buf function opens the file as if using the .code open-file function, using a .meta mode-string of .strn r . If the .meta mode-opts is present, it specifies .meta options to be added to the string. These must be compatible with the implicit .str r mode. .coNP Functions @, file-put-buf @ file-append-buf and @ command-put-buf .synb .mets (file-put-buf < name < buf < skip-bytes <> [ mode-opts ]) .mets (file-place-buf < name < buf < skip-bytes <> [ mode-opts ]) .mets (file-append-buf < name < buf <> [ mode-opts ]) .mets (command-put-buf < cmd << buf ) .syne .desc The .code file-put-buf function opens a text stream over the file indicated by the string argument .metn name , writes the contents of the buffer object .meta buf into the file, and then closes the file. If the file doesn't exist, it is created. If it exists, it is truncated to zero length and overwritten. The default value of the optional .meta skip-bytes parameter is zero. If an argument is given, it must be a nonnegative integer. If it is nonzero, then after opening the file, before writing the buffer, the function will seek to an offset of that many bytes from the start of the file. The contents of .meta buf will be written at that offset. The .code file-place-buf function does not truncate an existing file to zero length. In all other regards, it is equivalent to .codn file-put-buf . The .code file-append-buf function is similar to .code file-put-buf except that if the file exists, it isn't overwritten. Rather, the buffer is appended to the file. The .code command-put-buf function opens an output text stream over an output command pipe created for the command specified in the string argument .metn cmd , as if by the .code open-command function. It then writes the contents of buffer .meta buf into the stream and closes the stream. The .codn file-put-buf , .code file-place-buf and .code file-append-buf functions open a file as if using the .code open-file function using, respectively, .meta mode-string values of .strn wb , .strn mb , and .strn ab . The .meta mode-opts argument, if present, specifies additional .meta options to be added to these modes. The return value of all three functions is that of the .code put-buf operation which is implicitly performed. .coNP Functions @ buf-str and @ str-buf .synb .mets (buf-str < str <> [ null-term-p ]) .mets (str-buf < buf <> [ null-term-p ]) .syne .desc The .code buf-str and .code str-buf functions perform UTF-8 conversion between the character string and buffer data types. The .code buf-str function UTF-8-encodes .meta str and returns a buffer containing the converted representation. If a true argument is given to the .meta null-term-p parameter, then a null terminating byte is added to the buffer. This byte is added even if the previous byte is already a null byte from the conversion of a pseudo-null character occurring in .metn str . The .code str-buf function takes the contents of buffer .meta buf to be UTF-8 data, which is converted to a character string and returned. Null bytes in the buffer are mapped to the pseudo-null character .codn #\exDC00 . If a true argument is given to the .meta null-term-p parameter, then if the contents of .meta buf end in a null byte, that byte is not included in the conversion. .coNP Functions @ buf-int and @ buf-uint .synb .mets (buf-int << integer ) .mets (buf-uint << integer ) .syne .desc The .code buf-int and .code buf-uint functions convert a signed and unsigned integer, respectively, or else a character, into a binary representation, which is returned as a buffer object. Under both functions, the representation uses big endian byte order: most significant byte first. The .code buf-uint function requires a nonnegative .meta integer argument, which may be a character. The representation stored in the buffer is a pure binary representation of the value using the smallest number of bytes required for the given .meta integer value. The .code buf-int function requires an integer or character argument. The representation stored in the buffer is a two's complement representation of .meta integer using the smallest number of bytes which can represent that value. If .meta integer is nonnegative, then the first byte of the buffer lies in the range 0 to 127. If .meta integer is negative, then the first byte of the buffer lies in the range 128 to 255. The integer 255 therefore doesn't convert to the buffer .code #b'ff' but rather .codn #b'00ff' . The buffer .code #b'ff' represents -1. If the .meta integer argument is a character object, it is taken to be its Unicode code point value, as returned by the .code int-chr function. .coNP Functions @ int-buf and @ uint-buf .synb .mets (int-buf << buf ) .mets (uint-buf << buf ) .syne .desc The .code int-buf and .code uint-buf functions recover an integer value from its binary form which appears inside .metn buf , which must be a buffer object. These functions expect .meta buf to contain the representation produced by, respectively, the functions .code buf-int and .codn buf-uint . If .meta buf holds the representation of an integer value .metn n , as produced by .mono .meti (buf-int << n ) .onom then .mono .meti (int-buf << buf ) .onom returns .metn n . The same relationship holds between .code buf-uint and .codn uint-buf . Thus, these equalities hold: .verb .mets (= (int-buf (buf-int << n )) << n ) .mets (= (uint-buf (buf-uint << n )) << n ) .brev provided that .meta n is of integer type and, in the case of .codn buf-uint , nonnegative. .coNP Functions @ buf-compress and @ buf-decompress .synb .mets (buf-compress < buf <> [ level ]) .mets (buf-decompress << buf ) .syne .desc The .code buf-compress and .code buf-decompress functions perform compression using the Deflate algorithm, via Zlib. These functions are only available if \*(TX is built with Zlib support. More specifically, .code buf-compress uses Zlib's .code compress2 function; therefore it can be expected to interoperate with other software which uses the same function. The .code buf-compress function compresses the entire contents of .meta buf and returns new buffer with the compressed contents. The optional .meta level argument specifies the compression level as an integer. Valid values range from 0 (no compression) to 9 (maximum compression). The value -1 selects a default compression determined internally by Zlib. The .code buf-decompress function reverses the .code buf-compress operation: it takes a compressed .meta buf and returns a buffer containing the original uncompressed data. The .code buf-compress function throws an error exception if the .meta level value is unacceptable to Zlib. The .code buf-decompress function throws an error exception if .meta buf doesn't contain a compressed image. .SS* Structures \*(TX supports user-defined types in the form of structures. Structures are objects which hold multiple storage locations called slots, which are named by symbols. Structures can be related to each other by inheritance. Multiple inheritance is permitted. The type of a structure is itself an object, of type .codn struct-type . When the program defines a new structure type, it does so by creating a new .code struct-type instance, with properties which describe the new structure type: its name, its list of slots, its initialization and "boa constructor" functions, and the structures type it inherits from (the .IR supertypes ). The .code struct-type object is then used to generate instances. Structures instances are not only containers which hold named slots, but they also indicate their struct type. Two structures which have the same number of slots having the same names are not necessarily of the same type. Structure types and structures may be created and manipulated using a programming interface based on functions. For more convenient and clutter-free expression of structure-based program code, macros are also provided. Furthermore, concise and expressive slot access syntax is provided courtesy of the referencing dot and unbound referencing dot syntax, a syntactic sugar for the .code qref and .code uref macros. Structure types have a name, which is a symbol. The .code typeof function, when applied to any struct type, returns the symbol .codn struct-type . When .code typeof is applied to a struct instance, it returns the name of the struct type. Effectively, struct names are types. The consequences are unspecified if an existing struct name is reused for a different struct type, or an existing type name is used for a struct type. .NP* Static Slots Structure slots can be of two kinds: they can be the ordinary instance slots or they can be static slots. The instances of a given structure type have their own instance of a given instance slot. However, they all share a single instance of a static slot. Static slots are allocated in a global area associated with a structure type and are initialized when the structure type is created. They are useful for efficiently representing properties which have the same value for all instances of a struct. These properties don't have to occupy space in each instance, and time doesn't have to be wasted initializing them each time a new instance is created. Static slots are also useful for struct-specific global variables. Lastly, static slots are also useful for holding methods and functions. Although structures can have methods and functions in their instances, usually, all structures of the same type share the same functions. The .code defstruct macro supports a special syntax for defining methods and struct-specific functions at the same time when a new structure type is defined. The .code defmeth macro can be used for adding new methods and functions to an existing structure and its descendants. Static slots may be assigned just like instance slots. Changing a static slot changes that slot in every structure of the same type. Static slots are not listed in the .code #S(...) notation when a structure is printed. When the structure notation is read from a stream, if static slots are present, they will be processed and their values stored in the static locations they represent, thus changing their values for all instances. Static slots are inherited just like instance slots. The following simplified discussion is restricted to single inheritance. A detailed description of multiple inheritance is given in the Multiple Inheritance section below. If a given structure .meta B has some static slot .metn s , and a new structure .meta D is derived from .metn B , using .codn defstruct , and does not define a slot .metn s , then .meta D inherits .metn s . This means that .meta D shares the static slot with .metn B : both types share a single instance of that slot. On the other hand if .code D defines a static slot .meta s then that slot will have its own instance in the .meta D structure type; .meta D will not inherit the .meta B instance of slot .metn s . Moreover, if the definition of .code D omits the .meta init-form for slot .metn s , then that slot will be initialized with a copy of the current value of slot .meta s of the .meta B base type, which allows derived types to obtain the value of base type's static slot, yet have that in their own instance. The slot type can be overridden. A structure type deriving from another type can introduce slots which have the same names as the supertype, but are of a different kind: an instance slot in the supertype can be replaced by a static slot in the derived type or vice versa. Note that, in light of the above type overriding possibility, the static slot value propagation happens only from the immediate supertype. If .code D is derived from .code G which has a static slot .codn s , whereas .code D specifies .code s as an instance slot, but then .code B again specifies a static slot .codn s , then .codn B 's slot .code s will not inherit the value from .codn G 's .code s slot. Simply, .codn B 's supertype is .code D and that supertype is not considered to have a static slot .codn s . A structure type is associated with a static initialization function which may be used to store initial values into static slots. This function is invoked once in a type's life time, when the type is created. The function is also inherited by derived struct types and invoked when they are created. .NP* Multiple Inheritance When a structure type is defined, two or more supertypes may be specified. The new structure type then potentially inherits instance and static slots from all of the specified supertypes, and is considered to be a subtype of all of them. This situation with two or more supertypes is called .IR "multiple inheritance" . The contrasting term is .IR "single inheritance" , denoting the situation when a structure has exactly one supertype. \*(TL's struct types initially permitted only single inheritance. Multiple inheritance support was introduced in version 229, as a straightforward extension of single inheritance semantics. In the .code make-struct-type function and .code defstruct macro, a list of supertypes can be given instead of just one. The type then inherits slots from all of the specified types. If any conflicts arise among the supertypes due to slots having the same name, the leftmost supertype dominates: that type's slot will be inherited. If the leftmost slot is static, then that static slot will be inherited. Otherwise, the instance slot will be inherited. Of course, any slot which is specified in the newly defined type itself dominates over any same-named slots among the supertypes. The new structure type inherits all of the slot initializing expressions, as well as .code :init and .code :postinit methods of all of its supertypes. Each time the structure is instantiated, the .code :init initializing expressions inherited from the supertypes, together with the slot initializing expressions, are all evaluated, in right-to-left order: the initializations contributed by each supertype are performed before considering the next supertype to the left. The .code :postinit methods are similarly invoked in right-to-left order, before the .code :postinit methods of the new type itself. Thus the order is: supertype inits, own inits, supertype post-inits, own post-inits. .NP* Duplicate Supertypes Multiple inheritance makes it possible for a type to inherit the same supertype more than once, either directly (by naming it more than once as a direct supertype) or indirectly (by inheriting two or more different types, which have a common ancestor). The latter situation is sometimes referred to as the .IR "diamond problem" . Until \*(TX 242, the situation of duplicate supertypes was ignored for the purposes of object initialization. It was documented that if a supertype is referenced by inheritance, directly or indirectly, two or more times, then its initializing expressions are evaluated that many times. Starting in \*(TX 243, duplicate supertypes no longer give rise to duplicate initialization. When an object is instantiated, only one initialization of a duplicated supertype occurs. The subsequent initializations that would take place in the absence of duplicate detection are suppressed. Note also that the .code :fini mechanism is tied to initialization. Initialization of an object registers the finalizers, and so in \*(TX 242, .code :fini finalizers are also executed multiple times, if .code :init initializers are. .TP* Examples: Consider following program: .verb (defstruct base () (:init (me) (put-line "base init")) (:fini (me) (put-line "base fini"))) (defstruct d1 (base) (:init (me) (put-line "d1 init")) (:fini (me) (put-line "d1 fini"))) (defstruct d2 (base) (:init (me) (put-line "d2 init")) (:fini (me) (put-line "d2 fini"))) (defstruct s (d1 d2)) (call-finalizers (new s)) .brev Under \*(TX 242, and earlier versions that support multiple inheritance, it produces the output: .verb base init d2 init base init d1 init d1 fini base fini d2 fini base fini .brev The supertypes are initialized in a right-to-left traversal of the type lattice, without regard for .code base being duplicated. Starting with \*(TX 243, the output is: .verb base init d2 init d1 init d1 fini d2 fini base fini .brev The rightmost duplicate of the base is initialized, so that the initialization is complete prior to the initializations of any dependent types. Likewise, the same rightmost duplicate of the base is finalized, so that finalization takes place after that of any dependent struct types. Note, however, that the .code derived function function mechanism is not required to detect duplicated direct supertypes. If a supertype implements the .code derived function to detect situations when it is the target of inheritance, and some subtype inherits that type more than once, that function may be called more than once. The behavior is unspecified. .NP* Dirty Flags All structure instances contain a Boolean flag called the .IR "dirty flag" . This flag is not a slot, but rather a meta-data property that is exposed to program access. When the flag is set, an object is said to be dirty; otherwise it is clean. Newly constructed objects come into existence dirty. The dirty flag state can be tested with the function .codn test-dirty . An object can be marked as clean by clearing its dirty flag with .codn clear-dirty . A combined operation .code test-clear-dirty is provided which clears the dirty flag, and returns its previous value. The dirty flag is set whenever a new value is stored into the instance slot of an object. Note: the dirty flag can be used to support support the caching of values derived from an object's slots. The derived values don't have to be recomputed while an object remains clean. .NP* Equality Substitution In object-based or object-oriented programming, sometimes it is necessary for a new data type to provide its own notion of equality: its own requirements for when two distinct instances of the type are considered equal. Furthermore, types sometimes have to implement their own notion, also, of inequality: the requirements for the manner in which one instance is considered lesser or greater than another. \*(TL structures implement a concept called .IR "equality substitution" which provides a simple, unified way for the implementor of an object to encode the requirements for both equality and inequality. Equality substitution allows for objects to be used as keys in a hash table according to the custom equality, without the programmer being burdened with the responsibility of developing a custom hashing function. An object participates in equality substitution by implementing the .code equal method. The .code equal method takes no arguments other than the object itself. It returns a representative value which is used in place of that object for the purposes of .code equal comparison. Whenever an object which supports equality substitution is used as an argument of any of the functions .codn equal , .codn nequal , .codn greater , .codn less , .codn gequal , .code lequal or .codn hash-equal , the .code equal method of that object is invoked, and the return value of that method is taken in place of that object. The same is true if an object which supports equality substitution is used as a key in an .code :equal-based hash table. The substitution is applied repeatedly: if the return value of the object's .code equal method is an object which itself supports equality substitution, than that returned object's method is invoked on that object to fetch its equality substitute. This repeats as many times as necessary until an object is determined which isn't a structure that supports equality substitution. Once the equality substitute is determined, then the given function proceeds with the replacement object. Thus for example .code equal compares the replacement object in place of the original, and an .code :equal-based hash table uses the replacement object as the key for the purposes of hashing and comparison. .NP* Custom Slot Expansion The .code defstruct macro has a provision for for application-defined clauses, which may be defined using the .code define-struct-clause macro. This macro associates new clause keywords with custom expansion. The .code :delegate clause of .code defstruct is in fact implemented externally to .code defstruct using .codn define-struct-clause . .NP* Custom Preludes The .code defstruct macro has a provision for implicit inclusion of application-defined clauses called preludes, which are previously defined via the .code define-struct-prelude macro. During macro-expansion, .code defstruct checks whether the structure being defined is the target of one or more preludes. If so, it includes the clauses from those preludes as if they were written directly in the .code defstruct syntax. .coNP Macro @ defstruct .synb .mets (defstruct >> { name | >> ( name << arg *)} < super .mets \ \ << slot-specifier *) .syne .desc The .code defstruct macro defines a new structure type and registers it under .metn name , which must be a bindable symbol, according to the .code bindable function. Likewise, the name of every .meta slot must also be a bindable symbol. The .meta super argument must either be .codn nil , or a symbol which names an existing struct type, or else a list of such symbols. The newly defined struct type will inherit all slots, as well as initialization behaviors from the specified struct types. The .code defstruct macro is implemented using the .code make-struct-type function, which is more general. The macro analyzes the .code defstruct argument syntax, and synthesizes arguments which are then used to call the function. Some remarks in the description of .code defstruct only apply to structure types defined using that macro. Slots are specified using zero or more .meta slot-specifier clauses. Application-defined clauses are possible via .codn define-struct-clause . The .code defstruct macro may bring in prelude clauses which are not specified in its syntax, but that have been specified using .codn define-struct-prelude . The following built-in clauses are supported: .RS .meIP < name The simplest slot specifier is just a name, which must be a bindable symbol, as defined by the .code bindable function. This form is a short form for the .mono .meti (:instance << name ) .onom syntax. .meIP >> ( name << init-form ) This syntax is a short form for the .mono .meti (:instance < name << init-form ) .onom syntax. .meIP (:instance < name <> [ init-form ]) This syntax specifies an instance slot called .meta name whose initial value is obtained by evaluating .meta init-form whenever a new instance of the structure is created. This evaluation takes place in the original lexical environment in which the .code defstruct form occurs. If .meta init-form is omitted, the slot is initialized to .codn nil . .meIP (:static < name <> [ init-form ]) This syntax specifies a static slot called .meta name whose initial value is obtained by evaluating .meta init-form once, during the evaluation of the .code defstruct form in which it occurs, if the .meta init-form is present. If .meta init-form is absent, and a static slot with the same name exists in the .meta super base type, then this slot is initialized with the value of that slot. Otherwise it is initialized to .codn nil . The definition of a static slot in a .code defstruct causes the new type to have its own instance that slot, even if a same-named static slot occurs in the .meta super base type, or its bases. .meIP (:method < name <> ( param +) << body-form *) This syntax creates a static slot called .meta name which is initialized with an anonymous function. The anonymous function is created during the evaluation of the .code defstruct form. The function takes the arguments specified by the .meta param symbols, and its body consists of the .metn body-form s. There must be at least one .metn param . When the function is invoked as a method, as intended, the leftmost .meta param receives the structure instance. The .metn body-form s are evaluated in a context in which a block named .meta name is visible. Consequently, .code return-from may be used to terminate the execution of a method and return a value. Methods are invoked using the .code "instance.(name arg ...)" syntax, which implicitly inserts the instance into the argument list. Due to the semantics of static slots, methods are naturally inherited from a base structure to a derived one, and defining a method in a derived class which also exists in a base class performs OOP-style overriding. .meIP (:function < name <> ( param *) << body-form *) This syntax creates a static slot called .meta name which is initialized with an anonymous function. The anonymous function is created during the evaluation of the .code defstruct form. The function takes the arguments specified by the .meta param symbols, and its body consists of the .metn body-form s. This specifier differs from .code :method only in one respect: there may be zero parameters. A structure function defined this way is intended to be used as a utility function which doesn't receive the structure instance as an argument. The .metn body-form s are evaluated in a context in which a block named .meta name is visible. Consequently, .code return-from may be used to terminate the execution of the function and return a value. Such functions are called using the .code "(call instance.name arg ...)" or else the DWIM brackets syntax .codn "[instance.name arg ...]" . The remarks about inheritance and overriding in the description of .code :method also apply to .codn :function . .meIP (:init <> ( param ) << body-form *) The .code :init specifier doesn't describe a slot. Rather, it specifies code which is executed when a structure is instantiated, before the slot initializations specific to the structure type are performed. The code consists of .metn body-form s which are evaluated in order in a lexical scope in which the variable .meta param is bound to the structure object. Multiple .code :init specifiers may appear in the same .code defstruct form. They are executed in their order of appearance, left to right. When an object with one or more levels of inheritance is instantiated, the .code :init code of a base structure type, if any, is executed before any initializations specific to a derived structure type. Under multiple inheritance, the .code :init code of the rightmost base type is executed first, then that of the remaining bases in right-to-left order. The .code :init initializations are executed before any other slot initializations. The argument values passed to the .code new or .code lnew operator or the .code make-struct function are not yet stored in the object's slots, and are not accessible. Initialization code which needs these values to be stable can be defined with .codn :postinit . Initializers in base structures must be careful about assumptions about slot kinds, because derived structures can alter static slots to instance slots or vice versa. To avoid an unwanted initialization being applied to the wrong kind of slot, initialization code can be made conditional on the outcome of .code static-slot-p applied to the slot. (Code generated by .code defstruct for initializing instance slots performs this kind of check). The .metn body-form s of an .code :init specifier are not surrounded by an implicit .codn block . .meIP (:postinit <> ( param ) << body-form *) The .code :postinit specifier is similar to .codn :init . Both specify forms which are evaluated during object instantiation. The difference is that the .codn body-form s of a .code :postinit are evaluated after other initializations have taken place, including the .code :init initializations, as a second pass. By the time .code :postinit initialization runs, the argument material from the .codn make-struct , .code new or .code lnew invocation has already been processed and stored into slots. Like .code :init actions, .code :postinit actions registered at different levels of the type's inheritance hierarchy are invoked in the base-to-derived order, in right-to-left order among multiple bases at the same level. Multiple .code :postinit form in the same .code defstruct are invoked in left-to-right order. .meIP (:fini <> ( param ) << body-form *) The .code :fini specifier doesn't describe a slot. Rather, it specifies a finalization function which is associated with the structure instance, as if by use of the .code finalize function. This finalization registration takes place as the first step when an instance of the structure is created, before the slots are initialized and the .code :init code, if any, has been executed. The registration takes place as if by the evaluation of the form .mono .meti (finalize < obj (lambda <> ( param ) << body-form ...) t) .onom where .meta obj denotes the structure instance. Note the .code t argument which requests reverse order of registration, ensuring that if an object has multiple finalizers registered at different levels of inheritance hierarchy, the finalizers specified for a derived structure type are called before inherited finalizers. The .metn body-form s of a .code :fini specifier are not surrounded by an implicit .codn block . Multiple .code :fini clauses may be specified in the same .codn defstruct , in which case they are invoked in reverse, right-to-left order. Note that an object's finalizers can be called explicitly with .codn call-finalizers . Note: the .code with-objects macro arranges for finalizers to be called on objects when the execution of a scope terminates by any means. .meIP (:postfini <> ( param ) << body-form *) Like .codn :fini , .code :postfini specifier doesn't describe a slot. The syntax is identical to .codn :fini . Independently of whether .code :fini is specified, at most one .code :postfini may be specified. The only difference between .code :fini and .code :postfini is that .code :postfini arranges for a finalizer to be registered as if by the evaluation of the form .mono .meti (finalize < obj (lambda <> ( param ) << body-form ...)) .onom where .meta obj denotes the structure instance. Note the that unlike .codn :fini , this omits the .code t parameter, which means that .code :postfini finalizers of derived structures execute after the execution of inherited finalizers. It also means that multiple .code :postfini finalizers appearing in the same .code defstruct execute in left-to-right order unlike the reverse right-to-left order of .code :fini finalizers. When both .code :fini and .code :postfini clauses are specified in the same .code defstruct form, all the .code :postfini finalizers execute after all the .code :fini finalizers regardless of the order in which they appear. .meIP (:inherit << super *) The .code :inherit clause specifies zero or more types to be inherited. Each .meta super argument must be a symbol which is the name of an existing struct type. These symbols are appended to the list of supertypes coming from the .meta super argument .codn defstruct . Note: the motivation behind .code :inherit is to make it possible for struct clauses defined by .code define-struct-clause to inject supertypes. Developers are encouraged to use the regular .meta super argument of .code defstruct to declare inheritance of supertypes, rather than writing visible .code :inherit clauses that can be moved into the .meta super argument. .RE .IP The slot names given in a .code defstruct must all be unique among themselves, but they may match the names of existing slots in the .meta super base type. A given structure type can have only one slot under a given symbolic name. If a newly specified slot matches the name of an existing slot in the .meta super type or that type's chain of ancestors, it is called a .IR "repeated slot" . The kind of the repeated slot (static or instance) is not inherited; it is established by the .code defstruct and may be different from the type of the same-named slot in the supertype or its ancestors. If a repeated slot is introduced as a static slot, and has no .meta init-form then it receives the current of the a static of the same name from the nearest supertype which has such a slot. If a repeated slot is an instance slot, no such inheritance of value takes place; only the local .meta init-form applies to it; if it is absent, the slot it initialized to .code nil in each newly created instance of the new type. However, .code :init and .code :postinit initializations are inherited from a base type and they apply to the repeated slots, regardless of their kind. These initializations take place on the instantiated object, and the slot references resolve accordingly. The initialization for slots which are specified using the .code :method or .code :function specifiers is reordered with regard to .code :static slots. Regardless of their placement in the .code defstruct form, .code :method and .code :function slots are initialized before .code :static slots. This ordering is useful, because it means that when the initialization expression for a given static slot constructs an instance of the struct type, any instance initialization code executing for that instance can use all functions and methods of the struct type. However, note the static slots which follow that slot in the .code defstruct syntax are not yet initialized. If it is necessary for a structure's initialization code to have access to all static slots, even when the structure is instantiated during the initialization of a static slot, a possible solution may be to use lazy instantiation using the .code lnew operator, rather than ordinary eager instantiation via .codn new . It is also necessary to ensure that that the instance isn't accessed until all static initializations are complete, since access to the instance slots of a lazily instantiated structure triggers its initialization. The structure name is specified using two forms, plain .meta name or the syntax .mono .meti >> ( name << arg *) .onom If the second form is used, then the structure type will support "boa construction", where "boa" stands for "by order of arguments". The .metn arg s specify the list of slot names which are to be initialized in the by-order-of-arguments style. For instance, if three slot names are given, then those slots can be optionally initialized by giving three arguments in the .code new macro or the .code make-struct function. Slots are first initialized according to their .metn init-form s, regardless of whether they are involved in boa construction. A slot initialized in this style still has a .meta init-form which is processed independently of the existence of, and prior to, boa construction. The boa constructor syntax can specify optional parameters, delimited by a colon, similarly to the .code lambda syntax. However, the optional parameters may not be arbitrary symbols; they must be symbols which name slots. Moreover, the .mono .meti >> ( name < init-form <> [ present-p ]) .onom optional parameter syntax isn't supported. When boa construction is invoked with optional arguments missing, the default values for those arguments come from the .metn init-form s in the remaining .code defstruct syntax. .TP* Examples: .verb (defvar *counter* 0) ;; New struct type foo with no super type: ;; Slots a and b initialize to nil. ;; Slot c is initialized by value of (inc *counter*). (defstruct foo nil (a b (c (inc *counter*)))) (new foo) -> #S(foo a nil b nil c 1) (new foo) -> #S(foo a nil b nil c 2) ;; New struct bar inheriting from foo. (defstruct bar foo (c 0) (d 100)) (new bar) -> #S(bar a nil b nil c 0 d 100) (new bar) -> #S(bar a nil b nil c 0 d 100) ;; counter was still incremented during ;; construction of d: *counter* -> 4 ;; override slots with new arguments (new foo a "str" c 17) -> #S(foo a "str" b nil c 17) *counter* -> 5 ;; boa initialization (defstruct (point x : y) nil (x 0) (y 0)) (new point) -> #S(point x 0 y 0) (new (point 1 1)) -> #S(point x 1 y 1) ;; property list style initialization ;; can always be used: (new point x 4 y 5) -> #S(point x 4 y 5) ;; boa applies last: (new (point 1 1) x 4 y 5) -> #S(point x 1 y 1) ;; boa with optional argument omitted: (new (point 1)) -> #S(point x 1 y 0) ;; boa with optional argument omitted and ;; with property list style initialization: (new (point 1) x 5 y 5) -> #S(point x 1 y 5) .brev .coNP Macro @ defmeth .synb .mets (defmeth < type-name < name < param-list << body-form *) .syne .desc Unless .meta name is one of the two symbols .code :init or .codn :postinit , the .code defmeth macro installs a function into the static slot named by the symbol .meta name in the struct type indicated by .metn type-name . If the structure type doesn't already have such a static slot, it is first added, as if by the .code static-slot-ensure function, subject to the same checks. If the function has at least one argument, it can be used as a method. In that situation, the leftmost argument passes the structure instance on which the method is being invoked. The function takes the arguments specified by the .meta param-list symbols, and its body consists of the .metn body-form s. The .metn body-form s are placed into a .code block named .codn name . A method named .code lambda allows a structure to be used as if it were a function. When a structure is applied to arguments, as if it were a function, the .code lambda method is invoked with those arguments, with the object itself inserted into the leftmost argument position. If .code defmeth is used to redefine an existing method, the semantics can be inferred from that of .codn static-slot-ensure . In particular, the method will be imposed into all subtypes which inherit (do not override) the method. If .meta name is the keyword symbol .codn :init , then instead of operating on a static slot, the macro redefines the .meta initfun of the given structure type, as if by a call to the function .codn struct-set-initfun . Similarly, if .meta name is the keyword symbol .codn :postinit , then the macro redefines the .meta postinitfun of the given structure type, as if by a call to the function .codn struct-set-postinitfun . When redefining .code :initfun the admonishments given in the description of .code struct-set-initfun apply: if the type has an .meta initfun generated by the .code defstruct macro, then that .meta initfun is what implements all of the slot initializations given in the slot specifier syntax. These initializations are lost if the .meta initfun is overwritten. The .code defmeth macro returns a method name: a unit of syntax of the form .mono .meti (meth < type-name << name ) .onom which can be used as an argument to the accessor .code symbol-function and other situations. .coNP Macros @ new and @ lnew .synb .mets (new >> { name | >> ( name << arg *)} >> { slot << init-form }*) .mets (lnew >> { name | >> ( name << arg *)} >> { slot << init-form }*) .syne .desc The .code new macro creates a new instance of the structure type named by .metn name . If the structure supports "boa construction", then, optionally, the arguments may be given using the syntax .mono .meti >> ( name << arg *) .onom instead of .metn name . Slot values may also be specified by the .meta slot and .meta init-form arguments. Note: the evaluation order in .code new is surprising: namely, .metn init-form s are evaluated before .metn arg s if both are present. When the object is constructed, all default initializations take place first. If the object's structure type has a supertype, then the supertype initializations take place. Then the type's initializations take place, followed by the .meta slot .meta init-form overrides from the .code new macro, and lastly the "boa constructor" overrides. If any of the initializations abandon the evaluation of .code new by a nonlocal exit such as an exception throw, the object's finalizers, if any, are invoked. The macro .code lnew differs from new in that it specifies the construction of a lazy struct, as if by the .code make-lazy-struct function. When .code lnew is used to construct an instance, a lazy struct is returned immediately, without evaluating any of the .meta arg and .meta init-form expressions. The expressions are evaluated when any of the object's instance slots is accessed for the first time. At that time, these expressions are evaluated (in the same order as under .codn new ) and initialization proceeds in the same way. If any of the initializations abandon the delayed initializations steps arranged by .code lnew by a nonlocal exit such as an exception throw, the object's finalizers, if any, are invoked. Lazy initialization does not detect cycles. Immediately prior to the lazy initialization of a struct, the struct is marked as no longer requiring initialization. Thus, during initialization, its instance slots may be freely accessed. Slots not yet initialized evaluate as .codn nil . .coNP Macros @ new* and @ lnew* .synb .mets (new* >> { expr | >> ( expr << arg *)} >> { slot << init-form }*) .mets (lnew* >> { expr | >> ( expr << arg *)} >> { slot << init-form }*) .syne .desc The .code new* and .code lnew* macros are variants, respectively, of .code new and .codn lnew . The difference in behavior in these macros relative to .code new and .code lnew is that the .meta name argument is replaced with an expression .meta expr which is evaluated. The value of .meta expr must be a struct type, or a symbol which is the name of a struct type. With one exception, if .meta expr0 is a compound expression, then .mono .meti (new* < expr0 ...) .onom is interpreted as .mono .meti (new* >> ( expr1 << args... ) ...) .onom where the head of .metn expr0 , .metn expr1 , is actually the expression which is evaluated to produce the type, and the remaining constituents of .metn expr0 , .metn args , become the boa arguments. The same requirement applies to .codn lnew* . The exception is that if .meta expr1 is the symbol .codn dwim , this interpretation does not apply. Thus .mono .meti (new* >> [ fun << args... ] ...) .onom evaluates the .mono .meti >> [ fun << args... ] .onom expression, rather than treating it as .mono .meti (dwim < fun << args... ) .onom where .code dwim would be evaluated as a variable reference expected to produce a type. .TP* Examples: .verb ;; struct with boa constructor (defstruct (ab a : b) () a b) ;; error: find-struct-type is interpreted as a variable (new* (find-struct-type 'ab) a 1) -> ;; error ;; OK: extra nesting. (new* ((find-struct-type 'ab)) a 1) -> #S(ab a 1 b nil) ;; OK: dwim brackets without nesting. (new* [find-struct-type 'ab] a 1) -> #S(ab a 1 b nil) ;; boa construction (new* ([find-struct-type 'ab] 1 2)) -> #S(ab a 1 b 2) (new* ((find-struct-type 'ab) 1 2)) -> #S(ab a 1 b 2) ;; mixed construction (new* ([find-struct-type 'ab] 1) b 2) -> #S(ab a 1 b 2) (let ((type (find-struct-type 'ab))) (new* type a 3 b 4)) -> #S(ab a 3 b 4) (let ((type (find-struct-type 'ab))) (new* (type 3 4))) -> #S(ab a 3 b 4) .brev .coNP Macro @ with-slots .synb .mets (with-slots >> ({ slot | >> ( sym << slot )}*) < struct-expr .mets \ \ << body-form *) .syne .desc The .code with-slots binds lexical macros to serve as aliases for the slots of a structure. The .meta struct-expr argument is expected to be an expression which evaluates to a struct object. It is evaluated once, and its value is retained. The aliases are then established to the slots of the resulting struct value. The aliases are specified as zero or more expressions which consist of either a single symbol .meta slot or a .mono .meti >> ( sym << slot ) .onom pair. The simple form binds a macro named .meta slot to a slot also named .metn slot . The pair form binds a macro named .meta sym to a slot named .metn slot . The lexical aliases are syntactic places: assigning to an alias causes the value to be stored into the slot which it denotes. After evaluating .meta struct-expr the .code with-slots macro arranges for the evaluation of .metn body-form s in the lexical scope in which the aliases are visible. .TP* "Dialect Notes:" The intent of the .code with-slots macro is to help reduce the verbosity of code which makes multiple references to the same slot. Use of .code with-slots is less necessary in \*(TL than other Lisp dialects thanks to the dot operator for accessing struct slots. Lexical aliases to struct places can also be arranged with considerable convenience using the .code placelet operator. However, .code placelet will not bind multiple aliases to multiple slots of the same object such that the expression which produces the object is evaluated only once. .TP* Example: .verb (defstruct point nil x y) ;; Here, with-slots introduces verbosity because ;; each slot is accessed only once. The function ;; is equivalent to: ;; ;; (defun point-delta (p0 p1) ;; (new point x (- p1.x p0.x) y (- p1.y p0.y))) ;; ;; Also contrast with the use of placelet: ;; ;; (defun point-delta (p0 p1) ;; (placelet ((x0 p0.x) (y0 p0.y) ;; (x1 p1.x) (y1 p1.y)) ;; (new point x (- x1 x0) y (- y1 y0))))) (defun point-delta (p0 p1) (with-slots ((x0 x) (y0 y)) p0 (with-slots ((x1 x) (y1 y)) p1 (new point x (- x1 x0) y (- y1 y0))))) .brev .coNP Macro @ qref .synb .mets (qref < object-form .mets \ \ >> { slot | >> ( slot << arg *) | >> [ slot << arg *]}+) .syne .desc The .code qref macro ("quoted reference") performs structure slot access. Structure slot access is more conveniently expressed using the referencing dot notation, which works by translating to qref .code qref syntax, according to the following equivalence: .verb a.b.c.d <--> (qref a b c d) ;; a b c d must not be numbers .brev (See the Referencing Dot section under Additional Syntax.) The leftmost argument of .code qref is an expression which is evaluated. This argument is followed by one or more reference designators. If there are two or more designators, the following equivalence applies: .verb (qref obj d1 d2 ...) <---> (qref (qref obj d1) d2 ...) .brev That is to say, .code qref is applied to the object and a single designator. This must yield an object, which to which the next designator is applied as if by another .code qref operation, and so forth. If the null-safe syntax .code "(t ...)" is present, the equivalence becomes more complicated: .verb (qref (t obj) d1 d2 ...) <---> (qref (qref (t obj) d1) d2 ...) (qref obj (t d1) d2 ...) <---> (qref (t (qref obj d1)) d2 ...) .brev Thus, .code qref can be understood in terms of the semantics of the binary form .mono .meti (qref < object-form << designator ) .onom Designators come in three basic forms: a lone symbol, an ordinary compound expression consisting of a symbol followed by arguments, or a DWIM expression consisting of a symbol followed by arguments. A lone symbol designator indicates the slot of that name. That is to say, the following equivalence applies: .verb (qref o n) <--> (slot o 'n) .brev where .code slot is the structure slot accessor function. Because .code slot is an accessor, this form denotes the slot as a syntactic place; slots can be modified via assignment to the .code qref form and the referencing dot syntax. The slot name being implicitly quoted is the basis of the term "quoted reference", giving rise to the .code qref name. A compound designator indicates that the named slot is a function, which is to be applied to arguments. The following equivalence applies in this case, except that .code o is evaluated only once: .verb (qref o (n arg ...)) <--> (call (slot o 'n) o arg ...) .brev A DWIM designator similarly indicates that the named slot is a function, which is to be applied to arguments. The following equivalence applies: .verb (qref obj [name arg ...]) <--> [(slot obj 'name) o arg ...] .brev Therefore, under this equivalence, this syntax provides the usual Lisp-1-style evaluation rule via the .code dwim operator. If the .meta object-form has the syntax .mono .meti (t << expression ) .onom this indicates null-safe access: if .meta expression evaluates to .code nil then the entire expression .mono .meti (qref (t << expression ) << designator ) .onom form yields .codn nil . This syntax is produced by the .code .? notation. The null-safe access notation prevents not only slot access, but also method or function calls on .codn nil . When a method or function call is suppressed due to the object being .codn nil , no aspect of the method or function call is evaluated; not only is the slot not accessed, but the argument expressions are not evaluated. .TP* Example: .verb (defstruct foo nil (array (vec 1 2 3)) (increment (lambda (self index delta) (inc [self.array index] delta)))) (defvarl s (new foo)) ;; access third element of s.array: [s.array 2] --> 3 ;; increment first element of array by 42 s.(increment 0 42) --> 43 ;; access array member s.array --> #(43 2 3) .brev Note how .code increment behaves much like a single-argument-dispatch object-oriented method. Firstly, the syntax .mono s.(increment 0 42) .onom effectively selects the .code increment function which is particular to the .code s object. Secondly, the object is passed to the selected function as the leftmost argument, so that the function has access to the object. .coNP Macro @ uref .synb .mets (uref >> { slot | >> ( slot << arg *) | >> [ slot << arg *]}+) .syne .desc The .code uref macro ("unbound reference") expands to an expression which evaluates to a function. The function takes exactly one argument: an object. When the function is invoked on an object, it references slots or methods relative to that object. Note: the .code uref syntax may be used directly, but it is also produced by the unbound referencing dot syntactic sugar: .verb .a --> (uref a) .?a --> (uref t a) .(f x) --> (uref (f x)) .(f x).b --> (uref (f x) b) .a.(f x).b --> (uref a (f x) b) .brev The macro may be understood in terms of the following translation scheme: .verb (uref a b ...) --> (lambda (o) (qref o a b ...)) (uref t a b ...) --> (lambda (o) (if o (qref o a b ...))) .brev where .code o is understood to be a unique symbol (for instance, as produced by the .code gensym function). When only one .code uref argument is present, these equivalences also hold: .verb (uref (f a b c ...)) <--> (umeth f a b c ...) (uref s) <--> (usl s) .brev The terminology "unbound reference" refers to the property that .code uref expressions produce a function which isn't bound to a structure object. The function binds a slot or method; the call to that function then binds an object to that function, as an argument. .TP* Examples: Suppose that the objects in .code list have slots .code a and .codn b . Then, a list of the .code a slot values may be obtained using: .verb (mapcar .a list) .brev because this is equivalent to .verb (mapcar (lambda (o) o.a) list) .brev Because .code uref produces a function, its result can be operated upon by functional combinators. For instance, we can use the .code juxt combinator to produce a list of two-element lists, which hold the .code a and .code b slots from each object in .codn list : .verb (mapcar (juxt .a .b) list) .brev .coNP Macro @ meth .synb .mets (meth < struct < slot << curried-expr *) .syne .desc The .code meth macro allows indirection upon a method-like function stored in a function slot. The .code meth macro binds .meta struct as the leftmost argument of the function stored in .metn slot , returning a function which takes the remaining arguments. That is to say, it returns a function .meta f such that .mono .meti >> [ f < arg ...] .onom calls .mono .meti >> [ struct.slot < struct < arg ...] .onom except that .meta struct is evaluated only once. If one or more .meta curried-expr expressions are present, their values are bound inside .meta f also, and when .meta f is invoked, these are passed to the function stored in the slot. Thus if .meta f is produced by .code "(meth struct slot c1 c2 c3 ...)" then .mono .meti >> [ f < arg ...] .onom calls .mono .meti >> [ struct.slot < struct < c1v < c2v < c3v ... < arg ...] .onom except that .meta struct is evaluated only once, and .metn c1v , .meta c2v and .meta c3v are the values of expressions .codn c1 , .code c2 and .codn c3 . The argument .meta struct must be an expression which evaluates to a struct. The .meta slot argument is not evaluated, and must be a symbol denoting a slot. The syntax can be understood as a translation to a call of the .code method function: .verb (meth a b) <--> (method a 'b) .brev If .meta curried-arg expressions are present, the translation may be be understood as: .verb (meth a b c1 c2 ...) <--> [(fun method) a 'b c1 c2 ...] .brev In other words the .meta curried-arg expressions are evaluated under the .code dwim operator evaluation rules. .TP* Example: .verb ;; struct for counting atoms eq to key (defstruct (counter key) nil key (count 0) (:method increment (self key) (if (eq self.key key) (inc self.count)))) ;; pass all atoms in tree to func (defun map-tree (tree func) (if (atom tree) [func tree] (progn (map-tree (car tree) func) (map-tree (cdr tree) func)))) ;; count occurrences of symbol a ;; using increment method of counter, ;; passed as func argument to map-tree. (let ((c (new (counter 'a))) (tr '(a (b (a a)) c a d))) (map-tree tr (meth c increment)) c) --> #S(counter key a count 4 increment #) .brev .coNP Macro @ umeth .synb .mets (umeth < slot << curried-expr *) .syne .desc The .code umeth macro binds the symbol .meta slot to a function and returns that function. The .meta curried-expr arguments, if present, are evaluated as if they were arguments to the .code dwim operator. When that function is called, it expects at least one argument. The leftmost argument must be an object of struct type. The slot named .meta slot is retrieved from that object, and is expected to be a function. That function is called with the object, followed by the values of the .metn curried-expr s, if any, followed by that function's arguments. The syntax can be understood as a translation to a call of the .code umethod function: .verb (umeth s ...) <--> [umethod 's ...] .brev The macro merely provides the syntactic sugar of not having to quote the symbol, and automatically treating the curried argument expressions using Lisp-1 semantics of the .code dwim operator. .TP* Example: .verb ;; seal and dog are variables which hold structures of ;; different types. Both have a method called bark. (let ((bark-fun (umeth bark))) [bark-fun dog] ;; same effect as dog.(bark) [bark-fun seal]) ;; same effect as seal.(bark) .brev The .code u in .code umeth stands for "unbound". The function produced by .code umeth is not bound to any specific object; it binds to an object whenever it is invoked by retrieving the actual method from the object's slot at call time. .coNP Macro @ usl .synb .mets (usl << slot ) .syne .desc The .code usl macro binds the symbol .meta slot to a function and returns that function. When that function is called, it expects exactly one argument. That argument must be an object of struct type. The slot named .meta slot is retrieved from that object and returned. The name .code usl stands for "unbound slot". The term "unbound" refers to the returned function not being bound to a particular object. The binding of the slot to an object takes place whenever the function is called. .coNP Function @ make-struct-type .synb .mets (make-struct-type < name < super < static-slots < slots .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ < static-initfun < initfun << boactor .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ < boactor << postinitfun ) .syne .desc The .code make-struct-type function creates a new struct type. The .meta name argument must be a bindable symbol, according to the .code bindable function. It specifies the name property of the struct type as well as the name under which the struct type is globally registered. The .meta super argument indicates the supertype for the struct type. It must be either a value of type .codn struct-type , a symbol which names a struct type, or else .codn nil , indicating that the newly created struct type has no supertype. The .meta static-slots argument is a list of symbol which specify static slots. The symbols must be bindable and the list must not contain duplicates. The .meta slots argument is a list of symbols which specifies the instance slots. The symbols must be bindable and there must not be any duplicates within the list, or against entries in the .meta static-slots list. The new struct type's effective list of slots is formed by appending together .meta static-slots and .metn slots , and then appending that to the list of the supertype's slots, and de-duplicating the resulting list as if by the .code uniq function. Thus, any slots which are already present in the supertype are removed. If the structure has no supertype, then the list of supertype slots is taken to be empty. When a structure is instantiated, it shall have all the slots specified in the effective list of slots. Each instance slot shall be initialized to the value .codn nil , prior to the invocation of .meta initfun and .metn boactor . The .meta static-initfun argument either specifies an initialization function, or is .codn nil , which is equivalent to specifying a function which does nothing. Prior to the invocation of .metn static-initfun , each new static slot shall be initialized the value .codn nil . Inherited static slots retain their values from the supertype. If specified, .meta static-initfun function must accept one argument. When the structure type is created (before the .code make-struct-type function returns) the .meta static-initfun function is invoked, passed the newly created structure type as its argument. The .meta initfun argument either specifies an initialization function, or is .codn nil , which is equivalent to specifying a function which does nothing. If specified, this function must accept one argument. When a structure is instantiated, every .meta initfun in its chain of supertype ancestry is invoked, in order of inheritance, so that the root supertype's .meta initfun is called first and the structure's own specific .meta initfun is called last. These calls occur before the slots are initialized from the .meta arg arguments or the .meta slot-init-plist of .codn make-struct . Each function is passed the newly created structure object, and may alter its slots. If multiple inheritance occurs, the .meta initfun functions of multiple supertypes are called in right-to-left order. The .meta boactor argument either specifies a by-order-of-arguments initialization function ("boa constructor") or is .codn nil , which is equivalent to specifying a constructor which does nothing. If specified, it must be a function which takes at least one argument. When a structure is instantiated, and boa arguments are given, the .meta boactor is invoked, with the structure as the leftmost argument, and the boa arguments as additional arguments. This takes place after the processing of .meta initfun functions, and after the processing of the .meta slot-init-plist specified in the .code make-struct call. Note that the .meta boactor functions of the supertypes are not called, only the .meta boactor specific to the type being constructed. The .meta postinitfun argument either specifies an initialization function, or is .codn nil , which is equivalent to specifying a function which does nothing. If specified, this function must accept one argument. The .meta postinitfun function is similar to .metn initfun . The difference is that .meta postinitfun functions are called after all other initialization processing, rather than before. They are are also called in order of inheritance: the .meta postinitfun of a structure's supertype is called before its own, and in right-to-left order among multiple supertypes under multiple inheritance. .coNP Function @ find-struct-type .synb .mets (find-struct-type << name ) .syne .desc The .code find-struct-type returns a .code struct-type object corresponding to the symbol .metn name . If no struct type is registered under .metn name , then it returns .codn nil . A .code struct-type object exists for each structure type and holds information about it. These objects are not themselves structures and are all of the same type, .codn struct-type . .coNP Function @ struct-type-p .synb .mets (struct-type-p << obj ) .syne .desc The .code struct-type-p function returns .code t if .meta obj is a structure type, otherwise it returns .codn nil . A structure type is an object of type .codn struct-type , returned by .codn find-struct-type . .coNP Function @ struct-type-name .synb .mets (struct-type-name << type-or-struct ) .syne .desc The .code struct-type-name function determines a structure type from the .meta type-or-struct argument and returns that structure type's symbolic name. The .meta type-or-struct argument must be either a struct type object (such as the return value of a successful lookup via .codn find-struct-type ), a symbol which names a struct type, or else a struct instance. .coNP Function @ super .synb .mets (super <> [ type-or-struct ]) .syne .desc The .code super function determines a structure type from the .meta type-or-struct argument and returns the struct type object which is the supertype of that type, or else .code nil if that type has no supertype. The .meta type-or-struct argument must be either a struct type object, a symbol which names a struct type, or else a struct instance. .coNP Function @ make-struct .synb .mets (make-struct < type < slot-init-plist << arg *) .syne .desc The .code make-struct function returns a new object which is an instance of the structure type .metn type . The .meta type argument must either be a .code struct-type object, or else a symbol which is the name of a structure. The .meta slot-init-plist argument gives a list of slot initializations in the style of a property list, as defined by the .code prop function. It may be empty, in which case it has no effect. Otherwise, it specifies slot names and their values. Each slot name which is given must be a slot of the structure type. The corresponding value will be stored into the slot of the newly created object. If a slot is repeated, it is unspecified which value takes effect. The optional .metn arg s specify arguments to the structure type's boa constructor. If the arguments are omitted, the boa constructor is not invoked. Otherwise the boa constructor is invoked on the structure object and those arguments. The argument list must match the trailing parameters of the boa constructor (the remaining parameters which follow the leftmost argument which passes the structure to the boa constructor). When a new structure is instantiated by .codn make-struct , its slot values are first initialized by the structure type's registered functions as described under .codn make-struct-type . Then, the .meta slot-init-plist is processed, if not empty, and finally, the .metn arg s are processed, if present, and passed to the boa constructor. If any of the initializations abandon the evaluation of .code make-struct by a nonlocal exit such as an exception throw, the object's finalizers, if any, are invoked. .coNP Function @ make-lazy-struct .synb .mets (make-lazy-struct < type << argfun ) .syne .desc The .code make-lazy-struct function returns a new object which is an instance of the structure type .metn type . The .meta type argument must either be a .code struct-type object, or else a symbol which is the name of a structure. The .meta argfun argument should be a function which can be called with no parameters and returns a cons cell. More requirements are specified below. The object returned by .code make-lazy-struct is a lazily-initialized struct instance, or .IR "lazy struct" . A lazy struct remains uninitialized until just before the first access to any of its instance slots. Just before an instance slot is accessed, initialization takes place as follows. The .meta argfun function is invoked with no arguments. Its return value must be a cons cell. The .code car of the cons cell is taken to be a property list, as defined by the .code prop function. The .code cdr field is taken to be a list of arguments. These values are treated as if they were, respectively, the .meta slot-init-plist and the boa constructor arguments given in a .code make-struct invocation. Initialization of the structure proceeds as described in the description of .codn make-struct . .coNP Functions @ struct-from-plist and @ struct-from-args .synb .mets (struct-from-plist < type >> { slot << value }*) .mets (struct-from-args < type << arg *) .syne .desc The .code struct-from-plist and .code struct-from-args functions are interfaces to the .code make-struct function. The .code struct-from-plist function passes its .meta slot and .meta value arguments as the .meta slot-init-plist argument of .codn make-struct . It passes no boa constructor arguments. The .code struct-from-args function calls .meta make-struct with an empty .metn slot-init-plist , passing down the list of .metn arg s. The following equivalences hold: .verb (struct-from-plist a s0 v0 s1 v1 ...) <--> (make-struct a (list s0 v0 s1 v1 ...)) (struct-from-args a v0 v1 v2 ...) <--> (make-struct a nil v0 v1 v2 ...) .brev .coNP Function @ allocate-struct .synb .mets (allocate-struct << type ) .syne .desc The .code allocate-struct provides a low-level allocator for structure objects. The .meta type argument must either be a .code struct-type object, or else a symbol which is the name of a structure. The .code allocate-struct creates and returns a new instance of .meta type all of whose instance slots take on the value .codn nil . No initializations are performed. The struct type's registered initialization functions are not invoked. .coNP Function @ copy-struct .synb .mets (copy-struct << struct-obj ) .syne .desc The .code copy-struct function creates and returns a new object which is a duplicate of .metn struct-obj , which must be a structure. The duplicate object is a structure of the same type as .meta struct-obj and has the same slot values. The creation of a duplicate does not involve calling any of the struct type's initialization functions. Only instance slots participate in the duplication. Since the original structure and copy are of the same structure type, they already share static slots. This is a low-level, "shallow" copying mechanism. If an object design calls for a higher level cloning mechanism with deep copying or other additional semantics, one can be built on top of .codn copy-struct . For instance, a structure can have a .code copy method similar to the following: .verb (:method copy (me) (let ((my-copy (copy-struct me))) ;; inform the copy that it has been created ;; by invoking its copied method. my-copy.(copied) my-copy)) .brev which can then be invoked on whatever object needs copying. Note that a method named .code copy is a special structure function. When an object provides this method, the .code copy function uses the method to copy the object, rather than using .codn copy-struct . Since this logic is generic, it can be placed in a base method. The .code copied method which it calls is the means by which the new object is notified that it is a copy. This method takes on whatever special responsibilities are required when a copy is produced, such as registering the object in various necessary associations, or performing a deeper copy of some of the objects held in the slots. The .code copied handler can be implemented at multiple levels of an inheritance hierarchy. The initial call to .code copied from .code copy will call the most derived override of that method. To call the corresponding method in the base class, a given derived method can use the .code call-super-fun function, or else the .code "(meth ...)" syntax in the first position of a compound form, in place of a function name. Examples of both are given in the documentation for .codn call-super-fun . Thus derived structs can inherit the copy handling logic from base structs, and extend it with their own. .coNP Accessor @ slot .synb .mets (slot < struct-obj << slot-name ) .mets (set (slot < struct-obj << slot-name ) << new-value ) .syne .desc The .code slot function retrieves a structure's slot. The .meta struct-obj argument must be a structure, and .meta slot-name must be a symbol which names a slot in that structure. Because .code slot is an accessor, a .code slot form is a syntactic place which denotes the slot's storage location. A syntactic place expressed by .code slot does not support deletion. .coNP Function @ slotset .synb .mets (slotset < struct-obj < slot-name << new-value ) .syne .desc The .code slotset function stores a value in a structure's slot. The .meta struct-obj argument must be a structure, and .meta slot-name must be a symbol which names a slot in that structure. The .meta new-value argument specifies the value to be stored in the slot. If a successful store takes place to an instance slot of .metn struct-obj , then the dirty flag of that object is set, causing the .code test-dirty function to report true for that object. The .code slotset function returns .metn new-value . .coNP Functions @, test-dirty @ clear-dirty and @ test-clear-dirty .synb .mets (test-dirty << struct-obj ) .mets (clear-dirty << struct-obj ) .mets (test-clear-dirty << struct-obj ) .syne .desc The .codn test-dirty , .code clear-dirty and .code test-clear-dirty functions comprise the interface for interacting with structure dirty flags. Each structure instance has a dirty flag. When this flag is set, the structure instance is said to be dirty, otherwise it is said to be clean. A newly created structure is dirty. A structure remains dirty until its dirty flag is explicitly reset. If a structure is clean, and one of its instance slots is overwritten with a new value, it becomes dirty. The .code test-dirty function returns the dirty flag of .metn struct-obj : .code t if .meta struct-obj is dirty, otherwise .codn nil . The .code clear-dirty function clears the dirty flag of .meta struct-obj and returns .meta struct-obj itself. The .code test-clear-dirty flag combines these operations: it makes a note of the dirty flag of .meta struct-obj and clears it. Then it returns the noted value, .code t or .codn nil . .coNP Function @ structp .synb .mets (structp << obj ) .syne .desc The .code structp function returns .code t if .meta obj is a structure, otherwise it returns .codn nil . .coNP Function @ struct-type .synb .mets (struct-type << struct-obj ) .syne .desc The .code struct-type function returns the structure type object which represents the type of the structure object instance .metn struct-obj . .coNP Function @ clear-struct .synb .mets (clear-struct < struct-obj <> [ value ]) .syne .desc The .code clear-struct replaces all instance slots of .meta struct-obj with .metn value , which defaults to .code nil if omitted. Note that finalizers are not executed prior to replacing the slot values. .coNP Function @ reset-struct .synb .mets (reset-struct << struct-obj ) .syne .desc The .code reset-struct function reinitializes the structure object .meta struct-obj as if it were being newly created. First, all the slots are set to .code nil as if by the .code clear-struct function. Then the slots are initialized by invoking the initialization functions, in order of the supertype ancestry, just as would be done for a new structure object created by .code make-struct with an empty .meta slot-init-plist and no boa arguments. Note that finalizers registered against .meta struct-obj are not invoked prior to the reset operation, and remain registered. If the structure has state which is cleaned up by finalizers, it is advisable to invoke them using .code call-finalizers prior to using .codn reset-struct , or to take other measures to deal with the situation. If the structure specifies .code :fini handlers, then the reinitialization will cause these to registered, just like when a new object it constructed. Thus if .code call-finalizers is not used prior to .codn reset-struct , this will result in the existence of duplicate registrations of the finalization functions. Finalizers registered against .meta struct-obj .B are invoked if an exception is thrown during the reinitialization, just like when a new structure is being constructed. .coNP Function @ replace-struct .synb .mets (replace-struct < target-obj << source-obj ) .syne .desc The .code replace-struct function causes .meta target-obj to take on the attributes of .meta source-obj without changing its identity. The type of .code target-obj is changed to that of .codn source-obj . All instance slots of .code target-obj are discarded, and it is given new slots, which are copies of the instance slots of .codn source-obj . Because of the type change, .code target-obj implicitly loses all of its original static slots, and acquires those of .codn "source obj" . Note that finalizers registered against .meta target-obj are not invoked, and remain registered. If .meta target-obj has state which is cleaned up by finalizers, it is advisable to invoke them using .code call-finalizers prior to using .codn replace-struct , or to take other measures to handle the situation. If the .meta target-obj and .meta source-obj arguments are the same object, .code replace-struct has no effect. The return value is .metn target-obj . .coNP Function @ method .synb .mets (method < struct-obj < slot-name << curried-arg *) .syne .desc The .code method function retrieves a function .meta m from a structure's slot and returns a new function which binds that function's left argument. If .meta curried-arg arguments are present, then they are also stored in the returned function. These are the .IR "curried arguments" . The .meta struct-obj argument must be a structure, and .meta slot-name must be a symbol denoting a slot in that structure. The slot must hold a function of at least one argument. The function .meta f which .code method function returns, when invoked, calls the function .meta m previously retrieved from the object's slot, passing to that function .meta struct-obj as the leftmost argument, followed by the curried arguments, followed by all of .metn f 's own arguments. Note: the .code meth macro is an alternative interface which is suitable if the slot name isn't a computed value. .coNP Function @ super-method .synb .mets (super-method < struct-obj << slot-name ) .syne .desc The .code super-method function retrieves a function from a static slot belonging to one of the direct supertypes of the structure type of .metn struct-obj . It then returns a function which binds that function's left argument to the structure. The .meta struct-obj argument must be a structure which has at least one supertype, and .meta slot-name must be a symbol denoting a static slot in one of those supertypes. The slot must hold a function of at least one argument. The supertypes are searched from left to right for a static slot named .metn slot-name ; when the first such slot is found, its value is used. The .code super-method function returns a function which, when invoked, calls the function previously retrieved from the supertype's static slot, passing to that function .meta struct-obj as the leftmost argument, followed by the function's own arguments. .coNP Function @ umethod .synb .mets (umethod < slot-name << curried-arg *) .syne .desc The .code umethod returns a function which represents the set of all methods named by the slot .meta slot-name in all structure types, including ones not yet defined. The .meta slot-name argument must be a symbol. If one or more .meta curried-arg argument are present, these values represent the .I "curried arguments" which are stored in the function object which is returned. This returned function must be called with at least one argument. Its leftmost argument must be an object of structure type, which has a slot named .metn slot-name . The function will retrieve the value of the slot from that object, expecting it to be a function, and calls it, passing to it the following arguments: the object itself; all of the curried arguments, if any; and all of its remaining arguments. Note: the .code umethod name stands for "unbound method". Unlike the .code method function, .code umethod doesn't return a method whose leftmost argument is already bound to an object; the binding occurs at call time. .coNP Function @ uslot .synb .mets (uslot << slot-name ) .syne .desc The .code uslot returns a function which represents all slots named .meta slot-name in all structure types, including ones not yet defined. The .meta slot-name argument must be a symbol. The returned function must be called with exactly one argument. The argument must be a structure which has a slot named .metn slot-name . The function will retrieve the value of the slot from that object and return it. Note: the .code uslot name stands for "unbound slot". The returned function isn't bound to a particular object. The binding of .code slot-name to a slot in the structure object occurs when the function is called. .coNP Function @ slots .synb .mets (slots << type ) .syne .desc The .code slots function returns a list of all of the slots of struct type .metn type . The .meta type argument must be a structure type, or else a symbol which names a structure type. .coNP Function @ slotp .synb .mets (slotp < type << name ) .syne .desc The .code slotp function returns .code t if name .meta name is a symbol which names a slot in the structure type .metn type . Otherwise it returns .codn nil . The .meta type argument must be a structure type, or else a symbol which names a structure type. .coNP Function @ static-slot-p .synb .mets (static-slot-p < type << name ) .syne .desc The .code static-slot-p function returns .code t if name .meta name is a symbol which names a slot in the structure type .metn type , and if that slot is a static slot. Otherwise it returns .codn nil . The .meta type argument must be a structure type, or else a symbol which names a structure type. .coNP Function @ static-slot .synb .mets (static-slot < type << name ) .syne .desc The .code static-slot function retrieves the value of the static slot named by symbol .meta name of the structure type .metn type . The .meta type argument must be a structure type or a symbol which names a structure type, and .meta name must be a static slot of this type. .coNP Function @ static-slot-set .synb .mets (static-slot-set < type < name << new-value ) .syne .desc The .code static-slot-set function stores .meta new-value into the static slot named by symbol .meta name of the structure type .metn type . It returns .metn new-value . The .meta type argument must be a structure type or the name of a structure type, and .meta name must be a static slot of this type. .coNP Function @ static-slot-ensure .synb .mets (static-slot-ensure < type < name < new-value <> [ no-error-p ]) .syne .desc The .code static-slot-ensure ensures, if possible, that the struct type .metn type , as well as possibly one or more struct types derived from it, have a static slot called .metn name , that this slot is not shared with a supertype, and that the value stored in it is .metn new-value . Note: this function supports the redefinition of methods, as the implementation underlying the .code defmeth macro; its semantics is designed to harmonize with expected behaviors in that usage. The function operates as follows. If .meta type itself already has an instance slot called .meta name then an error is thrown, and the function has no effect, unless a true argument is specified for the .meta no-error-p Boolean parameter. In that case, in the same situation, the function has no effect and simply returns .metn new-value . If .meta type already has a non-inherited static slot called .meta name then this slot is overwritten with .meta new-value and the function returns .metn new-value . Types derived from .meta type may also have this slot, via inheritance; consequently, its value changes in those types also. If .meta type already has an inherited static slot called .meta name then its inheritance is severed; the slot is converted to a non-inherited static slot of .meta type and initialized with .metn new-value . Then all struct types derived from .meta type are scanned. In each such type, if the original inherited static slot is found, it is replaced with the same newly converted static slot that was just introduced into .metn type , so that all these types now inherit this new slot from .meta type rather than the original slot from some supertype of .metn type . These types all share a single instance of the slot with .metn type , but not with supertypes of .metn type . In the remaining case, .meta type has no slot called .metn name . The slot is added as a static slot to .metn type . Then it is added to every struct type derived from .meta type which does not already have a slot by that name, as if by inheritance. That is to say, types to which this slot is introduced share a single instance of that slot. The value of the new slot is .metn new-value , which is also returned from the function. Any subtypes of .meta type which already have a slot called .meta name are ignored, as are their subtypes. .coNP Function @ static-slot-home .synb .mets (static-slot-home < type << name ) .syne .desc The .code static-slot-home method determines which structure type actually defines the static slot .meta name present in struct type .metn type . If .meta type isn't a struct type, or the name of a struct type, the function throws an error. Likewise, if .meta name isn't a static slot of .metn type . If .meta name is a static slot of .meta type then the function returns a struct type name symbol which is either then name of .meta type itself, if the slot is defined specifically for .meta type or else the most distant ancestor of .meta type from which the slot is inherited. .coNP Function @ call-super-method .synb .mets (call-super-method < struct-obj < name << argument *) .syne .desc The .code call-super-method function is deprecated. Solutions involving .code call-super-method should be reworked in terms of .codn call-super-fun . The .code call-super-method retrieves the function stored in the static slot .meta name of one of the direct supertypes of .meta struct-obj and invokes it, passing to that function .meta struct-obj as the leftmost argument, followed by the given .metn argument s, if any. The .meta struct-obj argument must be of structure type. Moreover, that structure type must be derived from one or more supertypes, and .meta name must name a static slot available from at least one of those supertypes. The supertypes are searched left to right in search of this slot. The object retrieved from that static slot must be callable as a function, and accept the arguments. Note that it is not correct for a method that is defined against a particular type to use .code call-super-method to call the same method (or any other method) in the supertype of that particular type. This is because .code call-super-method refers to the type of the object instance .metn struct-obj , not to the type against which the calling method is defined. .coNP Function @ call-super-fun .synb .mets (call-super-fun < type < name << argument *) .syne .desc The .code call-super-fun retrieves the function stored in the slot .meta name of one of the supertypes of .meta type and invokes it, passing to that function the given .metn argument s, if any. The .meta type argument must be a structure type. Moreover, that structure type must be derived from one or more supertypes, and .meta name must name a static slot available from at least one of those supertypes. The supertypes are searched left to right in search of this slot. The object retrieved from that static slot must be callable as a function, and accept the arguments. .TP* Example: Print a message and call supertype method: .verb (defstruct base nil) (defstruct derived base) (defmeth base fun (obj arg) (format t "base fun method called with arg ~s\en" arg)) (defmeth derived fun (obj arg) (format t "derived fun method called with arg ~s\en" arg) (call-super-fun 'derived 'fun obj arg)) ;; Interactive Listener: 1> (new derived).(fun 42) derived fun method called with arg 42 base fun method called with arg 42 .brev Note that a static method or function in any structure type can be invoked by using the .code "(meth ...)" name syntax in the first position of a compound form, as a function name. Thus, the above .code "derived fun" can also be written: .verb (defmeth derived fun (obj arg) (format t "derived fun method called with arg ~s\en" arg) ((meth base fun) obj arg)) .brev .coNP Functions @ struct-get-initfun and @ struct-get-postinitfun .synb .mets (struct-get-initfun << type ) .mets (struct-get-postinitfun << type ) .syne .desc The .code struct-get-initfun and .code struct-get-postinitfun functions retrieve, respectively, a structure type's .meta initfun and .meta postinitfun functions. These are the functions which are initially configured in the call to .code make-struct-type via the .meta initfun and .meta postinitfun arguments. Either one may be .codn nil , indicating that the type has no .meta initfun or .metn postinitfun . .coNP Functions @ struct-set-initfun and @ struct-set-postinitfun .synb .mets (struct-set-initfun < type << function ) .mets (struct-set-postinitfun < type << function ) .syne .desc The .code struct-set-initfun and .code struct-set-postinitfun functions overwrite, respectively, a structure type's .meta initfun and .meta postinitfun functions. These are the functions which are initially configured in the call to .code make-struct-type via the .meta initfun and .meta postinitfun arguments. The .meta function argument must either be .code nil or else a function which accepts one argument. Note that .meta initfun has the responsibility for all instance slot initializations. The .code defstruct syntax compiles the initializing expressions in the slot specifier syntax into statements which are placed into a function, which becomes the .meta initfun of the struct type. .coNP Macro @ with-objects .synb .mets (with-objects >> ({( sym << init-form )}*) << body-form *) .syne .desc The .code with-objects macro provides a binding construct similar to .codn let* . Each .meta sym must be a symbol suitable for use as a variable name. Each .meta init-form is evaluated in sequence, and a binding is established for its corresponding .meta sym which is initialized with the value of that form. The binding is visible to subsequent .metn init-form s. Additionally, the values of the .metn init-form s are noted as they are produced. When the .code with-objects form terminates, by any means, the .code call-finalizers function is invoked on each value which was returned by an .meta init-form and had been noted. These calls are performed in the reverse order relative to the original evaluation of the forms. After the variables are established and initialized, the .metn body-form s are evaluated in the scope of the variables. The value of the last form is returned, or else .code nil if there are no forms. The invocations of .code call-finalizers take place just before the value of the last form is returned. .coNP Macro @ define-struct-clause .synb .mets (define-struct-clause < keyword < params <> [ body-form ]*) .syne .desc The .code define-struct-clause macro makes available a new, application-defined .code defstruct clause. The clause is named by .metn keyword , which must be a keyword symbol, and is implemented as a macro transformation by the .meta params and .metn body-form s of the definition. The definition established by .code define-struct-clause is called a .IR "struct clause macro" . A struct clause macro is invoked when .code defstruct syntax is processed which contains one or more clauses which are headed by the matching .meta keyword symbol. The .meta params comprise a macro-style parameter list which must match the invoking clause, otherwise an error exception is thrown. When .meta params successfully matches the clause parameters, the parameters are destructured into the parameters and the .metn body-form s are evaluated in the scope of those parameters. The .metn body-form s must return a possibly list of .code defstruct clauses, not a single clause. Each of the returned clauses is examined for the possibility that it may be a struct clause macro; if so, it is expanded. The built-in clause keywords .codn :static , .codn :instance , .codn :function , .codn :method , .codn :init , .codn :postinit , .code :fini and .codn :postfini . may not be used as the names of a struct clause macro; if any of these symbols is used as the .meta keyword parameter of .codn define-struct-clause , an error exception is thrown. The return value of a .code define-struct-clause macro invocation is the .meta keyword argument. .TP* Examples: .verb ;; Trivial struct clause macro which consumes any number of ;; arguments and produces no slots: (define-struct-clause :nothing (. ignored-args)) ;; Consequently, the following defines a struct with one slot, x: ;; The (:nothing ...) clause disappears by producing no clauses. (defstruct foo () (:nothing 1 2 3 beeblebrox) x) ;; struct clause macro called :multi which takes an initial value ;; and zero or more slot names. It produces instance slot definitions ;; which all use that same initial value. (define-struct-clause :multi (init-val . names) (mapcar (lop list init-val) names)) ;; define a struct with three slots initialized to zero: (defstruct bar () (:multi 0 a b c)) ;; expands to (a 0) (b 0) (c 0) ;; struct clause macro to define a slot along with a ;; get and set method. (define-struct-clause :getset (slot getter setter : init-val) ^((,slot ,init-val) (:method ,getter (obj) obj.,slot) (:method ,setter (obj new) (set obj.,slot new)))) ;; Example use: (defstruct point () (:getset x get-x set-x 0) (:getset y get-y set-y 0)) ;; This has exactly the same effect as the following defstruct: (defstruct point () (x 0) (y 0) (:method get-x (obj) obj.x) (:method set-x (ob new) (set obj.x new)) (:method get-y (obj) obj.y) (:method set-y (ob new) (set obj.y new))) .brev .coNP Struct Clause Macro @ :delegate .synb .mets (:delegate < name <> ( param +) < delegate-expr <> [ target-name ]) .syne .desc The :delegate struct clause macro provides a way to define a method which is implemented entirely by delegation to a different object. The name of the method is .meta name and its parameter list is specified in the same way as in the .meta :method clause. Instead of a method body, the .code :delegate clause has an expression .meta delegate-expr and an optional .meta target-name which defaults to .metn name . The .meta delegate-expr must be an expression which the delegate method can evaluate to produce a delegate object. The delegate method then passes its arguments to the target method, given by the .meta target-name argument, invoked on the delegate object. If the delegate method specifies an optional parameter without a default initializing expression, and that optional parameter is not given an argument value, it receives the colon symbol .code : as its argument. That value is passed on to the corresponding parameter of the delegate target method. Thus, if the target method has an optional parameter in that same parameter position, that colon symbol argument then has the effect of requesting the default value. If the target method has an ordinary parameter in that position, then the colon symbol is received as an ordinary argument value. If the delegate method specifies an optional parameter with a default initializing expression, and that optional parameter is not given an argument value, then the expression is evaluated to produce a value for that parameter, in the usual manner, and that value is passed as an argument to the corresponding parameter of the delegate target. Thus, delegates are able to specify different optional argument defaulting from their targets. A delegate may have an optional parameter in a position where the target has a required parameter and vice versa. The three-element optional parameter expression, specifying a Boolean variable which indicates whether the optional parameter has been given an argument, is not supported by the .code :delegate clause, and is diagnosed. If the delegate method has variadic parameters, they are passed on to the target after the fixed parameters. .TP* Example: Structure definitions: .verb (defstruct worker () name (:method work (me) `worker @{me.name} works`) (:method relax (me : (min 15)) `worker @{me.name} relaxes for @min min`)) ;; "contractor" class has a sub ("subcontractor") slot ;; which is another contractor of the same type. ;; The subcontractor's own sub slot, however is going ;; to be a worker. (defstruct contractor () sub (:delegate work (me) me.sub.sub) (:delegate break (me : min) me.sub.sub relax)) .brev The .code contractor structure's .code work and .code break methods delegate to the sub-subcontractor, which is going to be instantiated as a .code worker object. Note that the .code break method delegates to a differently named method .codn relax . .verb ;; The objects are set up as described above. ;; general contractor co has a co.sub subcontractor, ;; and co.sub.sub is a worker: (defvar co (new contractor sub (new contractor sub (new worker name "foo")))) ;; Call work method on general contractor: ;; this invokes co.sub.sub.(work) on the worker. co.(work) -> "worker foo works" ;; Call break method on general contractor with ;; no argument. This causes co.sub.sub.(relax :) ;; to be invoked, triggering argument defaulting: co.(break) -> "worker foo relaxes for 15 min" ;; Call break method with argument. This ;; invokes co.sub.sub.(relax 5), specifying a ;; value for the default argument: co.(break 5) -> "worker foo relaxes for 5 min" .brev .coNP Struct Clause Macro @ :mass-delegate .synb .mets (:mass-delegate < self-var < delegate-expr .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ < from-type <> [ * ] <> [ method ]*) .syne .desc The :mass-delegate struct macro provides a way to define multiple methods which are implemented as delegates to corresponding methods on another object. The implementation of .code :mass-delegate depends on the .code :delegate macro. The .meta self-var argument must be a bindable symbol. In each generated delegate method, this symbol will be the first argument. The purpose of this symbol is to enable the .meta delegate-expr to refer to the delegating object. The .meta delegate-expr is an expression which is inserted into every method. Its evaluation is expected to produce the delegate object. This expression may reference .meta self-var in order to retrieve or otherwise obtain the delegate from the delegating object. The .meta from-type argument is a symbol naming an existing structure type. If no such structure type has been defined, an error exception is thrown. After the .meta from-type argument, either zero or more slot names appear, optionally preceded by the .code * (asterisk) symbol. If the .code * symbol is present, and isn't followed by any other symbols, it indicates that all methods from .meta from-type are to be delegated. If symbols appear after the .code * then those specify exceptions: methods not to be delegated. No validation is performed on the exception list; it may specify nonexistent method names which have no effect. If the .code * symbol is absent, then every .meta method symbol specifies a method to be delegated. It is consequently expected to name a method of the .metn from-type : a static slot which contains a function. If any .meta method isn't a static slot of .metn from-type , or not a static slot which contains a function, an error exception is thrown. The .code :mass-delegate struct macro iterates over all of the methods of .meta from-type that selected for delegation, and for each one it generates a .code :delegate macro clause based on the existing method's parameter list. For instance, the delegate for a method which has two required arguments and one optional will itself have two required arguments and one optional. Delegates are not simply wrapper functions which take any number of arguments and try to pass them to the target. The generated .code :delegate clauses are then processed by that struct clause macro. Note: composition with delegation is a useful alternative when multiple inheritance is not applicable or desired for various reasons. One such reason is that structures that would be used as multiple inheritance bases use the same symbols for certain slots, and the semantics of those slots conflict. Under inheritance, same-named slots coming from different bases become one slot, Note: a particular .meta from-type being nominated in the .code :mass-delegate clause doesn't mean that the specific methods of that type shall be called by the generated delegates. The methods that shall be called are those of the calculated delegate object selected by the .metn delegate-expr . The .meta from-type is used as a source of the argument info, and method existence validation. It is up to the application to ensure that the delegation using .meta from-type makes sense with respect to the delegate object that is selected by the .metn delegate-expr : for instance, by ensuring that this object is an instance of .meta from-type or a subtype thereof. .TP* Example: .verb (defstruct foo-api () name (:method begin (me) ^(foo ,me.name begin)) (:method increment (me delta) ^(foo ,me.name increment ,delta)) (:method end (me) ^(foo ,me.name end))) (defstruct bar-api () name (:method open (me) ^(bar ,me.name open)) (:method read (me buf) ^(bar ,me.name read ,buf)) (:method write (me buf) ^(bar ,me.name write ,buf)) (:method close (me) ^(bar ,me.name close))) ;; facade holds the two API objects by composition: (defstruct facade () (foo (new foo-api name "foo")) (bar (new bar-api name "bar")) ;; delegate foo-api style calls via me.foo (:mass-delegate me me.foo foo-api *) ;; delegate bar-api style calls via me.bar ;; exclude the write method. (:mass-delegate me me.bar bar-api * write)) ;; instantiate facade as variable fa (defvar fa (new facade)) -> fa ;; begin call on facade delegates through foo-api object. fa.(begin) -> (foo "foo" begin) fa.(increment) -> ;; error: too few arguments fa.(increment 3) -> (foo "foo" increment 3) fa.(open) -> (bar "bar" open) fa.(write 4) -> ;; error: fa has no such method .brev .coNP Function @ macroexpand-struct-clause .synb .mets (macroexpand-struct-clause < clause <> [ form ]) .syne .desc If .code clause is a compound expression whose operator symbol was defined by .code define-struct-clause then .code macroexpand-struct-clause expands the clause and returns the expansion, which is a list of zero or more clauses. Otherwise, the function returns a one-element list containing the .meta clause argument, as if by the .mono .meti (list << clause ) .onom expression. The .meta form parameter, if present, is used for reporting errors. Note: clauses are usually expanded during the processing of a .code defstruct macro; in that situation, the entire unexpanded .code defstruct form serves the role, .TP* Examples: .verb ;; try to expand :delegate, using incorrect syntax. (macroexpand-struct-clause '(:delegate x (a b))) --> error "** source location n/a: nil: too few elements ..." ;; same, but with error reporting form. (macroexpand-struct-clause '(:delegate x (a b)) '(abc xyz)) --> error: "** expr-1:1: abc: too few elements ..." ;; correct :delegate syntax (macroexpand-struct-clause '(:delegate x (a b) a.foo)) --> ((:method x (a b) (qref a.foo (x b)))) ;; not a defstruct macro clause (macroexpand-struct-clause '(1 2 3)) -> ((1 2 3)) .brev .coNP Special Variable @ *struct-clause-expander* .desc The .code *struct-clause-expander* special variable holds the hash table of associations between keyword symbols and struct clause expander functions, defined by .codn define-struct-clause . If the expression .code "[*struct-clause-expander* :sym]" yields a function, then symbol .code :sym has a binding as a struct clause macro. If that expression yields .codn nil , then there is no such binding. The macro expanders in .code *struct-clause-expander* are two-parameter functions. The first parameter accepts the clause to be expanded. The second parameter accepts the .code defstruct form in which that clause is found; this is useful for error reporting. An expander function returns a list of clauses, which may be any, possibly empty mixture of primary clauses accepted by .code defstruct and clause macros. .coNP Macro @ define-struct-prelude .synb .mets (define-struct-prelude < name < struct-name-or-list << clause *) .syne .desc The .code define-struct-prelude macro defines a .IR prelude . A prelude is a named entity which implicitly provides clauses to .code defstruct macro invocations. Preludes are processed during the macroexpansion of .codn defstruct ; prelude definitions have no effect on previously compiled .code defstruct forms loaded from a file. A prelude has a .meta name which must be a bindable symbol. The purpose of this name is that if multiple .code define-struct-prelude forms are evaluated which specify the same .metn name , they replace each others' definition. Only the most recent prelude of a given .meta name is retained; the previous definitions are overwritten. The .meta struct-name-or-list argument is either a symbol or a list of symbols, which are valid for use as structure names. The prelude being defined shall be applicable to each of the structures whose names are given by this argument. The zero or more .meta clause arguments give the clauses which comprise the prelude. In the future, when a .code defstruct form is macroexpanded which targets any of the structures given by the .meta struct-name-or-list argument, the specified clauses will be inserted into that definition, as if they appeared in the .code defstruct form literally. Multiple preludes may be defined with different names, which each target the same structure. When the structure is defined, or redefined, it will receive all those preludes, in the order in which they were defined. .TP* Example: .verb ;; define init-fini-log prelude which targets fox and bear structs (define-struct-prelude init-fini-log (fox bear) (:init (me) (put-line `@me created`)) (:fini (me) (put-line `@me finalized`))) ;; The behavior is as if the following defstruct forms included ;; the above :init and :fini clauses (defstruct fox ()) (defstruct bear ()) (with-object ((f (new fox)) (b (new bear))) (put-line "inside with-object")) .brev Output: .verb #S(fox) created #S(bear) created inside with-object #S(bear) finalized #S(fox) finalized .brev .SS* Special Structure Functions Special structure functions are user-defined methods or structure functions which are specially recognized by certain functions in \*(TL. They endow structure objects with the ability to participate in certain usage scenarios, or to participate in a customized way. Special functions are required to bound to static slots, which is the case if the .code defmeth macro is used, or when methods or functions are defined using syntax inside a .code defstruct form. If a special function or method is defined as an instance slot, then the behavior of library functions which depend on this method is unspecified. Special functions introduced below by the word "Method" receive an object instance as an argument. Their syntax is indicated using the same notation which may be used to invoke them, such as: .verb .mets << object .(function-name < arg ...) .brev However, those introduced as "Function" do not operate on an instance. Their syntax is likewise indicated using the notation that may be used to invoke them: .verb .mets <> '[' object .function-name < arg ...']' .brev If such a invocation is actually used, the .meta object instance only serves for identifying the struct type whose static slot .code function-name provides the function; .meta object doesn't participate in the call. An object is not strictly required since the function can be called using .verb .mets [(static-slot < type 'function-name) < arg ...] .brev which looks up the function in the struct .meta type directly. .coNP Method @ copy .synb .mets << object .(copy) .syne .desc The special method .code copy is expected to produce a copy of the object. The .code copy function will use this method if it is available, otherwise fall back on .codn copy-struct . The method is responsible for all semantics of the copy operation; whatever object the method returns is taken to be a copy of .metn object . It is a recommended practice that the returned object be of the same type as .metn object . It is also a recommended practice that the returned object be newly created, distinct from any object which existed prior to the method being called. The objects held in that object's slots need not be new. .coNP Method @ equal .synb .mets << object .(equal) .syne .desc Normally, two struct values are not considered the same under the .code equal function unless they are the same object. However, if the .code equal method is defined for a structure type, then instances of that structure type support .IR "equality substitution" . The .code equal method must not require any arguments other than .metn object . Moreover, the method must never return .codn nil . When a struct which supports equality substitution is compared using .codn equal , .code less or .codn greater , its .code equal method is invoked, and the return value is used in place of that structure for the purposes of the comparison. The same applies when a struct is hashed using the .code hash-equal function, or implicitly by an .code :equal-hash hash table. Note: if an .code equal method is defined or redefined with different semantics for a struct type whose instances have already been inserted as keys in an .code :equal-based hash table, the behavior of subsequent insertion and lookup operations on that hash table becomes unspecified. .coNP Method @ print .synb .mets << object .(print < stream << pretty-p ) .syne .desc If a method named by the symbol .code print is defined for a structure type, then it is used for printing instances of that type. The .meta stream argument specifies the output stream to which the printed representation is to be written. The .meta pretty-p argument is a Boolean flag indicating whether pretty-printing is requested. Its value may simply be passed to recursive calls to .codn print , or used to select between .code ~s or .code ~a formatting if .code format is used. The value returned by the .code print method is significant. If the special keyword symbol .code : (colon) is returned, then the system will print the object in the default way, as if no .code print method existed: it is understood that the method declined the responsibility for printing the object. If any other value is returned, then it is understood that the method .code print method accepted the responsibility for printing the object, and the system consequently will generate into .meta stream any output output pertaining to .metn object 's representation. .coNP Methods @ slot and @ slot-set .synb .mets << object .(slot << slot-name ) .mets << object .(slot-set < slot-name << new-value ) .syne .desc Defining these methods allows a struct type to handle the situation when a nonexistent slot is accessed. The .code slot method, if it exists, is invoked if a slot named .meta slot-name is accessed by the .code slot function, or equivalent syntax, and that slot does not exist. The value returned by the method is taken to be the nonexistent slot's value. When a value is stored in a slot named .meta slot-name by the .code slotset function, or equivalent syntax, and the slot does not exist, then the .code slotset method is invoked, if it exists. It is recommended that the .code slotset function return .metn new-value , since the value returned propagates out of the .code slotset function, which in all other cases returns .metn new-value , which is important to the implementation of syntactic places that designate slots. .coNP Functions @ static-slot and @ static-slot-set .synb .mets << object .[static-slot < type << slot-name ) .mets << object .[static-slot-set < type < slot-name << new-value ) .syne .desc The .code static-slot and .code static-slot-set functions are analogous to the .code slot and .codn slotset , methods. These functions, if they exist, are only invoked when a static slot lookup fails. Static slot lookups occur through the .code static-slot and .code static-slot-set functions, which can be used directly and are used in certain situations. For instance when .code "(meth ...)" syntax is looked up with .codn symbol-function , static slot lookup is used. It is recommended that for simulating the existence of structure functions and methods, these methods be used. The .metn type , argument is an object of type .code struct-type giving the structure type on which the static slot lookup is taking place. It is recommended that the .code static-slot-set function return .metn new-value . .coNP Method @ lambda .synb .mets << object .(lambda << arg *) .syne .desc If a structure type provides a method called .code lambda then it can be used as a function. This method can be called by name, using the syntax given in the above syntactic description. However, the intended use is that it allows the structure instance itself to be used as a function. When a structure is applied to arguments as if it were a function, this is erroneous, unless that object has a .code lambda method. In that case, the arguments are passed to the lambda method. The leftmost argument of the method is the structure instance itself. That is to say, the following equivalences apply, except that .code s is evaluated only once: .verb (call s args ...) <--> s.(lambda args ...) [s args ...] <--> [s.lambda s args ...] (mapcar s list) <--> (mapcar (meth s lambda) list) .brev Note: a form such as .code "[s args ...]" where .code s is a structure can be treated as a place if the method .code lambda-set is also implemented. .coNP Method @ lambda-set .synb .mets << object .(lambda-set << arg * << new-value ) .syne .desc The .code lambda-set method, in conjunction with a .code lambda method, allows structures to be used as place accessors. If structure .code s supports a .meta lambda-set with four arguments, then the following use of the .code dwim operator is possible: .verb (set [s a b c d] v) (set (dwim s a b c d) v) ;; precisely equivalently .brev This has an effect which can be described by the following code: .verb (progn s s.(lambda-set a b c d v) v) .brev except that .code s and .code v are evaluated only once, and .code a through .code d are evaluated using the Lisp-1 semantics due the .code dwim operator. If a place-mutating operator is used on this form which requires the prior value, such as the .code inc macro, then the structure must support the .code lambda function also. If .code lambda takes .I n arguments, then .code lambda-set should take .I n+1 arguments. The first .I n arguments of these two methods are congruent; the extra rightmost argument of .code lambda-set is the new value to be stored into the place denoted by the prior arguments. The return value of .code lambda-set is ignored. Note: the .code lambda-set method is also used by the .code rplaca function, if no .code rplaca method exists. .TP* Example The following defines a structure with a single instance slot .code hash which holds a hash table, as well as .code lambda and .code lambda-set methods: .verb (defstruct hash-wrapper nil (hash (hash)) (:method lambda (self key) [self.hash key]) (:method lambda-set (self key new-val) (set [self.hash key] new-val) self)) .brev An instance of this structure can now be used as follows: .verb (let ((s (new hash-wrapper))) (set [s "apple"] 3 [s "orange] 4) [s "apple"]) -> 3 .brev .coNP Method @ length .synb .mets << object .(length) .syne .desc If a structure has .code length method, then it can be used as an argument to the .code length function. Structures which implement the methods .codn lambda , .code lambda-set and .code length can be treated as abstract vector-like sequences, because such structures support the .codn ref , .code refset and .code length functions. For instance, the .code nreverse function will operate on such objects. Note: a structure which supports the .code car method also supports the .code length function, in a different way. Such a structure is treated by .code length as a list-like sequence, and its length is measured by walking the sequence with .code cdr operations. If a structure supports both .code length and .codn car , preference is given to .codn length , which is likely to be much more efficient. .coNP Method @ length-< .synb .mets << object .(length-< << len ) .syne .desc If a structure has .code length-< method, then it can be used as the left argument to the .code length-< function. The .meta len argument receives the right argument. If an object doesn't implement the .code length-< method, but does implement the .code length it can also be used as an argument to the .code length-< function. In that situation, the .code length-< function will call the .code length method instead, and then compare the returned value against the .meta len parameter. .coNP Methods @, car @ cdr and @ nullify .synb .mets << object .(car) .mets << object .(cdr) .mets << object .(nullify) .syne .desc Structures may be treated as sequences if they define methods named by the symbols .codn car , .codn cdr , and .codn nullify . If a structure supports these methods, then these methods are used by the functions .codn car , .codn cdr , .codn nullify , .code empty and various other sequence manipulating functions derived from them, when those functions are applied to that object. An object which implements these three methods can be considered to represent a .I list-like abstract sequence. The object's .code car method should return the first value in that abstract sequence, or else .code nil if that sequence is empty. The object's .code cdr method should return an object denoting the remainder of the sequence, or else .code nil if the sequence is empty or contains only one value. This returned object can be of any type: it may be of the same structure type as that object, a different structure type, a list, or whatever else. If a non-sequence object is returned. The .code nullify method should return .code nil if the object is considered to denote an empty sequence. Otherwise it should either return that object itself, or else return the sequence which that object represents. .coNP Methods @ rplaca and @ rplacd .synb .mets << object .(rplaca << new-car-value ) .mets << object .(rplacd << new-cdr-value ) .syne .desc If a structure type defines the methods .code rplaca and .code rplacd then, respectively, the .code rplaca and .code rplacd functions will use these methods if they are applied to instances of that type. That is to say, when the function call .mono .meti (rplaca < o << v ) .onom is evaluated, and .meta o is a structure type, the function inquires whether .meta o supports a .code rplaca method. If so, then, effectively, .mono .meti << o . (rplaca << v ) .onom is invoked. The return value of this method call is ignored; .code rplaca returns .metn o . The analogous requirements apply to .code rplacd in relation to the .code rplacd method. Note: if the .code rplaca method doesn't exist, the .code rplaca function falls back on trying to store .meta new-car-value by means of the structure type's .code lambda-set method, using an index of zero. That is to say, if the type has no .code rplaca method, but does have a .code lambda-set method, then .mono .meti << o . (lambda-set 0 << v ) .onom is invoked. .coNP Function @ from-list .synb .mets <> '[' object .from-list << list ']' .syne .desc If a .code from-list structure function is defined for a structure type, it is called in certain situations with an argument which is a list object. The function's purpose is to construct a new instance of the structure type, derived from that list. The purpose of this function is to allow sequence processing operations such as .code mapcar and .code remove to operate on a structure object as if it were a sequence, and return a transformed sequence of the same type. This is analogous to the way such functions can operate on a vector or string, and return a vector or string. If a structure object behaves as a sequence thanks to providing .codn car , .code cdr and .code nullify methods, but does not have a .code from-list function, then those sequence-processing operations which return a sequence will always return a plain list of items. .coNP Function @ derived .synb .mets <> '[' object .derived < supertype << subtype ']' .syne .desc If a structure type supports a function called .metn derived , this function is called whenever a new type is defined which names that type as its supertype. The function is called with two arguments which are both struct types. The .meta supertype argument gives the type that is being inherited from. The .meta subtype gives the new type that is inheriting from .metn supertype . When a new structure type is defined, its list of immediate supertypes is considered. For each of those supertypes which defines the .code derived function, the function is invoked. The function is not retroactively invoked. If it is defined for a structure type from which subtypes have already been derived, it is not invoked for those existing subtypes. If .meta derived directly inherits .meta supertype more than once, it is not specified whether this function is called once, or multiple times. Note: the .meta supertype parameter exists because the .code derived function is itself inherited. If the same version of this function is shared by multiple structure types due to inheritance, this argument informs the function which of those types it is being invoked for. .coNP Methods @ iter-begin and @ iter-reset .synb .mets << object .(iter-begin) .mets << object .(iter-reset << iter ) .syne .desc If an object supports the .code iter-begin method, it is considered iterable; the .code iterable function will return .code t if invoked on this object. The responsibility of the .code iter-begin method is to return an iterator object: an object which supports certain special methods related to iteration, according to one of two protocols, described below. The .code iter-reset method is optional. It is similar to .code iter-begin but takes an additional .meta iter argument, an iterator object that was previously returned by the .code iter-begin method of the same .metn object . If .code iter-reset determines that .meta iter can be reused for a new iteration, then it can suitably mutate the state of .meta iter and return it. Otherwise, it behaves like .code iter-begin and returns a new iterator. There are two protocols for iteration: the fast protocol, and the canonical protocol. Both protocols require the iterator object returned by the .code iter-begin method to provide the methods .code iter-item and .codn iter-step . If the iterator also provides the .code iter-more method, then the protocol which applies is the canonical protocol. If that method is absent, then the fast protocol is followed. Under the fast protocol, the .code iter-more method does not exist and is not involved. The iterable object's .code iter-begin method must return .code nil if the abstract sequence is empty. If an iterator is returned, it is assumed that an object can be retrieved from the iterator by invoking its .code iter-item method. The iterator's .code iter-next method should return .code nil if there are no more objects in the abstract sequence, or else it should return an iterator that obeys the fast protocol (possibly itself). Under the canonical protocol, the iterator implements the .code iter-more function. The iterable object's .code iter-begin always returns an iterator object. The iterator object's .code iter-more method is always invoked to determine whether another item is available from the sequence. The iterator object's .code iter-step method is expected to return an iterator object which conforms to the canonical protocol. .coNP Method @ iter-item .synb .mets << object .(iter-item) .syne .desc The .code iter-item method is invoked on an iterator .meta object to retrieve the next item in the sequence. Under the fast protocol, it is assumed that if .meta object was returned by an iterable object's .code iter-begin method, or by an iterator's .code iter-step method, that an item is available. This method will be unconditionally invoked. Under the canonical protocol for iteration, the .code iter-more method will be invoked on .meta object first. If that method yields true, then .code iter-item is expected to yield the next available item in the sequence. Note: calls to the .code iter-item function, with .meta object as its argument, invoke the .code iter-item method. It is possible for an application to call .code iter-item through this function or directly as a method call without first calling .codn iter-more . No iteration mechanism in the \*(TL standard library behaves this way. If the iterator .meta object has no more items available and .code iter-more is invoked anyway, no requirements apply to its behavior or return value. .coNP Method @ iter-step .synb .mets << object .(iter-step) .syne .desc The .code iter-step method is invoked on an iterator object to produce an iterator object for the remainder of the sequence, excluding the current item. Under the fast iteration protocol, this method returns .code nil if there are no more items in the sequence. Under the canonical iteration protocol, this method always returns an iterator object. If no items remain in the sequence, then that iterator object's .code iter-more method returns .codn nil . Furthermore, under this protocol, .code iter-step is not called if .code iter-more returns .codn nil . Note: calls to the .code iter-step function, with .meta object as its argument, invoke the .code iter-step method. It is possible for an application to call .code iter-step through this function or directly as a method call without first calling .codn iter-more . No iteration mechanism in the \*(TL standard library behaves this way. If the iterator .meta object has no more items available and .code iter-step is invoked anyway, no requirements apply to its behavior or return value. .coNP Method @ iter-more .synb .mets << object .(iter-more) .syne .desc If an iterator .meta object returned by .code iter-begin supports the .code iter-more method, then the canonical iteration protocol applies to that iteration session. All subsequent iterators that are involved in the iteration are assumed to conform to the protocol and should implement the .code iter-more method also. The behavior is unspecified otherwise. The .code iter-more method is used to interrogate an iterator whether more unvisited items remain in the sequence. This method does not advance the iteration, and does not change the state of the iterator. It is idempotent: if it is called multiple times without any intervening call to any other method, it yields the same value. If an iterator does not implement the .code iter-more method, then if the .code iter-more function is applied to that iterator, it unconditionally returns .codn t . .SS* Sequence Manipulation Functions in this category uniformly manipulate abstract sequences. Lists, strings and vectors are sequences. Structure objects can behave like sequences, either list-like or vector-like sequences, if they have certain methods: see the previous section Special Structure Functions. Moreover, hash tables behave like sequences of key-value entries represented by .code cons pairs. Not all sequence-processing functions accept hash-table sequences. Additionally, some sequence-processing functions work not only with sequences but with all iterable objects: objects that can be used as arguments to the .code iter-begin function. Such arguments are called .meta iterable rather than .metn sequence , possibly abbreviated to .meta iter with or without a numeric suffix. Hash tables are always supported if they appear as .meta iterable arguments. .coNP Function @ seqp .synb .mets (seqp << object ) .syne .desc The function .code seqp returns .code t if .meta object is a sequence, otherwise .codn nil . Lists, vectors and strings are sequences. The object .code nil denotes the empty list and so is a sequence. Objects of type .code buf and .code carray are sequences, as are hash tables. Structures which implement the .code length or .code car methods are considered sequences. No other objects are sequences. However, future revisions of the language may specify additional objects that are sequences. .coNP Function @ iterable .synb .mets (iterable << object ) .syne .desc The .code iterable function returns .code t if .meta object is iterable, otherwise .codn nil . If .meta object is a sequence according to the .code seqp function, then it is iterable. If .meta object is a structure which supports the .code iter-begin method, then it is iterable. Additional objects that are not sequences are also iterable: numeric or character ranges, and numbers. Future revisions of the language may specify additional iterable objects. .coNP Functions @ make-like and @ seq-like .synb .mets (make-like < seq << object ) .mets (seq-like < object << arg *) .syne .desc The .code make-like function's .meta seq argument must be a sequence. If .meta object is a sequence type, then .meta list is converted to the same type of sequence, if possible, and returned. Otherwise the original .meta seq is returned. Conversion is supported to string and vector type, plus additional types as follows. Conversion to a structure type is possible for structures. If .meta object is an object of a structure type which has a static function .codn from-list , then .code make-like calls that function, passing to it, and the resulting value is returned. .meta seq and returns whatever value that function returns. If .meta object is a .codn carray , then .meta list is passed to the .code carray-list function, and the resulting value is returned. The second argument in the .code carray-list call is the element type taken from .metn object . The third argument is .codn nil , indicating that the resulting .code carray is not to be null terminated. The .meta object may be an iterator returned by .codn iter-begin . In this situation, if that object makes the original sequence available, then .code make-like takes that sequence in place of .metn object , The .code seq-like function creates, if possible, a sequence of the same kind as .meta object populated by the remaining .meta arg values. If some of the .meta arg values are not suitable elements for a sequence of that type, then a list of those values is returned. The result of .code seq-like is consistent with what the .code make-like function would return if given a sequence of the .meta arg values as the .meta seq argument. That is to say, the following equivalence holds: .verb (make-like (list a0 a1 ...) o) <-> (seq-like o a0 a1 ...) .brev Note: the .code make-like function is a helper which supports the development of unoptimized versions of a generic function that accepts any type of sequence as input, and produces a sequence of the same type as output. The implementation of such a function can internally accumulate a list, and then convert the resulting list to the same type as an input value by using .codn make-like . .coNP Functions @, list-seq @ vec-seq and @ str-seq .synb .mets (list-seq << iterable ) .mets (vec-seq << iterable ) .mets (str-seq << iterable ) .syne .desc The .codn list-seq , .code vec-seq and .code str-seq functions convert an iterable object of any type into a list, vector or string, respectively. The list returned by .code list-seq is lazy. The .code list-seq and .code vec-seq iterate the items of .meta iterable and accumulate these items into a new list or vector. The .code str-seq similarly iterates the items of .metn iterable , requiring them to be a mixture of characters and strings. .coNP Functions @ length and @ len .synb .mets (length << iterable ) .mets (len << iterable ) .syne .desc The .code length function returns the number of items contained in .metn iterable . The .code len function is a synonym of .codn length . An attempt to calculate the length of infinite lazy lists will not terminate. Iterable objects representing infinite ranges, such as integers and characters are invalid arguments. .coNP Function @ length-< .synb .mets (length-< < iterable << len ) .syne .desc The .code length-< function efficiently determines whether .mono .meti (length << iterable) .onom is less than the integer value .metn len . In cases when .meta iterable would have to be fully traversed in order to measure its length, the .code length-< function avoids this traversal, by making use of the functions .code length-str-< or .code length-list-< as appropriate. Note: this function is useful when a decision must be made between two algorithms, depending on whether the length is less than a certain small constant. It is also safe on lazy, infinite sequences and circular lists, for which .code length will fail to terminate. .coNP Function @ empty .synb .mets (empty << iterable ) .syne .desc If .meta iterable is a suitable argument for the .code length function, then the .code empty Returns .code t if .mono .meti (length << iterable ) .onom is zero, otherwise .codn nil . The .code empty function also supports certain objects not suitable as arguments for .codn length . An infinite lazy list is not empty, and so .code empty returns .code nil for such an object. The function also returns .code nil for iterable objects representing nonempty spaces, even if those spaces are infinite. For instance .code "(empty 0)" yields .code nil because the set of integers beginning with 0 isn't empty. .coNP Function @ nullify .synb .mets (nullify << iterable ) .syne .desc The .code nullify function returns .code nil if .meta iterable denotes an empty sequence. Otherwise, if .meta iterable is not an empty sequence, or isn't a sequence, then .meta iterable itself is returned. If .meta iterable is a structure object which supports the .code nullify method, then that method is called. If it returns .code nil then .code nil is returned. If the .code nullify method returns a substitute object other than the .meta iterable object itself, then .code nullify is invoked on that returned substitute object. Note: the .code nullify function is a helper to support unoptimized generic traversal of sequences. Thanks to the generalized behavior of .codn cdr , non-list sequences can be traversed using .codn cdr , similarly to proper lists, by checking for .code cdr returning the terminating value .codn nil . However, empty non-list sequences are handled incorrectly because since they are not the .code nil object, they look nonempty under this paradigm of traversal. The .code nullify function provides a correction: if the input sequence is filtered through .code nullify then the subsequent list-like iteration works correctly. Examples: .verb ;; Incorrect for empty strings: (defun print-chars (string) (while string (prinl (pop string)))) ;; Corrected with nullify: (defun print-chars (string) (let ((s (nullify string))) (while s (prinl (pop s))))) .brev Note: optimized generic iteration is available in the form of iteration based on .code iter-begin rather than .cod3 car / cdr and .codn nullify . Examples: .verb ;; Efficient with iterators, ;; at the cost of verbosity: (defun print-chars (string) (let ((i (iter-begin string))) (while (iter-more i) (prinl (iter-item s)) (set s (iter-step s))))) ;; Using mapping function built on iterators: (defun print-chars (string) [mapdo prinl string]) .brev .coNP Accessor @ sub .synb .mets (sub < sequence >> [ from <> [ to ]]) .mets (set (sub < sequence >> [ from <> [ to ]]) << new-val ) .syne .desc The .code sub function extracts a slice from input sequence .metn sequence . The slice is a sequence of the same type as .metn sequence . If the .meta from argument is omitted, it defaults to .codn 0 . If the .meta to parameter is omitted, it defaults to .codn t . Thus .code "(sub a)" means .codn "(sub a 0 t)" . The following semantic equivalence exists between a call to the .code sub function and the DWIM-bracket syntax, except that .code sub is an ordinary function call form, which doesn't apply the Lisp-1 evaluation semantics to its arguments: .verb ;; from is not a list (sub seq from to) <--> [seq from..to] .brev The description of the .code dwim operator\(emin particular, the section on Range Indexing\(emexplains the semantics of the range specification. The output sequence may share structure with the input sequence. If .meta sequence is a .code carray object, then the function behaves like .codn carray-sub . If .meta sequence is a .code buf object, then the function behaves like .codn sub-buf . If .meta sequence is a .code tree object, then the function behaves like .codn sub-tree . Note: because .code sub-tree is not an accessor, assigning to the .code sub syntax in this case will produce an error. The .meta sequence argument may also be any other object type that is suitable as input to the .code iter-begin function. In this situation, assigning to .code sub syntax produces an error. Furthermore, in cases where the .meta from and .meta to arguments imply that a suffix of .meta sequence is required, an lazy list of the suffix of the iterated sequence will be returned. In other cases, a regular list of the elements selected by .code sub is returned. If .meta sequence is a structure, it must support the .code lambda method. The .code sub operation is transformed into a call to the .code lambda method according to the following equivalence: .verb (sub o from to) <--> o.(lambda (rcons from to)) (sub o : to) <--> o.(lambda (rcons : to)) (sub o from) <--> o.(lambda (rcons from :)) (sub o) <--> o.(lambda (rcons : :)) .brev That is to say, the .meta from and .code to arguments are converted to range object. If either argument is missing, the .code : (colon) keyword symbol is used for the corresponding element of the range. When a .code sub form is used as a syntactic place, that place denotes a slice of .metn seq . The .meta seq argument must be itself be syntactic place, because receives a new value, which may be different from its original value in cases when .meta seq is a list. Overwriting that slice is equivalent to using the .code replace function. The following equivalences give the semantics, except that .codn x , .codn a , .code b and .code v are evaluated only once, in left-to-right order: .verb (set (sub x a b) v) <--> (progn (set x (replace x v a b)) v) (del (sub x a b)) <--> (prog1 (sub x a b) (set x (replace x nil a b))) .brev Note that the value of .code x is overwritten with the value returned by .codn replace . If .code x is a vector or string, then the return value of .code replace is just .codn x : the identity of the object doesn't change under mutation. However, if .code x is a list, its identity changes when items are added to or removed from the front of the list, and in those cases .code replace will return a value different from its first argument. Similarly, if .code x is an object with a .code lambda-set method, that method's return value becomes the return value of .code replace and must be taken into account. .coNP Function @ replace .synb .mets (replace < sequence < replacement-sequence >> [ from <> [ to ]]) .mets (replace < sequence < replacement-sequence << index-seq ) .syne .desc The .meta replace function modifies .meta sequence in the ways described below. The operation is destructive: it may work "in place" by modifying the original sequence. The caller should retain the return value and stop relying on the original input sequence. The return value of .code replace is the modified version of .metn sequence . This may be the same object as .meta sequence or it may be a newly allocated object. Note that the form: .verb (set seq (replace seq new fr to)) .brev has the same effect on the variable .code seq as the form: .verb (set [seq fr..to] new) .brev except that the former .code set form returns the entire modified sequence, whereas the latter returns the value of the .code new argument. The .code replace function has two invocation styles, distinguished by the type of the third argument. If the third argument is a sequence, then it is deemed to be the .meta index-seq parameter of the second form. Otherwise, if the third argument is missing, or is not a sequence, then it is deemed to be the .meta from argument of the first form. The first form of the replace function replaces a contiguous subsequence of the .meta sequence with .metn replacement-sequence . The replaced subsequence may be empty, in which case an insertion is performed. If .meta replacement-sequence is empty (for example, the empty list .codn nil ), then a deletion is performed. If the .meta from and .meta to arguments are omitted, their values default to .code 0 and .code t respectively. The description of the dwim operator\(emin particular, the section on Range Indexing\(emexplains the semantics of the range specification. The second form of the replace function replaces a subsequence of elements from .meta sequence given by .metn index-seq , with their counterparts from .metn replacement-sequence . If .meta replacement-sequence has at least as many elements as are indicated in .metn index-seq , then the indicated elements of .meta sequence are overwritten with successive elements from .metn replacement-sequence . If .meta replacement-sequence contains fewer elements than .metn index-seq , then the excess elements indicated in .meta index-seq which have no counterparts in the .meta replacement-sequence are deleted. Whenever a negative value occurs in .meta index-seq the original length of .meta sequence (before any deletions) is added to that value. Furthermore, similar restrictions apply on .meta index-seq as under the select function. Namely, the replacement stops when an index value in .meta index-seq is encountered which is out of range for .metn sequence . furthermore, if .meta sequence is a list, or if any deletions take place, then .meta index-seq must be monotonically increasing, after consideration of the displacement of negative values, or else the behavior is unspecified. If .meta replacement-sequence shares storage with the target range of .metn sequence , or, in the case when that range is resized by the .code replace operation, shares storage with any portion of .meta sequence above that range, then the effect of .code replace on either object is unspecified. If .meta sequence is a .code carray object, then .code replace behaves like .codn carray-replace . If .meta sequence is a .code buf object, then .code replace behaves like .codn buf-replace . If .meta sequence is a structure, then the structure must support the .code lambda-set method. The .code replace operation is translated into a call of the .code lambda-set method according to the following equivalences: .verb (replace o items from to) <--> o.(lambda-set (rcons from to) items) (replace o items index-seq) <--> o.(lambda-set index-seq items) .brev Thus, the .meta from and .meta to arguments are converted to single range object, whereas an .meta index-seq is passed as-is. It is an error if the .code from argument is a sequence, indicating an .metn index-seq , and a .code to argument is also given; the situation is diagnosed. If either .code from or .code to are omitted, the range object contains the .code : (colon) keyword symbol in the corresponding place: .verb (replace o items from) <--> o.(lambda-set (rcons from :) items) (replace o items : to) <--> o.(lambda-set (rcons : to) items) (replace o items) <--> o.(lambda-set (rcons : :) items) .brev It is the responsibility of the object's .code lambda-set method to implement semantics consistent with the description of .codn replace . .coNP Function @ take .synb .mets (take < count << sequence ) .syne .desc The .code take function returns .meta sequence with all except the first .meta count items removed. If .meta sequence is a list, then .code take returns a lazy list which produces the first .meta count items of sequence. For other kinds of sequences, including lazy strings, .code take works eagerly. If .meta count exceeds the length of .meta sequence then a sequence is returned which has all the items. This object may be .meta sequence itself, or a copy. If .meta count is negative, it is treated as zero. .coNP Functions @ take-while and @ take-until .synb .mets (take-while < predfun < sequence <> [ keyfun ]) .mets (take-until < predfun < sequence <> [ keyfun ]) .syne .desc The .code take-while and .code take-until functions return a prefix of .meta sequence whose items satisfy certain conditions. The .code take-while function returns the longest prefix of .meta sequence whose elements, accessed through .meta keyfun satisfy the function .metn predfun . The .meta keyfun argument defaults to the identity function: the elements of .meta sequence are examined themselves. The .code take-until function returns the longest prefix of .meta sequence which consists of elements, accessed through .metn keyfun , that do .B not satisfy .meta predfun followed by an element which does satisfy .metn predfun . If .meta sequence has no such prefix, then an empty sequence is returned of the same kind as .metn sequence . If .meta sequence is a list, then these functions return a lazy list. .coNP Function @ drop .synb .mets (drop < count << sequence ) .syne .desc The .code drop function returns .meta sequence with the first .meta count items removed. If .meta count is negative, it is treated as zero. If .meta count is zero, then .meta sequence is returned. If .meta count exceeds the length of .meta sequence then an empty sequence is returned of the same kind as .metn sequence . .coNP Functions @ drop-while and @ drop-until .synb .mets (drop-while < predfun < sequence <> [ keyfun ]) .mets (drop-until < predfun < sequence <> [ keyfun ]) .syne .desc The .code drop-while and .code drop-until functions return .meta sequence with a prefix of that sequence removed, according to conditions involving .meta predfun and .metn keyfun . The .code drop-while function removes the longest prefix of .meta sequence whose elements, accessed through .meta keyfun satisfy the function .metn predfun , and returns the remaining sequence. The .meta keyfun argument defaults to the identity function: the elements of .meta sequence are examined themselves. The .code drop-until function removes the longest prefix of .meta sequence which consists of elements, accessed through .metn keyfun , that do .B not satisfy .meta predfun followed by an element which does satisfy .metn predfun . A sequence of the remaining elements is returned. If .meta sequence has no such prefix, then a sequence same as .meta sequence is returned, which may be .meta sequence itself or a copy. .coNP Accessor @ last .synb .mets (last < sequence <> [ num ]) .mets (set (last < sequence <> [ num ]) << new-value) .syne .desc The .meta last function returns a subsequence of .meta sequence consisting of the last .meta num of its elements, where .meta num defaults to 1. If .meta num is zero or negative, then an empty sequence is returned. If .meta num is positive, and greater than or equal to the length of sequence, then sequence .meta sequence is returned. If a .code last form is used as a place, then .code sequence must be a place. The following equivalence gives the semantics of assignment to a .codn last : .verb (set (last x n) v) <--> (set (sub x (- (max n 0)) t) v) .brev A .code last place is deletable. The semantics of deletion may be understood in terms of the following equivalence: .verb (del (last x n)) <--> (del (sub x (- (max n 0)) t)) .brev .coNP Accessor @ butlast .synb .mets (butlast < sequence <> [ num ]) .mets (set (butlast < sequence <> [ num ]) << new-value ) .syne .desc The .code butlast function returns the prefix of .meta sequence consisting of a copy of it, with the last .meta num items removed. The parameter .meta num defaults to 1 if an argument is omitted. If .meta sequence is empty, an empty sequence is returned. If .meta num is zero or negative, then .meta sequence is returned. If .meta num is positive, and meets or exceeds the length of .metn sequence , then an empty sequence is returned. If a .code butlast form is used as a place, then .meta sequence must itself be a place. The following equivalence gives the semantics of assignment to a .codn last : .verb (set (butlast x n) v) <--> (set (sub x 0 (- (max n 0))) v) .brev A .code butlast place is deletable. The semantics of deletion may be understood in terms of the following equivalence: .verb (del (last x n)) <--> (del (sub x 0 (- (max n 0)))) .brev Note: the \*(TL .code take function also computes the prefix of a list; however, it counts items from the beginning, and provides lazy semantics which allow it to work with infinite lists. See also: the .code butlastn accessor, which operates on lists. That function has useful semantics for improper lists and treats an atom as the terminator of a zero-length improper list. Dialect Note: a destructive function similar to Common Lisp's .code nbutlast isn't provided. Assignment to a .code butlast form is destructive; Common Lisp doesn't support .code butlast as a place. .coNP Function @ ldiff .synb .mets (ldiff < sequence << tail-sequence ) .syne .desc The .code ldiff function is a somewhat generalized version of the same-named classic Lisp function found in traditional Lisp dialects. The .code ldiff function supports the original .code ldiff semantics when both inputs are lists. It determines whether the .meta tail-sequence list is a structural suffix of .metn sequence , which is to say: is .meta tail-sequence one of the .code cons cells which comprise .metn sequence ? If so, then a list is returned consisting of all the items of .meta sequence before .metn tail-sequence : a copy of .meta sequence with the .meta tail-sequence part removed, and replaced by the .code nil terminator. If .meta tail-sequence is .code nil or the lists are unrelated, then .meta sequence is returned. The \*(TL .code ldiff function supports the following additional semantics. .RS .IP 1. The basic description of .code ldiff is extended to work with list-like sequences, not merely lists; that is to say, objects which support the .code car method. .IP 2. If .meta sequence is any kind of sequence, and .meta tail-sequence is any kind of empty sequence, then .meta sequence is returned. .IP 3. If either argument is an atom that is not a sequence, .code ldiff returns .metn sequence . .IP 4. If .meta sequence is a list-like sequence, and .meta tail-sequence isn't, then the terminating atom of .meta sequence is determined. This atom is compared using .code equal to the .meta tail-sequence object. If they are equal, then a proper list is returned containing the items of .meta sequence excluding the terminating atom. .IP 5. If both arguments are vector-like sequences, then .code ldiff determines whether .meta sequence has a suffix which is .code equal to .metn tail-sequence . If this is the case, then a sequence is returned, of the same kind as .metn sequence , consisting of the items of .meta sequence before that suffix. If .meta tail-sequence is not .code equal to a suffix of .metn sequence , then .meta sequence is returned. .IP 6. In all other cases, .meta sequence and .meta tail-sequence are compared with .codn equal . If the comparison is true, .code nil is returned, otherwise .meta sequence is returned. .RE .TP* Examples: .verb ;;; unspecified: the compiler could make ;;; '(2 3) a suffix of '(1 2 3), ;;; or they could be separate objects. (ldiff '(1 2 3) '(2 3)) -> either (1) or (1 2 3) ;; b is the (1 2) suffix of a, so the ldiff is (1) (let* ((a '(1 2 3)) (b (cdr a))) (ldiff a b)) -> (1) ;; Rule 5: strings and vector (ldiff "abc" "bc") -> "a" (ldiff "abc" nil) -> "abc" (ldiff #(1 2 3) #(3)) -> #(1 2) ;; Rule 5: mixed vector kinds (ldiff "abc" #(#\eb #\ec)) -> "abc" ;; Rule 6: (ldiff #(1 2 3) '(3)) -> #(1 2 3) ;; Rule 4: (ldiff '(1 2 3) #(3)) -> '(1 2 3) (ldiff '(1 2 3 . #(3)) #(3)) -> '(1 2 3) (ldiff '(1 2 3 . 4) #(3)) -> '(1 2 3 . 4) ;; Rule 6 (ldiff 1 2) -> 1 (ldiff 1 1) -> nil .brev .coNP Function @ search .synb .mets (search < haystack < needle >> [ testfun <> [ keyfun ]]) .syne .desc The .code search function determines whether the sequence .meta needle occurs as substring within .metn haystack , under the given comparison function .meta testfun and key function .metn keyfun . If this is the case, then the zero-based position of the leftmost occurrence of .meta key within .meta haystack is returned. Otherwise .code nil is returned to indicate that .meta key does not occur within .metn haystack . If .meta key is empty, then zero is always returned. The arguments .meta haystack and .meta needle are sequences. They may not be hash tables. If .meta needle is not empty, then it occurs at some position N within .meta haystack if the first element of .meta needle matches the element at position N of .metn haystack , the second element of .meta needle matches the element at position N+1 of .meta haystack and so forth, for all elements of .metn needle . A match between elements is determined by passing each element through .metn keyfun , and then comparing the resulting values using .metn testfun . If .meta testfun is supplied, it must be a function which can be called with two arguments. If it is not supplied, it defaults to .codn eql . If .meta keyfun is supplied, it must be a function which can be called with one argument. If it is not supplied, it defaults to .codn identity . .TP* Examples: .verb ;; fails because 3.0 doesn't match 3 ;; under the default eql function [search #(1.0 3.0 4.0 7.0) '(3 4)] -> nil ;; occurrence found at position 1: ;; (3.0 4.0) matches (3 4) under = [search #(1.0 3.0 4.0 7.0) '(3 4) =] -> 1 ;; "even odd odd odd even" pattern ;; matches at position 2 [search #(1 1 2 3 5 7 8) '(2 1 1 1 2) : evenp] -> 2 ;; Case insensitive string search [search "abcd" "CD" : chr-toupper] -> 2 ;; Case insensitive string search ;; using vector of characters as key [search "abcd" #(#\eC #\eD) : chr-toupper] -> 2 .brev .coNP Function @ contains .synb .mets (contains < needle < haystack >> [ testfun <> [ keyfun ]]) .syne .desc The syntax of the .code contains function differs from that of .codn search : that the .meta needle and .meta haystack arguments are reversed. The semantics is identical. .coNP Function @ rsearch .synb .mets (rsearch < haystack < needle >> [ testfun <> [ keyfun ]) .syne .desc The .code rsearch function is like .code search except for two differences. Firstly, if .meta needle matches .meta haystack in multiple places, .code rsearch returns the rightmost matching position rather than the leftmost. Secondly, if .meta needle is an empty sequence, then .code rsearch returns the length of .codn haystack , thereby effectively declaring that the rightmost match for an empty .meta needle key occurs at the imaginary position past the element of .metn haystack . .coNP Function @ search-all .synb .mets (search-all < haystack < needle >> [ testfun <> [ keyfun ]) .syne .desc The .code search-all function is closely related to the .code search and .code rsearch functions. Whereas those two functions return the leftmost or rightmost position, respectively, of .meta needle within .metn haystack , the .code search-all function returns a list of all the positions where .meta needle occurs. The positions of overlapping matches are included in the list. If .meta needle is not found in .metn haystack , .code search-all returns the empty list .codn nil . If .meta needle is empty, then .code search-all returns a list of all positions in .meta haystack including the one position past the last element. In this situation, if .meta haystack is empty, the list .code "(0)" is returned. If .meta haystack contains one item, then the list .code "(0 1)" is returned and so forth. In all situations in which .code search-all returns a non-empty list, the first element of that list is what .code search would return for the same arguments, and the last element is what .code rsearch would return. .coNP Accessor @ ref .synb .mets (ref < sequence << index ) .mets (set (ref < sequence << index ) << new-value ) .syne .desc The .code ref accessor performs array-like indexing into sequences, as well as hash tables and objects of type .codn buf , .codn carray , .code tree as well as structure objects which define a .code lambda method. If the .meta sequence parameter is a hash, then these functions perform has retrieval and storage; in that case .meta index isn't restricted to an integer value. If .meta sequence is a structure, it supports .code ref directly if it has a .code lambda method. The .meta index argument is passed to that method, and the resulting value is returned. If a structure lacks a .code lambda method, but has a .code car method, then .code ref treats it as a list, traversing the structure using .cod3 car / cdr operations. In the absence of support for these operations, the function fails with an error exception. If .meta sequence is a sequence then .meta index argument must be an integer. The first element of the sequence is indexed by zero. Negative values are permitted, denoting backward indexing from the end of the sequence, such that the last element is indexed by -1, the second last by -2 and so on. See also the Range Indexing section under the description of the .code dwim operator. If .meta sequence is a list, then out-of-range indices, whether positive or negative, are treated leniently by .codn ref : such accesses produce the value .codn nil , rather than an error. For other sequence types, such accesses are erroneous. For hashes, accesses to nonexistent elements are treated leniently, and produce .codn nil . If .meta sequence is a search tree, then .code ref behaves like .codn tree-lookup . If .meta sequence is a range object, then .code ref behaves like .codn rangeref . A .code ref expression may be used as a place. Storing a value into a .code ref place is performed using the .code refset function. When the .code del operator is used to delete an index value from a .code ref place, the .meta sequence itself must be a place. The deletion calculates a new sequence with the item at .meta index deleted; that new sequence is stored back into the .meta sequence place. Deletion does not use .code refset but rather the .code replace function. .coNP Function @ refset .synb .mets (refset < sequence < index << new-value ) .syne .desc The .code refset function performs indexing into .meta sequence in a manner identical to .code ref with the purpose of overwriting the indexed element with .metn new-value . It is a companion function to .code ref which is used in the implementation of the .code ref place. The return value of .code ref-set is .metn new-value . If .meta sequence is a structure, it supports .code refset directly if it has a .code lambda-set method. This gets called with .meta index and .meta new-value as arguments. Then .meta new-value is returned. If a structure lacks a .code lambda-set method, then .code refset treats it as a list, traversing the structure using .cod3 car / cdr operations, and storing .meta new-value using .codn rplaca . In the absence of support for these operations, the function fails with an error exception. The .code refset function is not supported by search trees. The .code refset function is strict for out-of-range indices over all sequences, including lists. In the case of hashes, a .code refset of a nonexistent key creates the key. .coNP Accessor @ mref .synb .mets (mref < sequence << index *) .mets (set (mref < sequence << index +) new-value) .syne .desc The .code mref accessor provides a mechanism for invoking a curried function. Its name reflects its usefulness for multi-dimensional indexing into nested sequences. The associated .code mref place which makes the operator an accessor provides in-place replacement of values in multi-dimensional sequences. There are some restrictions on the .meta index arguments when .code mref is used as a place. The .meta sequence argument is not necessarily a sequence, but may be object that can be called as a function with one argument. Except that .code call isn't a place, the expression .code "(mref x i)" is equivalent to .codn "(call x i)" : invoke the function/object .code x with argument .codn i . When multiple .meta index arguments are present, the return value of each previous application is expected to be another callable object, to which the next .meta index argument is applied. Thus .code "(mref x i j k)" is equivalent to .codn "(call (call (call x i) j) k)" . This is also equivalent to .codn "[[[x i] j] k]" , provided that under the Lisp-1-style name resolution semantics of the DWIM brackets, the symbols .codn x , .codn i , .code j and .code k all resolve to bindings in the variable namespace. The expression .code "(mref x)" is not equivalent to .codn "(call x)" ; rather, it is equivalent to .codn x : there are no .meta index arguments and so the .code x object is taken as-is, not being applied to any index. In more detail, the .code mref function begins by taking .meta sequence as its an accumulator object. Then if there are .meta index arguments, it iterates over them. At each iteration step, it replaces the accumulator by treating the accumulator as a callable object and applying it to .meta index value and taking the resulting value as the new accumulator. After the iteration, the accumulator becomes the return value of the function. When .code mref is used as a place, only the rightmost .meta index argument may be a range. If any other argument is a range object, the behavior is unspecified. When .code mref is used as a place, and there is only one .meta index which is a range object, then the .meta sequence expression is also required to be a place, if it denotes a list or range object. If there are no .meta index augments then .meta sequence is unconditionally required to be a place. Note: the functions .code nested-vec and .code nested-vec-of may be used to create nested vectors which simulate multi-dimensional arrays. .TP* Examples: .verb ;; Indexing: (let ((ar '((1 2 3) (4 5 6) (7 8 9)))) (mref ar 1 1)) --> 5 ;; Updating value in nested sequence: (let ((ar (vec (vec (vec 0 1 2 3) (vec 4 5 6 7)) (vec (vec 8 9 10 11) (vec 12 13 14 15))))) (set (mref ar 0 0 1..3) "AB") ar) --> #(#(#( 0 #\eA #\eB 3) #( 4 5 6 7)) #(#( 8 9 10 11) #(12 13 14 15))) ;; Invoking curried function: (let ((cf (lambda (x) (lambda (y) (lambda (z) (+ x y z)))))) [mref cf 1 2 3]) --> 6 .brev .coNP Function @ update .synb .mets (update < sequence << function ) .syne .desc The .code update function replaces each elements in .meta sequence in a hash table, with the result of .meta function being applied to that element value. The .meta sequence is returned. The .meta sequence may be a hash table. In that case, .meta function is invoked with each hash value, which is replaced with the function's return value. .coNP Functions @, remq @ remql and @ remqual .synb .mets (remq < object < sequence <> [ key-function ]) .mets (remql < object < sequence <> [ key-function ]) .mets (remqual < object < sequence <> [ key-function ]) .syne .desc The .codn remq , .code remql and .code remqual functions produce a new sequence based on .metn sequence , removing the elements whose associated keys are .codn eq , .code eql or .code equal to .metn object . The input .meta sequence is unmodified, but the returned sequence may share substructure with it. If no items are removed, it is possible that the return value is .meta sequence itself. If .meta key-function is omitted, then the element keys compared to .meta object are the elements themselves. Otherwise, .meta key-function is applied to each element and the resulting value is that element's key which is compared to .metn object . .coNP Functions @, remq* @ remql* and @ remqual* .synb .mets (remq* < object << sequence ) .mets (remql* < object << sequence ) .mets (remqual* < object << sequence ) .syne .desc The .codn remq* , .code remql* and .code remqual* functions are lazy analogs of .codn remq , .code remql and .codn remqual . Rather than computing the entire new sequence prior to returning, these functions return a lazy list. Caution: these functions can still get into infinite looping behavior. For instance, in .codn "(remql* 0 (repeat '(0)))" , .code remql will keep consuming the .code 0 values coming out of the infinite list, looking for the first item that does not have to be deleted, in order to instantiate the first lazy value. .TP* Examples: .verb ;; Return a list of all the natural numbers, excluding 13, ;; then take the first 100 of these. ;; If remql is used, it will loop until memory is exhausted, ;; because (range 1) is an infinite list. [(remql* 13 (range 1)) 0..100] .brev .coNP Functions @, keepq @ keepql and @ keepqual .synb .mets (keepq < object < sequence <> [ key-function ]) .mets (keepql < object < sequence <> [ key-function ]) .mets (keepqual < object < sequence <> [ key-function ]) .syne .desc The .codn keepq , .code keepql and .code keepqual functions produce a new sequence based on .metn sequence , removing the items whose keys are not .codn eq , .code eql or .code equal to .metn object . The input .meta sequence is unmodified, but the returned sequence may share substructure with it. If no items are removed, it is possible that the return value is .meta sequence itself. The optional .meta key-function is applied to each element from the .meta sequence to convert it to a key which is compared to .metn object . If .meta key-function is omitted, then each element itself of .meta sequence is compared to .metn object . .coNP Functions @, remove-if @, keep-if @, separate @ remove-if* and @ keep-if* .synb .mets (remove-if < predfun < sequence >> [ keyfun <> [ mapfun ]]) .mets (keep-if < predfun < sequence >> [ keyfun <> [ mapfun ]]) .mets (separate < predfun < sequence >> [ keyfun <> [ mapfun ]]) .mets (remove-if* < predfun < sequence >> [ keyfun <> [ mapfun ]]) .mets (keep-if* < predfun < sequence >> [ keyfun <> [ mapfun ]]) .syne .desc The .code remove-if function produces a sequence whose contents are those of .meta sequence but with those elements removed which satisfy .metn predfun . Those elements which are not removed appear in the same order. The result sequence may share substructure with the input sequence, and may even be the same sequence object if no items are removed. The optional .meta keyfun specifies how each element from the .meta sequence is transformed to an argument to .metn predfun . If this argument is omitted then the predicate function is applied to the elements directly, a behavior which is identical to .meta keyfun being .codn "(fun identity)" . The optional .meta mapfun argument specifies a function which is applied to the elements of .meta sequence that are identified for retention, mapping them to the actual values that are accumulated into the output. In the absence of this argument, the behavior is to accumulate the elements themselves. If .meta keyfun and .meta mapfun are the same object, it is unspecified whether .meta mapfun is called, or whether the result of .meta keyfun is used. The .code keep-if function is exactly like .codn remove-if , except the sense of the predicate is inverted. The function .code keep-if retains those items which .code remove-if will delete, and removes those that .code remove-if will preserve. The .code separate function combines .code keep-if and .code remove-if into one, returning a list of two elements whose .code car and .code cadr are the result of calling .code keep-if and .codn remove-if , respectively, on .meta sequence (with the .meta predfun and .meta keyfun arguments passed through). One of the two elements may share substructure with the input sequence, and may even be the same sequence object if all items are either kept or removed (in which case the other element will be .codn nil ). Note: the .code separate function may be understood in terms of the following reference implementation: .verb (defun separate (pred seq : (keyfun :)) [(juxt (op keep-if pred @1 keyfun) (op remove-if pred @1 keyfun)) seq]) .brev The .code remove-if* and .code keep-if* functions are like .code remove-if and .codn keep-if , but produce lazy lists. .TP* Examples: .verb ;; remove any element numerically equal to 3. (remove-if (op = 3) '(1 2 3 4 3.0 5)) -> (1 2 4 5) ;; remove those pairs whose first element begins with "abc" [remove-if (op equal [@1 0..3] "abc") '(("abcd" 4) ("defg" 5)) car] -> (("defg" 5)) ;; equivalent, without test function (remove-if (op equal [(car @1) 0..3] "abc") '(("abcd" 4) ("defg" 5))) -> (("defg" 5)) .brev .coNP Functions @ keep-keys-if and @ separate-keys .synb .mets (keep-keys-if < predfun < sequence >> [ keyfun <> [ mapfun ]]) .mets (separate-keys < predfun < sequence <> [[ keyfun ]) .syne .desc The functions .code keep-keys-if and .code separate-keys are derived, respectively, from the functions .code keep-if and .codn separate , and have the same syntax and argument semantics. They differ in that rather than accumulating the elements of the input .codn sequence , they accumulate the transformed values of those elements, as projected through the .metn keyfun . If all arguments of .code keep-keys-if are specified, then it behaves exactly like .code keep-if for those same arguments. The same is true if both the .meta keyfun and .meta mapfun arguments are omitted, or if .meta keyfun is specified as .codn identity . The difference between .code keep-keys-if and .code keep-if is the defaulting of the .meta mapfun argument. If .meta mapfun is omitted, then it defaults to being the same function as the .meta keyfun argument. In the case of .codn separate-keys-if , when .meta keyfun is omitted, thus defaulting to .codn identity , or else explicitly specified as .code identity or equivalent function, the behavior is same as that of .codn separate . .TP* Example: .verb ;; square the values 1 to 20, keeping the even squares [keep-keys-if evenp (range 1 20) square] -> (4 16 36 64 100 144 196 256 324 400) ;; square the values 1 to 20 separating into even and odd: [separate-keys evenp (range 1 20) square] -> ((4 16 36 64 100 144 196 256 324 400) (1 9 25 49 81 121 169 225 289 361)) ;; contrast with keep-if: values are of input sequence [keep-if evenp (range 1 20) square] -> (2 4 6 8 10 12 14 16 18 20) .brev .coNP Functions @, countqual @ countql and @ countq .synb .mets (countq < object << iterable ) .mets (countql < object << iterable ) .mets (countqual < object << iterable ) .syne .desc The .codn countq , .code countql and .code countqual functions count the number of objects in .meta iterable which are .codn eq , .code eql or .code equal to .metn object , and return the count. .coNP Functions @ count and @ count-if .synb .mets (count < key < sequence >> [ testfun <> [ keyfun ]]) .mets (count-if < predfun < iterable <> [ keyfun ]) .syne .desc The .code count and .code count-if functions search through .meta sequence for items which match .metn key , or satisfy the predicate function .metn predfun , respectively. They return the number of matching or predicate-satisfying items. The .meta keyfun argument specifies a function which is applied to the elements of .meta sequence to produce the comparison key. If this argument is omitted, then the untransformed elements of .meta sequence are examined. The .code count function's .meta testfun argument specifies the test function which is used to compare the comparison keys from .meta sequence to .metn key . If this argument is omitted, then the .code equal function is used. The .code count function returns the number of elements of .meta sequence whose comparison key (as retrieved by .metn keyfun ) matches the .meta key object, as compared by .metn testfun . The .code count-if function's .meta predfun argument specifies a predicate function which is applied to the successive comparison keys taken from .metn sequence . The function returns the count of the number keys for which .meta predfun returns true. .coNP Function @ cons-count .synb .mets (cons-count < obj < tree <> [ test-function ]) .syne .desc The .code cons-count function returns the number of times the object .meta obj occurs in the .code cons cell structure .metn tree , under the equality imposed by the .metn test-function . If the optional .meta test-function argument is omitted, it defaults to .codn equal . First, .meta obj and .meta tree are compared using .metn test-function . If they are equal, that counts as one occurrence. Then, if .meta tree is a .code cons cell, the function recurses over the .code car and .code cdr fields. The sum of all these counts is returned. .coNP Functions @, posq @ posql and @ posqual .synb .mets (posq < object << sequence ) .mets (posql < object << sequence ) .mets (posqual < object << sequence ) .syne .desc The .codn posq , .code posql and .code posqual functions return the zero-based position of the first item in .meta sequence which is, respectively, .codn eq , .code eql or .code equal to .metn object . .coNP Functions @ pos and @ pos-if .synb .mets (pos < key < sequence >> [ testfun <> [ keyfun ]]) .mets (pos-if < predfun < sequence <> [ keyfun ]) .syne .desc The .code pos and .code pos-if functions search through .meta sequence for an item which matches .metn key , or satisfies the predicate function .metn predfun , respectively. They return the zero-based position of the matching item. The .meta keyfun argument specifies a function which is applied to the elements of .meta sequence to produce the comparison key. If this argument is omitted, then the untransformed elements of .meta sequence are examined. The .code pos function's .meta testfun argument specifies the test function which is used to compare the comparison keys from .meta sequence to .metn key . If this argument is omitted, then the .code equal function is used. The .code pos function returns the position of the first element of .meta sequence whose comparison key (as retrieved by .metn keyfun ) matches .metn key , as compared by the .meta testfun function. If no such element is found, .code nil is returned. The .code pos-if function's .meta predfun argument specifies a predicate function which is applied to the successive comparison keys taken from .meta sequence by applying .meta keyfun to successive elements. The position of the first element for which .meta predfun yields true is returned. If no such element is found, .code nil is returned. .coNP Functions @, rposq @, rposql @, rposqual @ rpos and @ rpos-if .synb .mets (rposq < object << sequence ) .mets (rposql < object << sequence ) .mets (rposqual < object << sequence ) .mets (rpos < key < sequence >> [ testfun <> [ keyfun ]]) .mets (rpos-if < predfun < sequence <> [ keyfun ]) .syne .desc These functions are counterparts of .codn rposq , .codn rposql , .codn rposqual , .code rpos and .code rpos-if which report position of the rightmost matching item, rather than the leftmost. .coNP Functions @ pos-max and @ pos-min .synb .mets (pos-max < sequence >> [ testfun <> [ keyfun ]]) .mets (pos-min < sequence >> [ testfun <> [ keyfun ]]) .syne .desc The .code pos-min and .code pos-max functions implement exactly the same algorithm; they differ only in their defaulting behavior with regard to the .meta testfun argument. If .meta testfun is not given, then the .code pos-max function defaults .meta testfun to the .code greater function, whereas .code pos-min defaults it to the .code less function. If .meta sequence is empty, both functions return .codn nil . Without a .meta testfun argument, the .code pos-max function finds the zero-based position index of the numerically maximum value occurring in .metn sequence , whereas .code pos-min without a .meta testfun argument finds the index of the minimum value. If a .meta testfun argument is given, the two functions are equivalent. The .meta testfun function must be callable with two arguments. If .meta testfun behaves like a greater-than comparison, then .code pos-max and .code pos-min return the index of the maximum element. If .meta testfun behaves like a .code less-than comparison, then the functions return the index of the minimum element. The .meta keyfun argument defaults to the .code identity function. Each element from .meta sequence is passed through this one-argument function, and the resulting value is used in its place. If a sequence contains multiple equivalent maxima, whether the position of the leftmost or rightmost such maximum is reported depends on whether .meta testfun compares for strict inequality, or whether it reports true for equal arguments also. Under the default .metn testfun , which is .codn less , the .code pos-max function will return the position leftmost of a duplicate set of maximum elements. To find the rightmost of the maxima, the .code lequal function can be substituted. Analogous reasoning applies to other test functions. .coNP Function @ subst .synb .mets (subst < old < new < seq >> [ testfun <> [ keyfun ]]) .syne .desc The .code subst function returns a sequence of the same type as .meta seq in which elements of .meta seq which match the .meta old object have been replaced with the .meta new object. To form the comparison keys, the elements of .meta seq are projected through the .meta testfun function, which defaults to .codn identity , so the items themselves are used as keys by default. Keys are compared to the .meta old value using .metn testfun , which defaults to .codn equal . .TP* Examples: .verb (subst "brown" "black" #("how" "now" "brown" "cow")) -> #("how" "now" "black" "cow")) ;; elements are converted to lower case to form keys [subst "brown" "black" #("how" "now" "BROWN" "cow") : downcase-str] -> #("how" "now" "black" "cow") ;; using < instead of equality, replace elements ;; greater than 5 with 0 [subst 5 0 '(1 2 3 4 5 6 7 8 9 10) <] (1 2 3 4 5 0 0 0 0 0)) .brev .coNP Functions @, subq @ subql and @ subqual .synb .mets (subq < old < new << sequence ) .mets (subql < old < new << sequence ) .mets (subqual < old < new << sequence ) .syne .desc The .codn subq , .code subql and .code subqual functions return a sequence of the same kind as .meta sequence in which elements matching the .meta old object are replaced by .meta new object. The matching elements are identified by comparing with .meta old using, respectively, the functions .codn eq , .codn eql , and .codn equal . .TP* Examples: .verb (subq #\eb #\ez "abc") -> "azc" (subql 1 3 #(0 1 2)) -> #(0 3 2) (subqual "are" "do" '#"how are you") -> ("how" "do" "you") .brev .coNP Function @ mismatch .synb .mets (mismatch < left-seq < right-seq >> [ testfun <> [ keyfun ]]) .syne .desc The .code mismatch function compares corresponding elements from the sequences .meta left-seq and .metn right-seq , returning the position at which the first mismatch occurs. If the sequences are of the same length, and their corresponding elements are the same, then .code nil is returned. If one sequence is shorter than the other, and matches a prefix of the other, then the mismatching position returned is one position after the last element of the shorter sequence, the same value as its length. An empty sequence is a prefix of every sequence. The .meta keyfun argument defaults to the .code identity function. Each element from .meta sequence is passed to .meta keyfun and the resulting value is used in its place. After being converted through .metn keyfun , items are then compared using .metn testfun , which must accept two arguments, and defaults to .codn equal . .coNP Function @ where .synb .mets (where < function << iterable ) .syne .desc If .meta iterable is a sequence, the .code where function returns a lazy list of the numeric indices of those of its elements which satisfy .metn function . The numeric indices appear in increasing order. If .meta iterable is a hash, the following special behavior applies: .code where returns a lazy list of of keys which have values which satisfy .metn function . These keys are not subject to an order. .meta function must be a function that can be called with one argument. For each element of .metn iterable , .meta function is called with that element as an argument. If a .cod2 non- nil value is returned, then the zero-based index of that element is added to a list. Finally, the list is returned. .coNP Function @ wheref .synb .mets (wheref << function) .syne .desc The .code wheref function is a combinator related to the .code where function. The .code wheref function returns a function that takes one argument. When a sequence is passed to that function, it returns the index positions where the sequence elements satisfy the given .metn function , which must be capable of taking one argument. Certain uses of .code where can be expressed more briefly using .codn wheref , according to the following equivalence: .verb (where f s) <--> [(wheref f) s] .brev .TP* Example: .verb ;; partition list of integers by odd, using where: [partition 0..10 (op where oddp)] --> ((0) (1 2) (3 4) (5 6) (7 8) (9)) ;; using wheref [partition 0..10 [wheref oddp]] --> ((0) (1 2) (3 4) (5 6) (7 8) (9)) .brev .coNP Functions @, whereq @ whereql and @ wherequal .synb .mets (whereq << object ) .mets (whereql << object ) .mets (wherequal << object ) .syne .desc The functions .codn whereq , .code whereql and .code wherequal are combinators related to the .code where function. The .code whereq function returns a function that takes one argument. When a sequence is passed to that function, it returns the index positions where the elements of the sequence are .code eq to .metn object . The .code whereql function differs only in that the test is .code eql rather than .codn eq , and the .code wherequal function uses .code equal equality. .TP* Example: .verb ;; indices where the string has a 'c', using where: (where (op eq #\ec) "abcabc") -> (2 5) ;; same, using whereq: [(whereq #\ec) "abcabc"] -> (2 5) .brev .coNP Function @ rmismatch .synb .mets (rmismatch < left-seq < right-seq >> [ testfun <> [ keyfun ]]) .syne .desc Similarly to .codn mismatch , the .code rmismatch function compares corresponding elements from the sequences .meta left-seq and .metn right-seq , returning the position at which the first mismatch occurs. All of the arguments have the same semantics as that of .codn mismatch . Unlike .codn mismatch , .code rmismatch compares the sequences right-to-left, finding the suffix which they have in common, rather than prefix. If the sequences match, then .code nil is returned. Otherwise, a negative index is returned giving the mismatching position, regarded from the end. If the sequences match only in the rightmost element, then -1 is returned. If they match in two elements then -2 and so forth. .coNP Functions @ starts-with and @ ends-with .synb .mets (starts-with < short-seq < long-seq >> [ testfun <> [ keyfun ]]) .mets (ends-with < short-seq < long-seq >> [ testfun <> [ keyfun ]]) .syne .desc The .code starts-with and .code ends-with functions compare corresponding elements from sequences .meta short-seq and .metn long-seq . The .code starts-with function returns .code t if .meta short-seq is prefix of .metn long-seq ; otherwise, it returns .codn nil . The .code ends-with function returns .code t if .meta short-seq is suffix of .metn long-seq ; otherwise, it returns .codn nil . Element from both sequences are mapped to comparison keys using .metn keyfun , which defaults to .codn identity . Comparison keys are compared using .meta testfun which defaults to .codn equal . .coNP Function @ select .synb .mets (select < sequence >> { index-seq | << function }) .syne .desc The .code select function returns a sequence, of the same kind as .metn sequence , which consists of those elements of .meta sequence which are identified by the indices in .metn index-seq , which is required to be a sequence. If a .meta function argument is given instead of .metn index-seq , then .meta function is invoked with .meta sequence as its argument. The return value is then taken as if it were the .meta index-seq argument . If .meta sequence is a sequence, then .meta index-seq consists of numeric indices. The length of the sequence, as reported by the .code length function, is added to every .meta index-seq value which is negative. The .code select function stops collecting values upon encountering an index value which is greater than or equal to the length of the sequence. (Rationale: without this strict behavior, .code select would not be able to terminate if .meta index-seq is infinite.) If .meta sequence is, more specifically, a list-like sequence, then .meta index-seq must contain monotonically increasing numeric values, even if no value is out of range, since the .code select function makes a single pass through the list based on the assumption that indices are ordered. (Rationale: optimization.) This requirement for monotonicity applies to the values which result after negative indices are displaced by the sequence length Also, in this list-like sequence case, values taken from .meta index-seq which are still negative after being displaced by the sequence length are ignored. If .meta sequence is a hash, then .meta index-seq is a list of keys. A new hash is returned which contains those elements of .meta sequence whose keys appear in .metn index-seq . All of .meta index-seq is processed, even if it contains keys which are not in .metn sequence . The nonexistent keys are ignored. The .code select function also supports objects of type .codn carray , in a manner similar to vectors. The indicated elements are extracted from the input sequence, and a new .code carray is returned whose storage is initialized by converting the extracted values back to the foreign representation. .coNP Function @ reject .synb .mets (reject < sequence >> { index-seq | << function }) .syne .desc The .code reject function returns a sequence, of the same kind as .metn sequence , which consists of all those elements of .meta sequence which are not identified by the indices in .metn index-seq , which may be a list or a vector. If .meta function is given instead of .metn index-seq , then .meta function is invoked with .meta sequence as its argument. The return value is then taken as if it were the .meta index-seq argument . If .code sequence is a hash, then .meta index-seq represents a list of keys. The .code reject function returns a duplicate of the hash, in which the keys specified in .meta index-seq do not appear. Otherwise if .meta sequence is a vector-like sequence, then the behavior of .code reject may be understood by the following equivalence: .verb (reject seq idx) --> (make-like [apply append (split* seq idx)] seq) .brev where it is to be understood that .meta seq is evaluated only once. If .meta sequence is a list, then, similarly, the following equivalence applies: .verb (reject seq idx) --> (make-like [apply append* (split* seq idx)] seq) .brev The input sequence is split into pieces at the indicated indices, such that the elements at the indices are removed and do not appear in the pieces. The pieces are then appended together in order, and the resulting list is coerced into the same type of sequence as the input sequence. .coNP Function @ relate .synb .mets (relate < domain-seq < range-seq <> [ default-val ]) .syne .desc The .code relate function returns a one-argument function which implements the relation formed by mapping the elements of .meta domain-seq to the positionally corresponding elements of .metn range-seq . That is to say, the function searches through the sequence .meta domain-seq to determine the position where its argument occurs, using .code equal as the comparison function. Then it returns the element from that position in the .meta range-seq sequence. This returned function is called the .IR "relation function" . If the relation function's argument is not found in .metn domain-seq , then the behavior depends on the optional parameter .metn default-val . If an argument is given for .metn default-val , then the relation function returns that value. Otherwise, the relation function returns its argument. Note: the .code relate function may be understood in terms of the following equivalences: .verb (relate d r) <--> (lambda (arg) (iflet ((p (posqual arg d))) [r p] arg)) (relate d r v) <--> (lambda (arg) (iflet ((p (posqual arg d))) [r p] v)) .brev Note: .code relate may return a hash table instead of a function, if such an object can satisfy the semantics required by the arguments. .TP* Examples: .verb (mapcar (relate "_" "-") "foo_bar") -> "foo-bar" (mapcar (relate "0123456789" "ABCDEFGHIJ" "X") "139D-345") -> "BJDXXDEF" (mapcar (relate '(nil) '(0)) '(nil 1 2 nil 4)) -> (0 1 2 0 4) .brev .coNP Function @ in .synb .mets (in < sequence < key >> [ testfun <> [ keyfun ]]) .mets (in < hash << key ) .syne .desc The .code in function tests whether .meta key is found inside .meta sequence or .metn hash . If the .meta testfun argument is specified, it specifies the function which is used to comparison keys from the sequence to .metn key . Otherwise the .code equal function is used. If the .meta keyfun argument is specified, it specifies a function which is applied to the elements of .meta sequence to produce the comparison keys. Without this argument, the elements themselves are taken as the comparison keys. If the object being searched is a hash, then if neither of the arguments .meta keyfun nor .meta testfun is specified, .code in performs a hash lookup for .codn key , returning .code t if the key is found, .code nil otherwise. If either of .meta keyfun or .meta testfun is specified, then .code in performs an exhaustive search of the hash table, as if it were a sequence of .code cons cells whose .code car fields are keys, and whose .code cdr keys are values. Thus to search by key, the .code car function must be specified as .metn keyfun . The .code in function returns .code t if it finds .meta key in .meta sequence or .metn hash , otherwise .codn nil . .coNP Function @ partition .synb .mets (partition < sequence >> { index-seq | < index | << function }) .syne .desc If .meta sequence is empty, then .code partition returns an empty list, and the second argument is ignored; if it is .metn function , it is not called. Otherwise, .code partition returns a lazy list of partitions of .metn sequence . Partitions are consecutive, non-overlapping, nonempty substrings of .metn sequence , of the same kind as .metn sequence , such that if these substrings are catenated together in their order of appearance, a sequence .code equal to the original is produced. If the second argument is of the form .metn index-seq , or if an .meta index-seq was produced from the .meta index or .meta function arguments, each value in that sequence must be an integer. Each integer value which is nonnegative specifies the index position given by its value. Each integer value which is negative specifies an index position given by adding the length of .meta sequence to its value. The sequence index positions thus denoted by .meta index-seq shall be strictly nondecreasing. Each successive element is expected to designate an index position at least as high as all previous elements, otherwise the behavior is unspecified. Index values which are still negative after the addition of the sequence length are ignored, as are index values greater than the sequence length. Nondecreasing means that repeated values are permitted; they have the same effect as a single value. If .meta index-seq is empty then a one-element list containing the entire .meta sequence is returned. If .meta index-seq is an infinite lazy list, the function shall terminate if that list eventually produces an index position which is greater than or equal to the length of .metn sequence . If the second argument is a function, then this function is applied to .metn sequence , and the return value of this call is then used in place of the second argument, which must either be a single index value, which is then taken as if it were the .meta index argument, or else a sequence of indices, which are taken as the .meta index-seq argument. If the second argument is neither a sequence, nor a function, then it is assumed to be an integer index, and is turned into an .meta index-seq sequence containing one element. After the .meta index-seq is obtained as an argument, or determined from the .meta index or .meta function arguments, the .code partition function then divides .meta sequence according to the indices. The first partition begins with the first element of .metn sequence . The second partition begins at the first position in .metn index-seq , and so on. Indices beyond the length of the sequence are ignored, as are indices less than or equal to zero. .TP* Examples: .verb (partition '(1 2 3) 1) -> ((1) (2 3)) ;; split the string where there is a "b" (partition "abcbcbd" (op where (op eql #\eb))) -> ("a" "bc" "bc" "bd") .brev .coNP Functions @ split and @ split* .synb .mets (split < sequence >> { index-seq | < index | << function }) .mets (split* < sequence >> { index-seq | < index | << function }) .syne .desc If .meta sequence is empty, then both .code split and .code split* return an empty list, and the second argument is ignored; if it is .metn function , it is not called. Otherwise, .code split returns a lazy list of pieces of .metn sequence : consecutive, non-overlapping, possibly empty substrings of .metn sequence , of the same kind as .metn sequence . A catenation of these pieces in the order they appear would produce a sequence that is .code equal to the original sequence. The .code split* function differs from .code split in that the elements indicated by the split indices are removed. The .metn index , .metn index-seq , and .meta function arguments are subject to the same restrictions and treatment as the corresponding arguments of the .code partition function, with the following difference: the index positions indicated by .code index-seq are required to be strictly increasing, rather than nondecreasing. As with .codn partition , this consideration applies to the transformed indices, after the displacement of negative values by the length of the sequence. If any element of .meta index-seq is not higher than the previous element, the behavior is unspecified. If the second argument is of the form .metn index-seq , or if an .meta index-seq was produced from the .meta index or .meta function arguments, then the .code split function divides .meta sequence according to the indices indicated in .metn index-seq . The first piece always begins with the first element of .metn sequence . Each subsequent piece begins with the position indicated by an element of .metn index-seq . Negative indices are ignored. If .meta index-seq includes index zero, then an empty first piece is generated. If .meta index-seq includes an index greater than or equal to the length of .meta sequence (equivalently, an index beyond the last element of the sequence) then an additional empty last piece is generated. The length of .meta sequence is added to any negative indices. An index which is still negative after being thus displaced is discarded. Note: the principal difference between .code split and .code partition is that .code partition does not produce empty pieces. .TP* Examples: .verb (split '(1 2 3) 1) -> ((1) (2 3)) (split "abc" 0) -> ("" "abc") (split "abc" 3) -> ("abc" "") (split "abc" 1) -> ("a" "bc") (split "abc" '(0 1 2 3)) -> ("" "a" "b" "c" "") (split "abc" '(1 2)) -> ("a" "b" "c") (split "abc" '(-1 1 2 15)) -> ("a" "b" "c") ;; triple split at makes two additional empty pieces (split "abc" '(1 1 1)) -> ("a" "" "" "bc") (split* "abc" 0) -> ("" "bc") ;; "a" is removed ;; all characters removed (split* "abc" '(0 1 2)) -> ("" "" "" "") .brev .coNP Function @ partition* .synb .mets (partition* < sequence >> { index-seq | < index | << function }) .syne .desc If .meta sequence is empty, then .code partition* returns an empty list, and the second argument is ignored; if it is .metn function , it is not called. The .metn index , .metn index-seq , and .meta function arguments are subject to the same restrictions and treatment as the corresponding arguments of the .code partition function, with the following difference: the index positions indicated by .code index-seq are required to be strictly increasing, rather than nondecreasing. If the second argument is of the form .metn index-seq , then .code partition* produces a lazy list of pieces taken from .metn sequence . The pieces are formed by deleting from .meta sequence the elements at the positions given in .metn index-seq , such that the pieces are the remaining nonempty substrings from between the deleted elements, maintaining their order. If .meta index-seq is empty then a one-element list containing the entire .meta sequence is returned. .TP* Examples: .verb (partition* '(1 2 3 4 5) '(0 2 4)) -> ((2) (4)) (partition* "abcd" '(0 3)) -> "bc" (partition* "abcd" '(0 1 2 3)) -> nil .brev .coNP Functions @, find @ find-if and @ find-true .synb .mets (find < key < sequence >> [ testfun <> [ keyfun ]]) .mets (find-if < predfun >> { sequence | << hash } <> [ keyfun ]) .mets (find-true < predfun >> { sequence | << hash } <> [ keyfun ]) .syne .desc The .code find and .code find-if functions search through a sequence for an item which matches a key, or satisfies a predicate function, respectively. The .code find-true function is a variant of .code find-if which returns the value of the predicate function instead of the item. The .meta keyfun argument specifies a function which is applied to the elements of .meta sequence to produce the comparison key. If this argument is omitted, then the untransformed elements of the .meta sequence are searched. The .code find function's .meta testfun argument specifies the test function which is used to compare the comparison keys from .meta sequence to the search key. If this argument is omitted, then the .code equal function is used. The first element from the list whose comparison key (as retrieved by .metn keyfun ) matches the search (under .metn testfun ) is returned. If no such element is found, .code nil is returned. The .code find-if function's .meta predfun argument specifies a predicate function which is applied to the successive comparison keys pulled from the list by applying .meta keyfun to successive elements. The first element for which .meta predfun yields true is returned. If no such element is found, .code nil is returned. In the case of .codn find-if , a hash table may be specified instead of a sequence. The .meta hash is treated as if it were a sequence of hash key and hash value pairs represented as cons cells, the .code car slots of which are the hash keys, and the .code cdr of which are the hash values. If the caller doesn't specify a .meta keyfun then these cells are taken as their keys. The .code find-true function's argument conventions and search semantics are identical to those of .codn find-if , but the return value is different. Instead of returning the found item, .code find-true returns the value which .meta predfun returned for the found item's key. .coNP Functions @ rfind and @ rfind-if .synb .mets (rfind < key < sequence >> [ testfun <> [ keyfun ]]) .mets (rfind-if < predfun >> { sequence | << hash } <> [ keyfun ]) .syne .desc The .code rfind and .code rfind-if functions are almost exactly like .code find and .code find-if except that if there are multiple matches for .meta key in .metn sequence , they return the rightmost element rather than the leftmost. In the case of .code rfind-if when a .meta hash is specified instead of a .metn sequence , the function searches through the hash entries in the same order as .codn find-if , but finds the last match rather than the first. Note: hashes are inherently not ordered; the relative order of items in a hash table can change when other items are inserted or deleted. .coNP Functions @ find-max and @ find-min .synb .mets (find-max < iterable >> [ testfun <> [ keyfun ]]) .mets (find-min < iterable >> [ testfun <> [ keyfun ]]) .syne .desc The .code find-min and .code find-max function implement exactly the same algorithm; they differ only in their defaulting behavior with regard to the .meta testfun argument. If .meta testfun is not given, then the .code find-max function defaults it to the .code greater function, whereas .code find-min defaults it to the .code less function. Without a .meta testfun argument, the .code find-max function finds the numerically maximum value occurring in .metn iterable , whereas .code pos-min without a .meta testfun argument finds the minimum value. If a .meta testfun argument is given, the two functions are equivalent. The .meta testfun function must be callable with two arguments. If .meta testfun behaves like a greater-than comparison, then .code find-max and .code find-min both return the maximum element. If .meta testfun behaves like a less-than comparison, then the functions return the minimum element. The .meta keyfun argument defaults to the .code identity function. Each element from .meta sequence is passed through this one-argument function, and the resulting value is used in its place for the purposes of the comparison. However, the original element is returned. If there are multiple equivalent maxima, then under the default .metn testfun , that being .codn less , the first one encountered while traversing .meta iterable is the one that is reported. See the notes under .code pos-max regarding duplicate maxima. .coNP Functions @ find-maxes and @ find-mins .synb .mets (find-maxes < iterable >> [ testfun <> [ keyfun ]]) .mets (find-mins < iterable >> [ testfun <> [ keyfun ]]) .syne .desc The .code find-maxes and .code find-mins functions have the same argument conventions as, respectively, .code find-max and .codn find-min . These functions differ in that they return a sequence of all the elements of .meta iterable which maximize the value of .metn keyfun . The returned sequence is of the same kind as .metn iterable . .coNP Functions @ find-max-key and @ find-min-key .synb .mets (find-max-key < iterable [ testfun <> [ keyfun ]]) .mets (find-min-key < iterable [ testfun <> [ keyfun ]]) .syne .desc The .code find-max-key and .code find-min-key functions have the same argument conventions as, respectively, .code find-max and .code find-min and agree with those functions in regard to which element of the input sequence is identified: all these functions identify the element which maximizes or minimizes the value of .metn keyfun . Whereas .code find-max and .code find-min return the maximizing or minimizing element itself, the .code find-max-key and .code find-min-key functions return the value of .meta keyfun applied to the element. Under the default .meta keyfun value, that being the .code identity function, these functions behave the same as .code find-max and .codn find-min . .coNP Functions @, uni @, isec @, isecp @ diff and @ symdiff .synb .mets (uni < iter1 < iter1 >> [ testfun <> [ keyfun ]]) .mets (isec < iter1 < iter1 >> [ testfun <> [ keyfun ]]) .mets (isecp < iter1 < iter1 >> [ testfun <> [ keyfun ]]) .mets (diff < iter1 < iter1 >> [ testfun <> [ keyfun ]]) .mets (symdiff < iter1 < iter2 >> [ testfun <> [ keyfun ]]) .syne .desc The functions .codn uni , .codn isec , .code diff and .code symdiff treat the sequences .meta iter1 and .meta iter2 as if they were sets. They, respectively, compute the set union, set intersection, set difference and symmetric difference of .meta iter1 and .metn iter2 , returning a new sequence. The .code isecp is Boolean: it returns .code t for those arguments for which .code isec returns a non-empty list, otherwise .codn nil . The arguments .meta iter1 and .meta iter2 need not be of the same kind. They may be hash tables. The returned sequence is of the same kind as .metn iter1 . If .meta iter1 is a hash table, the returned sequence is a list. For the purposes of these functions, an input which is a hash table is considered as if it were a sequence of hash key and hash value pairs represented as cons cells, the .code car slots of which are the hash keys, and the .code cdr of which are the hash values. This means that if no .meta keyfun is specified, these pairs are taken as keys. Since the input sequences are defined as representing sets, they are assumed not to contain duplicate elements. These functions are not required, but may, de-duplicate the sequences. The union sequence produced by .code uni contains all of the elements which occur in both .meta iter1 and .metn iter2 . If a given element occurs exactly once only in .meta iter1 or exactly once only in .metn iter2 , or exactly once in both sequences, then it occurs exactly once in the union sequence. If a given element occurs at least once in either .metn iter1 , .meta iter2 or both, then it occurs at least once in the union sequence. The intersection sequence produced by .code isec contains all of the elements which occur in both .meta iter1 and .metn iter2 . If a given element occurs exactly once in .meta iter1 and exactly once in .metn iter2 , then in occurs exactly once in the intersection sequence. If a given element occurs at least once in .meta iter1 and at least once in .metn iter2 , then in occurs at least once in the intersection sequence. The difference sequence produced by .code diff contains all of the elements which occur in .meta iter1 but do not occur in .metn iter2 . If an element occurs exactly once in .meta iter1 and does not occur in .metn iter2 , then it occurs exactly once in the difference sequence. If an element occurs at least once in .meta iter1 and does not occur in .metn iter2 , then it occurs at least once in the difference sequence. If an element occurs at least once in .metn iter2 , then it does not occur in the difference sequence. The symmetric difference sequence produced by .code symdiff contains all of the elements of .meta iter1 which do not occur in .meta iter2 and vice versa: it also contains all of the elements of .meta iter2 which do not occur in .metn iter1 . Element equivalence is determined by a combination of .meta testfun and .metn keyfun . Elements are compared pairwise, and each element of a pair is passed through .meta keyfun function to produce a comparison value. The comparison values are compared using .metn testfun . If .meta keyfun is omitted, then the untransformed elements themselves are compared, and if .meta testfun is omitted, then the .code equal function is used. Note: a function similar to .code diff named .code set-diff exists. This became deprecated starting in \*(TX 184. For the .code set-diff function, the requirement was specified to preserve the original order of items from .meta iter1 that survive into the output sequence. This requirement is not documented for the .code diff function, but is de facto honored by the implementation for at as long as the .code set-diff synonym continues to be available. The .code set-diff function doesn't support hash tables and is inefficient for vectors and strings. Note: these functions are not efficient for the processing of hash tables, even when both inputs are hashes, the .meta keyfun argument is .codn car , and .meta testfun matches the equality used by both hash-table inputs. If applicable, the operations .codn hash-uni , .code hash-isec and .code hash-diff should be used instead. .coNP Functions @, mapcar @, map @, mappend @ mapcar* and @ mappend* .synb .mets (mapcar < function << iterable *) .mets (map < function << iterable *) .mets (mappend < function << iterable *) .mets (mapcar* < function << iterable *) .mets (mappend* < function << iterable *) .syne .desc When given only one argument, the .code mapcar function returns .codn nil . .meta function is never called. When given two arguments, the .code mapcar function applies .meta function to each elements of .meta iterable and returns a sequence of the resulting values in the same order as the original values. The returned sequence is the same kind as .metn iterable , if possible. If the accumulated values cannot be elements of that type of sequence, then a list is returned. When additional sequences are given as arguments, this filtering behavior is generalized in the following way: .code mapcar traverses the sequences in parallel, taking a value from each sequence as an argument to the function. If there are two lists, .meta function is called with two arguments and so forth. The traversal is limited by the length of the shortest sequence. The return values of the function are collected into a new sequence which is returned. The returned sequence is of the same kind as the leftmost input sequence, unless the accumulated values cannot be elements of that type of sequence, in which case a list is returned. The functions .code mapcar and .code map are synonyms. The .code mappend function works like .codn mapcar , with the following difference. Rather than accumulating the values returned by the function into a sequence, mappend expects the items returned by the function to be sequences which are catenated with .codn append , and the resulting sequence is returned. The returned sequence is of the same kind as the leftmost input sequence, unless the values cannot be elements of that type of sequence, in which case a list is returned. The .code mapcar* and .code mappend* functions work like .code mapcar and .codn mappend , respectively. However, they return lazy lists rather than generating the entire output list prior to returning. .TP* Caveats: Like .codn mappend , .code mappend* must "consume" empty lists. For instance, if the function being mapped puts out a sequence of .codn nil s, then the result must be the empty list .codn nil , because .code "(append nil nil nil nil ...)" is .codn nil . But suppose that .code mappend* is used on inputs which are infinite lazy lists, such that the function returns .code nil values indefinitely. For instance: .verb ;; Danger: infinite loop!!! (mappend* (fun identity) (repeat '(nil))) .brev The .code mappend* function is caught in a loop trying to consume and squash an infinite stream of .codn nil s, and so doesn't return. .TP* Examples: .verb ;; multiply every element by two (mapcar (lambda (item) (* 2 item)) '(1 2 3)) -> (4 6 8) ;; "zipper" two lists together (mapcar (lambda (le ri) (list le ri)) '(1 2 3) '(a b c)) -> '((1 a) (2 b) (3 c))) ;; like append, mappend allows a lone atom or a trailing atom: (mappend (fun identity) 3) -> (3) (mappend (fun identity) '((1) 2)) -> (1 . 2) ;; take just the even numbers (mappend (lambda (item) (if (evenp x) (list x))) '(1 2 3 4 5)) -> (2 4) .brev .coNP Functions @, maprod @ maprend and @ maprodo .synb .mets (maprod < function << iterable *) .mets (maprend < function << iterable *) .mets (maprodo < function << iterable *) .syne .desc The .codn maprod , .code maprend and .code maprodo functions resemble .codn mapcar , .code mappend and .codn mapdo , respectively. When given no .meta iterable arguments or exactly one .meta iterable argument, they behave exactly like those three functions. When two or more .meta iterable arguments are present, .code maprod differs from .code mapcar in the following way, as do the remaining functions from their aforementioned counterparts. Whereas .code mapcar iterates over the .meta iterable values in parallel, taking successive tuples of element values and passing them to .metn function , the .code maprod function iterates over all .I combinations of elements from the sequences: the Cartesian product. The .code prod suffix stands for "product". If one or more .meta iterable arguments specify an empty sequence, then the Cartesian product is empty. In this situation, .meta function is not called. The result of the function is then .code nil converted to the same kind of sequence as the leftmost .metn iterable . The .code maprod function collects the values into a list just as .code mapcar does. Just like .codn mapcar , it converts the resulting list into the same kind of sequence as the leftmost .meta iterable argument, if possible. For instance, if the resulting list is a list or vector of characters, and the leftmost .meta iterable is a character string, then the list or vector of characters is converted to a character string and returned. The .code maprend function ("map product through function and append") iterates the .meta iterable element combinations exactly like .codn maprod , passing them as arguments to .metn function . The values returned by .meta function are then treated exactly as by the .code mappend function. The return values are expected to be sequences which are appended together as if by .codn append , and the final result is converted to the same kind of sequence as the leftmost .meta iterable if possible. The .code maprodo function, like .codn mapdo , ignores the result of .meta function and returns .codn nil . The combination iteration gives priority to the rightmost .metn iterable , which means that the rightmost element of each generated tuple varies fastest: the tuples are traversed in "rightmost major" order. This is made clear in the examples. .TP* Examples .verb [maprod list '(0 1 2) '(a b) '(i ii iii)] -> ((0 a i) (0 a ii) (0 a iii) (0 b i) (0 b ii) (0 b iii) (1 a i) (1 a ii) (1 a iii) (1 b i) (1 b ii) (1 b iii) (2 a i) (2 a ii) (2 a iii) (2 b i) (2 b ii) (2 b iii)) ;; Vectors #(#\ea #\ex) #(#\ea #\ey) ... are appended ;; together resulting in #(#\ea #\ex #\ea #\ey ...) ;; which is converted to a string: [maprend vec "ab" "xy"] -> "axaybxby" ;; One of the sequences is empty, so the product is an ;; empty sequence of the same kind as the leftmost ;; sequence argument, thus an empty string: [maprend vec "ab" ""] -> "" .brev .coNP Function @ mapdo .synb .mets (mapdo < function << iterable *) .syne .desc The .code mapdo function is similar to .codn mapcar , but always returns .codn nil . It is useful when .meta function performs some kind of side effect, hence the "do" in the name, which is a mnemonic for the execution of imperative actions. When only the .meta function argument is given, .meta function is never called, and .code nil is returned. If a single .meta iterable argument is given, then .code mapdo iterates over .metn iterable , invoking .meta function on each element. If two or more .meta iterable arguments are given, then .code mapdo iterates over the sequences in parallel, extracting parallel tuples of items. These tuples are passed as arguments to .metn function , which must accept as many arguments as there are sequences. .coNP Functions @ transpose and @ zip .synb .mets (transpose << iterable ) .mets (zip << iterable *) .syne .desc The .code transpose function performs a transposition on .metn iterable . This means that the elements of .meta iterable must be iterable. These iterables are understood to be columns; transpose exchanges rows and columns, returning a sequence of the rows which make up the columns. The returned sequence is of the same kind as .metn iterable . The rows are also the same kind of sequence as the first element of the original sequence, if possible, otherwise they are lists. The number of rows returned is limited by the shortest column among the sequences. The .code zip function takes variable arguments, and is equivalent to calling .code transpose on a list of the arguments. The following equivalences hold: .verb (zip . x) <--> (transpose x) [apply zip x] <--> (transpose x) .brev A special requirement applies when the first argument of .code zip or the first element of the .meta iterable argument of .code transpose is a string. In this situation, the tuples which emerge are strings, if possible. The special requirement is that column elements which are strings are treated as individual items and appended to the row strings. For example, .code "(zip \(dqab\(dq #(\(dqrst\(dq \(dqxyz\(dq))" produces .codn "(\(dqarst\(dq \(dqbxyz\(dq)" , rather than .codn "((\(dqa\(dq \(dqrst\(dq) (\(dqb\(dq \(dqxyz\(dq))" . .TP* Examples: .verb ;; transpose list of lists (transpose '((a b c) (c d e))) -> ((a c) (b d) (c e)) ;; transpose vector of strings: ;; - string columns become string rows ;; - vector input becomes vector output (transpose #("abc" "def" "ghij")) -> #("adg" "beh" "cfi") ;; error: transpose wants to make a list of strings ;; but 1 is not a character (transpose #("abc" "def" '(1 2 3))) ;; error! ;; String elements are catenated: (transpose #("abc" "def" ("UV" "XY" "WZ"))) -> #("adUV" "beXY" "cfWZ") ;; Transpose list of ranges (transpose (list 1..4 4..8 8..12)) -> ((1 4 8) (2 5 9) (3 6 10)) (zip '(a b c) '(c d e)) -> ((a c) (b d) (c e)) .brev .coNP Functions @, window-map @ window-mappend and @ window-mapdo .synb .mets (window-map < range < boundary < function << sequence ) .mets (window-mappend < range < boundary < function << sequence ) .mets (window-mapdo < range < boundary < function << sequence ) .syne .desc The .code window-map and .code window-mappend functions process the elements of .meta sequence by passing arguments derived from each successive element to .metn function . Both functions return, if possible, a sequence of the same kind as .codn sequence , otherwise a list. Under .codn window-map , values returned by .meta function are accumulated into a sequence of the same type as .meta sequence and that sequence is returned. Under .codn window-mappend , the values returned by the calls to .meta function are expected to be sequence which are appended together to form the output sequence. These functions are analogous to .code mapcar and .codn mappend . Unlike these, they operate only on a single sequence, and over this sequence they perform a .IR "sliding window mapping" , whose description follows. The function .code window-mapdo avoids accumulating a sequence, and instead returns .codn nil ; it is analogous to .codn mapdo . The argument to the .meta range parameter must be a positive integer, not exceeding 512. This parameter specifies the amount of ahead/behind context on either side of each element which is processed. It indirectly determines the window size for the mapping. The window size is twice .metn range , plus one. For instance if range is 2, then the window size is 5: the element being processed lies at the center of the window, flanked by two elements on either side, making five. The .meta function argument must specify a function which accepts a number of arguments corresponding to the window size. For instance if .meta range is 2, making the window size 5, then .meta function must accept 5 arguments. These arguments constitute the sliding window being processed. Each time .meta function is called, the middle argument is the element being processed, and the arguments surrounding it are its window. When an element is processed from somewhere in the interior of a sequence, where it is flanked on either side by at least .meta range elements, then the window is populated by those flanking elements taken from .metn sequence . The .meta boundary parameter specifies the window contents which are used for the processing of elements which are closer than .meta range to either end of the sequence. Except if it is of list type, .meta boundary must be a sequence containing at least twice .meta range number of elements (one less than the window size): if it has additional elements, they are not used. If it is a list, it may be shorter than twice .metn range ; in this case, the value .code nil is substituted for the missing elements. The argument may also be one of the two keyword symbols .code :wrap or .codn :reflect , described below. If .meta boundary is a sequence, it may be regarded as divided into two pieces of .meta range length. These two pieces then flank .code sequence on either end. The left half of .meta boundary is effectively prepended to the sequence, and the right half effectively appended. When the sliding window extends beyond the boundary of .meta sequence near its start or end, the window is populated from these flanking elements obtained from .metn boundary . If .meta boundary argument is specified as the keyword .codn :wrap , then the sequence is imagined to be flanked at either end by an infinite repetition of copies of itself. These flanks are trimmed to the window size to generate the boundary. For instance if the sequence is .code "(1 2 3)" and the window size is 9 due to the value of .meta range being 4, then the behavior of .code :wrap is as if .meta boundary value of .code "(3 1 2 3 1 2 3 1)" were specified. The left flank is .codn "(3 1 2 3)" , being the last four elements of an infinite repetition of .codn "1 2 3" ; and the right flank is similarly .codn "(1 2 3 1)" , being the first four elements of an infinite repetition of .codn "1 2 3" . If .meta boundary is given as the keyword .codn :reflect , then the sequence is imagined to be flanked at either end by an infinite repetition of reversed copies of itself. These flanks are trimmed to the window size to generate the boundary. For instance if the sequence is .code "(1 2 3)" and the window size is 9 due to the value of .meta range being 4, then the behavior of .code :reflect is as if .meta boundary value of .code "(1 3 2 1 3 2 1 3)" were specified. The left flank is .codn "(1 3 2 1)" , being the last four elements of an infinite repetition of .codn "3 2 1" ; and the right flank is similarly .codn "(3 2 1 3)" , being the first four elements of an infinite repetition of .codn "3 2 1" . .TP* Examples: .verb ;; change characters between angle brackets to upper case. [window-map 1 nil (lambda (x y z) (if (and (eq x #\e<) (eq z #\e>)) (chr-toupper y) y)) "abdeg"] --> "abdeg" ;; collect all numbers which are the centre element of ;; a monotonically increasing triplet [window-mappend 1 :reflect (lambda (x y z) (if (< x y z) (list y))) '(1 2 1 3 4 2 1 9 7 5 7 8 5)] --> (3 7) ;; calculate a moving average with a five-element ;; window, flanked by zeros at the boundaries: [window-map 2 #(0 0 0 0) (lambda (. args) (/ (sum args) 5)) #(4 7 9 13 5 1 6 11 10 3 8)] --> #(4.0 6.6 7.6 7.0 6.8 7.2 6.6 6.2 7.6 6.4 4.2)) .brev .coNP Function @ interpose .synb .mets (interpose < sep << sequence ) .syne .desc The .code interpose function returns a sequence of the same type as .metn sequence , in which the elements from .meta sequence appear with the .meta sep value inserted between them. If .meta sequence is an empty sequence or a sequence of length 1, then a sequence identical to .meta sequence is returned. It may be a copy of .meta sequence or it may be .meta sequence itself. If .meta sequence is a character string, then the value .meta sep must be a character. It is permissible for .metn sequence , or for a suffix of .meta sequence to be a lazy list, in which case interpose returns a lazy list, or a list with a lazy suffix. .TP* Examples: .verb (interpose #\e- "xyz") -> "x-y-z" (interpose t nil) -> nil (interpose t #()) -> #() (interpose #\ea "") -> "" (interpose t (range 0 0)) -> (0) (interpose t (range 0 1)) -> (0 t 1) (interpose t (range 0 2)) -> (0 t 1 t 2) .brev .coNP Functions @ reduce-left and @ reduce-right .synb .mets (reduce-left < binary-function < list .mets \ \ \ \ \ \ \ \ \ \ \ \ >> [ init-value <> [ key-function ]]) .mets (reduce-right < binary-function < list .mets \ \ \ \ \ \ \ \ \ \ \ \ \ >> [ init-value <> [ key-function ]]) .syne .desc The .code reduce-left and .code reduce-right functions reduce lists of operands specified by .meta list and .meta init-value to a single value by the repeated application of .metn binary-function . In the case of .codn reduce-left , the .meta list argument is required to be an object which is iterable according to the .code iter-begin function. The .code reduce-right function treats the .meta list argument using list operations. An effective list of operands is formed by combining .meta list and .metn init-value . If .meta key-function is specified, then the items of .meta list are mapped to new values through .metn key-function , as if by .codn mapcar . If .meta init-value is supplied, then in the case of .codn reduce-left , the effective list of operands is formed by prepending .meta init-value to .metn list . In the case of .codn reduce-right , the effective operand list is produced by appending .meta init-value to .metn list . The .meta init-value isn't mapped through .metn key-function . The production of the effective list can be expressed like this, though this is not to be understood as the actual implementation: .verb (append (if init-value-present (list init-value)) [mapcar (or key-function identity) list])))) .brev In the .code reduce-right case, the arguments to .code append are reversed. If the effective list of operands is empty, then .meta binary-function is called with no arguments at all, and its value is returned. This is the only case in which .meta binary-function is called with no arguments; in all remaining cases, it is called with two arguments. If the effective list contains one item, then that item is returned. Otherwise, the effective list contains two or more items, and is decimated as follows. Note that an .meta init-value specified as .code nil is not the same as a missing .metn init-value ; this means that the initial value is the object .codn nil . Omitting .meta init-value is the same as specifying a value of .code : (the colon keyword symbol). It is possible to specify .meta key-function while omitting an .meta init-value argument. This is achieved by explicitly specifying .code : as the .meta init-value argument. Under .codn reduce-left , the leftmost pair of operands is removed from the list and passed as arguments to .metn binary-function , in the same order that they appear in the list, and the resulting value initializes an accumulator. Then, for each remaining item in the list, .meta binary-function is invoked on two arguments: the current accumulator value, and the next element from the list. After each call, the accumulator is updated with the return value of .metn binary-function . The final value of the accumulator is returned. Under .codn reduce-right , the list is processed right to left. The rightmost pair of elements in the effective list is removed, and passed as arguments to .metn binary-function , in the same order that they appear in the list. The resulting value initializes an accumulator. Then, for each remaining item in the list, .meta binary-function is invoked on two arguments: the next element from the list, in right to left order, and the current accumulator value. After each call, the accumulator is updated with the return value of .metn binary-function . The final value of the accumulator is returned. .TP* Examples: .verb ;;; effective list is (1) so 1 is returned (reduce-left (fun +) () 1 nil) -> 1 ;;; computes (- (- (- 0 1) 2) 3) (reduce-left (fun -) '(1 2 3) 0 nil) -> -6 ;;; computes (- 1 (- 2 (- 3 0))) (reduce-right (fun -) '(1 2 3) 0 nil) -> 2 ;;; computes (* 1 2 3) (reduce-left (fun *) '((1) (2) (3)) nil (fun first)) -> 6 ;;; computes 1 because the effective list is empty ;;; and so * is called with no arguments, which yields 1. (reduce-left (fun *) nil) .brev .coNP Functions @, some @ all and @ none .synb .mets (some < sequence >> [ predicate-fun <> [ key-fun ]]) .mets (all < sequence >> [ predicate-fun <> [ key-fun ]]) .mets (none < sequence >> [ predicate-fun <> [ key-fun ]]) .syne .desc The .codn some , .code all and .code none functions apply a predicate-test function .meta predicate-fun over a list of elements. If the argument .meta key-fun is specified, then elements of .meta sequence are passed into .metn key-fun , and .meta predicate-fun is applied to the resulting values. If .meta key-fun is omitted, the behavior is as if .meta key-fun were the .code identity function. If .meta predicate-fun is omitted, the behavior is as if .meta predicate-fun were the .code identity function. These functions have short-circuiting semantics and return conventions similar to the .code and and .code or operators. The .code some function applies .meta predicate-fun to successive values produced by retrieving elements of .meta list and processing them through .metn key-fun . If the list is empty, it returns .codn nil . Otherwise it returns the first .cod2 non- nil return value returned by a call to .meta predicate-fun and stops evaluating the elements. If .meta predicate-fun returns .code nil for all elements, .code some returns .codn nil . The .code all function applies .meta predicate-fun to successive values produced by retrieving elements of .meta list and processing them through .metn key-fun . If the list is empty, it returns .codn t . Otherwise, if .meta predicate-fun yields .code nil for any value, the .code all function immediately returns without invoking .meta predicate-fun on any more elements. If all the elements are processed, then the .code all function returns the value which .meta predicate-fun yielded for the last element. The .code none function applies .meta predicate-fun to successive values produced by retrieving elements of .meta list and processing them through .metn key-fun . If the list is empty, it returns .codn t . Otherwise, if .meta predicate-fun yields .cod2 non- nil for any value, the .code none function immediately returns .codn nil . If .meta predicate-fun yields .code nil for all values, the .code none function returns .codn t . .TP* Examples: .verb ;; some of the integers are odd [some '(2 4 6 9) oddp] -> t ;; none of the integers are even [none '(1 3 4 7) evenp] -> t .brev .coNP Function @ multi .synb .mets (multi < function << sequence *) .syne .desc The .code multi function distributes an arbitrary list processing function .meta multi over multiple sequences given by the .meta list arguments. The .meta sequence arguments are first transposed into a single list of tuples. Each successive element of this transposed list consists of a tuple of the successive items from the lists. The length of the transposed list is that of the shortest .meta list argument. The transposed list is then passed to .meta function as an argument. The .meta function is expected to produce a list of tuples, which are transposed again to produce a list of lists which is then returned. Conceptually, the input sequences are columns and .meta function is invoked on a list of the rows formed from these columns. The output of .meta function is a transformed list of rows which is reconstituted into a list of columns. .TP* Example: .verb ;; Take three lists in parallel, and remove from all of them the ;; element at all positions where the third list has an element of 20. (multi (op remove-if (op eql 20) @1 third) '(1 2 3) '(a b c) '(10 20 30)) -> ((1 3) (a c) (10 30)) ;; The (2 b 20) "row" is gone from the three "columns". ;; Note that the (op remove if (op eql 20) @1 third) ;; expression can be simplified using the ap operator: ;; ;; (op remove-if (ap eql @3 20)) .brev .coNP Functions @, sort @, nsort @ ssort and @ snsort .synb .mets (sort < sequence >> [ lessfun <> [ keyfun ]]) .mets (nsort < sequence >> [ lessfun <> [ keyfun ]]) .mets (ssort < sequence >> [ lessfun <> [ keyfun ]]) .mets (snsort < sequence >> [ lessfun <> [ keyfun ]]) .syne .desc The .code nsort function destructively sorts .metn sequence , producing a sequence which is sorted according to the .meta lessfun and .meta keyfun arguments. The .meta keyfun argument specifies a function which is applied to elements of the sequence to obtain the key values which are then compared using the lessfun. If .meta keyfun is omitted, the identity function is used by default: the sequence elements themselves are their own sort keys. The .meta lessfun argument specifies the comparison function which determines the sorting order. It must be a binary function which can be invoked on pairs of keys as produced by the key function. It must return a .cod2 non- nil value if the left argument is considered to be lesser than the right argument. For instance, if the numeric function .code < is used on numeric keys, it produces an ascending sorted order. If the function .code > is used, then a descending sort is produced. If .meta lessfun is omitted, then it defaults to the generic .code less function. The .code sort function has the same argument requirements as .code nsort but is non-destructive: it returns a new object, leaving the input .meta sequence unmodified, as if a copy of the input object were made using the function .code copy and then that copy were sorted in-place using .codn nsort . The .code sort and .code nsort functions are stable for sequences which are lists. This means that the original order of items which are considered identical is preserved. For strings and vectors, .code sort and .code nsort are not stable. The .code ssort and .code nsort functions have the same argument syntax and semantics as, respectively, .code sort and .codn nsort . These functions provide a stable sort for all sequences, not only lists, at the cost of temporarily allocating memory. All of these functions can be applied to hashes. They produce meaningful behavior for a hash table which contains .I N keys which are the integers from 0 to .IR "N - 1" . Such as hash is treated as if it were a vector. The values are sorted and reassigned to sorted order to the integer keys. The behavior is not specified for hashes whose contents do not conform to this convention. Note: .code nsort was introduced in \*(TX 238. Prior to that version, .code sort behaved like .codn nsort . .coNP Functions @, csort @, cnsort @ cssort and @ csnsort .synb .mets (csort < sequence >> [ lessfun <> [ keyfun ]]) .mets (cnsort < sequence >> [ lessfun <> [ keyfun ]]) .mets (cssort < sequence >> [ lessfun <> [ keyfun ]]) .mets (csnsort < sequence >> [ lessfun <> [ keyfun ]]) .syne .desc The functions .codn csort , .codn cnsort , .code cssort and .code csnsort are caching counterparts of, respectively, .codn sort , .codn nsort , .code ssort and .codn snsort . They have exactly the same argument syntax and semantics. Caching refers to eliminating repeated calls to .meta keyfun for the same element of .metn sequence , in order to reduce the execution time, at the cost of using more storage. .coNP Function @ grade .synb .mets (grade < sequence >> [ lessfun <> [ keyfun ]]) .syne .desc The .code grade function returns a list of integer indices which indicate the position of the elements of .meta sequence in sorted order. The .meta lessfun and .meta keyfun arguments behave like those of the .code sort function. The .meta sequence object is not modified. The internal sort performed by .code grade is not stable. The indices of any elements considered equivalent under .code lessfun may appear in any order in the returned index sequence. Note: the .code grade function is inspired by the "grade up" and "grade down" operators in the APL language. .TP* Examples: .verb ;; Order of the 2 3 positions of the "l" ;; characters is not specified: [grade "Hello"] -> (0 1 2 3 4) [grade "Hello" >] -> (4 2 3 1 0) .brev .coNP Functions @, shuffle @, nshuffle @ cshuffle and @ cnshuffle .synb .mets (shuffle < sequence <> [ random-state ]) .mets (nshuffle < sequence <> [ random-state ]) .mets (cshuffle < sequence <> [ random-state ]) .mets (cnshuffle < sequence <> [ random-state ]) .syne .desc The .code nshuffle function pseudorandomly rearranges the elements of .metn sequence . This is performed in place: .meta sequence object is modified. The return value is .meta sequence itself. The rearrangement depends on pseudorandom numbers obtained from the .code rand function. The .meta random-state argument, if present, is passed to that function. The .code cnshuffle function also pseudorandomly rearranges the elements of .metn sequence . It differs from .code nshuffle in that it produces a cyclic permutation: a permutation consisting of a single cycle. Whereas .code nshuffle may possibly map some, or even all elements to their original locations, .code cnshuffle maps every element to a new location (if .meta sequence has at least two elements. An example of a cyclic permutation is the mapping of .code "(1 2 3 4)" to .codn "(3 1 4 2)" . The cycle consists of 1 mapping to 3, 3 mapping to 4, 4 mapping to 2, and 2 mapping back to 1. An example of a permutation which is not cyclic is .code "(1 2 3 4)" to .code "(1 3 4 2)" which contains two cycles: .code "(1)" maps to .code "(1)" and .code "(2 3 4)" maps to .codn "(3 4 2)" . The .code cnshuffle function will not produce this permutation; the .code nshuffle function may. The .code nshuffle and .code cnshuffle functions support hash tables in a manner analogous to the way .code nsort supports hash tables; the same remarks apply as in the description of that function. The .code shuffle and .code cshuffle functions have the same argument requirements and semantics as .code nshuffle and .code cnshuffle respectively, but differ in that it they in-place modification of .metn sequence : a new, shuffled sequence is returned, as if a copy of .meta sequence were made using .code copy and then that copy were shuffled in-place and returned. Note: .code nshuffle was introduced in \*(TX 238. Prior to that version, .code shuffle behaved like .codn nshuffle . Note: The pseudo-random number generator in \*(TX has only 512 bits of state, which is sufficient for generating the all the permutations of sequences of at most 98 elements, and the cyclic permutations of sequences of at most 99 elements. These figures should not be interpreted as guarantees, but as theoretical maxima. .coNP Functions @ rot and @ nrot .synb .mets (rot < sequence <> [ displacement ]) .mets (nrot < sequence <> [ displacement ]) .syne .desc The .code nrot and .code rot functions rotate the elements of .metn sequence , returning a rotated sequence. The .code nrot function does this destructively; it modifies .meta sequence in-place, whereas .code rot returns a new sequence without modifying the original. The .code rot function always returns a new sequence. In cases when no rotation is performed, it copies .meta sequence as if using the .code copy function. In cases when no rotation is performed, the .code nrot function returns the original sequence, which is unmodified. The .meta displacement parameter, an integer, has a default value of 1. To rotate elements means to displace their position within the .meta sequence by some amount, that being given by the .meta displacement parameter, while partially preserving their circular order. Circular order means that for the purposes of rotation, the sequence is regarded to be cyclic: the first element of the sequence is considered to be the successor of the last element and vice versa. Thus, when an element is displaced past the first or last position, it wraps to the end or beginning of the sequence. If the sequence is empty, or contains only one element, then .code rot and .code nrot terminate, performing no rotation. The following remarks apply to situations when .meta sequence has two or more elements. The .meta displacement parameter, which may be negative, is first reduced to the smallest positive residue modulo the length of the sequence, resulting in a value ranging from zero to one less than the sequence length. If the resulting value is zero, then no rotation is performed. The .meta displacement has a negative orientation: each element's position is decreased by this amount. Those elements whose position would become negative move to the end of the sequence. The default displacement of 1 causes the first element to become last, the second element to become first, and so forth. The opposite rotation can be obtained using -1 as the displacement. Note: even though .code nrot operates destructively, the returned object may not be the same object as .metn sequence . Only the returned object is required to be the rotated sequence. If this is different from the original .meta sequence input, the contents of that original object are unspecified. Note: the symbol .code rotate is the name of a place-mutating macro, which is much older than these functions. If .code S is a three-element sequence, then: .verb (set S (nrot S)) ;; alternatively: (upd S nrot) .brev has the same effect as: .verb (rotate [S 0] [S 1] [S 2]) .brev .TP* Examples: .verb (rot "abc") -> "bca" (rot #(1 2 3) -1) -> (3 1 2) ;; lower-case rot-13 (mapcar (relate (range #\ea #\ez) (rot (range #\ea #\ez) 13)) "hello, world!") -> "uryyb, jbeyq!" .brev .coNP Functions @ sort-group and @ csort-group .synb .mets (sort-group < sequence >> [ keyfun <> [ lessfun ]]) .syne .desc The .code sort-group function sorts .meta sequence according to the .meta keyfun and .meta lessfun arguments, and then breaks the resulting sequence into groups, based on the equivalence of the elements under .metn keyfun . The .code csort-group differs from .code sort-group in that it is based on the caching .code csort rather than .codn sort . The following equivalence holds: .verb (sort-group sq lf kf) <--> (partition-by kf (sort sq kf lf)) .brev Note the reversed order of .meta keyfun and .meta lessfun arguments between .code sort and .codn sort-group . .coNP Function @ uniq .synb .mets (uniq << sequence ) .syne .desc The .code uniq function returns a sequence of the same kind as .metn sequence , but with duplicates removed. Elements of .meta sequence are considered equal under the .code equal function. The first occurrence of each element is retained, and the subsequent duplicates of that element, of any, are suppressed, such that the order of the elements is otherwise preserved. The .code uniq function is an alias for the one-argument case of .codn unique . That is to say, this equivalence holds: .verb (uniq s) <--> (unique s) .brev .coNP Function @ unique .synb .mets (unique < sequence >> [ keyfun <> { hash-arg }*]) .syne .desc The .code unique function is a generalization of .codn uniq . It returns a sequence of the same kind as .metn sequence , but with duplicates removed. If neither .meta keyfun nor .metn hash-arg s are specified, then elements of sequence are considered equal under the .code equal function. The first occurrence of each element is retained, and the subsequent duplicates of that element, of any, are suppressed, such that the order of the elements is otherwise preserved. If .meta keyfun is specified, then that function is applied to each element, and the resulting values are compared for equality. In other words, the behavior is as if .meta keyfun were the .code identity function. If one or more .metn hash-arg s are present, these specify the arguments for the construction of the internal hash table used by .codn unique . The arguments are like those of the .code hash function. .coNP Function @ tuples .synb .mets (tuples < length < sequence <> [ fill-value ]) .syne .desc The .code tuples function produces a lazy list which represents a reorganization of the elements of .meta sequence into tuples of .metn length , where .meta length must be a positive integer. The length of the sequence might not be evenly divisible by the tuple length. In this case, if a .meta fill-value argument is specified, then the last tuple is padded with enough repetitions of .meta fill-value to make it have .meta length elements. If .meta fill-value is not specified, then the last tuple is left shorter than .metn length . The output of the function is a list, but the tuples themselves are sequences of the same kind as .metn sequence . If .meta sequence is any kind of list, they are lists, and not lazy lists. .TP* Examples: .verb (tuples 3 #(1 2 3 4 5 6 7 8) 0) -> (#(1 2 3) #(4 5 6) #(7 8 0)) (tuples 3 "abc") -> ("abc") (tuples 3 "abcd") -> ("abc" "d") (tuples 3 "abcd" #\ez) -> ("abc" "dzz") (tuples 3 (list 1 2) #\ez) -> ((1 2 #\ez)) .brev .coNP Function @ tuples* .synb .mets (tuples* < length < sequence <> [ fill-value ]) .syne .desc The .code tuples* function produces a lazy list of overlapping tuples taken from .metn sequence . The length of the tuples is given by the .meta length argument. The .meta length argument must be a positive integer. Tuples are subsequences of consecutive items from the input .metn sequence , beginning with consecutive elements. The first tuple in the returned list begins with the first item of .metn sequence ; the second tuple begins with the second item, and so forth. The output of the function is a list, but the tuples themselves are sequences of the same kind as .metn sequence . If .meta sequence is any kind of list, they are lists, and not lazy lists. If .meta sequence is shorter than .meta length then it contains no tuples of that length. In this case, if no .meta fill-value argument is specified, then the empty list is returned. In this same situation, if .meta fill-value is specified, then a one-element list is returned, consisting of a tuple of the required length, consisting of the elements from .meta sequence followed by repetitions of .metn fill-value , which must be of a type suitable as an element of the sequence. The .meta fill-value is otherwise ignored. .TP* Examples: .verb .brev (tuples* 1 "abc") -> ("a" "b" "c") (tuples* 2 "abc") -> ("ab" "bc") (tuples* 3 "abc") -> ("abc") (tuples* 4 "abc") -> nil (tuples* 4 "abc" #\ez) -> ("abcz") (tuples* 6 "abc" #\ez) -> ("abczzz") (tuples* 6 "abc" 4) -> error (tuples* 2 '(a b c)) -> ((a b) (b c)) (take 3 (tuples* 3 0)) -> ((0 1 2) (1 2 3) (2 3 4)) .brev .coNP Function @ partition-by .synb .mets (partition-by < function << sequence ) .syne .desc If .meta sequence is empty, then .code partition-by returns an empty list, and .meta function is never called. Otherwise, .code partition-by returns a lazy list of partitions of the sequence .metn sequence . Partitions are consecutive, nonempty substrings of .metn sequence , of the same kind as .metn sequence . The partitioning begins with the first element of .meta sequence being placed into a partition. The subsequent partitioning is done according to .metn function , which is applied to each element of .metn sequence . Whenever, for the next element, the function returns the same value as it returned for the previous element, the element is placed into the same partition. Otherwise, the next element is placed into, and begins, a new partition. The return values of the calls to .meta function are compared using the .code equal function. .TP* Examples: .verb [partition-by identity '(1 2 3 3 4 4 4 5)] -> ((1) (2) (3 3) (4 4 4) (5)) (partition-by (op = 3) #(1 2 3 4 5 6 7)) -> (#(1 2) #(3) #(4 5 6 7)) .brev .coNP Function @ partition-if .synb .mets (partition-if < function < iterable <> [ count ]) .syne .desc The .code partition-if function separates the .meta iterable sequence into partitions which are identified by the two-argument .metn function . The principal idea is that successive overlapping pairs from .meta iterable are passed as arguments to .metn function , and whenever .meta function yields true, those elements are identified as belonging to separate partitions: a partitioning division shall take place between them. The detailed semantics is given below, as a procedure. Firstly, if .meta sequence is empty, then .code partition-if returns an empty list, and .meta function is never called. Otherwise, .code partition-if returns a lazy list of partitions of .metn iterable . Partitions are consecutive, nonempty substrings of .metn iterable , of the same kind as .metn iterable . The partitioning begins with the first element of .meta iterable being placed into the first partition. The subsequent partitioning is done according to a Boolean .metn function , which must accept two arguments. Whenever the function yields true, it indicates that a partition is to be terminated and a new partition to begin. The .meta count argument, if present, must be a nonnegative integer. It indicates a limit on how many partitions will be delimited; after this limit is reached, the remainder of the .meta iterable sequence is placed into a single partition. After the first element is placed into a partition, the following partition-building process is repeated until the partition is terminated. .RS .IP 1. If .meta iterable contains no more elements, then the partition terminates. .IP 2. Otherwise, if the .meta count is present, and has a value of zero, then the next available element is unconditionally deposited into the current partition, and the process repeats from step 1. .IP 3. Otherwise, .meta function is invoked on two values: the previous element which has most recently been deposited into the partition, and its successor from .metn iterable . .IP 4. If .meta function returns .codn nil , then the partition continues: the next element is added to the partition, and the process repeats from step 1. .IP 5. Otherwise, .meta function has returned true and the partition is terminated. In this case, if .meta count is present, it is decremented. .RE .IP When the current partition is terminated, it is converted to a sequence of the same kind as .meta iterable as if by using the .code make-like function, and incorporated as the next element of the lazy list of partitions. If, after a partition is thus produced, a next element is available, it is placed into a new partition, and the above partition-building process takes place from step 1. Otherwise, the lazy list terminates. .TP* Examples: .verb ;; Start new partition for unequal characters. [partition-if neql "aaaabbcdee"] -> ("aaaa" "bb" "c" "d" "ee") ;; As above, but partition only twice [partition-if neql "aaaabbcdee" 2] -> ("aaaa" "bb" "cdee") ;; Start new partition when non-digit follows digit: [partition-if (do and (chr-isdigit @1) (not (chr-isdigit @2))) "a13cd9foo42z"] -> ("a13" "cd9" "foo42" "z") ;; Place ascending runs of consecutive integers ;; into partitions. I.e. start a partition whenever the ;; difference from the previous element isn't 1: (partition-if (op /= (- @2 @1) 1) '(1 3 4 5 7 8 9 10 9 8 6 5 3 2)) -> ((1) (3 4 5) (7 8 9 10) (9) (8) (6) (5) (3) (2)) ;; Place runs of adjacent integers into partitions. ;; I.e. start a new partition if the the absolute value of ;; the difference from the previous exceeds 1: (partition-if (op > (abs (- @2 @1)) 1) '(1 3 4 5 7 8 9 10 9 8 6 5 3 2)) -> ((1) (3 4 5) (7 8 9 10 9 8) (6 5) (3 2)) .brev .SS* Open Sequence Traversal Functions in this category perform efficient traversal of sequences. There are two flavors of these functions: functions in the .code iter-begin group, and functions in the .code seq-begin group. The latter are obsolescent. User-defined iteration is possible via defining special methods on structures. An object supports iteration by defining the special method .code iter-begin which is different from the .code iter-begin function. This special function returns an iterator object which supports special methods .codn iter-item , .code iter-more and .codn iter-step . Two protocols are supported, one of which is more efficient by eliminating the .code iter-more method. Details are specified in the section .BR "Special Structure Functions" . .coNP Function @ iter-begin .synb .mets (iter-begin << seq ) .syne .desc The .code iter-begin function returns an iterator object suitable for traversing the elements of the sequence denoted by the .meta seq object. If .meta seq is a list-like sequence, then .code iter-begin may return .meta seq itself as the iterator. Likewise if .meta seq is a number. If .meta seq is a structure which supports the .code iter-begin method, then that method is called and its return value is returned. A structure which does not support this method is possibly considered to be a sequence according to the usual criteria, based on whether it supports the .codn nullify , .code length or .code car methods. A struct object supporting none of these methods is deemed not iterable. Otherwise, if .meta seq is an iterator object of .code seq-iter type, such as one produced by .codn iter-begin , then an iterator similar to that iterator is returned, as if produced by applying the .code copy-iter function to .metn seq . In all other cases, if .meta seq is iterable, an object of type .code seq-iter is returned. Range objects are iterable if they are numeric. A range consisting of two strings may also be iterable, as described below. A range is considered to be a numeric or character range if the .code from element is a number or character. The .code to is then required to be either a value which is comparable with that number or character using the .code < function, or else it must be one of the two objects .code t or .codn : , either of which indicate that the range is unbounded. In this unbounded range case, the expressions .code "(iter-begin X..:)" and .code "(iter-begin X..t)" are equivalent to .codn "(iter-begin X)" . Numeric ranges are half-open: the .code to value of ascending ranges is excluded, as is the .code from value of descending ranges, so that .code 0..10 steps through the values .code 0 through .codn 9 , and .code 10..0 steps through the same values in reverse order. A string range consists of two strings of equal length. If the strings are of unequal length, an error exception is thrown. The sequence denoted by a string range is a sequence of strings formed from the the Cartesian product of the character ranges formed by positionally-corresponding characters from the two strings. The order of the sequence is such that the rightmost character varies most frequently. In more detail, the string range iterates over successive strings by incrementing or decrementing the characters of the .code from string until they are equal to those of the .code to string. The rightmost character has priority. For instance, the range .code "\(dqAA\(dq..\(dqCC\(dq" iterates over the strings .codn "AA" , .codn "AB" , .codn "AC" , .codn "BA" , .codn "BB" , .codn "BC" , .codn "CA" , .code "CB" and .codn "CC" . The descending range .code "\(dqCC\(dq..\(dqAA\(dq" iterates over the same strings, in reverse order. Whenever the incrementing character attains the value of the corresponding character in the .code to string, that character is reset to its starting value, and its left neighbor, if it exists, is incremented instead. If no left neighbor exists, the iteration terminates. For every character position in the string pair, it is independently determined whether the iteration for that position is ascending or descending, such that the range .code "\(dqAC\(dq..\(dqCA\(dq" iterates over the strings .codn "AC" , .codn "AB" , .codn "AA" , .codn "BC" , .codn "BB" , .codn "BA" , .codn "CC" , .code "CB" and .codn "CA" . Search trees are iterable. Iteration entails an in-order visits of the elements of a tree. A tree iterator created by .code tree-begin is also iterable. It is unspecified whether iteration over a .code tree-iter object modifies that object to perform the traversal, or whether it uses a copy of the iterator. If .code seq is not an iterable object, an error exception is thrown. .coNP Function @ iter-more .synb .mets (iter-more << iter ) .syne .desc The .code iter-more function returns .code t if there remain more elements to be traversed. Otherwise it returns .codn nil . The .meta iter argument must be a valid iterator returned by a call to .metn iter-begin , .meta iter-step or .metn iter-reset . The .code iter-more function doesn't change the state of .metn iter . If .code iter is the object .code nil then .code nil is returned. Note: the .code iter-begin may return .code nil if its argument is .code nil or any empty sequence, or an empty range (a range whose .code to and .code from fields are the same number or character). If .meta iter is a .code cons cell, then .code iter-more returns .codn t . If .meta iter is a number, then .code iter-more returns .codn t . This is the case even if calculating the successor of that number isn't possible due to floating-point overflow or insufficient system resources. If .meta iter is a character, then .code iter-more returns .code t if .meta iter isn't the highest possible character code, otherwise .codn nil . If .meta iter was formed from a descending range, meaning that .code iter-begin was invoked on a range with a .code from fielding exceeding its .code to value, then .code iter-begin returns true while the current iterator value is greater than the the limiting value given by the .code to field. For an ascending range, it returns true if the current iterator value is lower than the limiting value. However, note the peculiar semantics of .code iter-item with regard to descending range iteration. If .meta iter is a structure, then if it supports an .code iter-more method, then that method is called with no arguments, and its return value is returned. If the structure does not have an .code iter-more method, then .code t is returned. .coNP Function @ iter-item .synb .mets (iter-item << iter ) .syne .desc If the .code iter-more function indicates that more items remain to be visited, then the next item can be retrieved using .codn iter-item . The .meta iter argument must be a valid iterator returned by a call to .metn iter-begin , .meta iter-step or .metn iter-reset . The .code iter-more function doesn't change the state of .metn iter . If .code iter-more is invoked on an iterator which indicates that no more items remain to be visited, the return value is .codn nil . If .meta iter is a .code cons cell, then .code iter-item returns the .code car field of that cell. If .meta iter is a character or number, then .code iter-item returns that character or number itself. If .meta iter is based on an ascending numeric or character range, then .code iter-item returns the current iteration value, which is initialized by .code iter-begin as a copy of the range's .code from field. Thus, the range .code 0..3 traverses the values .codn 0 , .code 1 and .codn 2 , excluding the .codn 3 . If .meta iter is based on a descending numeric or character range, then .code iter-item returns the predecessor of the current iteration value, which is initialized .code iter-begin as a copy of the range's .code from field. Thus, the range .code 3..0 traverses the values .codn 2 , .code 1 and .codn 0 , excluding the .codn 3 : exactly the same values are visited as for the range .code 0..3 only in reverse order. If .meta iter is a structure which supports the .code iter-item method, then that method is called and its return value is returned. .coNP Function @ iter-step .synb .mets (iter-step << iter ) .syne .desc If the .code iter-more function indicates that more items remain to be visited, then the .code iter-step function may be used to consume the next item. The function returns an iterator denoting the traversal of the remaining items in the sequence. The .meta iter argument must be a valid iterator returned by a call to .metn iter-begin , .meta iter-step or .metn iter-reset . The .code iter-step function may return a new object, in which case it avoids changing the state of .metn iter , or else it may change the state of .meta iter and return it. If the application discontinues the use of .metn iter , and continues the traversal using the returned iterator, it will work correctly in either situation. If .code iter-step is invoked on an iterator which indicates that no more items remain to be visited, the return value is unspecified. If .meta iter is a .code cons cell, then .code iter-step returns the .code cdr field of that cell. That value must itself be a .code cons or else .codn nil , otherwise an error is thrown. This is to prevent iteration from wrongly iterating into the non-null terminators of improper lists. Without this rule, iteration of a list like .code "(1 2 . 3)" would reach the .code cons cell .code "(2 . 3)" at which point a subsequent .code iter-step would return the .code cdr field .codn 3 . But that value is a valid iterator which will then continue by stepping through .codn 4 , .code 5 and so on. If .meta iter is a list-like sequence, then .code cdr is invoked on it and that value is returned. The value must also be a list-like sequence, or else .codn nil . The reasoning for this is the same as for the similar restriction imposed in the case when .meta iter is a .codn cons . If .meta iter is a character or number, then .code iter-step returns its successor, as if using the .code succ function. If .meta iter is a structure which supports the .code iter-step method, then that method is called and its return value is returned. .coNP Function @ iter-reset .synb .mets (iter-reset < iter << seq ) .syne .desc The .code iter-reset function returns an iterator object specialized for the task of traversing the sequence .metn seq . If it is possible for .meta iter to be that object, then the function may adjust the state of .meta iter and return it. If .code iter-reset doesn't use .metn iter , then it behaves exactly like .code iter-begin being invoked on .metn seq . If .meta seq is a structure which supports the .code iter-reset method, then that method is called and its return value is returned. Note the reversed arguments. The .code iter-reset method is of the .meta seq object, not of .metn iter . That is to say, the call .mono .meti (iter-reset < iter << obj ) .onom results in the .mono .meti << obj .(iter-reset << iter ) .onom call. If .meta seq is a structure which doesn't support .code iter-reset then .meta iter is ignored, .code iter-begin is invoked on .meta seq and the result is returned. .coNP Function @ copy-iter .synb .mets (copy-iter << iter ) .syne .desc The .code copy-iter produces a duplicate .meta iter such that the duplicate iterator will traverse the same sequence of items as .meta iter starting at the current point in the sequence indicated by iter. For some kinds of iterators, such as integers and conses, .code copy-iter just returns .metn iter . If .meta iter is a structure object, then if it supports the .code copy method, that method is invoked and its return value is taken as the iterator copy. Otherwise, .meta iter must implement a list-like sequence, in which case the object is just returned. If .code iter is a structure which neither supports a .code copy method nor implements a list-like sequence by supporting the .code car method, an error exception is thrown. Note: iterators of type .code seq-iter can be copied with the .code copy function (which for those objects is defined in terms of .codn copy-iter ). However, the .code copy function has the wrong semantics for other kinds of iterator objects. It refuses to copy certain atoms such as numbers, and in the case of conses it behaves like .codn copy-list , which is unnecessary. .coNP Function @ iter-cat .synb .mets (iter-cat << seq *) .syne .desc The .code iter-cat function produces a catenated iterator: an object suitable for traversing the abstract sequence formed by the catenation of the .meta seq arguments. This is accomplished without actually catenating the argument sequences. If no arguments are given to .code iter-cat then it returns .codn nil . Otherwise, the abstract semantics of the catenated iterator is as follows. The iterator retains all of the .meta seq objects. It converts the first .meta seq object to an iterator as if by .code iter-begin on it. This is referred to as the individual iterator. When that iterator is exhausted of items, .code iter-begin is called on the next .meta seq object to produce the next individual iterator. Note: under this semantics, the catenated iterator's .code iter-more operation does not simply report the value returned by the .code iter-more call on the individual iterator. When the individual iterator's .code iter-more function returns .codn nil , the catenated operator then switches to the individual iterator of the next .meta seq object in the argument sequence. This is repeated as many times as necessary until an iterator is found for which The .code iter-item function, when applied to a catenated iterator, similarly potentially searches through the argument space. .code iter-more yields true, or the arguments are exhausted. .TP* Examples: .verb ;; Create an iterator that produces 0, 1, ... 9, 20, 21, ... 29 (iter-cat 0..10 20..30) .brev .coNP Function @ seq-begin .synb .mets (seq-begin << object ) .syne .desc The obsolescent .code seq-begin function returns an iterator object specialized to the task of traversing the sequence represented by the input .metn object . If .meta object isn't a sequence, an exception is thrown. Note that if .meta object is a lazy list, the returned iterator maintains a reference to the head of that list during the traversal; therefore, generic iteration based on iterators from .code seq-begin is not suitable for indefinite iteration over infinite lists. .coNP Function @ seq-next .synb .mets (seq-next < iter << end-value ) .syne .desc The obsolescent .code seq-next function retrieves the next available item from the sequence iterated by .metn iter , which must be an object returned by .codn seq-begin . If the sequence has no more items to be traversed, then .meta end-value is returned instead. Note: to avoid ambiguities, the application should provide an .meta end-value which is guaranteed distinct from any item in the sequence, such as a freshly allocated object. .coNP Function @ seq-reset .synb .mets (seq-reset < iter << object ) .syne .desc The obsolescent .code seq-reset reinitializes the existing iterator object .meta iter to begin a new traversal over the given .metn object , which must be a value of a kind that would be a suitable argument for .codn seq-begin . The .code seq-reset function returns .metn iter . .SS* Procedural List Construction \*(TL provides an a structure type called .code list-builder which encapsulates state and methods for constructing lists procedurally. Among the advantages of using .code list-builder is that lists can be constructed in the left-to-right direction without requiring multiple traversals or reversal. For example, .code list-builder naturally combines with iteration or recursion: items visited in an iterative or recursive process can be collected easily using .code list-builder in the order they are visited. The .code list-builder type provides methods for adding and removing items at either end of the list, making it suitable where a .I dequeue structure is required. The basic workflow begins with the instantiation of a .code list-builder object. This object may be initialized with a piece of list material which begins the to-be-constructed list, or it may be initialized to begin with an empty list. Methods such as .code add and .code pend are invoked on this object to extend the list with new elements. At any point, the list constructed so far is available using the .code get method, which is also how the final version of the list is eventually retrieved. The .code list-builder methods which add material to the list all return the list builder, making chaining possible. .verb (new list-builder).(add 1).(add 2).(pend '(3 4 5)).(get) -> (1 2 3 4 5) .brev The .code build macro is provided which syntactically streamlines the process. It implicitly creates a .code list-builder instance and binds it to a hidden lexical variable. It then evaluates forms in a lexical scope in which shorthand macros are available for building the list. .coNP Structure @ list-builder .synb .mets (defstruct list-builder nil .mets \ \ head tail) .syne .desc The .code list-builder structure encapsulates the state for a list building process. Programs should use the .code build-list function for creating an instance of .codn list-builder . The .code head and .code tail slots should be regarded as internal variables. .coNP Function @ build-list .synb .mets (build-list <> [ initial-list ]) .syne .desc The .code build-list function instantiates and returns an object of struct type .codn list-builder . If no .meta initial-list argument is supplied, then the object is implicitly initialized with an empty list. If the argument is supplied, then it is equivalent to calling .code build-list without an argument to produce an object .meta obj by invoking the method call .mono .meti << obj .(ncon << initial-list ) .onom on this object. The object produced by the expression .meta list is installed (without being copied) into the object as the prefix of the list to be constructed. The .meta initial-list argument can be a sequence other than a list. .TP* Example: .verb ;; build the list (a b) trivially (let ((lb (build-list '(a b)))) lb.(get) -> (a b) .brev .coNP Methods @ add and @ add* .synb .mets << list-builder .(add << element *) .mets << list-builder .(add* << element *) .syne .desc The .code add and .code add* methods extend the list being constructed by a .code list-builder object by adding individual elements to it. The .code add method adds elements at the tail of the list, whereas .code add* adds elements at the front. These methods return the .meta list-builder object. The precise semantics is as follows. All of the .meta element arguments are combined into a list as if by the .code list function, and the resulting list combined with the current contents of the .code list-builder object as if using the .code append function. The resulting list becomes the new contents. .TP* Examples: .verb ;; Build the list (1 2 3 4) (let ((lb (build-list))) lb.(add 3 4) lb.(add* 1 2) lb.(get)) -> (1 2 3 4) ;; Add "c" to "abc" ;; same semantics as (append "abc" #\ec) (let ((lb (build-list "ab"))) lb.(add #\ec) lb.(get)) -> "abc" .brev .coNP Methods @ pend and @ pend* .synb .mets << list-builder .(pend << list *) .mets << list-builder .(pend* << list *) .syne .desc The .code pend and .code pend* methods extend the list being constructed by a .code list-builder object by adding lists to it. The .code pend method catenates the .meta list arguments together as if by the .code append function, then appends the resulting list to the end of the list being constructed. The .code pend* method is similar, except it prepends the catenated lists to the front of the list being constructed. The .code pend and .code pend* operations do not mutate the input lists, but may cause the resulting list to share structure with the input lists. These functions may mutate the list already contained in .metn list-builder ; however, they avoid mutating those parts of the current list that are shared with inputs that were given in earlier calls to these functions. These methods return the .meta list-builder object. .TP* Example: .verb ;; Build the list (1 2 3 4) (let ((lb (build-list))) lb.(pend '(3 4)) lb.(pend* '(1 2)) lb.(get)) -> (1 2 3 4) .brev .coNP Methods @ ncon and @ ncon* .synb .mets << list-builder .(ncon << list *) .mets << list-builder .(ncon* << list *) .syne .desc The .code ncon and .code ncon* methods extend the list being constructed by a .code list-builder object by adding lists to it. The .code ncon method destructively catenates the .meta list arguments as if by the .code nconc function. The resulting list is appended to the list being constructed. The .code ncon* method is similar, except it prepends the catenated lists to the front of the list being constructed. These methods may destructively manipulate the list already contained in the .meta list-builder object, and likewise may destructively manipulate the input lists. They may cause the list being constructed to share substructure with the input lists. Additionally, these methods may destructively manipulate the list already contained in the .meta list-builder object without regard for shared structure between that list and inputs given earlier any of the .codn pend , .codn pend* , .code ncon or .code ncon* functions. The .code ncon* function can be called with a single argument which is an atom. This atom will simply be installed as the terminating atom of the list being constructed, if the current list is an ordinary list. These methods return the .meta list-builder object. .TP* Example: .verb ;; Build the list (1 2 3 4 . 5) (let ((lb (build-list))) lb.(ncon* (list 1 2)) lb.(ncon (list 3 4)) lb.(ncon 5) lb.(get)) -> (1 2 3 4 . 5) .brev .coNP Method @ oust .synb .mets << list-builder .(oust << list *) .syne .desc The .code oust method discards the list constructed so far, optionally replacing it with new material. The .code oust method catenates the .meta list arguments together as if by the .code append function. The resulting list, which is empty if there are no .meta list arguments, then replaces the object's list constructed so far. The .code oust method returns the .meta list-builder object. .TP* Examples: .verb ;; Build the list (3 4) by first building (1 2), ;; then discarding that and adding 3 and 4: (let ((lb (build-list))) lb.(add 1 2) lb.(oust) lb.(add 3 4) lb.(get)) -> (3 4) ;; Build the list (3 4 5 6) by first building (1 2), ;; then replacing with catenation of (3 4) and (5 6): (let ((lb (build-list))) lb.(pend '(1 2)) lb.(oust '(3 4) '(5 6)) lb.(get)) -> (3 4 5 6) .brev .coNP Method @ get .synb .mets << list-builder .(get) .syne .desc The .code get method retrieves the list constructed so far by a .code list-builder object. It doesn't change the state of the object. The retrieved list may be passed as an argument into the construction methods on the same object. .TP* Examples: .verb ;; Build the circular list (1 1 1 1 ...) ;; by appending (1) to itself destructively: (let ((lb (build-list '(1)))) lb.(ncon* lb.(get)) lb.(get)) -> (1 1 1 1 ...) ;; build the list (1 2 1 2 1 2 1 2) ;; by doubling (1 2) twice: (let ((lb (build-list))) lb.(add 1 2) lb.(pend lb.(get)) lb.(pend lb.(get)) lb.(get)) -> (1 2 1 2 1 2 1 2) .brev .coNP Methods @ del and @ del* .synb .mets << list-builder .(del) .mets << list-builder .(del*) .syne .desc The .code del and .code del* methods each remove an element from the list and return it. If the list is empty, they return .codn nil . The .code del method removes an element from the front of the list, whereas .code del* removes an element from the end of the list. Note: this orientation is opposite to .code add and .codn add* . Thus .code del pairs with .code add to produce FIFO queuing behavior. .coNP Macros @ build and @ buildn .synb .mets (build << form *) .mets (buildn << form *) .syne .desc The .code build and .code buildn macros provide a shorthand notation for constructing lists using the .code list-builder structure. They eliminate the explicit call to the .code build-list function to construct the object, and eliminate the explicit references to the object. Both of these macros create a lexical environment in which a .code list-builder object is implicitly constructed and bound to a hidden variable. This lexical environment also provides local functions named .codn add , .codn add* , .codn pend , .codn pend* , .codn ncon , .codn ncon* , .codn oust , .codn get , .code del and .codn del* , which mimic the .code list-builder methods, but operate implicitly on this hidden variable, so that the object need not be mentioned as an argument. With the exception of .codn get , .code del and .codn del* , the local functions return .codn nil , unlike like the same-named .code list-builder methods, which return the .code list-builder object. In this lexical environment, each .meta form is evaluated in order. When the last .meta form is evaluated, .code build returns the constructed list, whereas .code buildn returns the value of the last .metn form . If no forms are enclosed, both macros return .codn nil . Note: because the local function .code del has the same name as a global macro, it is implemented as a .code macrolet. Inside a .code build or .codn buildn , if .code del is invoked with no arguments, then it denotes a call to the .code list-builder .code del method. If invoked with an argument, then it resolves to the global .code del macro for deleting a place. .TP* Examples: .verb ;; Build the circular list (1 1 1 1 ...) ;; by appending (1) to itself destructively: (build (add 1) (ncon* (get))) -> (1 1 1 1 ...) ;; build the list (1 2 1 2 1 2 1 2) ;; by doubling (1 2) twice: (build (add 1 2) (pend (get)) (pend (get))) -> (1 2 1 2 1 2 1 2) ;; build a list by mapping over the local ;; add function: (build [mapdo add (range 1 3)]) -> (1 2 3) ;; breadth-first traversal of nested list; (defun bf-map (tree visit-fn) (buildn (add tree) (whilet ((item (del))) (if (atom item) [visit-fn item] (each ((el item)) (add el)))))) (let (flat) (bf-map '(1 (2 (3 4 (5))) ((6 7) 8)) (do push @1 flat)) (nreverse flat)) -> (1 2 8 3 4 6 7 5) .brev .SS* Permutations and Combinations .coNP Functions @ perm and @ permi .synb .mets (perm < seq <> [ len ]) .mets (permi < seq <> [ len ]) .syne .desc The .code perm function returns a lazy list which consists of all length .meta len permutations of formed by items taken from .metn seq . The permutations do not use any element of .meta seq more than once. The .code permi function has identical argument semantics, but returns an iterator instead of a lazy list: an object meeting the same conventions as the return value of .codn iter-begin . Argument .metn len , if present, must be a positive integer, and .meta seq must be a sequence. If .meta len is not present, then its value defaults to the length of .metn seq : the list of the full permutations of the entire sequence is returned. The permutations in the returned list are sequences of the same kind as .codn seq . If .meta len is zero, then a list containing one permutation is returned, and that permutation is of zero length. If .meta len exceeds the length of .metn seq , then an empty list is returned, since it is impossible to make a single nonrepeating permutation that requires more items than are available. The permutations are lexicographically ordered. .coNP Functions @ rperm and @ rpermi .synb .mets (rperm < seq << len ) .mets (rpermi < seq << len ) .syne .desc The .code rperm function returns a lazy list which consists of all the repeating permutations of length .meta len formed by items taken from .metn seq . "Repeating" means that the items from .meta seq can appear more than once in the permutations. The .code rpermi function has identical argument semantics, but returns an iterator instead of a lazy list: an object meeting the same conventions as the return value of .codn iter-begin . The permutations which are returned are sequences of the same kind as .metn seq . Argument .meta len must be a nonnegative integer, and .meta seq must be a sequence. If .meta len is zero, then a single permutation is returned, of zero length. This is true regardless of whether .meta seq is itself empty. If .meta seq is empty and .meta len is greater than zero, then no permutations are returned, since permutations of a positive length require items, and the sequence has no items. Thus there exist no such permutations. The first permutation consists of .meta le repetitions of the first element of .metn seq . The next repetition, if there is one, differs from the first repetition in that its last element is the second element of .metn seq . That is to say, the permutations are lexicographically ordered. .TP* Examples: .verb (rperm "01" 3) -> ("000" "001" "010" "011" "100" "101" "110" "111") (rperm #(1) 3) -> (#(1 1 1)) (rperm '(0 1 2) 2) -> ((0 0) (0 1) (0 2) (1 0) (1 1) (1 2) (2 0) (2 1) (2 2)) .brev .coNP Functions @ comb and @ combi .synb .mets (comb < seq << len ) .mets (combi < seq << len ) .syne .desc The .code comb function returns a lazy list which consists of all length .meta len nonrepeating combinations formed by taking items taken from .metn seq . "Nonrepeating combinations" means that the combinations do not use any element of .meta seq more than once. If .meta seq contains no duplicates, then the combinations contain no duplicates. The .code combi function has identical argument semantics, but returns an iterator instead of a lazy list: an object meeting the same conventions as the return value of .codn iter-begin . Argument .meta len must be a nonnegative integer, and .meta seq must be a sequence or a hash table. The combinations in the returned list are objects of the same kind as .metn seq . If .meta len is zero, then a list containing one combination is returned, and that combination is of zero length. If .meta len exceeds the number of elements in .metn seq , then an empty list is returned, since it is impossible to make a single nonrepeating combination that requires more items than are available. If .meta seq is a sequence, the returned combinations are lexicographically ordered. This requirement is not applicable when .meta seq is a hash table. .TP* Example: .verb ;; powerset function, in terms of comb. ;; Yields a lazy list of all subsets of s, ;; expressed as sequences of the same type as s. (defun powerset (s) (mappend* (op comb s) (range 0 (length s)))) .brev .coNP Functions @ rcomb and @ rcombi .synb .mets (rcomb < seq << len ) .mets (rcombi < seq << len ) .syne .desc The .code rcomb function returns a lazy list which consists of all length .meta len repeating combinations formed by taking items taken from .metn seq . "Repeating combinations" means that the combinations can use an element of .meta seq more than once. The .code rcombi function has identical argument semantics, but returns an iterator instead of a lazy list: an object meeting the same conventions as the return value of .codn iter-begin . Argument .meta len must be a nonnegative integer, and .meta seq must be a sequence. The combinations in the returned list are sequences of the same kind as .metn seq . If .meta len is zero, then a list containing one combination is returned, and that combination is of zero length. This is true even if .meta seq is empty. If .meta seq is empty, and .meta len is nonzero, then an empty list is returned. The combinations are lexicographically ordered. .SS* Macros Because \*(TL supports structural macros, \*(TX processes \*(TL expressions in two separate phases: the expansion phase and the evaluation/compilation phase. During the expansion phase, a top-level expression is recursively traversed, and all macro invocations in it are expanded. The result is a transformed expression which contains only function calls and invocations of special operators. This expanded form is then evaluated or compiled, depending on the situation. Macro invocations are compound forms and whose operator symbol has a macro definition in scope. A macro definition is a kind of function which operates on syntax during macro-expansion, called upon to calculate a transformation of the syntax. The return value of a macro replaces its invocation, and is traversed to look for more opportunities for macro expansion. Macros differ from ordinary functions in three ways: they are called at macro-expansion time, they receive pieces of unevaluated syntax as their arguments, and their parameter lists are macro parameter lists which support destructuring, as well as certain special parameters. \*(TL also supports symbol macros. A symbol macro definition associates a symbol with an expansion. When that symbol appears as a form, the macro-expander replaces it with the expansion. \*(TX source files are treated somewhat differently with regard to macro expansion compared to \*(TL. When \*(TL forms are read from a file by .code load or .code compile or read by the interactive listener, each form is expanded and evaluated or compiled before the subsequent form is processed. In contrast, when a \*(TX file is loaded, expansion of the Lisp forms are its arguments takes place during the parsing of the entire source file, and is complete for the entire file before any of the code is executed. .NP* Macro parameter lists \*(TX macros support destructuring, similarly to Common Lisp macros. This means that macro parameter lists are like function argument lists, but support nesting. A macro parameter list can specify a nested parameter list in every place where an argument symbol may appear. For instance, consider this macro parameter list: .verb ((a (b c)) : (c frm) ((d e) frm2 de-p) . g) .brev The top-level of this nested form has the structure .mono .meti \ \ >> ( I : < J < K . << L ) .onom in which we can identify the major constituent positions as .metn I , .metn J , .meta K and .metn L . The constituent at position .meta I is the mandatory parameter .codn "(a (b c))" . Position .meta J holds the optional parameter .code c (with default init form .codn frm ). At .meta K is found the optional parameter .code "(d e)" (with default init form .code frm2 and presence-indicating variable .codn de-p ). Finally, the parameter in the dot position .meta L is .codn g , which captures trailing arguments. Obviously, some of the parameters are compound expressions rather than symbols: .code "(a (b c))" and .codn "(d e)" . These compounds express nested macro parameter lists. Starting in \*(TX 285, the symbol .code t can be used in a macro parameter list in place of a parameter name. This indicates that an object is expected at that position in the corresponding structure, but no variable will be bound. For completeness, the .code t symbol may also be used for a presence-indicating variable. When the name of an optional parameter is specified as .codn t , and the corresponding structure is missing, the .meta init-val expression, if present, is still evaluated under the same circumstances as it would if a variable were present. Nested macro parameter lists recursively match the corresponding structure in the argument object. For instance if a simple argument would capture the structure .code "(1 (2 3))" then we can replace the argument with the nested argument list .code "(a (b c))" which destructures the .code "(1 (2 3))" such that the parameters .codn a , .code b and .code c will end up bound to .codn 1 , .code 2 and .codn 3 , respectively. Nested macro parameter lists have all the features of the top-level macro parameter lists: they can have optional arguments with default values, use the dotted position, and contain the .codn :env , .code :whole and .code :form special parameters, which are described below. In nested parameter lists, the binding strictness is relaxed for optional parameters. If .code "(a (b c))" is optional, and the argument is, say, .codn (1) , then .code a gets .codn 1 , and .code b and .code c receive .codn nil . Macro parameter lists also supports three special keywords, namely .codn :env , .code :whole and .codn :form . The parameter list .code "(:whole x :env y :form z)" will bind parameter .code x to the entire macro parameter list, bind parameter .code y to the macro environment and bind parameter .code z to the entire macro form (the original compound form used to invoke the macro). The .codn :env , .code :whole and .code :form notations can occur anywhere in a macro parameter list, other than to the right of the consing dot. They can be used in nested macro parameter lists also. Note that in a nested macro parameter list, .code :form and .code :env do not change meaning: they bind the same object as they would in the top-level of the macro parameter list. However the .code :whole parameter inside has a restricted scope in a nested parameter list: its parameter will capture just that part of the argument material which matches that parameter list, rather than the entire argument list. The processing of macro parameter lists omits the feature that when the .code : (colon) keyword symbol is given as the argument to an optional parameter, that argument is treated as a missing argument. This special logic is implemented only in the function argument passing mechanism, not in the binding of macro parameters to object structure. If the colon symbol appears in the object structure and is matched against an optional parameter, it is an ordinary value. That parameter is considered present, and takes on the colon symbol as its value. .TP* "Dialect Note:" In ANSI Common Lisp, the lambda list keyword .code &whole binds its corresponding variable to the entire macro form, whereas \*(TL's .code :whole binds its variable only to the arguments of the macro form. Note, however, that ANSI CL distinguishes between destructuring and macro lambda lists, and the .code &whole parameter has a different behavior in each. Under .codn destructuring-bind , the .code &whole parameter receives just the arguments, just like the behavior of \*(TL's .code :whole parameter. \*(TL does not distinguish between destructuring and macro lambda lists; they are the same and behave the same way. Thus .code :whole is treated the same way in macros as in .code tree-bind and related binding operators: it binds just the arguments to the parameter. \*(TL has the special parameter .code :form by means of which macros can access their invoking form. This parameter is also supported in .code tree-bind and binds to the entire .code tree-bind form. ANSI CL doesn't support the convention that the .code t symbol may appear instead of a parameter symbol to suppress the binding of a variable. .NP* The Macro Expansion Process The following description omits the treatment of top-level forms by .code eval and the compiler. This is described, respectively, in the description of .code eval and the section Top-Level Forms inside the LISP COMPILATION chapter. Certain other details are also omitted, such as the dynamic evolution of the macro-time environment, the expansion of macrolet forms. Macro expansion is, generally speaking, a recursive process. The expression to be expanded is classified into cases, and as necessary, the constituent expressions are recursively expanded, depending on these cases. Certain aspects of the process may be regarded as iterative. Macro expansion maintains a macro-time lexical environment which is extended and contracted as the expander descends into various nested binding constructs. The expander may encounter a bindable symbol. If such a symbol has a binding as a symbol macro, then it is replaced by its expansion, and the expander iterates on the resulting form. The form may be another object, including a symbol. If it is the same symbol, than macro expansion terminates; the symbol remains unsubstituted. Symbols are treated differently by the expander if they are in the Lisp-1-style context of the .code dwim operator, or the equivalent square bracket notation. The expander takes into consideration the semantics of the combined function and variable namespace. The expander may encounter a compound form headed by a symbol which has a macro binding. In this situation, the macro expander function is called, and the form is replaced by the resulting form. That form is considered again as a potential macro. In any case, the expander makes a note that it has expanded a macro, If a form isn't a macro, then it's either a function call, special from or an atomic form: a symbol (that has no binding as a symbol macro) or other atom. The interesting cases are special forms and function calls, since the atomic forms are simply returned as-is without expansion. Special forms and function call forms contain other forms, some or all of which require expansion. The expander recognizes the shape of each special form or function call, pulls out the constituent expressions and expands them recursively, combining the results into a new version of the special form or function call form. Because \*(TL allows the same symbol to have a macro and function binding, the expander allows for interplay between the two, which produces useful behaviors. Recall from two paragraphs ago that whenever the expander expands a macro, it makes a note that it has done so. Subsequently, suppose that the rounds of macro expansion happen to terminate in such a way that the result is a function call form. The form's constituents are expanded, If the expansion of those constituents produces any change, then the resulting replacement function call form is again examined for the possibility that it may be a macro. This special requirement, not typically implemented by Lisp macro expanders, greatly simplifies the writing of macros which provide algebraic optimizations of function calls. An example follows to illustrate the benefit of the rule. Note that the example involves some simple macros which change the number of times that an argument expression is evaluated. A more careful handling of this issue is omitted in order to keep the examples simple. Suppose a macro is written for the .code sqrt function like this: .verb (defmacro sqrt (:match :form f) (((* @exp @exp)) exp) (@else f)) .brev The macro uses pattern matching to recognize cases like .code "(sqrt (* a a))" when the argument is a product expression with two identical terms. This pattern implements the arithmetic identity that the positive square root of a real term multiplied by itself is just that term. Now suppose that a similar macro is written to optimize a certain case of the .code expt function: .verb (defmacro expt (:match :form f) ((@exp 2) ^(* ,exp ,exp)) (@else f)) .brev This macro recognizes when the argument is being squared, turning .code "(expt x 2)" into .codn "(* x x)" : a strength reduction from exponentiation to multiplication. What if the following expression is then written: .verb (sqrt (expt x 2)) .brev The special provision in the expander algorithm allows the above combination to reduce to just .codn x , as follows. Firstly, the .code "(sqrt (expt x 2))" expression is treated as a macro call. It doesn't match the main case in the macro, only the fallback case which returns the form unexpanded. The expander notes that it has invoked a macro, and then proceeds to treat the form as a function call. The function call's argument expression .code "(expt x 2)" is expanded as a macro. This produces a transformation: our .code expt macro reduces this quadratic term to .codn "(* x x)" . Here is where the special rule comes into play. The expander sees that the function's arguments have been transformed. It knows that the original function call was the result of expansion. To promote more opportunities for expansion, it tries the transformed function call again as a macro. The .code "(sqrt (* x x))" form is handed to the .code sqrt macro, which this time has a match for the .code "(* x x)" argument pattern, reducing the entire form to .codn x . Effectively, the .code sqrt macro has the opportunity to work with both the unexpanded argument syntax .code "(expt x 2)" as well as its expanded version. It is first offered the one, and when it declines to expand, then the other. .coNP Operator @ defmacro .synb .mets (defmacro < name .mets \ \ \ \ \ \ \ \ \ <> ( param * [: << opt-param * ] [. < rest-param ]) .mets \ \ << body-form *) .syne .desc The .code defmacro operator is evaluated at expansion time. It defines a macro-expander function under the name .metn name , effectively creating a new operator. Note that the above syntax synopsis describes only the canonical parameter syntax which remains after parameter list macros are expanded. See the section Parameter List Macros. Note that the parameter list is a macro parameter list, and not a function parameter list. This means that each .meta param and .meta opt-param can be not only a symbol, but it can itself be a parameter list. The corresponding argument is then treated as a structure which matches that parameter list. This nesting of parameter lists can be carried to an arbitrary depth. A macro is called like any other operator, and resembles a function. Unlike in a function call, the macro receives the argument expressions themselves, rather than their values. Therefore it operates on syntax rather than on values. Also, unlike a function call, a macro call occurs in the expansion phase, rather than the evaluation phase. The return value of the macro is the macro expansion. It is substituted in place of the entire macro call form. That form is then expanded again; it may itself be another macro call, or contain more macro calls. A global macro defined using .code defmacro may decline to expand a macro form. Declining to expand is achieved by returning the original unexpanded form, which may be captured using the .code :form parameter. When a global macro declines to expand a form, the form is taken as-is. At evaluation time, it will be treated as a function call. Note: when a local macro defined by .code macrolet declines, more complicated requirements apply; see the description of .codn macrolet . .TP* "Dialect Notes:" A macro in the global namespace introduced by .code defmacro may coexist with a function of the same name introduced by .codn defun . This is not permitted in ANSI Common Lisp. ANSI Common Lisp doesn't describe the concept of declining to expand, except in the area of compiler macros. Since TXR Lisp allows global macros and functions of the same name to coexist, ordinary macros can be used to optimize functions in a manner similar to Common Lisp compiler macros. A macro can be written of the same name as a function, and can optimize certain cases of the function call by expanding them to some alternative syntax. Cases which it doesn't optimize are handled by declining to expand, in which case the form remains as the original function call. .TP* Example: .verb ;; dolist macro similar to Common Lisp's: ;; ;; The following will print 1, 2 and 3 ;; on separate lines: ;; and return 42. ;; ;; (dolist (x '(1 2 3) 42) ;; (format t "~s\en" x)) (defmacro dolist ((var list : result) . body) (let ((i (gensym))) ^(for ((,i ,list)) (,i ,result) ((set ,i (cdr ,i))) (let ((,var (car ,i))) ,*body)))) .brev .coNP Operator @ macrolet .synb .mets (macrolet >> ({( name < macro-style-params .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ << macro-body-form *)}*) .mets \ \ << body-form *) .syne .desc The .code macrolet binding operator extends the macro-time lexical environment by making zero or more new local macros visible. The .code macrolet symbol is followed by a list of macro definitions. Each definition is a form which begins with a .metn name , followed by .meta macro-style-params which is a macro parameter list, and zero or more .metn macro-body-form s. These macro definitions are similar to those globally defined by the .code defmacro operator, except that they are in a local environment. The macro definitions are followed by optional .metn body-forms . The macros specified in the definitions are visible to these forms. Forms inside the macro definitions such as the .metn macro-body-form s, and initializer forms appearing in the .meta macro-style-params are subject to macro-expansion in a scope in which none of the new macros being defined are yet visible. Once the macro definitions are themselves macro-expanded, they are placed into a new macro environment, which is then used for macro expanding the .metn body-form s. A .code macrolet form is fully processed in the expansion phase of a form, and is effectively replaced by .code progn form which contains expanded versions of .metn body-form s. This expanded structure shows no evidence that any macrolet forms ever existed in it. Therefore, it is impossible for the code evaluated in the bodies and parameter lists of .code macrolet macros to have any visibility to any surrounding lexical variable bindings, which are only instantiated in the evaluation phase, after expansion is done and macros no longer exist. A local macro defined using .code macrolet may decline to expand a macro form. Declining to expand is achieved by returning the original unexpanded form, which may be captured using the .code :form parameter. When a local macro declines to expand a form, the macro definition is temporarily hidden, as if it didn't exist in the lexical scope. If another macro of the same name is thereby revealed (a global macro or another local macro at a shallower nesting level), then an expansion is tried with that macro. If no such macro is revealed, or if a lexical function binding of that name is revealed, then no expansion takes place; the original form is taken as-is. When another macro is tried, the process repeats, resulting in a search which proceeds as far as possible through outer lexical scopes and finally the global scope. .coNP Function @ macro-form-p .synb .mets (macro-form-p < obj <> [ env ]) .syne .desc The .code macro-form-p function returns .code t if .meta obj represents the syntax of a form which is a macro form: either a compound macro or a symbol macro. Otherwise it returns .codn nil . A macro form is one that will transform under .code macroexpand-1 or .codn macroexpand ; an object which isn't a macro form will not undergo expansion. The optional .meta env parameter is a macroexpansion environment. A macroexpansion environment is passed down to macros and can be received via their special .code :env parameter. .meta env is used by .code macro-form-p to determine whether .meta obj is a macro in a lexical macro environment. If .meta env is not specified or is .codn nil , then .code macro-form-p only recognizes global macros. .TP* Example: .verb ;; macro which translates to 'yes if its ;; argument is a macro from, or otherwise ;; transforms to the form 'no. (defmacro ismacro (:env menv form) (if (macro-form-p form menv) ''yes ''no)) (macrolet ((local ())) (ismacro (local))) ;; yields yes (ismacro (local)) ;; yields no (ismacro (ismacro foo)) ;; yields yes .brev During macro expansion, the global macro .code ismacro is handed the macro-expansion environment via .codn ":env menv" . When the macro is invoked within the macrolet, this environment includes the macro-time lexical scope in which the .code local macro is defined. So when global checks whether the argument form .code (local) is a macro, the conclusion is yes: the (local) form is a macro call in that environment: .code macro-form-p yields .codn t . When .code "(global (local))" is invoked outside of the macrolet, no local macro is visible is there, and so .code macro-form-p yields .codn nil . .coNP Functions @ macroexpand-1 and @ macroexpand .synb .mets (macroexpand-1 < obj <> [ env ]) .mets (macroexpand < obj <> [ env ]) .syne .desc If .meta obj is a macro form (an object for which .code macro-form-p returns .codn t ), these functions expand the macro form and return the expanded form. Otherwise, they return .metn obj . .code macroexpand-1 performs a single expansion, expanding just the macro that is referenced by the symbol in the first position of .metn obj , and returns the expansion. That expansion may itself be a macro form. .code macroexpand performs an expansion similar to .codn macroexpand-1 . If the result is a macro form, then it expands that form, and keeps repeating this process until the expansion yields a non-macro-form. That non-macro-form is then returned. The optional .meta env parameter is a macroexpansion environment. A macroexpansion environment is passed down to macros and can be received via their special .code :env parameter. The environment they receive is their lexically apparent macro-time environment in which local macros may be visible. A macro can use this environment to "manually" expand some form in the context of that environment. .TP* Example: .verb ;; (rem-num x) expands x, and if x begins with a number, ;; it removes the number and returns the resulting ;; form. Otherwise, it returns the entire form. (defmacro rem-num (:env menv some-form) (let ((expanded (macroexpand some-form menv))) (if (numberp (car expanded)) (cdr expanded) some-form))) (macrolet ((foo () '(1 list 42)) (bar () '(list 'a))) (list (rem-num (foo)) (rem-num (bar)))) --> ((42) (a)) .brev The .code rem-num macro is able to expand the .code (foo) and .code (bar) forms it receives as the .code some-form argument, even though these forms use local macro that are only visible in their local scope. This is thanks to the macro environment passed to .codn rem-num . It is correctly able to work with the expansions .code "(1 list 42)" and .code "(list 'a)" to produce .code "(list 42)" and .code "(list 'a)" which evaluate to .code 42 and .code a respectively. .coNP Functions @ macroexpand-1-lisp1 and @ macroexpand-lisp1 .synb .mets (macroexpand-1-lisp1 < obj <> [ env ]) .mets (macroexpand-lisp1 < obj <> [ env ]) .syne .desc The .code macroexpand-1-lisp1 and .code macroexpand-lisp1 functions closely resemble, respectively, .code macroexpand-1 and .codn macroexpand . The argument and return value syntax and semantics is almost identical, except for one difference. These functions consider argument .meta obj to be syntax in a Lisp-1 evaluation context, such as any argument position of the .code dwim operator, or the equivalent DWIM Brackets notation. This makes a difference because in a Lisp-1 evaluation context, an inner function binding is able to shadow an outer symbol macro binding of the same name. The requirements about this language area are given in more detail in the description of the .code dwim operator. Note: the .code macroexpand-lisp1 function is useful to the implementor of a macro whose semantics requires one or more argument forms to be treated in a Lisp-1 context, in situations when such a macro needs to itself expand the material, rather than merely insert it as-is into the output code template. .coNP Functions @ expand and @ expand* .synb .mets (expand < form <> [ env ]) .mets (expand* < form <> [ env ]) .syne .desc The functions .code expand and .code expand* both perform a complete expansion of .meta form in the macro-environment .metn env , and return that expansion. If .meta env is omitted, the expansion takes place in the global environment in which only global macros are visible. The returned object is a structure that is devoid of any macro calls. Also, all .code macrolet and .code symacrolet blocks in form .meta form are removed in the returned structure, replaced by their fully expanded bodies. The difference between .code expand and .code expand* is that .code expand suppresses expansion-time deferred warnings (exceptions of type .codn defr-warning ), issued for unbound variables or functions. To suppress a warning means to intercept the warning exception with a handler which throws a .code continue exception to resume processing. What this requirement means is that if unbound functions or variables occur in the .meta form being expanded by expand, the warning is effectively squelched. Rationale: .code expand is may be used by macros for expanding fragments which contain references to variables or functions which are not defined in those fragments. .coNP Function @ expand-with-free-refs .synb .mets (expand-with-free-refs < form >> [ inner-env <> [ outer-env ]]) .syne .desc The .code expand-with-free-refs form performs a full expansion of .metn form , as if by the .code expand function and returns a list containing that expansion, plus four additional items which provide information about variable and function references which occur in .metn form . If both .meta inner-env and .meta outer-env are provided, then it is expected that .meta inner-env is lexically nested within .metn outer-env . Note: it is not required that .meta outer-env be the immediate parent of .metn inner-env . Note: a common usage situation is that .meta outer-env is the environment of the invocation of a "parent" macro which generates a form that contains local macros. The bodies of those local macros use .codn expand-with-free-refs , specifying their own environment as .meta inner-env and that of their generating "parent" as .metn outer-env . In detail, the five items of the returned list are .mono .meti >> ( expansion < fv-inner < ff-inner < fv-outer << ff-outer ) .onom whose descriptions are: .RS .meIP < expansion The full expansion of .metn form , containing no macro invocations, or .code symacrolet or .code macrolet forms. .meIP < fv-inner A list of the free variables which occur in .meta form relative to the .meta inner-env environment. That is to say, variables that are not bound inside .meta form and are not also bound in .metn inner-env . If .meta inner-env is omitted, then these are the absolutely free variables occurring in .metn form . .meIP < ff-inner Exactly like .meta fv-inner but informing about function bindings rather than variables. .meIP < fv-outer A list of the variables which which occur in .meta form which would be free if the environments between .meta inner-env and .meta outer-env (including the former, excluding the latter) were removed from consideration. A more detailed description of this semantics is given below. If .meta outer-env is omitted, then these are the absolutely free variables occurring in .metn form , ignoring the .metn inner-env . .meIP < ff-outer Exactly like .meta fv-outer but informing about function bindings rather than variables. .RE .IP The semantics of the treatment of .meta inner-env and .meta outer-env in the calculation of .meta fv-outer and .meta ff-outer is as follows. A new environment .meta diff-env is calculated from these two environments, and .meta form is expanded in this environment. Variables and functions occurring in .meta form which are not bound in .meta diff-env are listed as .meta fv-outer and .metn ff-outer . This .meta diff-env is calculated as follows. First .meta diff-env is initialized as a copy of .metn outer-env . Then, all environments below .meta outer-env down to .meta inner-env are examined for bindings which shadow bindings in .metn diff-env . Those shadows are removed from .metn diff-env . Therefore, what remains in .meta diff-env are those bindings from .meta outer-env that are .I not shadowed by the environments between .meta inner-env and .metn outer-env . Within each of the lists of variables returned by .codn expand-with-free-refs , the order of the variables is not specified. .TP* Example: Suppose that .code mac is a macro which somehow has access to the two indicated lexical environments in the following code snippet: .verb (let (a c) ;; <- outer-env (let (b) (let (c) ;; <- inner-env (mac (list a b c d))))) .brev Suppose that .code mac invokes the .code expand-with-free-refs function, passing in the .code "(list a b c d)" argument form as .code form and two macro-time environment objects corresponding to the indicated environments. Then the following object shall be a correct return value of .codn expand-with-free-refs : .verb ((list a b c d) (d) nil (d c b) nil) .brev A complete code example of this is given below. Other correct return values are possible due to permitted variations in the order of the variables within the four lists. For instance, instead of .code "(d c b)" the list .code "(c b d)" may appear. The .meta fv-inner list is .code "(d)" because this is the only variable that occurs in .code "(list a b c d)" which is free with regard to .metn inner-env . The .codn a , .code b and .code c variables are not listed because they appear bound inside .metn inner-env . The reported .meta fv-outer list is .code "(b c d)" because the form is considered against .meta diff-env which is formed by removing the shadowing bindings from .metn outer-env . The difference between .code "(a c)" and .code "(b c)" is .code a and so the form is considered in an environment containing the binding .code a which leaves .code "(b c d)" free. The following is a complete code sample demonstrating the above descriptions: .verb ;; Given this macro: (defmacro bigmac (:env out-env big-form) ^(macrolet ((mac (:env in-env little-form) ^',(expand-with-free-refs little-form in-env ,out-env))) ,big-form)) (let (a c) ;; <- outer-env, surrounding bigmac (bigmac (let (b) (let (c) ;; <- inner-env, surrounding mac (mac (list a b c d)))))) --> ((list a b c d) (d) nil (d c b) nil) .brev Note: this information is useful because a set difference can be calculated between the two reported sets. The set difference between the .meta fv-outer variables .code "(b c d)" and the .meta fv-inner variables .code "(d)" is .codn "(b c)" . That set difference .code "(b c)" is significant because it precisely informs about the .I bound variables which occur in .code "(list a b c d)" which appear bound in .metn inner-env , but are not bound due to a binding coming from .metn outer-env . In the above example, these are the variables enclosed in the .code bigmac macro, but external to the inner .code mac macro. The variable .code d is not listed in .code "(b c)" because it is not a bound variable. The variable .code a is not in .code "(b c)" because though it is bound in .metn inner-env , that binding comes from .metn outer-env . The upshot of this logic is that it allows a macro to inspect a form in order to discover the identities of the variables and functions which are used inside that form, whose definitions come from a specific, bounded scope surrounding that form. .coNP Functions @, lexical-var-p @, lexical-fun-p @ lexical-symacro-p and @ lexical-macro-p .synb .mets (lexical-var-p < env << form ) .mets (lexical-fun-p < env << form ) .mets (lexical-symacro-p < env << form ) .mets (lexical-macro-p < env << form ) .syne .desc These four functions are useful to macro writers. They are intended to be called from the bodies of macro expanders, such as the bodies of .code defmacro or .code macrolet forms. The .meta env argument is a macro-time environment, which is available to macros via the special .code :env parameter. Using these functions, a macro can enquire whether a given .meta form is, respectively, a symbol which has a variable binding, a function binding, a symbol macro (defined by .codn symacrolet ) or a macro (defined by .codn macrolet ) in the environment of the macro's invocation. This information is known during macro expansion. The macro expander recognizes lexical function and variable bindings, because these bindings can shadow macros. Special variables are not lexical. The function .code lexical-var-p returns .code nil if .meta form satisfies .code special-var-p function, indicating that it is the name of a special variable. The .code lexical-var-p function also returns .code nil for global lexical variables. If .meta form is a symbol for which only a global lexical variable binding is apparent, .code lexical-var-p returns .codn nil . Testing for the existence for a global variable can be done using .codn boundp ; if a symbol is .code boundp but not .codn special-var-p , then it is a global lexical variable. Similarly, .code lexical-fun-p returns .code nil for global functions, .code lexical-symacro-p returns .code nil for global symbol macros and .code lexical-macro-p returns .code nil for global macros. .TP* Example: .verb ;; ;; this macro replaces itself with :lexical-var if its ;; argument is a lexical variable, :lexical-fun if ;; its argument is a lexical function, or with ;; :not-lex-fun-var if neither is the case. ;; (defmacro classify (sym :env e) (cond ((lexical-var-p e sym) :lexical-var) ((lexical-fun-p e sym) :lexical-fun) (t :not-lex-fun-var))) ;; ;; Use classify macro above to report classification ;; of the x, y and f symbols in the given scope ;; (let ((x 1) (y 2)) (symacrolet ((y x)) (flet ((f () (+ 2 2))) (list (classify x) (classify y) (classify f))))) --> (:lexical-var :not-lex-fun-var :lexical-fun) ;; Locally bound specials are not lexical (let ((*stdout* *stdnull*)) (classify *stdout*)) --> :not-lex-fun-var .brev .TP* Note: .coNP Function @ lexical-binding-kind .synb .mets (lexical-binding-kind < env << symbol ) .syne .desc The .code lexical-binding-kind function inspects the macro-time environment .meta env to determine what kind of binding, if any, does .meta symbol have in the the variable namespace of that environment. If the innermost binding for .meta symbol is a variable binding, then either .code :var is returned if the variable is lexical, otherwise .code nil is returned if the variable is special. If the innermost binding for .meta symbol is a symbol macro, then .code :symacro is returned. In all other cases, .code nil is returned. The function does not consider global symbol macros or global lexical variables. .coNP Function @ lexical-fun-binding-kind .synb .mets (lexical-fun-binding-kind < env << symbol ) .syne .desc The .code lexical-fun-binding-kind function inspects the macro-time environment .meta env to determine what kind of binding, if any, does .meta symbol have in the the function namespace of that environment. If the innermost binding for .meta symbol is a function binding, then .code :fun is returned. If the innermost binding for .meta symbol is a macro, then .code :macro is returned. In all other cases, .code nil is returned. The function does not consider global macros or functions. .coNP Function @ lexical-lisp1-binding .synb .mets (lexical-lisp1-binding < env << symbol ) .syne .desc The .code lexical-lisp1-binding function inspects the macro-time environment .meta env to determine what kind of binding, if any, does .meta symbol have in that environment, from a Lisp-1 perspective. That is to say, it considers function bindings, variable bindings and symbol macro bindings to be in a single name space and finds the innermost binding of one of these types for .metn symbol . If such a binding is found, then the function returns one of the three keyword symbols .codn :var , .codn :fun , or .codn :symacro . If no such lexical binding is found, then the function returns .codn nil . Note that .code :var is never returned for a special variable, but such a variable can be shadowed by a symbol macro, in which case .code :symacro is returned. Note that a .code nil return doesn't mean that the symbol doesn't have a lexical binding. It could have an operator macro lexical binding (a macro binding in the function namespace established by .codn macrolet ). Unlike the .code lexical-binding-kind function, the .code lexical-lisp1-binding function never returns .code :macro because Lisp-1-style evaluation of symbols is blind to the existence of macros, other than symbol macros. .coNP Operator @ defsymacro .synb .mets (defsymacro < sym << form ) .syne .desc A .code defsymacro form introduces a symbol macro. A symbol macro consists of a binding between a symbol .meta sym and and a .metn form . The binding denotes the form itself, rather than its value. The .meta form argument is not subject to macro expansion; it is associated with .meta sym in its unexpanded state, as it appears in the .code defmacro form. The .code defsymacro form must be evaluated for its defining to take place; therefore, the definition is not available in the top-level form which contains the .code defsymacro invocation; it becomes available to a subsequent top-level form. Subsequent to the evaluation of the .code defsymacro definition, whenever the macro expander encounters .meta sym sym as a form, it replaces it by .metn form . After this replacement takes place, .meta form itself is then processed for further replacement of macros and symbol macros. Symbol macros are also recognized in contexts where .meta sym denotes a place which is the target of an assignment operation like .code set and similar. Note: if a symbol macro expands to itself directly, expansion stops. However, if a symbol macro expands to itself through a chain of expansions, runaway expansion-time recursion will occur. If a global variable exists by the name .metn sym , then .code defsymacro first removes that variable from the global environment, and if that variable is special, the symbol's special marking is removed. .code defsymacro doesn't alter the dynamic binding of a special variable. Any such a binding remains intact. If .code defsymacro is evaluated in a scope in which there is any lexical or dynamic binding of .meta sym in the variable namespace, whether as a variable or macro, the global symbol macro is shadowed by that binding. .coNP Operator @ symacrolet .synb .mets (symacrolet >> ({( sym << form )}*) << body-form *) .syne .desc The .code symacrolet operator binds local, lexically scoped macros that are similar to the global symbol macros introduced by .codn defsymacro . Each .meta sym in the bindings list is bound to its corresponding form, creating a new extension of the expansion-time lexical macro environment. Each .meta body-form is subsequently macro-expanded in this new environment in which the new symbol macros are visible. Note: ordinary lexical bindings such as those introduced by let or by function parameters lists shadow symbol macros. If a symbol .code x is bound by nested instances of .code macrolet and a .codn let , then the scope enclosed by both constructs will see whichever of the two bindings is more inner, even though the bindings are active in completely separate phases of processing. From the perspective of the arguments of a .code dwim form, lexical function bindings also shadow symbol macros. This is consistent with the Lisp-1-style name resolution which applies inside a .code dwim form. Lexical operator macros do not shadow symbol macros under any circumstances. .coNP Macros @ placelet and @ placelet* .synb .mets (placelet >> ({( sym << place )}*) << body-form *) .mets (placelet* >> ({( sym << place )}*) << body-form *) .syne .desc The .code placelet macro binds lexically scoped symbol macros in such a way that they behave as aliases for places denoted by place forms. Each .meta place must be an expression denoting a syntactic place. The corresponding .meta sym is established as an alias for the storage location which that place denotes, over the scope of the .metn body-form s. This binding takes place in such a way that each .meta place is evaluated exactly once, only in order to determine its storage location. The corresponding .meta sym then serves as an alias for that location, over the scope of the .metn body-form s. This means that whenever .meta sym is evaluated, it stands for the value of the storage location, and whenever a value is apparently stored into .metn sym , it is actually the storage location which receives it. The .code placelet* variant implements an alternative scoping rule, which allows a later .meta place form to refer to a .meta sym bound to an earlier .meta place form. In other words, a given .meta sym binding is visible not only to the .metn body-form s but also to .meta place forms which occur later. Note: certain kinds of places, notably .mono .meti (force << promise ) .onom expressions, must be accessed before they can be stored, and this restriction continues to hold when those places are accessed through .code placelet aliases. Note: .code placelet differs from .code symacrolet in that the forms themselves are not aliased, but the storage locations which they denote. .code "(symacrolet ((x y)) z)" performs the syntactic substitution of symbol .code x by form .codn y , wherever .code x appears inside .code z as an evaluated form, and is not shadowed by any inner binding. Whereas .code "(placelet ((x y)) z)" generates code which arranges for .code y to be evaluated to a storage location, and syntactically replaces occurrences of .code x with a form which directly denotes that storage location, wherever .code x appears inside .code z as an evaluated form, and is not shadowed by any inner binding. Also, .code x is not necessarily substituted by a single, fixed form, as in the case of .codn symacrolet . Rather it may be substituted by one kind of form when it is treated as a pure value, and another kind of form when it is treated as a place. Note: multiple accesses to an alias created by .code placelet denote multiple accesses to the aliased storage location. That can mean multiple function calls or array indexing operations and such. If the target of the alias is .mono .meti (read-once << place ) .onom instead of .metn place , then a single access occurs to fetch the prior value of .meta place and stored into a hidden variable. All of the multiple occurrences of the alias then simply retrieve this cached prior value from the hidden variable, rather than accessing the place. The .code read-once macro is independent of .code placelet and separately documented. .TP* "Example:" Implementation of .code inc using .codn placelet : .verb (defmacro inc (place : (delta 1)) (with-gensyms (p) ^(placelet ((,p ,place)) (set ,p (+ ,p ,delta))))) .brev The gensym .code p is used to avoid accidental capture of references emanating from the .code delta form. .coNP Macro @ expander-let .synb .mets (expander-let >> ({( sym << init-form )}*) << body-form *) .syne .desc The .code expander-let operator strongly resembles .code let* but has different semantics, relevant to expansion. It also has a stricter syntax in that variables may not be symbols without a .metn init-form : only variable binding specifications of the form .mono .meti >> (sym << init-form ) .onom are allowed. Symbols bound using .code expander-let are expected to be special variables. For every .metn sym , the expression .mono .meti (special-var-p << sym ) .onom should be true. The behavior is unspecified for any .meta sym which doesn't name a special variable. The .code expander-let macro establishes a new dynamic environment which each given .meta sym has the value of the specified .meta init-form which is evaluated in the top-level environment. Then, the .metn body-form s are turned into the arguments of a .code progn form, and that form is then expanded in the new environment in which the dynamic bindings are visible. Thus .code expander-let may be used to bind special variables which are visible to expansion-time computations occurring within .metn body-form s. A macro may generate an .code expander-let form in order to communicate values to macros contained in that form. .coNP Macro @ macro-time .synb .mets (macro-time << form *) .syne .desc The .code macro-time macro evaluates its arguments immediately during macro expansion. The .meta form arguments are processed from left to right. Each .meta form is fully expanded and evaluated in the top-level environment before the next form is considered. The value of the last .metn form , or else .code nil if there aren't any arguments, is converted into a literal expression which denotes that value, and the resulting literal is produced as the expansion of .metn macro-time . Note: .code macro-time supports techniques that require a calculation to be performed in the environment where the program is being compiled, and inserting the result of that calculation as a literal into the program source. Possibly, the calculation can have some useful effect in that environment, or use as an input information that is available in that environment. The .code load-time operator also inserts a calculated value as a de facto literal into the program, but it performs that calculation in the environment where the compiled file is being loaded. The two operators may be considered complementary in this sense. Consider the source file: .verb (defun host-name-c () (macro-time (uname).nodename)) (defun host-name-l () (load-time (uname).nodename)) .brev If this is compiled via .codn compile-file , the .code uname call in .code host-name-c takes place when it is macro-expanded. Thereafter, the compiled version of the function returns the name of the machine where the compilation took place, no matter in what environment it is subsequently loaded and called. In contrast, the compilation of .code host-name-l arranges for that function's .code uname call to take place just one time, whenever the compiled file is loaded. Each time the function is subsequently called, it will return the name of the machine where it was loaded, without making any additional calls to .codn uname . Note: .code macro-time can be understood in terms of the following implementation. Note that this implementation always produces a .code quote expression, which .code macro-time is not required to do if .meta val is self-evaluating: .verb (defmacro macro-time (. forms) (let (val) (each ((f forms)) (set val (eval f))) ^(quote ,val))) .brev Because .code eval treats a top-level .code progn specially, this implementation is also possible: .verb (defmacro macro-time (. forms) ^(quote ,(eval ^(progn ,*forms)))) .brev .TP* Examples: .verb ;; The (1 2 3) object is produced at macro-expansion time, becoming ;; a quoted literal which evaluates to (1 2 3). (macro-time (list 1 2 3)) -> (1 2 3) ;; The above fact is revealed by macroexpand: the list form was ;; evaluated, and then quote was inserted to produce (quote (1 2 3)) ;; which is notated '(1 2 3): (macroexpand '(macro-time (list 1 2 3))) -> '(1 2 3) ;; Quote isn't required on a self-evaluating object; it serves ;; as a literal expression denoting itself: (macroexpand '(macro-time (join-with "-" "a" "b"))) -> "a-b" .brev .coNP Macro @ equot .synb .mets (equot << form ) .syne .desc The .code equot macro ("expand and quote") performs a full expansion of .code form in the surrounding macro environment. Then it constructs a .code quote form whose argument is the expansion. This quote form is then returned as the macro replacement for the original .code equot form. .TP* Example: .verb (symacrolet ((a (+ 2 2))) (list (quote a) (equot a) a)) --> (a (+ 2 2) 4) .brev Above, the expansion of .code a is .codn "(+ 2 2)" . Thus the macro call .code "(equot a)" expands to .codn "(quote (+ 2 2))" . When that is evaluated, it yields .codn "(+ 2 2)" . If .code a is quoted, then the result is .codn a : no expansion or evaluation takes place. Whereas if .code a is presented for evaluation, then not only is it expanded to .codn "(+ 2 2)" , but that expansion is reduced to 4. The .code equot operator is a an intermediate point between these two semantics: it permits expansion to proceed, but then suppresses evaluation of the result. .coNP Operators @, tree-bind @ mac-param-bind and @ mac-env-param-bind .synb .mets (tree-bind < macro-style-params < expr << form *) .mets (mac-param-bind < context-expr .mets \ \ < macro-style-params < expr << form *) .mets (mac-env-param-bind < context-expr < env-expr .mets \ \ < macro-style-params < expr << form *) .syne .desc The .code tree-bind operator evaluates .codn expr , and then uses the resulting value as a counterpart to a macro-style parameter list. If the value has a tree structure which matches the parameters, then those parameters are established as bindings, and the .metn form s, if any, are evaluated in the scope of those bindings. The value of the last .meta form is returned. If there are no forms, .code nil is returned. Under .codn tree-bind , the value of the .code :form available to .meta macro-style-params is the .code tree-bind form itself. The .code mac-param-bind operator is similar to .code tree-bind except that it takes an extra argument, .metn context-expr . This argument is an expression which is evaluated. It is expected to evaluate to a compound form. If an error occurs during binding, the error diagnostic message is based on information obtained from this form. By contrast, the .code tree-bind operator's error diagnostic refers to the .code tree-bind form, which is cryptic if the binding is used for the implementation of some other construct, hidden from the user of that construct. In addition, .meta context-expr specifies the value for the .code :form parameter that .meta macro-style-params may refer to. The .code mac-env-param-bind is an extension of .code mac-param-bind which takes one more argument, .codn env-expr , before the macro parameters. This expression is evaluated, and becomes the value of the .code :env parameter that .meta macro-style-params may refer to. Under .code tree-bind and .codn mac-param-bind , the .code :env parameter takes on the value .codn nil . Under all three operators, the .code :whole parameter takes on the value of .metn expr . These operators throw an exception if there is a structural mismatch between the parameters and the value of .codn expr . One way to avoid this exception is to use .codn tree-case , which is based on the conventions of .codn tree-bind . There exists no .code tree-case analog for .code mac-param-bind or .codn mac-env-param-bind . .coNP Operator @ tree-case .synb .mets (tree-case < expr >> {( macro-style-params << form *)}*) .syne .desc The .code tree-case operator evaluates .meta expr and matches it against a succession of zero or more cases. Each case defines a pattern match, expressed as a macro style parameter list .metn macro-style-params . If the object produced by .meta expr matches .metn macro-style-params , then the parameters are bound, becoming local variables, and the .metn form s, if any, are evaluated in order in the environment in which those variables are visible. If there are forms, the value of the last .meta form becomes the result value of the case, otherwise the result value of the case is nil. If the result value of a case is the object .code : (the colon symbol), then processing continues with the next case. Otherwise the evaluation of .code tree-case terminates, returning the result value. If the value of .meta expr does not match the .meta macro-style-params parameter list of a case, processing continues with the next case. If no cases match, then .code tree-case terminates, returning .codn nil . .TP* Example: .verb ;; reverse function implemented using tree-case (defun tb-reverse (obj) (tree-case obj (() ()) ;; the empty list is just returned ((a) obj) ;; one-element list returned ((a . b) ^(,*(tb-reverse b) ,a)) ;; car/cdr recursion (a a))) ;; atom is just returned .brev Note that in this example, the atom case is placed last, because an argument list which consists of a symbol is a "catch all" match that matches any object. We know that it matches an atom, because the previous .code "(a . b)" case matches conses. In general, the order of the cases in .code tree-case is important: even more so than the order of cases in a .code cond or .codn caseql . The one-element list case is unnecessary; it can be removed. .coNP Macro @ tb .synb .mets (tb < macro-style-params << form *) .syne .desc The .code tb macro is similar to the .code lambda operator but its argument binding is based on a macro-style parameter list. The name is an abbreviation of .codn tree-bind . A .code tb form evaluates to a function which takes a variable number of arguments. When that function is called, those arguments are taken as a list object which is matched against .meta macro-style-params as if by .metn tree-bind . If the match is successful, then the parameters are bound to the corresponding elements from the argument structure and each successive .meta form is evaluated in an environment in which those bindings are visible. The value of the last .meta form is the return value of the function. If there are no forms, the function's return value is .codn nil . The following equivalence holds, where .code args should be understood to be a globally unique symbol: .verb (tb pattern body ...) <--> (lambda (. args) (tree-bind pattern args body ...)) .brev .coNP Macro @ tc .synb .mets (tc >> {( macro-style-params << form *)}*) .syne .desc The .code tc macro produces an anonymous function whose behavior is closely based on the .code tree-case operator. Its name is an abbreviation of .codn tree-case . The anonymous function takes a variable number of arguments. Its argument list is taken to be the value macro is tested against the multiple pattern clauses of an implicit .codn tree-case . The return value of the function is that of the implied .codn tree-case . The following equivalence holds, where .code args should be understood to be a globally unique symbol: .verb (tc clause1 clause2 ...) <--> (lambda (. args) (tree-case args clause1 clause2 ...)) .brev .coNP Macro @ with-gensyms .synb .mets (with-gensyms <> ( sym *) << body-form *) .syne .desc The .code with-gensyms evaluates the .metn body-form s in an environment in which each variable name symbol .meta sym is bound to a new uninterned symbol ("gensym"). .TP* "Example:" The code: .verb (let ((x (gensym)) (y (gensym)) (z (gensym))) ^(,x ,y ,z)) .brev may be expressed more conveniently using the .code with-gensyms shorthand: .verb (with-gensyms (x y z) ^(,x ,y ,z)) .brev .SS* Parameter List Macros Parameter list macros, also more briefly called .I "parameter macros" are an original feature of \*(TL. If the first element of a function or macro parameter list is a keyword symbol other than .codn :env , .codn :whole , .code :form or .code : (the colon symbol), it denotes a parameter macro. This keyword symbol is expected to have a binding in the parameter macro namespace: a global namespace which associates keyword symbols with parameter list expander functions. Parameter list macros are recognized in both function parameter lists and macro parameter lists. A macro parameter list can, via nesting, contain multiple nested parameter lists. Each such nested list may contain parameter macro invocations; those are all traversed and processed. Expansion of a parameter list macro occurs at macro-expansion time, when a function's or macro's parameter list is traversed by the macro expander. It takes place as follows. First, the keyword is removed from the parameter list. The keyword's binding in the parameter macro namespace is retrieved. If it doesn't exist, an exception is thrown. Otherwise, the remaining parameter list is first recursively processed for more occurrences of parameter macros. This expansion produces a transformed parameter list, along with a transformed function body. These two artifacts are then passed to the transformer function retrieved from the keyword symbol's binding. The function returns a further transformed version of the parameter list and body. These are processed for more parameter macros. The process terminates when no more expansion is possible, because a parameter list has been produced which does not begin with a parameter macro. This final parameter list and its accompanying body are then taken in place of the original parameter list and body. \*(TL provides a two built-in parameter list macros. The .code :key parameter macro endows a function keyword parameters. The .code :match parameter macro allows a function to be expressed using pattern matching, which requires the body to consist of pattern-matching clauses. The implementation of both of these macros is written entirely using this parameter list macro mechanism, by means of the public .code define-param-expander macro. .coNP Special Variable @ *param-macro* .desc The variable .code *param-macro* holds a hash table which associates keyword symbols with parameter list expander functions. The functions are expected to conform to the following syntax: .mono .mets (lambda >> ( params < body < env << form ) << form *) .onom The .meta params parameter receives the parameter list of the function which is undergoing parameter expansion. All other parameter macros have already been expanded. The .meta body parameter receives the list of body forms. The function is expected to return a .code cons cell whose .code car contains the transformed parameter list, and whose .code cdr contains the transformed list of body forms. Parameter expansion takes place at macro expansion time. The .meta env parameter receives the macro-expansion-time environment which surrounds the function being expanded. Note that this environment doesn't take into account the parameters themselves; therefore, it is not the correct environment for expanding macros among the .meta body forms. For that purpose, it must be extended with shadowing entries, the manner of doing which is undocumented. However .meta env may be used directly for expanding init forms for optional parameters occurring in .metn params . The .meta form parameter receives the overall function-defining form that is being processes, such as a .code defun or .code lambda form. This is intended for error reporting. A parameter transformer returns the transformed parameter list and body as a single object: a list whose first element is the parameter list, and whose remaining elements are the forms of the body. Thus, the following is a correct null transformer: .verb (lambda (params body env form) (cons params body)) .brev .coNP Macro @ define-param-expander .synb .mets (define-param-expander < name >> ( pvar < bvar : < evar << fvar ) .mets \ \ << form *) .syne .desc The .code define-param-expander macro provides syntax for defining parameter macros. Invocations of this macro expand to code which constructs an anonymous function and installs it into the .code *param-macro* hash table, under the key given by .metn name . The .meta name parameter's argument should be a keyword symbol that is valid for use as a parameter macro name. The .metn pvar , .metn bvar , .meta evar and .meta fvar arguments must be symbols suitable for variable binding. These symbols define the parameters of the expander function which shall, respectively, receive the parameter list, body forms, macro environment and function form. If .meta evar is omitted, a symbol generated by the .code gensym function is used. Likewise if .meta fvar is omitted. The .meta form arguments constitute the body of the expander. The .code define-param-expander form returns .metn name . The parameter macro returns the transformed parameter list and body as a single object: a list whose first element is the parameter list, and whose remaining elements are the forms of the body. .TP* Example: The following example shows the implementation of a parameter macro .code :memo which provides rudimentary memoization. Using the macro is extremely easy. It is a matter of simply inserting the .code :memo keyword at the front of a function's parameter list. The function is then memoized. .verb (defvarl %memo% (hash :weak-keys)) (defun ensure-memo (sym) (or (gethash %memo% sym) (sethash %memo% sym (hash)))) (define-param-expander :memo (param body) (let* ((memo-parm [param 0..(posq : param)]) (hash (gensym)) (key (gensym))) ^(,param (let ((,hash (ensure-memo ',hash)) (,key (list ,*memo-parm))) (or (gethash ,hash ,key) (sethash ,hash ,key (progn ,*body))))))) .brev The above .code :memo macro may be used to define a memoized Fibonacci function as follows: .verb (defun fib (:memo n) (if (< n 2) (clamp 0 1 n) (+ (fib (pred n)) (fib (ppred n))))) .brev All that is required is the insertion of the .code :memo keyword. .coNP Parameter List Macro @ :key .synb .mets (:key << non-key-param * .mets \ \ [ -- >> { sym | >> ( sym >> [ init-form <> [ p-sym ]])}* ] .mets \ \ [ . rest-param ]) .syne .desc Parameter list macro .code :key injects keyword parameter support into functions and macros. When .code :key appears as the first item in a function parameter list, a special syntax is recognized in the parameter list. After any required and optional parameters, the symbol .code -- (two dashes) may appear. Parameters after this symbol are interpreted as keyword parameters. After the keyword parameters, a rest parameter may appear in the usual way as a symbol in the dotted position. Keyword parameters use the same syntax as optional parameters, except that if used in a macro parameter list, they do not support destructuring whereas optional parameters do. That is to say, regardless whether .code :key is used in a function or macro, keyword parameters are symbols. A keyword parameter takes three possible forms: .RS .meIP < sym A keyword parameter may be specified as a simple symbol .metn sym . If the argument for such a keyword parameter is missing, it takes on the value .codn nil . .meIP >> ( sym << init-form ) If the keyword parameter symbol .meta sym is enclosed in a list, then the second element of that list specifies a default value, similarly to the default value for an optional argument. If the function is called in such a way that the argument for the parameter is missing, the .meta init-form is evaluated and the resulting value is bound to the keyword parameter. The evaluation takes place in a lexical scope in which the required and optional parameters are are already visible, and their values are bound. If there is a .meta rest-param it is also visible in this scope, even though in the parameter list it appears to the left. .meIP >> ( sym < init-form << p-sym ) The three-element form of the keyword parameter specifies an additional symbol .metn p-sym , which names an argument that implicitly receives a Boolean argument indicating the presence of the keyword argument. If an argument is not passed for the keyword parameter .metn sym , then parameter .meta sym-p takes on the value .codn nil . If an argument is given for .metn sym , then the .meta sym-p argument takes on the value .codn t . This mechanism also closely resembles the analogous one supported in optional arguments. See the previous paragraph regarding the evaluation scope of .metn init-form . .RE .IP In a call to a .codn :key -enabled function, keyword arguments begin after those arguments which satisfy all of the required and optional parameters. Keyword arguments consist of interleaved indicators and values, which are separate arguments. Thus passing a keyword argument actually requires the passing of two function arguments: an indicator keyword symbol, followed by the associated value. The indicator keywords are expected to have the same symbol name as the defined keyword parameters. For instance, the indicator-value pair .code ":xyz 42" passes the value .code 42 to a keyword parameter that may be named .code xyz in any package: it may be .code usr:xyz or .code mypackage:xyz and so forth. Arguments specifying unrecognized keywords are ignored. If the function has a .metn rest-param , then that parameter receives the keyword arguments as a list. Since that list contains indicators and values, it is a de facto property list. In detail, the .code :key mechanism generates a regular variadic function which receives the keyword arguments as the trailing argument list. That function parses the recognized keyword arguments out of the trailing list, and binds them to the keyword parameter symbols as local variables. If a .meta rest-param parameter is defined, then the entire keyword argument list is available through that parameter, and the keyword argument parsing logic also refers to the value of that parameter to gain access to the keyword arguments. If there is no .meta rest-param specified, then the .code :key macro adds a .meta rest-param using a machine-generated symbol. The argument parsing logic then refers to the value of that symbol. .TP* Example: Define a function .code fun with two required arguments .codn "a b" , one optional argument .codn c , two keyword arguments .code foo and .codn bar , and a rest parameter .codn klist : .verb (defun fun (:key a b : c -- foo bar . klist) (list a b c foo bar klist)) (fun 1 2 3 :bar 4) -> (1 2 3 nil 4 (:bar 4)) .brev Define a function with only keyword arguments, with default expressions and Boolean indicator params: .verb (defun keyfun (:key -- (a 10 a-p) (b 20 b-p)) (list a a-p b b-p)) (keyfun :a 3) -> (3 t 20 nil) (keyfun :b 4) -> (10 nil 4 t) (keyfun :c 4) -> (10 nil 20 nil) (keyfun) -> (10 nil 20 nil) .brev .coNP Function @ macroexpand-params .synb .mets (expand-params < proto-form <> [ env ]) .syne .desc The .code expand-param function expands all of the parameter list macros expressed in the .I "prototype form" .metn proto-form , returning an expanded version of the form. The .meta proto-form is a compound form which has a shape very similar to a lambda expression, and may be a lambda expression. The first element of .meta proto-form is a name, which is an arbitrary object, though the use of a symbol is strongly recommended. This object plays no role in .code expand-params other than for composing diagnostic messages if errors occur. The second element of .meta proto-form is the parameter list. The remaining elements of .meta proto-form are zero or more body forms. If .meta proto-form contains no parameter macro invocations, then it is returned. The optional .meta env parameter specifies the macro environment which is passed to the parameter macro expanders, which they can receive via the .code :env parameter. The default value .code nil specifies the top-level environment. .TP* Examples: .verb ;; No expansion: argument is returned (macroexpand-params '(foo (arg) body)) -> (foo (arg) body) ;; Expand :key macro (macroexpand-params '(bar (:key a b c -- d (e 1234 f-p)) body)) --> (bar (a b c . #:g0014) (let (d e f-p) (let ((#:g0015 (memp :d #:g0014))) (when #:g0015 (set d (cadr #:g0015)))) (let ((#:g0015 (memp :e #:g0014))) (cond (#:g0015 (set e (cadr #:g0015)) (set f-p t)) (t (set e 1234)))) body)) .brev .SS* Mutation of Syntactic Places .coNP Macro @ set .synb .mets (set >> { place << new-value }*) .syne .desc The .code set operator stores the values of expressions in places. It must be given an even number of arguments. If there are no arguments, then .code set does nothing and returns .codn nil . If there are two arguments, .meta place and .metn new-value , then .meta place is evaluated to determine its storage location, then .meta new-value is evaluated to determine the value to be stored there, and then the value is stored in that location. Finally, the value is also returned as the result value. If there are more than two arguments, then .code set performs multiple assignments in left-to-right order. Effectively, .code "(set v1 e1 v2 e2 ... vn en)" is precisely equivalent to .codn "(progn (set v1 e1) (set v2 e2) ... (set vn en))" . .coNP Macro @ pset .synb .mets (pset >> { place << new-value }*) .syne .desc The syntax of .code pset is similar to that of .codn set , and the semantics is similar also in that zero or more places are assigned zero or more values. In fact, if there are no arguments, or if there is exactly one pair of arguments, .code pset is equivalent to .codn set . If there are two or more argument pairs, then all of the arguments are evaluated first, in left-to-right order. No store takes place until after every .meta place is determined, and every .meta new-value is calculated. During the calculation, the values to be stored are retained in hidden, temporary locations. Finally, these values are moved into the determined places. The rightmost value is returned as the form's value. The assignments thus appear to take place in parallel, and .code pset is capable of exchanging the values of a pair of places, or rotating the values among three or more places. (However, there are more convenient operators for this, namely .code rotate and .codn swap ). .TP* Example: .verb ;; exchange x and y (pset x y y x) ;; exchange elements 0 and 1; and 2 and 3 of vector v: (let ((v (vec 0 10 20 30)) (i -1)) (pset [vec (inc i)] [vec (inc i)] [vec (inc i)] [vec (inc i)]) vec) -> #(10 0 30 20) .brev .coNP Macro @ zap .synb .mets (zap < place <> [ new-value ]) .syne .desc The .code zap macro assigns .meta new-value to .meta place and returns the previous value of .metn place . If .meta new-value is missing, then .code nil is used. In more detail, first .code place is evaluated to determine the storage location. Then, the location is accessed to retrieve the previous value. Then, the .code new-value expression is evaluated, and that value is placed into the storage location. Finally, the previously retrieved value is returned. .coNP Macro @ flip .synb .mets (flip << place ) .syne .desc The .code flip macro toggles the Boolean value stored in .metn place . If .meta place previously held .codn nil , it is set to .codn t , and if it previously held a value other than .codn nil , it is set to .codn nil . .coNP Macros @ test-set and @ test-clear .synb .mets (test-set << place ) .mets (test-clear << place ) .syne .desc The .code test-set macro examines the value of .metn place . If it is .code nil then it stores .code t into the place, and returns .codn t . Otherwise it leaves .meta place unchanged and returns .codn nil . The .code test-clear macro examines the value of .metn place . If it is Boolean true (any value except .codn nil ) then it stores .code nil into the place, and returns .codn t . Otherwise it leaves .meta place unchanged and returns .codn nil . .coNP Macro @ compare-swap .synb .mets (compare-swap < place < cmp-fun < cmp-val << store-val ) .syne .desc The .code compare-swap macro examines the value of .meta place and compares it to .meta cmp-val using the comparison function given by the function name .metn cmp-fun . This comparison takes places as if by evaluating the expression .mono .meti >> ( cmp-fun < value << cmp-val ) .onom where .meta value denotes the current value of .metn place . If the comparison is false, .meta place is not modified, the .meta store-val expression is not evaluated, and the macro returns .codn nil . If the comparison is true, then .code compare-swap evaluates the .meta store-val expression, stores the resulting value into .meta place and returns .codn t . .coNP Macro @ ensure .synb .mets (ensure < place << init-expr ) .syne .desc The .code ensure macro examines the value of .metn place . If the current value is .codn nil , then .meta init-expr is evaluated. The value is stored in .meta place and becomes the result of the .code ensure form. If the value of .meta place is other than .codn nil , then the form yields that value. In this case, .meta init-expr isn't evaluated, and .meta place isn't modified. The .meta place expression is evaluated only once to determine the place. .coNP Macros @ inc and @ dec .synb .mets (inc < place <> [ delta ]) .mets (dec < place <> [ delta ]) .syne .desc The .code inc macro increments .meta place by adding .meta delta to its value. If .meta delta is missing, the value used in its place the integer 1. First the .meta place argument is evaluated as a syntactic place to determine the location. Then, the value currently stored in that location is retrieved. Next, the .meta delta expression is evaluated. Its value is added to the previously retrieved value as if by the .code + function. The resulting value is stored in the place, and returned. The macro .code dec works exactly like .code inc except that addition is replaced by subtraction. The similarly defaulted .meta delta value is subtracted from the previous value of the place. .coNP Macros @ pinc and @ pdec .synb .mets (pinc < place <> [ delta ]) .mets (pdec < place <> [ delta ]) .syne .desc The macros .code pinc and .code pdec are similar to .code inc and .codn dec . The only difference is that they return the previous value of .meta place rather than the incremented value. .coNP Macros @ test-inc and @ test-dec .synb .mets (test-inc < place >> [ delta <> [ from-val ]]) .mets (test-dec < place >> [ delta <> [ to-val ]]) .syne .desc The .code test-inc and .code test-dec macros provide combined operations which change the value of a place and provide a test whether, respectively, a certain previous value was overwritten, or a certain new value was attained. By default, this tested value is zero. The .code test-inc macro notes the prior value of .meta place and then updates it with that value, plus .metn delta , which defaults to 1. If the prior value is .code eql to .meta from-val then it returns .codn t , otherwise .codn nil . The default value of .meta from-val is zero. The .code test-dec macro produces a new value by subtracting .meta delta from the value of .metn place . The argument .meta delta defaults to 1. The new value is stored into .metn place . If the new value is .code eql to .meta to-val then .code t is returned, otherwise .codn nil . .coNP Macro @ swap .synb .mets (swap < left-place << right-place ) .syne .desc The .code swap macro exchanges the values of .meta left-place and .meta right-place and returns the value which is thereby transferred to .metn right-place . First, .meta left-place and .meta right-place are evaluated, in that order, to determine their locations. Then the prior values are retrieved, exchanged and stored back. The value stored in .meta right-place is also returned. If .meta left-place and .meta right-place are ranges of the same sequence, the behavior is not specified if the ranges overlap or are of unequal length. Note: the .code rotate macro's behavior is somewhat more specified in this regard. Thus, although any correct .code swap expression can be expressed using .codn rotate , but the reverse isn't true. .coNP Macro @ push .synb .mets (push < item << place ) .syne .desc The .code push macro places .meta item at the head of the list stored in .meta place and returns the updated list which is stored back in .metn place . First, the expression .meta item is evaluated to produce the push value. Then, .meta place is evaluated to determine its storage location. Next, the storage location is accessed to retrieve the list value which is stored there. A new object is produced as if by invoking .code cons function on the push value and list value. This object is stored into the location, and returned. .coNP Macro @ pop .synb .mets (pop << place ) .syne .desc The .code pop macro removes an element from the list stored in .meta place and returns it. First, .meta place is evaluated to determine the place. The place is accessed to retrieve the original value. Then a new value is calculated, as if by applying the .code cdr function to the old value. This new value is stored. Finally, a return value is calculated and returned, as if by applying the .code car function to the original value. .coNP Macro @ pushnew .synb .mets (pushnew < item < place >> [ testfun <> [ keyfun ]]) .syne .desc The .code pushnew macro inspects the list stored in .metn place . If the list already contains the item, then it returns the list. Otherwise it creates a new list with the item at the front and stores it back into .metn place , and returns it. First, the expression .meta item is evaluated to produce the push value. Then, .meta place is evaluated to determine its storage location. Next, the storage location is accessed to retrieve the list value which is stored there. The list is inspected to check whether it already contains the push value, as if using the .code member function. If that is the case, the list is returned and the operation finishes. Otherwise, a new object is produced as if by invoking .code cons function on the push value and list value. This object is stored into the location and returned. .coNP Macro @ shift .synb .mets (shift << place + << shift-in-value) .syne .desc The .code shift macro treats one or more places as a "multi-place shift register". The values of the places are shifted one place to the left. The first (leftmost) place receives the value of the second place, the second receives that of the third, and so on. The last (rightmost) place receives .meta shift-in-value (which is not treated as a place, even if it is a syntactic place form). The previous value of the first place is returned. More precisely, all of the argument forms are evaluated left to right, in the process of which the storage locations of the places are determined, .meta shift-in-value is reduced to its value. The values stored in the places are sampled and saved. Note that it is not specified whether the places are sampled in a separate pass after the evaluation of the argument forms, or whether the sampling is interleaved into the argument evaluation. This affects the behavior in situations in which the evaluation of any of the .meta place forms, or of .metn shift-in-value , has the side effect of modifying later places. Next, the places are updated by storing the saved value of the second place into the first place, the third place into the second and so forth, and the value of .meta shift-in-value into the last place. Finally, the saved original value of the first place is returned. If any of the places are ranges which index into the same sequence, and the behavior is not otherwise unspecified due to the issue noted in an earlier paragraph, the effect upon the multiply-stored sequence can be inferred from the above-described storage order. Note that even if stores take place which change the length of the sequence and move some elements, not-yet-processed stores whose ranges to refer to these elements are not adjusted. With regard to the foregoing paragraph, a recommended practice is that if subranges of the same sequence object are shifted, they be given to the macro in ascending order of starting index. Furthermore, the semantics is simpler if the ranges do not overlap. .coNP Macro @ rotate .synb .mets (rotate << place *) .syne .desc Treats zero or more places as a "multi-place rotate register". If there are no arguments, there is no effect and .code nil is returned. Otherwise, the last (rightmost) place receives the value of the first (leftmost) place. The leftmost place receives the value of the second place, and so on. If there are two arguments, this equivalent to .codn swap . The prior value of the first place, which is the value rotated into the last place, is returned. More precisely, the .meta place arguments are evaluated left to right, and the storage locations are thereby determined. The storage locations are sampled, and then the sampled values are stored back into the locations, but rotated by one place as described above. The saved original value of the leftmost .meta place is returned. It is not specified whether the sampling of the original values is a separate pass which takes place after the arguments are evaluated, or whether this sampling it is interleaved into argument evaluation. This affects the behavior in situations in which the evaluation of any of the .meta place forms has the side effect of modifying the value stored in a later .meta place form. If any of the places are ranges which index into the same sequence, and the behavior is not otherwise unspecified due to the issue noted in the preceding paragraph, the effect upon the multiply-stored sequence can be inferred from the above-described storage order. Note that even if stores take place which change the length of the sequence and move some elements, not-yet-processed stores whose ranges to refer to these elements are not adjusted. With regard to the foregoing paragraph, a recommended practice is that if subranges of the same sequence object are shifted, they be given to the macro in ascending order of starting index. Furthermore, the semantics is simpler if the ranges do not overlap. .coNP Macro @ del .synb .mets (del << place ) .syne .desc The .code del macro requests the deletion of .codn place . If .code place doesn't support deletion, an exception is thrown. First .code place is evaluated, thereby determining its location. Then the place is accessed to retrieve its value. The place is then subject to deletion. Finally, the previously retrieved value is returned. Precisely what deletion means depends on the kind of place. The built-in places in \*(TL have deletion semantics which are intended to be unsurprising to the programmer familiar with the data structure which holds the place. Generally, if a place denotes the element of a sequence, then deletion of the place implies deletion of the element, and deletion of the element implies that the gap produced by the element is closed. The deleted element is effectively replaced by its successor, that successor by its successor and so on. If a place denotes a value stored in a dynamic data set such as a hash table, then deletion of that place implies deletion of the entry which holds that value. If the entry is identified by a key, that key is also removed. If .code place is a DWIM bracket expression indexing into a structure, the structure is expected to implement the .code lambda and .code lambda-set methods. Moreover, the place form must have only two arguments: the object and an index argument. In other words, the .code del form must have this syntax: .mono .mets (del >> [ obj << index ]) .onom The .code lambda method will be invoked with the unmodified .meta obj and .meta index arguments to determine the prior value to be returned. Then the .code lambda-set method will be invoked with three arguments: .metn obj , a possibly modified .meta index value and the argument .code nil representing an empty replacement sequence. If .meta index is a sequence or range, it is passed to the .code lambda-set method unmodified. Otherwise it is expected to be an integer, and converted into a one-element range spanning the indicated element. For instance, if the .meta index value is .codn 3 , it is converted to the range .codn "#R(3 4)" . In effect, the .code lambda-set method is thereby asked to replace the one-element subsequence starting at index .code 3 with the empty sequence .codn nil . .coNP Macro @ lset .synb .mets (lset <> { place }+ << sequence-expr ) .syne .desc The .code lset operator's parameter list consists of one or more places followed by an expression .metn sequence-expr . The macro evaluates .codn sequence-expr , which is expected to produce a sequence. Successive elements of the resulting list are then assigned to each successive .codn place . If there are fewer elements in the sequence than places, the unmatched places receive the value .codn nil . Excess elements in the sequence are ignored. An error exception occurs if the sequence is an improper list with fewer elements than places. A .code lset form produces the value of .meta sequence-expr as its result value. .coNP Macro @ upd .synb .mets (upd < place << opip-arg *) .syne .desc The .code upd macro evaluates .meta place and passes the value as an argument to the operational pipeline function formed, as if by the .code opip macro, from the .meta opip-arg arguments. The result of this function is then stored back into .metn place . The following equivalence holds, except that place .code p is evaluated only once: .verb (upd p x y z ...) <--> (set p (call (opip x y z ...) p)) .brev .SS* User-Defined Places and Place Operators \*(TL provides a number of place-modifying operators such as .codn set , .codn push , and .codn inc . It also provides a variety of kinds of syntactic places which may be used with these operators. Both of these categories are open-ended: \*(TL programs may extend the set of place-modifying operators, as well as the vocabulary of forms which are recognized as syntactic places. Regarding place operators, it might seem obvious that new place operators can be developed, since they are macros, and macros can expand to uses of existing place operators. As an example, it may seem that .code inc operator could be written as a macro which uses .codn set : .verb (defmacro new-inc (place : (delta 1)) ^(set ,place (+ ,place ,delta))) .brev However, the above .code new-inc macro has a problem: the .code place argument form is inserted into two places in the expansion, which leads to two evaluations. This is visibly incorrect if the place form contains any side effects. It is also potentially inefficient. \*(TL provides a framework for writing place update macros which evaluate their argument forms once, even if they have to access and update the same places. The framework also supports the development of new kinds of place forms as capsules of code which introduce the right kind of material into the lexical environment of the body of an update macro, to enable this special evaluation. .NP* Place-Expander Functions The central design concept in \*(TL syntactic places are .IR "place-expander functions" . Each compound place is defined by up to three place-expander functions, which are associated with the place via the leftmost operator symbol of the place form. One place-expander, the .IR "update expander" , is mandatory. Optionally, a place may also provide a .I "clobber expander" as well as a .IR "delete expander" . An update expander provides the expertise for evaluating a place form once in its proper run-time context to determine its actual run-time storage location, and to access and modify the storage location. A clobber expander provides an optimized mechanism for uses that perform a one-time store to a place without requiring its prior value. If a place definition does not supply a clobber expander, then the syntactic places framework uses the update expander to achieve the functionality. A delete expander provides the expertise for determining the actual run-time storage location corresponding to a place, and obliterating it, returning its prior value. If a place does not supply a delete expander, then the place does not support deletion. Operators which require deletion, such as .code del will raise an error when applied to that place. The expanders operate independently, and it is expected that place-modifying operators choose one of the three, and use only that expander. For example, accessing a place with an update expander and then overwriting its value with a clobber expander may result in incorrect code which contains multiple evaluations of the place form. The programmer who implements a new place does not write expanders directly, but rather defines them via the .codn defplace , .code define-accessor or .code defset macro. The programmer who implements a new place update macro likewise does not call the expanders directly. Usually, they are invoked via the macros .codn with-update-expander , .code with-clobber-expander and .codn with-delete-expander . These are sufficient for most kind of macros. In certain complicated cases, expanders may be invoked using the wrapper functions .codn call-update-expander , .code call-clobber-expander and .codn call-delete-expander . These convenience macros and functions perform certain common chores, like macro-expanding the place in the correct environment, and choosing the appropriate function. The expanders are described in the following sections. .NP* The Update Expander .synb .mets (lambda >> ( getter-sym < setter-sym < place-form .mets \ \ \ \ \ \ \ \ << body-form ) ...) .syne .desc The update expander is a code-writer. It takes a .meta body-form argument, representing code, and returns a larger form which surrounds this code with additional code. This larger form returned by the update expander can be regarded as having two abstract actions, when it is substituted and evaluated in the context where .meta place-form occurs. The first abstract action is to evaluate .meta place-form exactly one time, in order to determine the actual run-time location to which that form refers. The second abstract action is to evaluate the caller's .metn body-form s, in a lexical environment in which bindings exist for some lexical functions or (more usually) lexical macros. These lexical macros are explicitly referenced by the .metn body-form ; the update expander just provides their definition, under the names it is given via the .meta getter-sym and .meta setter-sym arguments. The update expander writes local functions or macros under these names: a getter function and a setter function. Usually, update expanders write macros rather than functions, possibly in combination with some lexical anonymous variables which hold temporary objects. Therefore the getter and setter are henceforth referred to as macros. The code being generated is with regard to some concrete instance of .metn place-form . This argument is the actual form which occurs in a program. For instance, the update expander for the .code car place might be called with an arbitrary variant of the .meta place-form which might look like .codn "(car (inc (third some-list)))" . In the abstract semantics, upfront code wrapped around the .meta body-form by the update expander provides the logic to evaluate this place to a location, which is retained in some hidden local context. The getter local macro named by .meta getter-sym must provide the logic for retrieving the value of this place. The getter macro takes no arguments. The .meta body-form makes free use of the getter function; they may call it multiple times, which must not trigger multiple evaluations of the original place form. The setter local macro named by .meta setter-sym must generate the logic for storing a new value into the once-evaluated version of .metn place-form . The setter function takes exactly one argument, whose value specifies the value to be stored into the place. It is the caller's responsibility to ensure that the argument form which produces the value to be stored via the setter is evaluated only once, and in the correct order. The setter does not concern itself with this form. Multiple calls to the setter can be expected to result in multiple evaluations of its argument. Thus, if necessary, the caller must supply the code to evaluate the new value form to a temporary variable, and then pass the temporary variable to the setter. This code can be embedded in the .meta body-form or can be added to the code returned by a call to the update expander. The setter local macro or function must return the new value which is stored. That is to say, when .meta body-form invokes this local macro or function, it may rely on it yielding the new value which was stored, as part of achieving its own semantics. The update expander does not macro-expand .codn place-form . It is assumed that the expander is invoked in such a way that the place has been expanded in the correct environment. In other words, the form matches the type of place which the expander handles. If the expander had to macro-expand the place form, it would sometimes have to come to the conclusion that the place form must be handled by a different expander. No such consideration is the case: when an expander is called on a form, that is final; it is certain that it is the correct expander, which matches the symbol in the .code car position of the form, which is not a macro in the context where it occurs. An update expander is free to assume that any place which is stored (the setter local macro is invoked on it) is accessed at least once by an invocation of the getter. A place update macro which relies on an update expander, but uses only the store macro, might not work properly. An example of an update expander which relies on this assumption is the expander for the .mono .meti (force << promise ) .onom place type. If .meta promise has not yet been forced, and only the setter is used, then .meta promise might remain unforced as its internal value location is updated. A subsequent access to the place will incorrectly trigger a force, which will overwrite the value. The expected behavior is that storing a value in an unforced .code force place changes the place to forced state, preempting the evaluation of the delayed form. Afterward, the promise exhibits the value which was thus assigned. The update expander is not responsible for all issues of evaluation order. A place update macro may consist of numerous places, as well as numerous value-producing forms which are not places. Each of the places can provide its registered update expander which provides code for evaluating just that place, and a means of accessing and storing the values. The place update macro must call the place expanders in the correct order, and generate any additional code in the correct order, so that the macro achieves its required documented evaluation order. .TP* "Example Update Expander Call:" .verb ;; First, capture the update expander ;; function for (car ...) places ;; in a variable, for clarity. (defvar car-update-expander [*place-update-expander* 'car]) ;; Next, call it for the place (car [a 0]). ;; The body form specifies logic for ;; incrementing the place by one and ;; returning the new value. (call car-update-expander 'getit 'setit '(car [a 0]) '(setit (+ (getit) 1))) ;; --> Resulting code: (rlet ((#:g0032 [a 0])) (macrolet ((getit nil (append (list 'car) (list '#:g0032))) (setit (val) (append (list 'sys:rplaca) (list '#:g0032) (list val)))) (setit (+ (getit) 1)))) ;; Same expander call as above, with a call to expand added ;; to show the fully expanded version of the returned code, ;; in which the ;; setit and getit calls have disappeared, ;; replaced by their macro-expansions. (expand (call car-update-expander 'getit 'setit '(car [a 0]) '(setit (+ (getit) 1)))) ;; --> Resulting code: (let ((#:g0032 [a 0])) (sys:rplaca #:g0032 (+ (car #:g0032) 1))) .brev The main noteworthy points about the generated code are: .RS .IP - the .code "(car [a 0])" place is evaluated by evaluating the embedded form .code "[a 0]" and storing storing the resulting object into a hidden local variable. That's as close a reference as we can make to the .code car field. .IP - the getter macro expands to code which simply calls the .code car function on the cell. .IP - the setter uses a system function called .codn sys:rplaca , which differs from .code rplaca in that it returns the stored value, rather than the cell. .RE .NP* The Clobber Expander .synb .mets (lambda >> ( simple-setter-sym < place-form .mets \ \ \ \ \ \ \ \ << body-form ) ...) .syne .desc The clobber expander is a code-writer similar to the update expander. It takes a .meta body-form argument, and returns a larger form which surrounds this form with additional program code. The returned block of code has one main abstract action. It must arrange for the evaluation of .meta body-form in a lexical environment in which a lexical macro or lexical function exists which has the name requested by the .meta simple-setter-sym argument. The simple setter local macro written by the clobber expander is similar to the local setter written by the update expander. It has exactly the same interface, performs the same action of storing a value into the place, and returns the new value. The difference is that its logic may be considerably simplified by the assumption that the place is being subject to exactly one store, and no access. A place update macro which uses a clobber expander, and calls it more than once, break the assumption; doing so may result in multiple evaluations of the .metn place-form . .NP* The Delete Expander .synb .mets (lambda >> ( deleter-sym < place-form .mets \ \ \ \ \ \ \ \ << body-form ) ...) .syne .desc The delete expander is a code-writer similar to clobber expander. It takes a .meta body-form arguments, and returns a larger form which surrounds this form with additional program code. The returned block of code has one main abstract action. It must arrange for the evaluation of .meta body-form in a lexical environment in which a lexical macro or lexical function exists which has the name requested by the .meta deleter-sym argument. The deleter macro written by the clobber expander takes no arguments. It may be called at most once. It returns the previous value of the place, and arranges for its obliteration, whatever that means for that particular kind of place. .coNP Macro @ with-update-expander .synb .mets (with-update-expander >> ( getter << setter ) < place < env .mets \ << body-form ) .syne .desc The .code with-update-expander macro evaluates the .meta body-form argument, whose result is expected to be a Lisp form. The macro adds additional code around this code, and the result is returned. This additional code is called the .IR "place-access code" . The .meta getter and .meta setter arguments must be symbols. Over the evaluation of the .metn body-form , these symbols are bound to the names of local functions which are provided in the place-access code. The .meta place argument is a form which evaluates to a syntactic place. The generated place-access code is based on this place. The .meta env argument is a form which evaluates to a macro-expansion-time environment. The .code with-update-expander macro uses this environment to perform macro-expansion on the value of the .meta place form, to obtain the correct update expander function for the fully macro-expanded place. The place-access code is generated by calling the update expander for the expanded version of .codn place . .TP* "Example:" The following is an implementation of the .code swap macro, which exchanges the contents of two places. Two places are involved, and, correspondingly, the .code with-update-expander macro is used twice, to add two instances of place-update code to the macro's body. .verb (defmacro swap (place-0 place-1 :env env) (with-gensyms (tmp) (with-update-expander (getter-0 setter-0) place-0 env (with-update-expander (getter-1 setter-1) place-1 env ^(let ((,tmp (,getter-0))) (,setter-0 (,getter-1)) (,setter-1 ,tmp)))))) .brev The basic logic for swapping two places is contained in the code template: .verb ^(let ((,tmp (,getter-0))) (,setter-0 (,getter-1)) (,setter-1 ,tmp)) .brev The temporary variable named by the .code gensym symbol .code tmp is initialized by calling the getter function for .metn place-0 . Then the setter function of .meta place-0 is called in order to store the value of .meta place-1 into .metn place-0 . Finally, the setter for .meta place-1 is invoked to store the previously saved temporary value into that place. The name for the temporary variable is provided by the .code with-gensyms macro, but establishing the variable is the caller's responsibility; this is seen as an explicit .code let binding in the code template. The names of the getter and setter functions are similarly provided by the .code with-update-expander macros. However, binding those functions is the responsibility of that macro. To achieve this, it adds the place-access code to the code generated by the .code "^(let ...)" backquote template. In the following example macro-expansion, the additional code added around the template is seen. It takes the form of two .code macrolet binding blocks, each added by an invocation of .codn with-update-expander : .verb (macroexpand '(swap a b)) --> (macrolet ((#:g0036 () 'a) ;; getter macro for a (#:g0037 (val-expr) ;; setter macro for a (append (list 'sys:setq) (list 'a) (list val-expr)))) (macrolet ((#:g0038 () 'b) ;; getter macro for b (#:g0039 (val-expr) ;; setter macro for b (append (list 'sys:setq) (list 'b) (list val-expr)))) (let ((#:g0035 (#:g0036))) ;; temp <- a (#:g0037 (#:g0038)) ;; a <- b (#:g0039 #:g0035)))) ;; b <- temp .brev In this expansion, for example .code #:g0036 is the generated symbol which forms the value of the .code getter-0 variable in the .code swap macro. The getter is a macro which simply expands to a .codn a : straightforward access to the variable a. The .code #:g0035 symbol is the value of the .code tmp variable. Thus the swap macro's .mono ^(let ((,tmp (,getter-0))) ...) .onom has turned into .mono ^(let ((#:g0035 (#:g0036))) ...) .onom A full expansion, with the .code macrolet local macros expanded out: .verb (expand '(swap a b)) --> (let ((#:g0035 a)) (sys:setq a b) (sys:setq b #:g0035)) .brev In other words, the original syntax .mono (,getter-0) .onom became .mono (#:g0036) .onom and finally just .codn a . Similarly, .mono (,setter-0 (,getter-1)) .onom became the .code macrolet invocations .mono (#:g0037 (#:g0038)) .onom which finally turned into: .codn "(sys:setq a b)" . .coNP Macro @ with-clobber-expander .synb .mets (with-clobber-expander <> ( simple-setter ) < place < env .mets \ << body-form ) .syne .desc The .code with-clobber-expander macro evaluates .metn body-form , whose result is expected to be a Lisp form. The macro adds additional code around this form, and the result is returned. This additional code is called the .IR "place-access code" . The .meta simple-setter argument must be a symbol. Over the evaluation of the .metn body-form , this symbol is bound to the name of a functions which are provided in the place-access code. The .meta place argument is a form which evaluates to a syntactic place. The generated place-access code is based on this place. The .meta env argument is a form which evaluates to a macro-expansion-time environment. The .code with-clobber-expander macro uses this environment to perform macro-expansion on the value of the .meta place form, to obtain the correct update expander function for the fully macro-expanded place. The place-access code is generated by calling the update expander for the expanded version of .codn place . .TP* "Example:" The following implements a simple assignment statement, similar to .code set except that it only handles exactly two arguments: .verb (defmacro assign (place new-value :env env) (with-clobber-expander (setter) place env ^(,setter ,new-value))) .brev Note that the correct evaluation order of .code place and .code new-value is taken care of, because .code with-clobber-expander generates the code which performs all the necessary evaluations of .codn place . This evaluation occurs before the code which is generated by .mono ^(,setter ,new-value) .onom part is evaluated, and that code is what evaluates .codn new-value . Suppose that a macro were desired which allows assignment to be notated in a right to left style, as in: .verb (assign 42 a) ;; store 42 in variable a .brev Now, the new value must be evaluated prior to the place, if left-to-right evaluation order is to be maintained. The standard .code push macro has this property: the push value is on the left, and the place is on the right. Now, the code has to explicitly take care of the order, like this: .verb ;; WRONG! We can't just swap the parameters; ;; place is still evaluated first, then new-value: (defmacro assign (new-value place :env env) (with-clobber-expander (setter) place env ^(,setter ,new-value))) ;; Correct: arrange for evaluation of new-value first, ;; then place: (defmacro assign (new-value place :env env) (with-gensym (tmp) ^(let ((,tmp ,new-value)) ,(with-clobber-expander (setter) place env ^(,setter ,tmp))))) .brev .coNP Macro @ with-delete-expander .synb .mets (with-delete-expander <> ( deleter ) < place < env .mets \ << body-form ) .syne .desc The .code with-delete-expander macro evaluates .metn body-form , whose result is expected to be a Lisp form. The macro adds additional code around this code, and the resulting code is returned. This additional code is called the .IR "place-access code" . The .meta deleter argument must be a symbol. Over the evaluation of the .metn body-form , this symbol is bound to the name of a functions which are provided in the place-access code. The .meta place argument is a form which evaluates to a syntactic place. The generated place-access code is based on this place. The .meta env argument is a form which evaluates to a macro-expansion-time environment. The .code with-delete-expander macro uses this environment to perform macro-expansion on the value of the .meta place form, to obtain the correct update expander function for the fully macro-expanded place. The place-access code is generated by calling the update expander for the expanded version of .codn place . .TP* "Example:" The following implements the .code del macro: .verb (defmacro del (place :env env) (with-delete-expander (deleter) place env ^(,deleter))) .brev .coNP Function @ call-update-expander .synb .mets (call-update-expander < getter < setter < place < env .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ << body-form ) .syne .desc The .code call-update-expander function provides an alternative interface for making use of an update expander, complementary to .codn with-update-expander . Arguments .meta getter and .meta setter are symbols, provided by the caller. These are passed to the update expander function, and are used for naming local functions in the generated code which the update expander adds to .metn body-form . The .meta place argument is a place which has not been subject to macro-expansion. The .code call-update-expander function takes on the responsibility for macro-expanding the place. The .meta env parameter is the macro-expansion environment object required to correctly expand .code place in its original environment. The .meta body-form argument represents the source code of a place update operation. This code makes references to the local functions whose names are given by .meta getter and .metn setter . Those arguments allow the update expander to write these functions with the matching names expected by .metn body-form . The return value is an object representing source code which incorporates the .metn body-form , augmenting it with additional code which evaluates .code place to determine its location, and provides place accessor local functions expected by the .metn body-form . .TP* "Example:" The following shows how to implement a .code with-update-expander macro using .codn call-update-expander : .verb (defmacro with-update-expander ((getter setter) unex-place env body) ^(with-gensyms (,getter ,setter) (call-update-expander ,getter ,setter ,unex-place ,env ,body))) .brev Essentially, all that .code with-update-expander does is to choose the names for the local functions, and bind them to the local variable names it is given as arguments. Then it calls .codn call-update-expander . .TP* "Example:" Implement the swap macro using .codn call-update-expander : .verb (defmacro swap (place-0 place-1 :env env) (with-gensyms (tmp getter-0 setter-0 getter-1 setter-1) (call-update-expander getter-0 setter-0 place-0 env (call-update-expander getter-1 setter-1 place-1 env ^(let ((,tmp (,getter-0))) (,setter-0 (,getter-1)) (,setter-1 ,tmp)))))) .brev .coNP Function @ call-clobber-expander .synb .mets (call-clobber-expander < simple-setter < place < env .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ << body-form ) .syne .desc The .code call-clobber-expander function provides an alternative interface for making use of a clobber expander, complementary to .codn with-clobber-expander . Argument .meta simple-setter is a symbol, provided by the caller. It is passed to the clobber expander function, and is used for naming a local function in the generated code which the update expander adds to .metn body-form . The .meta place argument is a place which has not been subject to macro-expansion. The .code call-clobber-expander function takes on the responsibility for macro-expanding the place. The .meta env parameter is the macro-expansion environment object required to correctly expand .code place in its original environment. The .meta body-form argument represents the source code of a place update operation. This code makes references to the local function whose name is given by .metn simple-setter . That argument allows the update expander to write this function with the matching name expected by .metn body-form . The return value is an object representing source code which incorporates the .metn body-form , augmenting it with additional code which evaluates .code place to determine its location, and provides the clobber local function to the .metn body-form . .coNP Function @ call-delete-expander .synb .mets (call-delete-expander < deleter < place < env << body-form ) .syne .desc The .code call-delete-expander function provides an alternative interface for making use of a delete expander, complementary to .codn with-delete-expander . Argument .meta deleter is a symbol, provided by the caller. It is passed to the delete expander function, and is used for naming a local function in the generated code which the update expander adds to .metn body-form . The .meta place argument is a place which has not been subject to macro-expansion. The .code call-delete-expander function takes on the responsibility for macro-expanding the place. The .meta env parameter is the macro-expansion environment object required to correctly expand .code place in its original environment. The .meta body-form argument represents the source code of a place delete operation. This code makes references to the local function whose name is given by .metn deleter . That argument allows the update expander to write this function with the matching name expected by .metn body-form . The return value is an object representing source code which incorporates the .metn body-form , augmenting it with additional code which evaluates .code place to determine its location, and provides the delete local function to the .metn body-form . .coNP Macro @ define-modify-macro .synb .mets (define-modify-macro < name < parameter-list << function-name ) .syne .desc The .code define-modify-macro macro provides a simplified way to write certain kinds of place update macros. Specifically, it provides a way to write place update macros which modify a place by retrieving the previous value, pass it through a function (perhaps together with some additional arguments), and then store the resulting value back into the place and return it. The .meta name parameter specifies the name for the place update macro to be written. The .meta function-name parameter must specify a symbol: the name of the update function. The update macro and update function both take at least one parameter: the place to be updated, and its value, respectively. The .meta parameter-list specifies the additional parameters for the update function, which will also become additional parameters of the macro. Because it is a function parameter list, it cannot use the special destructuring features of macro parameter lists, or the .code :env or .code :whole special parameters. It can use optional parameters, and may be empty. The .code define-modify-macro macro writes a macro called .metn name . The leftmost parameter of this macro is a place, followed by the additional arguments specified by .metn parameter-list . The macro will arrange for the evaluation of the place argument to determine the place location. It will then retrieve and save the prior value of the place, and evaluate the remaining arguments. The prior value of the place, and the values of the additional arguments, are all passed to .meta function and the resulting value is then stored back into the location previously determined for .metn place . .TP* "Example:" Some standard place update macros are implementable using .codn define-modify-macro , such as .codn inc . The .code inc macro reads the old value of the place, then passes it through the .code + (plus) function, along with an extra argument: the delta value, which defaults to one. The .code inc macro could be written using .code define-modify-macro as follows: .verb (define-modify-macro inc (: (delta 1)) +) .brev Note that the argument list .code "(: (delta 1))" doesn't specify the place, because the place is the implicit leftmost argument of the macro which isn't given a name. With the above definition in place, when .code "(inc (car a))" is invoked, then .code "(car a)" is first reduced to a location, and that location's value is retrieved and saved. Then the .code delta parameter s evaluated to its value, which has defaulted to 1, since the argument was omitted. Then these two values are passed to the .code + function, and so 1 is added to the value previously retrieved from .codn "(car a)" . The resulting sum is then stored back .code "(car a)" without evaluating .code "(car a)" again. .coNP Macro @ defplace .synb .mets (defplace < place-destructuring-args < body-sym .mets \ \ >> ( getter-sym < setter-sym << update-body ) .mets \ \ >> [( ssetter-sym << clobber-body ) .mets \ \ \ >> [( deleter-sym << delete-body )]]) .syne .desc The .code defplace macro is used to introduce a new kind of syntactic place. It writes the update expander, and optionally clobber and delete expander functions, from a simpler, more compact specification, and automatically registers the resulting functions. The compact specification of a .code defplace call contains only code fragments for the expander functions. The name and syntax of the place is determined by the .meta place-destructuring-args argument, which is macro-style parameter list whose structure mimics that of the place. In particular, its leftmost symbol gives the name under which the place is registered. The .code defplace macro provides automatic destructuring of the syntactic place, so that the expander code fragments can refer to the components of a place by name. The .meta body-sym parameter must be be a symbol. This symbol will capture the .meta body-forms parameter which is passed to the update expander, clobber expander or delete expander. The code fragments then have access to the body forms via this name. The .metn getter-sym , .metn setter-sym , and .meta update-body parenthesized triplet specify the update expander fragment. The .code defplace macro will bind .meta getter-sym and .meta setter-sym to symbols. The .meta update-body must then specify a template of code which evaluates the syntactic place to determine its storage location, and provides a pair of local functions, using these two symbols as their name. The template must also insert the .meta body-sym forms into the scope of these local functions, and the place determining code. The .meta setter-sym and .meta clobber-body arguments similarly specify an optional clobber expander fragment, as a single optional argument. If specified, the .meta clobber-body must generate a local function named using .meta setter-sym wrapped around .meta body-sym forms. The .meta deleter-sym and .meta deleter-body likewise specify a delete expander fragment. If this is omitted, then the place shall not support deletion. .TP* "Example:" Implementation of the place denoting the .code car field of .code cons cells: .verb (defplace (car cell) body ;; the update expander fragment (getter setter (with-gensyms (cell-sym) ;; temporary symbol for cell ^(let ((,cell-sym ,cell)) ;; evaluate place to cell ;; getter and setter access cell via temp var (macrolet ((,getter () ^(car ,',cell-sym)) (,setter (val) ^(sys:rplaca ,',cell-sym ,val))) ;; insert body form from place update macro ,body)))) ;; clobber expander fragment: simpler: no need ;; to evaluate cell to temporary variable. (ssetter ^(macrolet ((,ssetter (val) ^(sys:rplaca ,',cell ,val))) ,body)) ;; deleter: delegate to pop semantics: ;; (del (car a)) == (pop a). (deleter ^(macrolet ((,deleter () ^(pop ,',cell))) ,body))) .brev .coNP Macro @ defset .synb .mets (defset < name < params < new-val-sym << set-form ) .mets (defset < get-fun-sym << set-fun-sym ) .syne .desc The .code defset macro provides a mechanism for introducing a new kind of syntactic place. It is simpler to use than .code defplace and more concise, but not as general. The .code defset macro is designed for situations in which a function or macro which evaluates all of its arguments is required to serve as a syntactic place. It provides two flavors of syntax: the long form, indicated by giving .code defset five arguments, and a short form, which uses two arguments. In the long form of .codn defset , the syntactic place is described by .meta name and .metn params . The .code defset form expresses the request that a call to the function or operator named .meta name be treated as a syntactic place, which has arguments described by the parameter list .metn params . The .meta set-form argument specifies an expression which generates the code for storing a new value to the place. The .code defset macro makes the necessary arrangements such that when an operator form named by .meta name is treated as a syntactic place, then at macro-expansion time, code is generated to evaluate all of its argument expressions into machine-generated variables. The names of those variables are automatically bound to the corresponding symbols given in the .meta params argument list of the .code defset syntax. Code is also generated to evaluate the expression which gives the new value to be stored, and that is bound to a generated variable whose name is bound to the .code new-val-sym symbol. Then arrangements are made to invoke the operator named by .meta name and to evaluate the .code set-form in an environment in which these symbol bindings are visible. The operator named .meta name is invoked using an altered argument list which uses temporary symbols in place of the original expressions. The task of .code set-form is to insert the values of the symbols from .meta params and .meta new-val-sym into a suitable code templates that will perform the store actions. The code generated by .code set-form must also take on the responsibility of yielding the new value as its result. If .meta params list contains optional parameters, the default value expressions of those parameters shall be evaluated in the scope of the .code defset definition. The .meta params list may specify a rest parameter. In the expansion, this parameter will capture a list of temporary symbols, corresponding to the list of variadic argument expressions. For instance if the .code defset parameter list for a place .code g is .codn "(a b . c)" , featuring the rest parameter .codn c , and its .meta set-form is .code "^(s ,a ,b ,*c)" and the place is invoked as .code "(g (i) (j) (k) (l))" then parameter .code c will be bound to a list of gensyms such as .code "(#:g0123 #:g0124)" so that the evaluation of .meta set-form will yield syntax resembling .codn "(s #:g0121 #:g0122 #:g0123 #:g0124)" . Here, gensyms .code #:g0123 and .code #:g0124 are understood to be bound to the values of the expressions .code (k) and .codn (l) , the two trailing parameters corresponding to the rest parameter .codn c . Syntactic places defined by .code defset that have a rest parameter may be invoked with improper syntax such as .codn "(set (g x y . z) v)" . In this situation, that rest parameter will be bound to the name of a temporary variable which holds the value of .code z rather than to a list of temporary variable names holding the values of trailing expressions. The .code set-form must be prepared for this situation. In particular, the rest parameter's value is an atom, then it cannot be spliced in the backquote syntax, except at the last position of a list. Although syntactic places defined by .code defset perform macro-parameter-like destructuring of the place form, binding unevaluated argument expressions to the parameter symbols, nested macro parameter lists are not supported: .meta params specifies a function parameter list. The parameter list may use parameter macros, keeping in mind that the parameter expansion is applied at the time the .code defset form is processed, specifying an expanded parameter list which receives unevaluated expressions. The .meta set-form may refer to all symbols produced by parameter list expansion, other than generated symbols. For instance, if a parameter list macro .code :addx exists which adds the parameter symbol .code x to the parameter list, and this .code :addx is invoked in the .meta params list of a .codn defset , then .code x will be visible to the .metn set-form . The short, two-argument form of .code defset simply specifies the names of two functions or operators: .code get-fun-sym names the operator which accesses the place, and .code set-fun-sym names the operator which stores a new value into the place. It is expected that all arguments of these operators are evaluated expressions, and that the store operator takes one argument more than the access operator. The operators are otherwise assumed to be variadic: each instance of a place based on .code get-fun-sym individually determines how many arguments are passed to that operator and to the one named by .codn set-fun-sym . The definition .code "(defset g s)" means that .code "(inc (g x y))" will generate code which ensures that .code x and .code y are evaluated exactly once, and then those two values are passed as arguments to .code g which returns the current value of the place. That value is then incremented by one, and stored into the place by calling the .code s function/operator with three arguments: the two values that were passed to .code g and the new value. The exact number of arguments is determined by each individual use of .code g as a place; the .code defset form doesn't specify the arity of .code g and .codn s , only that .code s must accept one more argument relative to .codn g . The following equivalence holds between the short and long forms: .verb (defset g s) <--> (defset g (. r) n ^(g ,*r) ^(s ,*r ,n)) .brev Note: the short form of .code defset is similar to the .code define-accessor macro. .TP* "Example:" Implementation of .code car as a syntactic place using a long form .codn defset : .verb (defset car (cell) new (let ((n (gensym))) ^(rlet ((,n ,new)) (progn (rplaca ,cell ,n) ,n)))) .brev Given such a definition, the expression .code "(inc (car (abc)))" expands to code closely resembling: .verb (let ((#:g0048 (abc))) (let ((#:g0050 (succ (car #:g0048)))) (rplaca #:g0048 #:g0050) #:g0050)) .brev The .code defset macro has arranged for the argument expression .code (abc) of .code car to be evaluated to a temporary variable .codn #:g0048 , a .codn gensym . This, then, holds the .code cons cell being operated on. At macro-expansion time, the variable .code cell from the parameter list specified by the .code defset is bound to this symbol. The access expression .code "(car #:0048)" to retrieve the prior value is automatically generated by combining the name of the place .code car with the gensym to which its argument .code (abc) has been evaluated. The .code new variable was bound to the expression giving the new value, namely .codn "(succ (car #:g0048))" . The .meta set-form is careful to evaluate this only one time, storing its value into the temporary variable .codn #:g0050 , referenced by the variable .codn n . The .metn set-form 's .code "(rplaca ,cell ,n)" fragment thus turned into .code "(rplaca #:g0048 #:g0050)" where .code #:g0048 references the cons cell being operated on, and .code #:g0050 the calculated new value to be stored into its .code car field. The .meta set-form is careful to arrange for the new value .code #:g0050 to be returned. Those place-mutating operators which yield the new value, such as .code set and .code inc rely on this behavior. .coNP Macro @ define-place-macro .synb .mets (define-place-macro < name < macro-style-params .mets \ \ << body-form *) .syne .desc In some situations, an equivalence exists between two forms, only one of which is recognized as a place. The .code define-place-macro macro can be used to establish a form as a place in terms of a translation to an equivalent form which is already a place. The .code define-place-macro has the same syntax as .codn defmacro . It specifies a macro transformation for a compound form which has the .meta name symbol in its leftmost position. Place macro expansion doesn't use an environment; place macros are in a single global namespace, special to place macros. There are no lexically scoped place macros. Such an effect can be achieved by having a place macro expand to an a form which is the target of a global or local macro, as necessary. To support place macros, forms which are used as syntactic places are subject to a modified macro-expansion algorithm: .RS .IP 1. If a place macro exists for a form that is being used as a place, then the that place macro is invoked to expand the form, and the expansion is taken in place of the original form. This process repeats until the form can no longer be expanded as a place macro, or the place macro declines to expand the form by returning the unexpanded input. .IP 2. A form that has been fully expanded as a place macro is then subject to a single-round of macro-expansion, as if by .codn macroexpand-1 , which takes place in the original form's lexical environment. If the form doesn't expand, or the result of expansion is .code nil or a non-symbolic atom, then the process terminates. Otherwise, the process is repeated from step 1. .RE .IP The .code define-place-macro macro does not cause .meta name to become .codn mboundp . There can exist both an ordinary macro and a place macro of the same name. In this situation, when the macro call appears as a place form, it is expanded as a place macro, according to the above steps. When the macro call appears as an evaluated form, not being used as a place, the form is expanded using the ordinary macro. .TP* "Example:" Implementation of .code first in terms of .codn car : .verb (define-place-macro first (obj) ^(car ,obj)) .brev .coNP Functions @ macroexpand-place and @ macroexpand-1-place .synb .mets (macroexpand-1-place < form <> [ env ]) .mets (macroexpand-place < form <> [ env ]) .syne .desc If .meta form is a place macro form (a form whose operator symbol has been defined as a place macro using .codn define-place-macro ) these functions expand the place macro form and return the expanded form. Otherwise, they return .metn form . .code macroexpand-1-place performs a single expansion, expanding only the place the macro that is referenced by the symbol in the first position of .metn form , and returns the expansion. Note that if .meta form is an ordinary macro form, this function will not expand it, even if such an expansion would reveal a place macro form. .code macroexpand-place performs a full place expansion of .meta form by the following process. If .meta form is a place macro call, it is expanded, and the result is checked again to see whether it is a place macro, and expanded. This is repeated as many times as necessary until the result is no longer a place macro call. Then, if the resulting form is an ordinary macro invocation, it is expanded once as if by .codn macroexpand-1 . This process is iterated until a fixed point is reached. The optional .meta env parameter is a macro environment. Note: the .code macroexpand-1-place function ignores the .meta env parameter, which could change in the future. .TP* Examples Given this ordinary macro definition .verb (defmacro leftmost (x) ^(first ,x)) .brev the following results are obtained: .verb ;; ordinary macro leftmost expands to first, ;; then first place macro expands to car: (macroexpand-place '(leftmost x)) -> (car x) ;; macroexpand-1-place won't expand ordinary macro: (macroexpand-1-place '(leftmost x)) -> (leftmost x) ;; macroexpand-1-place expands place macro (macroexpand-1-place '(first x)) -> (car x) .brev .coNP Macro @ rlet .synb .mets (rlet >> ({( sym << init-form )}*) << body-form *) .syne .desc The macro .code rlet is similar to the .code let operator. It establishes bindings for one or more .metn sym s, which are initialized using the values of .metn init-form s. Note that the simplified syntax for a variable which initializes to .code nil by default is not supported by .codn rlet ; that is to say, the syntax .meta sym cannot be used in place of the .mono .meti >> ( sym << init-form ) .onom syntax when .meta sym is to be initialized to .codn nil . The .code rlet macro differs from .code let in that .code rlet assumes that those .metn sym s whose .metn init-form s, after macro expansion, are constant expressions (according to the .code constantp function) may be safely implemented as a symbol macro rather than a lexical variable. Therefore .code rlet is suitable in situations in which simpler code is desired from the output of certain kinds of machine-generated code, which binds local symbols: code with fewer temporary variables. On the other hand, .code rlet is not suitable in some situations when true variables are required, which are assignable, and provide temporary storage. .TP* "Example:" .verb ;; WRONG! Real storage location needed. (rlet ((flag nil)) (flip flag)) ;; error: flag expands to nil ;; Demonstration of constant-propagation (let ((a 42)) (rlet ((x 1) (y a)) (+ x y))) --> 43 (expand '(let ((a 42)) (rlet ((x 1) (y a)) (+ x y)))) --> (let ((a 42)) (let ((y a)) (+ 1 y))) .brev The last example shows that the .code x variable has disappeared in the expansion. The .code rlet macro turned it into into a .code symacrolet denoting the constant 1, which then propagated to the use site, turning the expression .code "(+ x y)" into .codn "(+ 1 y)" . .coNP Macro @ slet .synb .mets (slet >> ({( sym << init-form )}*) << body-form *) .syne .desc The macro .code slet is a stronger form of the .code rlet macro. Just like .codn rlet , .code slet reduces bindings initialized by constant expressions to symbol macros. In addition, unlike .codn rlet , .code slet also reduces to symbol macros those bindings whose initializing expressions are simple references to lexical variables. .TP* Examples: .verb ;; reduces to let (slet ((a (list x y))) a) ;; b is a free variable, so this is also let (slet ((a b)) a) ;; b is lexical, so a becomes a symbol macro ;; the (slet ...) form becomes b. (let (b) (slet ((a b)) a)) ;; a becomes symbol macro; form transforms to 1. (slet ((a 1)) a) .brev .coNP Macro @ alet .synb .mets (alet >> ({( sym << init-form )}*) << body-form *) .syne .desc The macro .code alet ("atomic" or "all") is a stronger form of the .code slet macro. All bindings initialized by constant expressions are turned to symbol macros. Then, if all of the remaining bindings are all initialized by lexical variables, they are also turned to symbol macros. Otherwise, none of the remaining bindings are turned to symbol macros. The .code alet macro can be used even in situations when it is possible that the initializing forms the variables may have side effects through which they affect each others' evaluations. In this situation .code alet still propagates constants via symbol macros, and can eliminate the remaining temporaries if they can all be made symbol macros for existing lexicals: i.e. there doesn't exist any initialization form with interfering side effects. .coNP Macro @ define-accessor .synb .mets (define-accessor < get-function << set-function ) .syne .desc The .code define-accessor macro is used for turning a function into an accessor, such that forms which call the function can be treated as places. Arguments to .code define-accessor are two symbols, which must name functions. When the .code define-accessor call is evaluated, the .meta get-function symbol is registered as a syntactic place. Stores to the place are handled via calls to .metn set-function . If .meta get-function names a function which takes N arguments, .meta set-function must name a function which takes N+1 arguments. Moreover, in order for the accessor semantics to be correct .meta set-function must treat its rightmost argument as the value being stored, and must also return that value. When a function call form targeting .meta get-function is treated as a place which is subject to an update operation (for instance an increment via the .code inc macro), the accessor definition created by .code define-accessor ensures that the arguments of .meta get-function are evaluated only once, even though the update involves a call to .meta get-function and .meta set-function with the same arguments. The argument forms are evaluated to temporary variables, and these temporaries are used as the arguments in the calls. No other assurances are provided by .codn define-accessor . In particular, if .meta get-function and .meta set-function internally each perform some redundant calculation over their arguments, this cannot be optimized. Moreover, if that calculation has a visible effect, that effect is observed multiple times in an update operation. If further optimization or suppression of multiple effects is required, the more general .code defplace macro must be used to define the accessor. It may also be possible to treat the situation in a satisfactory way using a .code define-place-macro definition, which effectively then supplies inline code whenever a certain form is used as a place, and that code itself is treated as a place. Note: .code define-accessor is similar to the short form of .codn defset . .coNP Accessor @ read-once .synb .mets (read-once << expression ) .mets (set (read-once << place ) << new-value ) .syne .desc When the .code read-once accessor is invoked as a function, it behaves like .codn identity , simply returning the value of .metn expression , which is not required to be a syntactic place. If a .code read-once form is used as a syntactic place then its argument must also be a .metn place . The .code read-once syntactic place denotes the same place as the enclosed .code place form, but with somewhat altered semantics, which is most useful in conjunction with .codn placelet , and in writing place-mutating macros which make multiple accesses to a place. Firstly, if the .code read-once place is evaluated, it accesses the existing value of .meta place exactly once, even if it occurs in a place-mutating form which normally doesn't use the prior value, such as the .code set macro. When .code read-once accesses .metn place , it stores the value in a hidden variable. Then, within the same place-mutating form, multiple references to the same .code read-once form all access the value of this hidden variable. Whenever the .code read-once form is assigned, both the the hidden variable and the underlying .meta place receive the new value. Multiple references to the same .code read-once form can be produced using the .code placelet or .code placelet* macros, or by making multiple calls to the getter function obtained using .code with-update-expander in the implementation of a user-defined place-mutating operator, or user-defined place. .TP* Example: In both of the following two examples, there is no question that the .code array and .code i expressions are themselves evaluated only once; the issue is the access to the array itself; under the plain placelet, the array referencing takes place more times. .verb ;; without read-once, array element [array i] is ;; accessed twice to fetch its current value: once ;; in the plusp expression, and then once again in ;; the dec expression. (placelet ((cell [array i])) (if (plusp cell) (dec cell))) ;; with read-once, it is accessed once. plusp refers ;; to a hidden lexical variable to obtain the prior ;; value, and so does dec. dec stores the new value ;; through to [array i] and the hidden variable. (placelet ((cell (read-once [array i]))) (if (plusp cell) (dec cell))) .brev The following is .B not an example of multiple references to the same .code read-once form: .verb (defmacro inc-positive (place) ^(if (plusp (read-once ,place)) (inc (read-once ,place)))) .brev Here, even though the .code read-once forms may be structurally identical, they are separate instances. The first instance isn't even a syntactic place, but a call to the .code read-once function. Multiple references to the same place can only be generated using .code placelet or else by multiple explicit calls to the same getter function or macro generated for a place by an update expander. The following is a corrected version of .codn inc-positive : .verb (defmacro inc-positive (place :env env) (with-update-expander (getter setter) ^(read-once ,place) env ^(if (plusp (,getter)) (,setter (succ (,getter)))))) .brev To write the macro without .code read-once requires that it handles the job of providing a temporary variable for the value: .verb (defmacro inc-positive (place :env e) (with-update-expander (getter setter) place env (with-gensym (value) ^(slet ((,value (,getter))) ^(if (plusp ,value) (,setter (succ ,value))))))) .brev The .code read-once accessor wrapped around .meta place allows .code inc-positive to simply make multiple references to .code "(,getter)" which will cache the value; the macro doesn't have to introduce its own hidden caching variable. .coNP Special Variables @, *place-update-expander* @ *place-clobber-expander* and @ *place-delete-expander* .desc These variables hold hash tables, by means of which update expanders, clobber expanders and delete expanders are registered, as associations between symbols and functions. If .code "[*place-update-expander* 'sym]" yields a function, then symbol .code sym is the basis for a syntactic place. If the expression yields .codn nil , then forms beginning with .code sym are not syntactic places. (The situation of a clobber accessor or delete accessor being defined without an update expander is improper). .coNP Special Variable @ *place-macro* .desc The .code *place-macro* special variable holds the hash table of associations between symbols and place macro expanders. If the expression .code "[*place-macro* 'sym]" yields a function, then symbol .code sym has a binding as a place macro. If that expression yields .codn nil , then there is no such binding: compound forms beginning with .code sym do not undergo place macro expansion. .SS* Structural Pattern Matching .NP* Introduction \*(TL provides a structural pattern-matching system. Structural pattern matching is a syntax which allows for the succinct expression of code which classifies objects according to their shape and content, and which accesses the elements within objects, or both. The central concept in structural pattern matching is the resolution of a pattern against an object. The pattern is specified as syntax which is part of the program code. The object is a run-time value of unknown type, shape and other properties. The primary pattern-matching decision is Boolean: does the object match the pattern? If the object matches the pattern, then it is possible to execute an associated body of code in a scope in which variables occurring in the pattern take on values from the corresponding parts of the object. .NP* Pattern-Matching Operators Structural pattern matching is available via several different macro operators, which are: .codn when-match , .codn if-match , .codn match , .codn match-case , .codn match-cond , .codn match-ecase , .code lambda-match and .codn defun-match . Function and macro argument lists may also be augmented with pattern matching using the .code :match parameter macro. The .code when-match macro is the simplest. It tests an object against a pattern, and if there is a match, evaluates zero or more forms in an environment in which the pattern variables have bindings to the corresponding elements of the object. The .code if-match macro evaluates a single form if there is a match, in the scope of the bindings established by the pattern, otherwise an alternative form evaluated in a scope in which those bindings are absent. The .code match macro tests and object against a pattern, expecting a match. If the match fails, an exception is thrown. Otherwise, it evaluates zero or more forms in the scope of the bindings established by the pattern. The .code match-case macro evaluates the same object against multiple clauses, each consisting of a pattern and zero or more forms. At most one matching clause is identified and evaluated. The .code match-ecase macro is similar to .code match-case except that if no matching case is identified, an exception is thrown. The .code match-cond macro evaluates multiple clauses, each of which specifies a pattern and an object expression. If the object produced by the expression matches the pattern, the forms in the clause are evaluated in scope of the variables bound by the clause's pattern. The .code lambda-match macro provides a way to express an anonymous function whose argument list is matched against multiple clauses similarly to .code match-case and .code defun-match provides a way to define a top-level function using the same concept. Additionally, there exist .code each-match and .code while-match macro families. .NP* Syntax and Key Concepts \*(TL's structural pattern-matching notation is template-based. With the exception of structures and hash tables, objects are matched using patterns which are based on their printed notation. For instance, the pattern .code "(1 2 @a)" is a pattern matching the list .code "(1 2 3)" binding .code a to .codn 3 . The notation supports lists, vectors, ranges and atoms. Atoms are compared using the .code equal function. Thus, in the above pattern, the 1 and 2 in the pattern match the corresponding 1 and 2 atoms in the object using .codn equal . All parts of a pattern are static material which matches literally, except those parts introduced by the meta prefix .codn @ . This prefix denotes variables like .code @a as well as useful pattern-matching operators like .mono .meti @(all << pattern ) .onom which matches a list or sublist whose elements all match .metn pattern . The quasiquote syntax is specially supported for expressing matching, in an alternative style. For instance the quasiquote .code "^(1 2 ,a)" is a pattern equivalent to the .codn "(1 2 @a)" . Structure objects are matched using a dedicated .code "@(struct name ...)" operator, or else in the quasiquote style using .code "^#S(name ...)" syntax. The non-quasiquoted literal syntax .code "#S(name ...)" cannot be used for matching. Similarly, hash objects are matched using a .code "@(hash ...)" operator, or else .code "^#H(...)" syntax in the quasiquote style. .code "#H(...)" cannot be used. Note: the non-quasiquoted .code #S and .code #H literals are not and cannot be used for matching because they produce structure and hash objects which lose important information about how they were specified in the syntax, and carry restrictions which are unacceptable for pattern matching. The order of sub-patterns is important in pattern syntax, but struct and hash objects do not preserve the order in which their elements were specified. A struct literal is required to specify the name of an existing struct type, and slot names which are valid for that type, otherwise it is erroneous. This is not acceptable for pattern matching, because patterns may appear in place of those elements. The pattern match for a hash may specify the same key pattern more than once, which means that the key pattern cannot be an actual key in an actual hash, which requires every key to be unique. Structure and hash quasiquotes do not have these issues; they are not actually literal structure and hash objects, but list-based syntax. .NP* Variables in Patterns Patterns use meta-symbols for denoting variables. Variables must be either bindable symbols, or else .codn nil , which has a special meaning: the pattern variable .code @nil matches any object, and binds no variable. Pattern variables are ordinary Lisp variables. Whereas in ordinary non-pattern matching Lisp code, it is always unambiguous whether a variable is being bound or referenced, this is deliberately not the case in patterns. A variable occurring in a pattern may be a fresh variable, or a reference to an existing one. The difference between these situations is not apparent from the syntax of the pattern; it depends on the context established by the scope. With one exception, if a pattern contains a variable which is already bound in the surrounding scope, then it refers to that binding. Otherwise, it freshly binds the variable. The exception is that pattern operator .code @(as) always binds a fresh variable. A variable being already bound includes as a lexical or global symbol macro .cod2 ( symacrolet or .codn defsymacro ). When a pattern variable refers to an existing variable, then each occurrence of that variable must match an object which is .code equal to the value of that variable. For instance, the following function returns the third element of a list, if the first two elements are repetitions of the .code x argument, otherwise .codn nil : .verb (defun x-x-y (list x) (when-match (@x @x @y) list y)) (x-x-y '(1 1 2) 1) -> 2 (x-x-y '(1 2 3) 1) -> nil ;; no @x @x match (x-x-y '(1 1 2 r2) 1) -> nil ;; list too long .brev If the variable does not exist in the scope surrounding the pattern, then the leftmost occurrence of the variable establishes a binding, taking the value from is corresponding object being matched by that occurrence of the variable. The remaining occurrences of the variable, if any, must correspond to objects which are .code equal to that value, or else there is no match. For instance, the pattern .code "(@a @a)" matches the list like .code "(1 1)" as follows. First .code @a binds to the leftmost .code 1 and then the second .code 1 matches the existing value of that .codn a . An input such as .code "(1 2)" fails to match because the second occurrence of .code @a retrieves an object that is not .code equal to that variable's existing value. A pattern can contain multiple occurrences of the same symbol as a variable. These may or may not refer to the same variable. Two occurrences of the same symbol refer to distinct variables if: .RS .IP 1. they are freshly bound in separate branches of the .code @(or) operator; or .IP 2. one of the two variables is freshly bound by the .code @(as) operator and the other variable occurs outside of that .codn @(as) ; or .IP 3. or both of the variables are freshly bound using .codn @(as) . .RE Any other two or more occurrences same symbol occurring in the same pattern refer to the same variable. .NP* Comparison to Macro Parameter Lists \*(TL's macro-style parameter lists, appearing in .code tree-bind and related macros, also provide a form of structural pattern matching. Macro-style parameter list pattern matching is limited to objects of one kind: tree structures made of .code cons cells. It is only useful for matching on shape, not content. For example, .code tree-bind cannot express the idea of matching a list whose first element is the symbol .code a and whose third element is .codn 42 . Moreover, every position in the tree pattern much specify a variable which captures the corresponding element of the structure or else the symbol .code t to indicate that no variable is to be captured. There are no other pattern matching operators. .NP* User-Defined Patterns User-defined pattern operators are possible. When the .meta operator symbol in the .mono .meti >> @( operator << argument *) .onom syntax doesn't match any built-in operator, a search takes place to determine whether .meta operator is a pattern macro. If so, the pattern macro is expanded, and its result of the expansion treated as a pattern to process recursively, unless it is the original macro form, in which case it is treated as a predicate pattern. User-defined pattern macros are defined using the .code defmatch macro. .SS* Pattern-Matching Notation The pattern-matching notation is documented in the following sections; a section describing the pattern-matching macros follows. .NP* Atom match A pattern consisting of an atom other than a vector matches a similar object. The similarity is determined using the .code equal function. The atom is not subject to evaluation, which means that a symbolic atom stands for itself, and not the value of a variable. .TP* Examples: .verb ;; the pattern 1 matches the object 1 (if-match 1 1 'yes 'no) --> yes ;; the object 0 does not match (if-match 1 0 'yes 'no) --> no ;; a matches a, does not match b (let ((sym 'a)) (list (if-match a sym 'yes 'no) (if-match b sym 'yes 'no))) --> (yes no) .brev .NP* Variable match .synb .mets >> @ symbol .syne .desc A meta-symbol can be used as a pattern expression. This pattern unconditionally matches an object of any kind. The .meta symbol is required to be a either a bindable symbol according to the .code bindable function, or else the symbol .codn nil . If .meta symbol is a bindable symbol, which has not binding in scope, then a variable by that name is freshly bound, and takes on the corresponding object as its value. If .meta symbol is a bindable symbol with an existing binding, then the corresponding object must be .code equal to that variable's existing value, or else the match fails. If .meta symbol is .codn nil , then the match succeeds unconditionally, without binding a variable. .TP* Examples: .verb (when-match @a 42 (list a)) -> (42) (when-match (@a @b @c) '(1 2 3) (list c b a)) -> (3 2 1) ;; No match: list is longer than pattern (when-match (@a @b) '(1 2 3) (list a b)) -> nil ;; Use of nil in dot position to match longer list (when-match (@a @b . @nil) '(1 2 3) (list a b)) -> (1 2) .brev .NP* List match .synb .mets <> ( pattern +) .mets <> ( pattern + . << pattern ) .syne .desc Pattern syntax consisting of a nonempty, possibly improper list matches list structure. A pattern expression may be specified in the dotted position. If it is omitted, then there is an implicit terminating .code nil which constitutes an atom expression matching .codn nil . A list pattern matches a list of the same shape. For each .meta pattern expressions, there must exist an item in the list. A match occurs when every .meta pattern matches the corresponding element of the list, including the .meta pattern in the dotted position. Because the dotted position .meta pattern matches a list, it is possible for a short pattern to match a longer list. The syntax is indicated as requiring at least one .meta pattern because otherwise the list is empty, which corresponds to the atom pattern .codn nil . The syntax .mono .meti (. << pattern ) .onom is valid, but indistinguishable from .metn pattern . It is only a list pattern if .meta pattern is a list pattern. .TP* Examples: .verb (if-match (@a @b @c . @d) '(1 2 3 . 4) (list d c b a)) --> (4 3 2 1) ;; 2 doesn't satisfy oddp (if-match (@(oddp @a) @b @c . @d) '(2 x y z) (list a b c d) :no-match) --> :no-match ;; 1 and 2 match, a takes (3 4) (if-match (1 2 . @a) '(1 2 3 4) a) --> (3 4) ;; nesting (if-match ((1 2 @a) @b) '((1 2 3) 4) (list a b)) -> (3 4) .brev .NP* Vector match .synb .mets <> #( pattern *) .syne .desc A pattern match for a vector is expressed using vector notation enclosing pattern expressions. This pattern matches a vector object which contains exactly as many elements as there are patterns. Each pattern is applied against the corresponding vector element. .TP* Examples: .verb ;; empty vector pattern matches empty vector (if-match #() #() :yes :no) -> :yes ;; empty vector pattern fails to match nonempty vector (if-match #() #(1) :yes :no) -> :no ;; match with nested list and vector (if-match #((1 @a) #(3 @b)) #((1 2) #(3 4)) (list a b)) --> (2 4) .brev .NP* Range match .synb .mets >> #R( from-pattern << to-pattern ) .syne .desc A pattern match for a range can be expressed by embedding pattern expressions in the .code #R notation. The resulting pattern requires the corresponding object to be a range, otherwise the match fails. If the corresponding object is a range, then the .meta from-pattern is matched against its .code from and the .meta to-pattern is matched against its .code to part. Note that if the range expression notation .code a..b is used as a pattern, that is actually a list pattern, due to that being a syntactic sugar for .codn "(rcons a b)" . .TP* Examples: .verb (if-match #R(10 20) 10..20 :yes :no) -> :yes (if-match #R(10 20) #R(10 20) :yes :no) -> :yes (if-match #R(10 20) #R(1 2) :yes :no) -> :no (when-match #R(@a @b) 1..2 (list a b)) -> (1 2) ;; not a range match! rcons syntax match (when-match @a..@b '1..2 (list a b)) -> (1 2) ;; above, de-sugared: (when-match (rcons @a @b) '(rcons 1 2) (list a b)) -> (1 2) .brev .NP* Quasiliteral match .synb .mets <> "`...@" var "...`" .mets <> "@`...@" var "...`" .syne .desc The quasiliteral syntax is supported as a pattern-matching operator. The corresponding object is required to be a character string, which is analyzed according to the structure of the quasiliteral pattern, and portions of the string are captured by variables. If the corresponding object isn't a string according to .code stringp then the match fails. The quasiliteral pattern must match the entire input string. In order that the quasiliteral's syntactic structure is not misinterpreted as a predicate pattern, and in order to make certain situations work in quasiquoted pattern matching, a quasiliteral pattern may be specified as either .code "`...`" or .codn "@`...`" . The latter form, which is structurally .code "(sys:expr (sys:quasi ...))" is specially recognized and treated as equivalent to the unadorned quasiliteral pattern. A quasiliteral pattern matches in a linear fashion, from left to right. Variables bound earlier in the pattern can be referenced later in the pattern as bound variables. With one exception, bound variables denote character strings in accordance with the usual quasiliteral conversion and formatting rules. All of the modifier notations may be used. For instance, if .code x is a bound variable, then .code "@{x -40}" denotes the value of .code x converted to a string, and right-aligned in a forty-character-wide field. Consequently, the notation matches exactly such a forty-character text. The exception is that if a bound variable has a regular expression modifier, as in .code "@{x #/re/}" then it has a special meaning as a pattern. Moreover, this syntax has no meaning in a quasiliteral. In the following description of the quasiliteral pattern-matching rules, the symbols .metn uv , .meta uv0 and .meta uv1 represent to unbound variables: variables which have no apparent lexical binding and are not defined as global variables. Unless indicated otherwise, .mono .meti >> @ uv .onom refers to a plain variable syntax such as .code @abc or else to braced syntax without modifiers, such as .codn @{abc} . The same remarks apply to .meta uv0 and .metn uv1 . The symbol .meta bv represents a bound variable: a variable which has an existing binding, which can occur in the form of the ordinary notation, or the braced notation with or without modifiers. The notation .codn {P} , .codn {P0} , .codn {P1} ... denotes a substring of the pattern, possibly empty. .RS .coIP `` The empty quasiliteral pattern matches an empty string. .coIP `text{P}` A quasiliteral pattern which begins with a portion of text matches a string which begins with the same text. The remaining portion .code {P} of the pattern is then matched against a suffix of the input string which excludes the matched text. .meIP <> `@ uv ` A simple unbound variable occurring as the last element of the pattern matches and binds the entire rest of the input string. .meIP <> `@ uv text{P}` A simple unbound variable followed by a text element matches the input string if .str text occurs in that string as a substring. In that case, .meta uv is bound to the possibly empty prefix of the input string consisting of the characters before the leftmost match for .strn text . The rest of the pattern .code {P} is then matched against that suffix of the input string which begins after the last character of the leftmost match for .strn text . .meIP <2> `@ uv @ bv {P}` The bound variable .meta bv is converted to text in the manner of an ordinary quasiliteral substitution. The situation then reduces to the .mono .meti <> `@ uv text{P}` .onom pattern, where .code text denotes the character string produced by substitution of .metn bv . .meIP >> `@{ uv << integer }{P}` An unbound variable .meta uv which uses the brace notation to specify a literal .meta integer modifier denotes a match for that many characters. It is an error if the value is zero or negative. The match succeeds if the input string has at least that many characters, in which case the variable .meta uv takes on those characters, and the rest of the pattern is matched against a suffix of the string with those characters removed. .meIP >> `@{ uv <> #/ regex /}{P}` An unbound variable .meta uv which carries a regular-expression modifier specifies a regular-expression match. If a prefix of the input string matches .metn regex , then the match is successful and .meta uv captures that prefix. The rest of the pattern .code {P} is then matched against the rest of the string after the prefix. .meIP >> `@{ bv <> #/ regex /}{P}` A bound variable .meta bv which carries a regular expression modifier specifies a regular expression match exactly like an unbound variable. This syntax produces a successful match if two conditions are met: a prefix of the input string matches .metn regex , and the matched prefix is .meta equal to the value of .metn bv . The rest of the pattern .code {P} is then matched against the rest of the string after the prefix. .meIP <> `@ bv {P}` The bound variable .meta bv is converted to text the manner of an ordinary quasiliteral substitution. The situation then reduces to the .code `text{P}` pattern, where .code text denotes the character string produced by substitution of .metn bv . .meIP <2> `@ uv0 @ uv1 {P0}` Two consecutive unbound variables, where .meta uv0 is a plain variable with no modifiers, constitutes an invalid pattern. This situation is diagnosed as an error. If .meta uv0 is braced, carrying an integer or regular-expression modifier .metn mod , then the situation is treated as the pattern .mono .meti >> `@{ uv << mod }{P}` .onom where .code {P} refers to the .mono .meti <> @ uv1 {P0} .onom portion. .RE .IP No other quasiliteral syntax, or combination of variable modifiers, is supported in quasiliteral patterns. .TP* Examples: .verb (when-match `@a-@b` "foo-bar" (list a b)) -> ("foo" "bar") (when-match `@{a #/\ed+/}@b` "123xy" (list a b)) -> ("123" "xy") (let ((a 42)) (when-match `[@{a -8}] @b` "[ 42] packets` b)) -> "packets" .brev .NP* Quasiquote matching notation .synb .mets >> ^ qq-syntax .syne .desc Quasiquoting provides an alternative pattern-matching syntax. It uses a subset of the quasiquoting notation. Only specific kinds of quasiquoted objects listed in this description are supported. Within a quasiquote used for pattern-matching, unquotes indicate operators and variables instead of the .code @ prefix. Splicing unquote syntax plays no role; its presence produces unspecified behavior. The quasiquote matching notation is described, understood and implementing in terms of a translation to the standard pattern-matching syntax, according to the following rules. The .code [X] notation used here indicates that the element enclosed in brackets is subject to a recursive translation according to the rules: .RS .meIP >> , expr An unquoted expression occurring in the quasiquote is translated to the .mono .meti >> @ expr .onom pattern-matching syntax. If .meta expr is a symbol, then this is a meta-variable: .mono .meti (sys:var << expr ) .onom otherwise it is translated to the .mono .meti (sys:expr << expr ) .onom syntax. .coIP ",`...quasilit...`" An unquoted quasiliteral is treated uniformly as .mono .meti >> , expr .onom and is therefore translated into .codn "@`...quasilit...`" . Since that is equivalent to .codn "`...quasilit...`" , quasiliteral matching is supported within quasiquote notation in a straightforward way. .meIP >> ~ expr In JSON syntax, unquotes are given the same above treatment as .code , (comma) unquotes in ordinary syntax. .coIP ~`...quasilit...` Similarly, quasiliterals are supported in JSON syntax. .meIP #H(() >> ( k0 << v0 ) >> ( k1 << v1 ) ...) Hash quasiquote syntax is translated according to the .mono .meti @(hash <> ([ k0 ] <> [ v0 ]) <> ([ k0 ] <> [ v0 ]) ...) .onom pattern, with each key and value recursively translated. The syntax must specify .code () for the hash construction arguments part, otherwise an error is diagnosed. That is to say, it must be of the form .codn "#H(() ...)" . where the first element is .codn () . .meIP >> #S( type < e0 < e1 ...) Structure quasiquote syntax is translated according to the .mono .meti @(struct <> [ type ] <> [ e0 ] <> [ e1 ] ...) .onom pattern. .meIP >> #( e0 < e1 ...) Vector quasiquote syntax is translated according to the .mono .meti <> #([ e0 ] <> [ e1 ] ...) .onom pattern: it becomes a vector object containing embedded patterns. .meIP <> #J[ e0 , << e1 , ...] A JSON array quasiquote is translated into .mono .meti <> #([ e0 ] <> [ e1 ] ...) .onom exactly like a vector. Here, the .code [X] transformation recognizes JSON .code ~ (tilde) unquotes, and recursively recognizes and transform JSON syntax not prefixed by .codn #J . .meIP >> #J{ k0 : << v0 , < k1 : << v1 , ...} A JSON hash quasiquote is translated into .mono .meti @(hash <> ([ k0 ] <> [ v0 ]) <> ([ k0 ] <> [ v0 ]) ...) .onom exactly like a hash. .meIP >> ( car . << cdr ) Tree structure is translated according to the .mono .meti <> ([ car ] . <> [ cdr ]) .onom pattern: it is recursively examined for translations. .meIP >> ^ nested-qq-syntax A nested quasiquote pattern is diagnosed as an error. .meIP >> ,* expr Splicing syntax is diagnosed as an error. .meIP >> ~* expr Splicing JSON syntax is diagnosed as an error inside a JSON quasiquote. .meIP >> ~* expr .meIP < obj Any other quasiquoted object is left untranslated. .RE .IP .TP* Examples: .verb ;; basic unquote: variables embedded via unquote, ;; not requiring @ prefix. (when-match ^(,a ,b) '(1 2) (list a b)) --> (1 2) ;; operators embedded via unquote; interior of operators ;; is regular non-quasiquoting pattern syntax. (when-match ^(,(oddp @a) ,(evenp @b)) '(1 2) (list a b)) --> (1 2) (when-match ^#(,a ,b) #(1 2) (list a b)) --> (1 2) (when-match ^#S(,type year ,y) #S(time year 2021) (list (struct-type-name type) y)) --> (time 2021) (when-match ^#H(() (x ,y) (,(symbolp @y) ,datum)) #H(() (x k) (k 42)) datum) --> (42) ;; JSON syntax (when-match ^#J~a 42.0 a) --> 42.0 (when-match ^#J[~a, ~b] #J[true, false] (list a b)) --> (t nil) (when-match ^#J{"x" : ~y, ~(symbolp @y) : ~datum} #J{"x" : true, true : 42} datum) --> (42.0) (when-match ^#J{"foo" : {"x" : ~val}} #J{"foo" : {"x" : "y"}} val) --> "y" .brev .coNP Pattern Operator @ struct .synb .mets @(struct < name >> { slot-name << pattern }*) .mets @(struct < pattern >> { slot-name << pattern }*) .syne .desc The .code struct pattern operator matches a structure object. The operator supports two modes of matching, the choice of which depends on whether the first argument is a .meta name or a .metn pattern . The first argument is considered a .meta name if it is a bindable symbol according to the .code bindable function. In this situation, the operator operates in strict mode. Otherwise, the operator is in loose mode. The .meta name or .meta pattern argument is followed by zero or more .meta "slot-name pattern" pairs, which are not enclosed in lists, similarly to the way slots are presented in the .code #S struct syntax and in the argument conventions of the .code new macro. In strict mode, .meta name is assumed to be the name of an existing struct type. The object being matched is tested whether it is a subtype of this type, as if using the .code subtypep function. If it isn't, the match fails. In loose mode, the object being matched is tested whether it is a structure object of any structure type. If it isn't, the match fails. In strict mode, each .meta "slot-name pattern" pair requires that the object's slot of that name contain a value which matches .metn pattern . The operator assumes that all the .metn slot-name s are slots of the struct type indicated by .metn name . In loose mode, no assumption is made that the object actually has the slots specified by the .meta slot-name arguments. The object's structure type is inquired to determine whether it has each of those slots. If it doesn't, the match fails. If the object has the required slots, then the values of those slots are matched against the patterns. In loose mode, the .meta pattern given in the first argument position of the syntax is matched against the object's structure type: the type itself, rather than its symbolic name. .TP* Examples: .verb ;; extract the month from a time structure ;; that is required to have a year of 2021. (when-match @(struct time year 2021 month @m) #S(time year 2021 month 1) m) -> 1 ;; match any structure with name and value slots, ;; whose name is foo, and extract the value. (defstruct widget () name value) (defstruct grommet () name value) (append-each ((obj (list (new grommet name "foo" value :grom) (new widget name "foo" value :widg)))) (when-match @(struct @type name "foo" value @v) obj (list (list type v)))) --> ((# :grom) (# :widg)) .brev .coNP Pattern Operator @ hash .synb .mets @(hash >> {( key-pattern <> [ value-pattern ])}*) .syne .desc The .code hash pattern operator matches a hash-table object by means of patterns which target keys, values or both. An important concept in the requirements governing the operation of the .code hash operator is that of a .IR "trivial pattern" . A pattern is .I nontrivial if it is a variable or operator. A quasiliteral pattern that contains variables or operators is nontrivial. A pattern is also nontrivial if it is a list, vector, range or quasiquote pattern containing at least one nontrivial pattern. Otherwise, it is trivial. The .code hash operator requires the corresponding object to be a hash table, otherwise the match fails. If the corresponding object is a hash table, then the operator matches each .meta key-pattern and .meta value-pattern pair against that object as described below. Each of the pairs must successfully match, otherwise the overall match fails. The following requirements apply to key-value pattern pairs in which the value pattern is specified. If .meta key-pattern is a trivial pattern, then the semantics of the match is that .meta key-pattern is taken as a literal object representing a hash key. The hash table is searched for that key. If the key is not found, the match fails. Otherwise, the value corresponding to that key is matched against the .meta value-pattern which may be trivial or nontrivial. If .meta key-pattern is a simple variable pattern .mono .meti >> @ sym .onom and if .meta sym has an existing binding, then the value of .meta sym is looked up in the hash table. If it is not found, then the match fails, otherwise the corresponding value is matched against .metn value-pattern , which may be trivial or nontrivial. If .meta key-pattern is a nontrivial pattern other than a variable pattern for a variable which has an existing binding, and if .meta value-pattern is trivial, then .meta value-pattern is taken as a literal object, which is used for searching the hash table for one or more keys, as if it were the .meta value argument in a call to the .code hash-keys-of function, to find all keys which have a value .code equal to that value. If no keys are found, then the match fails. Otherwise, the .code key-pattern is then matched against the retrieved list of hash keys. Finally, if both .meta key-pattern and .meta value-pattern are nontrivial, then an exhaustive search is performed of the hash table. Every key in the hash table is matched against .meta key-pattern and if it matches, the value is matched against .metn value-pattern . If both match, then the values from the matches are collected into lists. At least one matching key-value pair must be found, otherwise the overall match fails. Note: this situation can be understood as if the hash table were an association list of .code cons cells of the form .verb .mets >> ( key . << value ) .brev and as if the two patterns were combined into a .code coll operator against this list in the following way: .verb .mets @(coll >> ( key-pattern . << value-pattern )) .brev such that the semantics can then be understood in terms of the .code coll operator matching against an association list. The following requirements apply when the .meta value-pattern is omitted. If .meta key-pattern is a nontrivial pattern other than a variable pattern for a variable which has an existing binding, then the pattern is applied against the list of keys from the hash table, which are retrieved as if using the .code hash-keys function. If .meta key-pattern is a variable pattern referring to an existing binding, then that pattern is taken as a literal object. The match is successful if that object occurs as a key in the hash table. .TP* Example: .verb ;; First, (x @y) has a trivial key pattern so the x ;; entry from the hash table is retrieved, the ;; value being the symbol k. This k is bound to @y. ;; Because y now a bound variable the pattern (@y @datum) ;; is interpreted as search of the hash table for ;; a single entry matching the value of @y. This ;; is the k entry, whose value is 42. The @datum ;; value match takes this 42. (when-match @(hash (x @y) (@y @datum)) #H(() (x k) (k 42)) datum) --> 42 ;; Again, (x @y) has a trivial key pattern so the x ;; entry from the hash table is retrieved, the ;; value being the symbol k. This k is bound to @y. ;; This time the second pattern has a @(symbolp) ;; predicate operator. This is not a variable, and ;; so the pattern searches the entire ;; hash table. The @y variable has a binding to k, ;; so only the (k 42) entry is matched. The 42 ;; value matches @datum, and is collected into a list. (when-match @(hash (x @y) (@(symbolp @y) @datum)) #H(() (x k) (k 42)) datum) --> (42) .brev .coNP Pattern Operator @ as .synb .mets @(as < name << pattern ) .syne .desc The .code as pattern operator binds the corresponding object to a fresh variable given by .metn name , similarly to the Lisp .code let operator. If another variable called .meta name exists, it is shadowed; thus, no back-referencing is performed. The .meta name argument must be a bindable symbol, or else .codn nil . If .meta name is .codn nil , then no name is bound. Thus .mono .meti @(as nil << pattern ) .onom is equivalent to .metn pattern . Otherwise, .meta pattern processed in a scope in which the new .meta name binding is already visible. The .code as operator succeeds if .meta pattern matches. Note: in a situation when it is necessary to bind a variable to an object in parallel with one or more patterns, such that the variable can back-reference to an existing occurrence, the .code and pattern operator can be used. .TP* Example: .verb ;; w captures the entire (1 2 3) list: (when-match @(as w (@a @b @c)) '(1 2 3) (list w a b c)) --> ((1 2 3) 1 2 3) ;; match a list which has itself as the third element (when-match @(as a (1 2 @a 4)) '#1=(1 2 #1# 4) :yes) --> :yes .brev .coNP Pattern Operator @ with .synb .mets @(with <> [ main-pattern ] >> { side-pattern | << name } << expr ) .syne .desc The .code with pattern operator matches the optional .meta main-pattern against a corresponding object, while matching a .meta side-pattern or .meta name against the value of the expression .meta expr which is embedded in the syntax. First, if .meta main-pattern is present in the syntax, it is matched against its corresponding object. This match must succeed, or else the .code with operator fails to match, in which case .meta expr is not evaluated. Next, if .meta main-pattern successfully matched, or is absent, .meta expr is evaluated in the scope of earlier pattern variables, including any which that emanate from .metn main-pattern . It is unspecified whether later pattern variables are visible. Finally, .meta side-pattern is matched against the value of .metn expr . If that succeeds, then the operator has successfully matched. If a .meta name is specified instead of a .metn side-pattern , it must be a bindable symbol or else .codn nil . .TP* Examples: .verb (when-match (@(with @a x 42) @b @c) '(1 2 3) (list a b c x)) --> (1 2 3 42) (let ((o 3)) (when-match (@(evenp @x) @(with @z @(oddp y) o)) '(4 6) (list x y z))) --> (4 3 6) .brev .coNP Pattern Operator @ require .synb .mets @(require < pattern << condition *) .syne .desc The pattern operator .code require applies the specified .meta pattern to the corresponding object. If the .meta pattern matches, the operator then imposes the additional constraints specified by zero or more .meta condition forms. Each .meta condition is evaluated in a scope in which the variables from .meta pattern have already been established. For the .code require operator to be a successful match, every .meta condition must evaluate true, otherwise the match fails. The .meta condition forms behave as if they were the arguments of an implicit .code and operator, which implies left-to-right evaluation behavior, stopping evaluation on the first .meta condition which produces .codn nil , and defaulting to a result of .code t when no .meta condition forms are specified. .TP* Examples: .verb ;; Match a (+ a b) expression where a and b are similar: (when-match @(require (+ @a @b) (equal a b)) '(+ z z) (list a b)) --> (z z) ;; Mismatched case (if-match @(require (+ @a @b) (equal a b)) '(+ y z) (list a b) :no-match) --> :no-match .brev .coNP Pattern Operators @ all and @ all* .synb .mets @(all << pattern ) .mets @(all* << pattern ) .syne .desc The .code all and .code all* pattern operators require the corresponding object to be a sequence. The specified .meta pattern is applied against every element of the sequence. The match is successful if .meta pattern matches every element. Furthermore, in the case of a successful match, each variable that is freshly bound by .meta pattern is converted into a list of all of the objects which that variable encounters from all elements of the sequence. Those variables which already have a binding from another .meta pattern are not converted to lists. Their existing values are merely required to match each corresponding object they encounter. The difference between .code all and .code all* is as follows. The .code all operator respects the vacuous truth of the match when the sequence is empty. In that case, the match is successful, and the variables are all bound to the empty list .codn nil . In contrast, the alternative .code all* operator behaves like a failed match when the sequence is empty. .TP* Examples: .verb ;; all elements of list match the pattern (x @a @b) ;; a is bound to (1 2 3); b to (a b c) (when-match @(all (x @a @b)) '((x 1 a) (x 2 b) (x 3 c)) (list a b)) --> ((1 2 3) (a b c)) ;; Match a two element list whose second element ;; consists of nothing but zero or more repetitions ;; of the first element. x is not turned into a list ;; because it has a binding due to @x. (when-match @(@x @(all x)) '(1 (1 1 1 1)) x) -> 1 ;; no match because of the 2 (when-match @(@x @(all x)) '(1 (1 1 1 2)) x) -> nil .brev .coNP Pattern Operator @ some .synb .mets @(some << pattern ) .syne .desc The .code some pattern operator requires the corresponding object to be a sequence. The specified .meta pattern is applied against every element of the sequence. The match is successful if .meta pattern matches at least one element. Variables are extracted from the first matching which is found. .TP* Example: .verb ;; the second (x 2 b) element is the leftmost one ;; which matches the (x @a @b) pattern (when-match @(some (x @a @b)) '((y 1 a) (x 2 b) (z 3 c)) (list a b)) -> (2 b) .brev .coNP Pattern Operator @ coll .synb .mets @(coll << pattern ) .syne .desc The .code coll pattern operator requires the corresponding object to be a sequence. The specified .meta pattern is applied against every element of the sequence. The match is successful if .meta pattern matches at least one element. Each variable that is freshly bound by the .meta pattern is converted into a list of all of the objects which that variable encounters from the matching elements of the sequence. Those variables which already have a binding from another .meta pattern are not converted to lists. Their existing values are merely required to match each corresponding object they encounter. Variables are extracted from all matching elements, and collected into parallel lists, just like with the .code @(all) operator. .TP* Example: .verb (when-match @(coll (x @a @b)) '((y 1 a) (x 2 b) (z 3 c) (x 4 d)) (list a b)) -> ((2 4) (b d)) .brev .coNP Pattern Operators @ scan and @ scan-all .synb .mets @(scan << pattern )) .mets @(scan-all << pattern )) .syne .desc The .code scan operator matches .meta pattern against the corresponding object. If the match fails, and the object is a .code cons cell, the match is tried on the .code cdr of the cons cell. The .code cdr traversal repeats until a successful match is found, or a match failure occurs against against an atom. Thus, a list object, possibly improper, matches .meta pattern under .code scan if any suffix of that object matches. The .code scan-all pattern matches the object in the same way. However, instead of finding the leftmost match, it finds all matches. Every variable that occurs inside .meta pattern is bound to a list of the matches which correspond to that variable. .TP* Examples: .verb ;; mismatch: 1 doesn't match 2 (when-match @(scan 2) 1 t) -> t ;; simple atom match: 42 matches 42 (when-match @(scan 42) 42 t) -> t ;; (2 3) is a sublist of (1 2 3 4) (when-match @(scan (2 3 . @nil)) '(1 2 3 4) t) -> t ;; (2 @x 4 . @nil) matches (2 3 4), binding x to 3: (when-match @(scan (2 @x 4 . @nil)) '(1 2 3 4 5) x) -> 3 ;; The entire matching suffix can be captured. (when-match @(scan @(as sfx (2 @x 4 . @nil))) '(1 2 3 4 5) sfx) -> (2 3 4 5) ;; Missing . @nil in pattern anchors search to end: (when-match @(scan (@x 2)) '(1 2 3 2 4 2) x) ;; Terminating atom anchors to improper end: (when-match @(scan (@x . 4)) '(1 2 3 . 4) x) -> 3 ;; Atom pattern matches only terminating atom (when-match @(scan #(@x @y)) '(1 2 3 . #(4 5)) (list x y)) -> (4 5) ;; Pattern doesn't match list: (match @(scan-all (b @x)) '(1 2 3 4 b 5 b 6 7 8) x) -> error ;; x bound to list of items that follow b symbol: (match @(scan-all (b @x . @nil)) '(1 2 3 4 b 5 b 6 7 8) x) -> (5 6) .brev .coNP Pattern Operators @ and and @ or .synb .mets @(and << pattern *) .mets @(or << pattern *) .syne .desc The .code and and .code or operators match multiple patterns in parallel, against the same object. The .code and operator matches if every .meta pattern matches the object, otherwise there is no match. The .code or operator requires one .meta pattern to match. It tries the patterns in left-to-right order, and stops at the first matching one, declaring failure if none match. The .code and and .code or operators have different scoping rules. Under .codn and , later patterns are processed in the scopes of earlier patterns, just like with other pattern operators. Duplicate variables back-reference. Under .codn or , the patterns are processed in separate, parallel scopes. No back-referencing takes place among same-named variables introduced in separate patterns of the same .codn or . When the .code and matches, the variables from all of the patterns are bound. When the .code or operator matches, the variables from all of the patterns are also bound. However, only the variables from the matching .meta pattern take on the values implied by that pattern. The variables from the nonmatching patterns that do not have the same names as variables in the matching .metn pattern , and that have been newly introduced in the .code or operator, take on .code nil values. .TP* Examples .verb (if-match @(and (@x 2 3) (1 @y 3) (1 2 @z)) '(1 2 3) (list x y z)) -> (1 2 3) (if-match @(or (@x 3 3) (1 @x 3) (1 2 @x)) '(1 2 3) x) -> 2 .brev .coNP Pattern Operator @ not .synb .mets @(not << pattern ) .syne .desc The pattern operator .code not provides logical inverse semantics. It matches if and only if the .meta pattern does not match. Whether or not the .code not operator matches, no variables are bound. If the embedded .meta pattern matches, the variables which it binds are suppressed by the .code not operator. .TP* Examples: .verb ;; @a matches unconditionally, so @(not @a) always fails: (if-match @(not @a) 1 :yes :no) -> :no ;; error: a is not bound (if-match @(not @a) 1 :yes a) -> error (match-case '(1 2 3) ((@(not 1) @b @c) (list :case1 b c)) ((@(not 0) @b @c) (list :case2 c b))) --> (:case2 3 2) .brev .NP* Pattern predicate operator .synb .mets >> @( function << arg *) .mets >> @( function << arg * >> @ avar << arg *) .mets >> @( function << arg * . <> @ avar ) .mets >> @(@ rvar >> ( function << arg *)) .mets >> @(@ rvar >> ( function << arg * >> @ avar << arg *)) .mets >> @(@ rvar >> ( function << arg * . <> @ avar )) .syne .desc Whenever the operator position of a pattern consists of a symbol which is neither the name of a pattern operator, nor the name of a macro, the expression denotes a predicate pattern. An expression is also a predicate pattern if it is handled by a pattern macro which declines to expand it by yielding the original expression. An operator pattern is expected to conform to one of the first three syntactic variations above. Together, these three variations constitute the .I "first form" of the pattern predicate operator. Whenever the operator position of a pattern consists of a meta-symbol, it is also a predicate pattern, expected to conform to one of the second three syntax variations above. These three variations constitute the .I "second form" of the operator. The first form of the predicate pattern consists of a compound form consisting of an operator and arguments. Exactly one of the arguments may be a pattern variable .meta avar ("argument variable") which must be a bindable symbol or else .codn nil . The pattern variable may also appear in the dot position, rather than as an argument. The role of .meta avar and the consequences of omitting it are described below. The second form of the predicate pattern consists of a meta-symbol .meta rvar ("result variable") which must be a bindable symbol or else .codn nil . This is followed by a compound form which consists of an operator symbol, followed by arguments, one of which may be a pattern .code avar as in the simple form. If .meta rvar is .codn nil , then the predicate pattern is equivalent to the first form. That is to say, the following are equivalent: .verb @(@nil (f ...)) <--> @(f ...) .brev The matching of the predicate pattern is processed as follows. If the .meta avar variable is present, then the predicate pattern first binds the corresponding object to the .meta avar variable, performing an ordinary variable match with the potential back-referencing which that implies. If that succeeds, then the object is inserted into the compound form, substituted in the position indicated by the .mono .meti >> @ avar .onom variable, either an ordinary argument position or the dot position. This form is then evaluated. If it yields true, then the match is successful, otherwise the match fails. If the .meta avar variable is absent, then no initial variable matching takes place. The corresponding object is added as an extra rightmost argument into the compound form, which is evaluated. Its truth value then determines the success of the match, just like in the case with .metn avar . If the second form is being processed, and specifies a .meta rvar that is not .codn nil , and if the predicate has succeeded, then then an extra processing step takes place. A variable match is performed to bind the .meta rvar variable to the result of the predicate, with potential back-referencing. If that match succeeds, then the predicate pattern succeeds. The compound form may be headed by the .code dwim operator, and therefore the DWIM bracket notation may be used. For instance .code "@[f @x]" is equivalent to .code "@(dwim f @x)" and is processed accordingly. Similarly, .code "@(@y [f @x])" is equivalent to .codn "@(@y (dwim f @x))" . The dot position of .meta avar in the predicate syntax denotes function application. So that is to say, the pattern predicate form .code "(f . @a)" where .code @a is in the dotted position invokes the function .code f as if by evaluation of the form .code "(f . x)" where .code x is hidden temporary variable holding the object corresponding to the pattern. The form .code "(f . x)" is a standard \*(TL notation with the same meaning as .codn "(apply (fun f) x)" . If .meta avar is the .code nil symbol, then no variable is bound. The matched object is substituted into the predicate expression at the position indicated by .codn @nil . .TP* Examples: .verb (when-match (@(evenp) @(oddp @x)) '(2 3) x) -> 3 (when-match @(<= 1 @x 10) 4 x) -> 4 (when-match @(@d (chr-digit @c)) #\e5 (list d c)) -> (5 #\e5) (when-match @(<= 1 @x 10) 11 x) -> nil ;; use hash table as predicate: (let ((h #H(() (a 1) (b 2)))) (when-match @[h @x] 'a x)) -> a ;; as above, also capture hash value (let ((h #H(() (a 1) (b 2)))) (when-match @(@y [h @x]) 'a (list x y))) -> (a 1) ;; apply (1 2 3) to < using dot position (when-match @(@x (< . @sym)) '(1 2 3) (list x sym)) -> (t (1 2 3)) ;; Match three-element list whose middle element ;; is a number in the range 10 20, without ;; binding any variables: (when-match (@nil @(<= 10 @nil 20) @nil) obj (prinl "obj matches")) .brev .coNP Pattern Macro @ sme .synb .mets @(sme < spat < mpat < epat >> [ mvar <> [ evar ]]) .syne .desc The pattern macro .code sme (start, middle, end) is a notation defined using the .code defmatch macro. The .code sme macro generates a complex pattern which matches three non-overlapping parts of a list object using three patterns. The .meta spat pattern is required to match a prefix of the input list. If that match is successful, then the remainder of the list is searched for a match for .metn mpat , using the .code scan operator. If that match, in turn, is successful, then the suffix of the remainder of the list is required to match .codn epat . The optional .meta mvar and .meta evar arguments must be bindable symbols, if they are specified. These symbols specify lexical variables which are bound to, respectively, the object matched by .meta mpat and .metn epat , using the fresh binding semantics of the .code as pattern operator. The first two patterns, .meta spat and .metn mpat , must be possibly dotted list patterns. The last pattern, .metn epat , may be any pattern: it may be an atom match for the terminating atom, or a possibly dotted list pattern matching the list suffix. Important to the semantics of .code sme is the concept of the length of a list pattern. The length of a pattern with a pattern variable or operator in the dotted position is the number of items before that variable or operator. The length of .code "(1 2 . @(and a b))" is 2; likewise the length of .code "(1 2 . @nil)" is also 2. The length of a pattern which does not have a variable or operator in the dotted position is simply its list length. For instance, the pattern .code "(1 2 3)" has length 3, and so does the pattern .codn "(1 2 3 . 4)" . The length is determined by the list object structure of the pattern, and not the printed syntax used to express it. Thus, .code "(1 . (2 3))" is still a length 3 pattern, because it denotes the same .code "(1 2 3)" object, using the dot notation unnecessarily. The non-overlapping semantics of .code sme evolves as follows. In the following description, it is understood that a match is required at every step. If that match fails, then the entire .code sme operator fails: .RS .IP 1. First, .meta spat is required to match a a prefix of the input list. If the match succeeds, then a .I "middle suffix" of the input is calculated by dropping from it leading elements. The number of elements dropped is equal to the length of .metn spat . .IP 2. The middle suffix is then searched for an occurrence of the middle pattern .metn mpat , as if using the .code scan pattern operator. All elements skipped by the search are dropped, until a match is found. .IP 3. At that point, if .meta mvar has been specified, it is bound to the remaining input, which still includes the part which just matched .metn mpat . .IP 4. Next, a number of elements equal to the length of .metn mpat , are dropped from the middle suffix, leaving a residue comprising the .IR "final suffix" . .IP 5. The end pattern .meta epat must then match a suffix of the final suffix. .IP 6. If the .meta evar variable has been specified, it is bound to the entire suffix that was matched by .metn epat . .RE .TP* Examples: .verb (when-match @(sme (1 2) (3 4) (5 . 6) m e) '(1 2 3 4 5 . 6) (list m e)) -> ((3 4 5 . 6) (5 . 6)) (when-match @(sme (1 2) (3 4) (5 . 6) m e) '(1 2 abc 3 4 def 5 . 6) (list m e)) ((3 4 def 5 . 6) (5 . 6)) ;; backreferencing (when-match @(sme (1 @y) (@z @x @y @z) (@x @y)) '(1 2 3 1 2 3 1 2) (list x y z)) -> (1 2 3)) ;; collect odd items starting at 3, before 7 (when-match @(and @(sme (1 @x) (3) (7) m e) @(with @(coll @(oddp @y)) (ldiff m e))) '(1 2 3 4 5 6 7) (list x y)) -> (2 (3 5))) ;; no overlap (when-match @(sme (1 2) (2 3) (3 4)) '(1 2 3 4) t) -> nil ;; The atom 5 is like a "zero-length improper list". (when-match @(sme () () 5) 5 t) -> t .brev .coNP Pattern Macro @ end .synb .mets @(end < pattern <> [ var ]) .syne .desc The pattern macro .code end is a notation defined using the .code defmatch macro, which matches .meta pattern against the suffix of a corresponding list object, which may be an improper list or atom. The optional argument .meta var specifies the name of a variable which captures the matched portion of the object. The .code end macro is related to the .code sme macro according to the following equivalence: .verb @(end pat var) <--> @(sme () () pat : var) .brev All of the requirements given for .code sme apply accordingly. .TP* Examples: .verb ;; atom match (when-match @(end 3 x) 3 x) -> 3 ;; y captures (2 3) (when-match @(end (2 @x) y) '(1 2 3) (list x y)) -> (3 (2 3)) ;; variable in dot position (when-match @(end (2 . @x) y) '(1 2 . 3) (list x y)) -> (3 (2 . 3)) ;; z captures entire object (when-match @(as z @(end (2 @x) y)) '(1 2 3) (list x y z)) -> (3 (2 3) (1 2 3))) .brev .SS* Pattern-Matching Macros .coNP Macros @, when-match @ match and @ if-match .synb .mets (when-match < pattern < expr << form *) .mets (match < pattern < expr << form *) .mets (if-match < pattern < expr < then-form <> [ else-form ]) .syne .desc The .codn when-match , .code match and .code if-match macros conditionally evaluate code based on whether the value of .meta expr matches .metn pattern . The .code when-match macro arranges for every .meta form to be evaluated in the scope of the variables established by .meta pattern when it matches the object produced by .metn expr . The value of the last .meta form is returned, or else .code nil if there are no forms. If the match fails, the forms are not evaluated, and .code nil is produced. The .code match macro behaves exactly like .code when-match when the match is successful. When the match fails, .code match throws an exception of type .codn match-error . The .code if-match macro evaluates .meta then-form in the scope of the variables established by .meta pattern if the match is successful, and yields the value of that form. Otherwise, it evaluates .metn else-form , which defaults to .code nil if it is not specified. .coNP Macros @ match-case and @ match-ecase .synb .mets (match-case < expr >> {( pattern << form *)}*) .mets (match-ecase < expr >> {( pattern << form *)}*) .syne .desc The .code match-case macro matches the value of .meta expr against zero or more patterns. Normally, the patterns are considered in left-to-right order. If the value .meta expr matches more than one .metn pattern , the leftmost .meta pattern is selected and that clause is evaluated. Under certain conditions, detailed below, it is possible for .code match-case and .code match-ecase to be transformed into a .code casequal form. In that case, if there are multiple clauses with equivalent patterns, it is not specified which one is evaluated. The syntax of .code match-case consists of an expression .meta expr followed by zero or more clauses. Each clause is a compound expression whose first element is .metn pattern , which is followed by zero or more forms. First, .meta expr is evaluated. Then, the value is matched against each .meta pattern in succession, stopping at the first pattern which provides a successful match. If no pattern provides a successful match, then .code match-case terminates and returns .codn nil . If a .meta pattern matches successfully, then each .meta form associated with the pattern is evaluated in the scope of the variable bindings established by that .metn pattern . Then .code match-case terminates, returning the value of the last .meta form or else .code nil if there are no forms. The .code match-ecase macro differs from .code match-case as follows. When none of the clauses match under .codn match-case , then that form terminates with a value of .codn nil . In the same situation, the .code match-ecase form throws an exception of type .codn match-error . An .code match-ecase form may be transformed to a .code casequal form if all the .mets pattern s are trivial. A trivial pattern is either an atom, or else a vector or list expression containing no variables. A .code match-case form may be transformed to a .code casequal form under the same conditions as .codn match-case . Additionally, .code match-case may also be transformed if it contains exactly one clause which matches any object by means of the key .code @nil or else a variable match such as .codn @abc , if that clause appears last. That clause is transformed into an .meta else-clause of the .code casequal form. .TP* Examples: .verb ;; classify sequence of objects by pattern matching, ;; returning a list of the results (collect-each ((obj (list '(1 2 3) '(4 5) '(3 5) #S(time year 2021 month 1 day 1) #(vec tor)))) (match-case obj (@(struct time year @y) y) (#(@x @y) (list x y)) ((@nil @nil @x) x) ((4 @x) x) ((@x 5) x))) --> (3 5 3 2021 (vec tor)) ;; default case can be represented by a guaranteed match (match-case 1 (2 :two) (@x :default)) --> :default .brev .coNP Macro @ match-cond .synb .mets (match-cond >> {( pattern < expr << form *)}*) .syne .desc The .code match-cond macro's arguments are zero or more clauses, each of which specifies a .metn pattern , an expression .metn expr , and zero or more .metn form s. The clauses are processed in order. Successive .metn expr s are evaluated, and matched against their corresponding pattern. If there is no match, processing continues with the next clause. If no match is found in any clause, the .code match-cond form terminates, returning .codn nil . If an .metn expr 's value matches the corresponding .metn pattern , then every .code form is evaluated in scope of the variables established by the pattern. The .code match-form then terminates, yielding the value of the last .codn form , or else the value of .meta expr if there are no .codn form s. Note: the pattern .code "(t t ...)" is recommended for specifying an unconditionally matching clause. .TP* Example: .verb (let ((x 42)) (match-cond (`@x-73` "73-73" :a) (`@x-@y` "42-24" y))) --> "24" .brev .coNP Macro @ lambda-match .synb .mets (lambda-match >> {( pattern << form *)}*) .syne .desc The .code lambda-match is conceptually similar to .codn match-case . The arguments of .code lambda-match are zero or more clauses similar to those of .codn match-case , each consisting of a compound expression headed by a .meta pattern followed by zero or more .metn form s. The macro generates a .code lambda expression which evaluates to an anonymous function in the usual way. When the anonymous function is called, each clause's .meta pattern is matched against the function's actual arguments. When a match occurs, each .meta form associated with the .meta pattern is evaluated, and the value of the last .meta form becomes the return value of the function. If none of the clauses match, then .code nil is returned. Whenever .meta pattern is a list-like pattern, it is not matched against a list object, as is the usual case with a list-like pattern, but against the actual arguments. For instance, the pattern .code "(@a @b @c)" expects that the function was called with exactly three arguments. If that is the case, the patterns are then matched to the arguments. The pattern .code @a takes the first argument, binding it to variable .code a and so forth. If .meta pattern is a dotted list-like pattern, then the dot position is matched against the remaining arguments. For instance, the pattern .code "(@a @b . @c)" requires at least two arguments. The first two are bound to .code a and .codn b , respectively. The list of remaining arguments, if any, is bound to .codn c , which will be .code nil if there are no remaining arguments. Any non-list-like .meta pattern .code P is analyzed as an equivalent list-like dotted pattern due to .code P syntax being equivalent to .code "(. P)" syntax. Such a pattern matches the list of all arguments. Thus, the following are all equivalent: .verb (lambda-match (@a a)) (lambda-match ((. @a) a)) (lambda a a) (lambda (. a) a) .brev The characteristics of the resulting anonymous function are determined as follows. If at least one .meta pattern specified in a .meta lambda-match is a dotted pattern, the function is variadic. The arity of the resulting anonymous function is determined as follows, from the lengths of the patterns. The length of a pattern is the number of elements, not including the dotted element. The length of the longest pattern determines the number of fixed arguments. Unless the function is variadic, it may not be called with more arguments than can be matched by the longest pattern. The length of the shortest pattern determines the number of required arguments. The function may not be called with fewer arguments than can be matched by the shortest pattern. If these two lengths are unequal, then the function has a number of optional arguments, equal to the difference. Note: an anonymous function which takes one argument and matches that object against clauses using .code match-case can be obtained with the .code do operator, using the pattern: .codn "(do match @1 ...)" . Note: the parameter macro .code :match can also define a .code lambda with pattern matching. Any .code "(lambda-match clauses ...)" form can be written as .codn "(lambda (:match) clauses ...)" . The parameter macro offers the additional ability of defining named arguments which are inserted before the implicit arguments generated from the clauses, and combining with other parameter macros. .TP* Examples: .verb (let ((f (lambda-match (() (list 0 :args)) ((@a) (list 1 :arg a)) ((@a @b) (list 2 :args a b)) ((@a @b . @c) (list* '> 2 :args a b c))))) (list [f] [f 1] [f 1 2] [f 1 2 3])) --> ((0 :args) (1 :arg 1) (2 :args 1 2) (> 2 :args 1 2 3)) [(lambda-match ((0 1) :zero-one) ((1 0) :one-zero) ((@x @y) :no-match)) 1 0] --> :one-zero [(lambda-match ((0 1) :zero-one) ((1 0) :one-zero) ((@x @y) :no-match)) 1 1] --> :no-match [(lambda-match ((0 1) :zero-one) ((1 0) :one-zero) ((@x @y) :no-match)) 1 2 3] --> ;; error .brev .coNP Macro @ defun-match .synb .mets (defun-match < name >> {( pattern << form *)}*) .syne .desc The .code defun-match macro can be used to define a top-level function in the style of .codn lambda-match . It produces a form which has all of the properties of .codn defun , such as a block of the same .meta name being established around the implicit .code match-case so that .code return-from is possible. The .mono .meti >> ( pattern << form *) .onom clauses of .code defun-match have exactly the same syntax and semantics as those of .codn lambda-match . Note: instead of .codn defun-match , the parameter macro .code :match may be used. The following equivalence holds: .verb (defun name (:match) ...) <--> (defun-match ...) .brev The parameter macro offers the additional ability of defining named arguments which are inserted before the implicit arguments generated from the clauses, and combining with other parameter macros. .TP* Examples: .verb ;; Fibonacci (defun-match fib ((0) 1) ((1) 1) ((@x) (+ (fib (pred x)) (fib (ppred x))))) (fib 0) -> 1 (fib 1) -> 1 (fib 2) -> 2 (fib 3) -> 3 (fib 4) -> 5 (fib 5) -> 8 ;; Ackermann (defun-match ack ((0 @n) (+ n 1)) ((@m 0) (ack (- m 1) 1)) ((@m @n) (ack (- m 1) (ack m (- n 1))))) (ack 3 7) -> 1021 (ack 1 1) -> 3 (ack 2 2) -> 7 .brev .coNP Parameter List Macro @ :match .synb .mets (:match << left-param * [-- << extra-param *]) << clause * .syne .desc Parameter list macro .code :match allows any function to be expressed in the style of .codn lambda-match , with extra features. The .code :match macro expects the body of the function to consist of .code lambda-match clauses, which are semantically treated in exactly the same manner as under .codn lambda-match . The following restrictions apply. The parameter list may not include optional parameters delimited by .code : (the colon keyword symbol). The parameter list may not be dotted. The macro produces a function which the .meta left-param parameters, if any, are inserted to the left of the implicit parameters generated by the .code lambda-match transformation. Furthermore, the .code :match parameter macro supports integration with the .code :key parameter macro, or any other macro which uses a compatible .code -- convention for delimiting special arguments. If the parameter list includes the symbol .code -- then that portion of the parameter list is set aside and not included in the .code lambda-match transformation. Then, that list is integrated into the resulting lambda. A complete transformation can be described by the following diagram: .verb (lambda (:match a b c ... -- s t u ...) clauses ...) --> (lambda (a b c ... m n p ... -- s t u ... . z) body ...) .brev In this diagram, .code "a b c ..." denote the .meta left-param parameters. The .code "m n p ..." symbols denote the fixed parameters generated by the .code lambda-match transformation from the semantic analysis of .metn clauses . The .code "s t u ..." symbols denote the original .meta extra-param parameters. Finally, .code z denotes the dotted parameter generated by the .code lambda-match transform. If the transform produces no dotted parameter, then this is .codn nil . The dotted parameter is thus separated from the .code "m n p ..." group to which it belongs. When no .code -- and .meta extra-params are present, the transformation reduces to: .verb (lambda (:match a b c ...) clauses ...) --> (lambda (a b c ... m n p ... . z) body ...) .brev Note: these requirements harmonize with the .code :key parameter macro. If that is present to the left of .code :match it removes the .code -- and the .code "s t u ..." keyword parameters, reuniting the .code z parameter with the .code "m n p" group. Furthermore, the .code :key macro generates code which refers to the existing .code z dotted parameter as the source for the keyword parameters, unless .code z is .codn nil , in which case it inserts its own generated symbol. .TP* Examples: .verb ;; Match-style cond-like macro with unreachability diagnosis. ;; Demonstrates usefulness of :match, which allows the :form ;; parameter to be promoted through to the macro definition. (defmacro my-cond (:match :form f) (() nil) (((@(and @(constantp @test) @(eval))) . @rest) (when rest (compile-error f "unreachable code after ~s" test)) test) (((@(and @(constantp @test) @(eval)) . @forms) . @rest) (when (and rest) (compile-error f "unreachable code after ~s" test)) ^(progn ,*forms)) (((@test) . @rest) ^(or ,test (my-cond ,*rest))) (((@test . @forms) . @rest) ^(if ,test (progn ,*forms) (my-cond ,*rest))) ((@else . @rest) (compile-error f "bad syntax"))) (my-cond (3)) --> 3 (my-cond (3 4)) --> 4 (my-cond (3 4) (5)) --> ;; my-cond: unreachable code after 3 (my-cond 42) --> ;; my-cond: bad syntax .brev .verb ;; Keyword parameter example. (defstruct simple-widget () name) (defstruct widget (simple-widget) frobosity luminance) (defstruct simple-point-widget (simple-widget) (:static width 0) (:static height 0)) (defstruct point-widget (widget) (:static width 0) (:static height 0)) (defstruct general-widget (widget) width height) ;; Note that in clauses with no . @rest parameter, there ;; is a mismatch if keyword arguments are present. The (0 0) ;; clause exploits this to match only when keywords are absent. (defun make-widget (:key :match name -- frob lum) ((0 0) (new simple-point-widget name name)) ((0 0 . @rest) (new point-widget name name frobosity frob luminance lum)) ((@x @y . @rest) (new general-widget name name width x height x frobosity frob luminance lum))) (make-widget "abc" 0 0) --> #S(simple-point-widget name "abc") (make-widget "abc" 0 0 :frob 42) --> #S(point-widget name "abc" frobosity 42 luminance nil) (make-widget "abc" 0 0 :lum 9) --> #S(point-widget name "abc" frobosity nil luminance 9) (make-widget "abc" 0 1 :lum 9) --> #S(general-widget name "abc" frobosity nil luminance 9 width 0 height 0) .brev .coNP Macro @ defmatch .synb .mets (defmatch < name < macro-style-params .mets \ \ << body-form *) .syne .desc The .code defmatch macro allows for the definition of pattern macros: user-defined pattern operators which are implemented via expansion into existing operator syntax. The .code defmatch macro has the same syntax as .codn defmacro . It specifies a macro transformation for a compound form which has the .meta name symbol in its leftmost position. This macro transformation is performed when .meta name is used as a pattern operator: an expression of the form .mono .meti >> @( name << argument *) .onom occurring in pattern-matching syntax. The behavior is unspecified if .meta name is the name a built-in pattern operator, or a predefined pattern macro. The pattern macro bindings are stored in a hash table held by the variable .code *match-macro* whose keys are symbols, and whose values are expander functions. There are no lexically scoped pattern macros. Pattern macros defined with .code defmatch may specify the special macro parameters .code :form and .code :env in their parameter lists. The values of these parameters are determined in a manner particular to .codn defmatch . The .code :form parameter captures the pattern-matching form, or a constituent thereof, in which the macro is being invoked. For instance, if the operator is being used inside a pattern given to a .code when-match macro invocation, then the form will be that entire .code when-match form. The .code :env parameter captures a specially constructed macro-time environment object in which all of the variables to the left of the pattern appear as lexical variables. The parent of this environment is the surrounding macro environment. If the pattern macro needs to treat a variable which already has a binding differently from an unbound variable, it can look up the variable in this environment. .TP* Example: .verb ;; Create an alias called let for the @(as var pattern) operator: ;; Note that the macro produces @(as ...) and not just (as ...) (defmatch let (var pattern) ^@(as ,var ,pattern)) ;; use the macro in matching: (when-match @(let x @(or foo bar)) 'foo x) ;; Error reporting example using :form (defmatch foo (sym) (unless (bindable sym) (compile-error *match-form* "~s: bindable symbol expected, not ~s" 'foo sym)) ...) ;; Pattern macro which uses = equality to backreference ;; an existing lexical binding, or else binds the variable ;; if it has no existing lexical binding. (defmatch var= (sym :env e) (if (lexical-var-p e sym) (with-gensyms (obj) ^@(require (sys:var ,obj) (= ,sym ,obj))) ^(sys:var ,sym))) ;; example use: (when-match (@(var= a) @(var= a)) '(1 1.0) a) -> 1 ;; no match: (equal 1 1.0) is false (when-match (@a @a) '(1 1.0) a) -> nil .brev .coNP Function @ macroexpand-match .synb .mets (macroexpand-match < pattern <> [ env ]) .syne .desc If .code pattern is a compound form whose operator symbol has been defined as a macro pattern using .codn defmatch , then .code macroexpand-match will expand that pattern and return the expansion. Otherwise it returns the .code pattern argument. In order to be recognized by .code macroexpand-match the .meta pattern argument must not include the .code @ prefix that would normally be used to invoke it. The expansion, however, will include that syntax. The .code env parameter specifies the macro-time environment for the expander. Note: pattern expanders, like built-in patterns, may use the macro environment for deciding whether a variable is an existing lexical variable, or a free variable, based on which a pattern may be expanded differently. .TP* Example: Given: .verb (defmatch point (x y) ^@(struct point x @,x y @,y)) .brev a result similar to the following may be obtained: .verb (macroexpand-match '(point a b)) -> @(struct point x @a y @b) .brev Note that the pattern is specified plainly as .code "(point a b)" rather than .codn "@(point a b)" , yet the expansion is .codn "@(struct ...)" . .coNP Special Variable @ *match-macro* .desc The .code *match-macro* special variable holds the hash table of associations between symbols and pattern macro expanders. If the expression .code "[*match-macro* 'sym]" yields a function, then symbol .code sym has a binding as a pattern macro. If that expression yields .codn nil , then there is no such binding: pattern operator forms based on .code sym do not undergo place macro expansion. The macro expanders in .code *match-macro* are two-parameter functions. The first argument passes the operator syntax to be expanded. The second argument is used for passing the environment object which the expander can capture using .code :env in its macro parameter list. .coNP Macros @ each-match and @ each-match-product .synb .mets (each-match >> ({ pattern << seq-form }*) << body-form *) .mets (each-match-product >> ({ pattern << seq-form }*) << body-form *) .syne .desc The .code each-match macro arranges for elements from multiple sequences to be visited in parallel, and each to be matched against respective patterns. For each matching tuple of parallel elements, a body of forms is evaluated in the scope of the variables bound in the patterns. The first argument of .code each-match specifies a list of alternating .meta pattern and .meta seq-form expressions. Each .meta pattern is associated with the sequence which results from evaluating the immediately following .metn seq-form . Items coming from that sequence correspond with that pattern. The remaining arguments are .metn body-form s to evaluated for successful matches. The .metn body-form s are surrounded by an implicit anonymous block. If any of the forms .code return invoke a return out of this block, then the iteration terminates, and the result value of the block becomes the result value of the loop. The processing takes place as follows: .RS .IP 1. Every .meta seq-form is evaluated in left-to-right order and is expected to produce an iterable sequence or object that would be a suitable argument to .code mapcar or .codn iter-begin . This evaluation takes place in the scope surrounding the macro form, in which none of the variables that are bound in the .meta pattern expressions are yet visible. .IP 2. The next available item is taken from each of the sequences. If any of the sequences has no more items available, then .code each-match terminates and returns .codn nil . .IP 3. Each item taken in step 2 is matched against the .meta pattern which is corresponds with its sequence. Each successive pattern can refer to the variables bound in the previous patterns in the same iteration. If any pattern match fails, then the process continues with step 2. .IP 4. If all the matches are successful, then .metn body-form s, if any, are executed in the scope of variables bound in the .metn pattern s. Processing then continues at step 2. .RE .IP The .code each-match-product differs from .code each-match in that instead of taking parallel tuples of items from the sequences, it iterates over the tuples of the Cartesian product of the sequences similarly to the .code maprod function. The product tuples are ordered in such a way that the rightmost element, which always coming coming from sequence produced by the last .metn seq-form , varies the fastest. If there are two sequences .code "(1 2)" and .codn "(a b)" , then .code each-match iterates over the tuples .code "(1 a)" and .codn "(2 b)" , whereas .code each-match-product iterates over .codn "(1 a)" , .codn "(1 b)" , .code "(2 a)" and .codn "(2 b)" . .TP* Examples: .verb ;; Number all the .JPG files in the current directory. ;; For instance foo.jpg becomes foo-0001.jpg, if it is ;; the first file. (each-match (@(as name `@base.jpg`) (glob "*.jpg") @(@num (fmt "~,04a")) 1) (rename-path name `@base-@num.jpg`)) ;; Iterate over combinations of matching phone ;; numbers and odd integers from the (1 2 3) list (build (each-match-product (`(@a) @b-@c` '("x" "" "(311) 555-5353" "(604) 923-2323" "133" "4-5-6-7") @(oddp @x) '(1 2 3)) (add (list x a b c)))) --> ((1 "311" "555" "5353") (3 "311" "555" "5353") (1 "604" "923" "2323") (3 "604" "923" "2323"))) .brev .coNP Macros @ append-matches and @ append-match-products .synb .mets (append-matches >> ({ pattern << seq-form }*) << body-form *) .mets (append-match-products >> ({ pattern << seq-form }*) << body-form *) .syne .desc The macro .code append-matches is subject to all of the requirements specified for .code each-match in regard to the argument conventions and semantics, and the presence of the implicit anonymous block around the .metn body-form s. Whereas .code each-match returns .codn nil , the .code append-matches macro requires, in each iteration which produces a match for each .metn pattern , that the last .meta body-form evaluated must produce a list. These lists are catenated together as if by the .code append function and returned. It is unspecified whether the nonmatching iterations produce empty lists which are included in the append operation. If the last tuple of items which produces a match is absolutely the the last tuple, the corresponding .meta body-form evaluation may yield an atom which then becomes the terminator for the returned list, in keeping with the semantics of .codn append . an atom. The .code append-match-products macro differs from .code append-matches in that it iterates over the Cartesian product tuples of the sequences, rather than parallel tuples. The difference is exactly like that between .code each-match and .codn each-match-product . .TP* Examples: .verb (append-matches ((:foo @y) '((:foo a) (:bar b) (:foo c) (:foo d)) (@x :bar) '((1 :bar) (2 :bar) (3 :bar) (4 :foo))) (list x y)) --> (1 a 3 c) (append-matches (@x '((1) (2) (3) 4)) x) --> (1 2 3 . 4) (append-match-products (@(oddp @x) (range 1 5) @(evenp @y) (range 1 5)) (list x y)) --> (1 2 1 4 3 2 3 4 5 2 5 4) .brev .coNP Macros @ keep-matches and @ keep-match-products .synb .mets (keep-matches >> ({ pattern << seq-form }*) << body-form *) .mets (keep-match-products >> ({ pattern << seq-form }*) << body-form *) .syne .desc The macro .code keep-matches is subject to all of the requirements specified for .code each-match in regard to the argument conventions and semantics, and the presence of the implicit anonymous block around the .metn body-form s. Whereas .code each-match returns .codn nil , the .code keep-matches macro returns a list of the values produced by all matching iterations which led to the execution of the .metn body-form s. The .code keep-match-products macro differs from .code keep-matches in that it iterates over the Cartesian product tuples of the sequences, rather than parallel tuples. The difference is exactly like that between .code each-match and .codn each-match-product . .TP* Examples: .verb (keep-matches ((:foo @y) '((:foo a) (:bar b) (:foo c) (:foo d)) (@x :bar) '((1 :bar) (2 :bar) (3 :bar) (4 :foo))) (list x y)) --> ((1 a) (3 c)) (keep-match-products (@(oddp @x) (range 1 5) @(evenp @y) (range 1 5)) (list x y)) --> ((1 2) (1 4) (3 2) (3 4) (5 2) (5 4)) .brev .coNP Macro @ while-match .synb .mets (while-match < pattern < expr << form *) .syne .desc The .code while-match macro evaluates .meta expr and matches it against .meta pattern similarly to .codn when-match . If the match is successful, every .meta form is evaluated in an environment in which new bindings from .meta pattern are visible. In this case, the process repeats: .meta expr is evaluated again, and tested against .metn pattern . If the match fails, .code while-match terminates and produces .code nil as its result value. Each iteration produces fresh bindings for any variables that are implicated for binding in .metn pattern . The .meta expr and .meta form expressions are surrounded by an anonymous block. .coNP Macros @ while-match-case and @ while-true-match-case .synb .mets (while-match-case < expr >> {( pattern << form *)}*) .mets (while-true-match-case < expr >> {( pattern << form *)}*) .syne .desc The macros .code while-match-case and .code while-true-match-case combine iteration with the semantics of .codn match-case . The .code while-match-case evaluates .meta expr and matches it against zero or more clauses in the manner of .code match-case. If there is a match, this process is repeated. If there is no match, .code while-match-case terminates, and returns .codn nil . In each iteration, the matching clause produces fresh bindings for any variables implicated for binding in its respective .metn pattern . The .meta expr and .meta form expressions are surrounded by an anonymous block. The .code while-true-match-case macro is identical in almost every respect to .codn while-match-case , except that it terminates the loop if .meta expr evaluates to .codn nil , without attempting to match that value against the clauses. Note: the semantics of .code while-true-match-case can be obtained in .code while-match-case by inserting a .code return clause. That is to say, a construct of the form .verb (while-true-match-case expr ...) .brev may be rewritten into .verb (while-match-case expr (nil (return)) ;; match nil and return ...) .brev except that .code while-true-match-case isn't required to rely on performing a block return. .SS* Quasiquote Operator Syntax .coNP Macro @ qquote .synb .mets (qquote << form ) .syne .desc The .code qquote (quasi-quote) macro operator implements a notation for convenient list construction. If .meta form is an atom, or a list structure which does not contain any .code unquote or .code splice operators, then .mono .meti (qquote << form ) .onom is equivalent to .mono .meti (qquote << form ). .onom If .metn form , however, is a list structure which contains .code unquote or .code splice operators, then the substitutions implied by those operators are performed on .metn form , and the .code qquote operator returns the resulting structure. Note: how the qquote operator actually works is that it is compiled into code. It becomes a Lisp expression which, when evaluated, computes the resulting structure. A .code qquote can contain another .codn qquote . If an .code unquote or .code splice operator occurs within a nested .codn qquote , it belongs to that .codn qquote , and not to the outer one. However, an unquote operator which occurs inside another one belongs one level higher. For instance in .verb (qquote (qquote (unquote (unquote x)))) .brev the leftmost .code qquote belongs with the rightmost unquote, and the inner .code qquote and .code unquote belong together. When the outer .code qquote is evaluated, it will insert the value of .codn x , resulting in the object .codn "(qquote (unquote [value-of-x]))" . If this resulting qquote value is evaluated again as Lisp syntax, then it will yield .codn [value-of-value-of-x] , the value of .code [value-of-x] when treated as a Lisp expression and evaluated. .TP* Examples: .verb (qquote a) -> a (qquote (a b c)) -> (a b c) (qquote (1 2 3 (unquote (+ 2 2)) (+ 2 3))) -> (1 2 3 4 (+ 2 3)) (qquote (unquote (+ 2 2))) -> 4 .brev In the second-to-last example, the .code "1 2 3" and the .code "(+ 2 3)" are quoted verbatim. Whereas the .code "(unquote (+ 2 2))" operator caused the evaluation of .code "(+ 2 2)" and the substitution of the resulting value. The last example shows that .meta form can itself (the entire argument of .codn qquote ) can be an unquote operator. However, note: .code "(quote (splice form))" is not valid. Note: a way to understand the nesting behavior is a via a possible model of quasi-quote expansion which recursively compiles any nested quasi quotes first, and then treats the result of their expansion. For instance, in the processing of .verb (qquote (qquote (unquote (unquote x)))) .brev the .code qquote operator first encounters the embedded .code "(qquote ...)" and compiles it to code. During that recursive compilation, the syntax .code "(unquote (unquote x))" is encountered. The inner quote processes the outer unquote which belongs to it, and the inner .code "(unquote x)" becomes material that is embedded verbatim in the compilation, which will then be found when the recursion pops back to the outer quasiquote, which will then traverse the result of the inner compilation and find the .codn "(unquote x)" . .TP* "Dialect Note:" In Lisp dialects which have a published quasiquoting operator syntax, there is the expectation that the quasiquote read syntax corresponds to it. That is to say, the read syntax .code "^(a b ,c)" is expected to translate to .codn "(qquote a b (unquote c))" . In \*(TL, this is not true! Although .code "^(a b ,c)" is translated to a quasiquoting macro, it is an internal one, not based on the public .codn qquote , .code unquote and .code splice symbols being documented here. This idea exists for hygiene. The quasiquote read syntax is not confused by the presence of the symbols .codn qquote , .code unquote or .code splice in the template, since it doesn't treat them specially. This also allows programmers to use the quasiquote read syntax to construct quasiquote macros. For instance .verb ^(qquote (unquote ,x)) ;; does not mean ^^,,x ! .brev To the quasiquote reader, the .code qquote and .code unquote symbols mean nothing special, and so this syntax simply means that if the value of .code x is .codn foo , the result of evaluating this expression will be .codn "(qquote (unquote foo))" . The form's expansion is actually this: .verb (sys:qquote (qquote (unquote (sys:unquote x)))) .brev the .code sys:qquote macro recognizes .code sys:unquote embedded in the form, and the other symbols not in the .code sys: package are just static template material. The .code sys:quote macro and its associated .code sys:unquote and .code sys:splice operators work exactly like their ordinary counterparts. So in effect, \*(TX has two nearly identical, independent quasi-quote implementations, one of which is tied to the read syntax, and one of which isn't. This is useful for writing quasiquotes which write quasiquotes. .coNP Operator @ unquote .synb .mets (qquote (... (unquote << form ) ...)) .mets (qquote (unquote << form )) .syne .desc The .code unquote operator is not an operator .I per .IR se . The .code unquote symbol has no binding in the global environment. It is a special syntax that is recognized within a .code qquote form, to indicate forms within the quasiquote which are to be evaluated and inserted into the resulting structure. The syntax .mono .meti (qquote (unquote << form )) .onom is equivalent to .metn form : the .code qquote and .code unquote "cancel out". .coNP Operator @ splice .synb .mets (qquote (... (splice << form ) ...)) .syne .desc The .code splice operator is not an operator .I per .IR se . The .code splice symbol has no binding in the global environment. It is a special syntax that is recognized within a .code qquote form, to indicate forms within the quasiquote which are to be evaluated and inserted into the resulting structure. The syntax .mono .meti (qquote (splice << form )) .onom is not permitted and raises an exception if evaluated. The .code splice syntax must occur within a list, and not in the dotted position. The .code splice form differs from unquote in that .mono .meti (splice << form ) .onom requires that .meta form must evaluate to a list. That list is integrated into the surrounding list. .SS* Math Library The following documentation describes the behavior of the Math Library functions as they apply to the native numeric and character types. The functions also support application-defined structure types. That feature is not described here but in the section User-Defined Arithmetic Types. When one or more operands of a Math Library function is a user-defined arithmetic structure, no conversions are performed on the operands, and the stated restrictions do not apply. The operands are passed to the methods as described in the User-Defined Arithmetic Types section. The operands need not be numeric. User-defined arithmetic structures can work with operands which are not numbers. If .code a is such a type, it is possible for an expression such as .code "(+ a \(dqabc\(dq)" to be meaningful and correct. Similarly, it is possible for an apparent division by zero such as .code "(/ a 0)" to be meaningful and correct, since the .code / method of the .code a object decides how to handle zero. .coNP Functions @, + @ - and @ * .synb .mets (+ << number *) .mets (- < number << number *) .mets (* << number *) .syne .desc The .codn + , .code - and .code * functions perform addition, subtraction and multiplication, respectively. Additionally, the .code - function performs additive inverse. The .code + function requires zero or more arguments. When called with no arguments, it produces 0 (the identity element for addition), otherwise it produces the sum over all of the arguments. Similarly, the .code * function requires zero or more arguments. When called with no arguments, it produces 1 (the identity element for multiplication). Otherwise it produces the product of all the arguments. The semantics of .code - changes from subtraction to additive inverse when there is only one argument. The argument is treated as a subtrahend, against an implicit minuend of zero. When there are two or more argument, the first one is the minuend, and the remaining are subtrahends. When there are three or more operands, these operations are performed as if by binary operations, in a left-associative way. That is to say, .code "(+ a b c)" means .codn "(+ (+ a b) c)" . The sum of .code a and .code b is computed first, and then this is added to .codn c . Similarly .code "(- a b c)" means .codn "(- (- a b) c)" . First, .code b is subtracted from .codn a , and then .code c is subtracted from that result. The arithmetic inverse is performed as if it were subtraction from integer 0. That is, .code "(- x)" means the same thing as .codn "(- 0 x)" . The operands of .codn + , .code - and .code * can be characters, integers (fixnum and bignum), and floats, in nearly any combination. If two operands have different types, then one of them is converted to the type of the one with the higher rank, according to this ranking: character < integer < float. For instance if one operand is integer, and the other float, the integer is converted to a float. .TP* Restrictions: Characters are not considered numbers, and participate in these operations in limited ways. Subtraction can be used to computed the displacement between the Unicode values of characters, and an integer displacement can be added to a character, or subtracted from a character. For instance .codn "(- #\e9 #\e0) is 9" . The Unicode value of a character .code C can be found using .codn "(- C #\ex0)" : the displacement from the NUL character. The rules can be stated as a set of restrictions: .RS .IP 1. Two characters may not be added together. .IP 2. A character may not be subtracted from an integer (which also rules out the possibility of computing the additive inverse of a character). .IP 3. A character operand may not be opposite to a floating point operand in any operation. .IP 4. A character may not be an operand of multiplication. .RE .coNP Function @ / .synb .mets (/ << divisor ) .mets (/ < dividend << divisor *) .syne .desc The .code / function performs floating-point division. Each operands is first converted to floating-point type, if necessary. In the one-argument form, the .meta dividend argument is omitted. An implicit dividend is present, whose value is .codn 1.0 , such that the one-argument form .code "(/ x)" is equivalent to the two-argument form .codn "(/ 1.0 x)" . If there are two or more arguments, explicitly or by the above equivalence, then a cumulative division is performed. The .meta divisor value is taken into consideration, and divided by the first .codn divisor . If another .code divisor follows, then that value is divided by that subsequent divisor. This process repeats until all divisors are exhausted, and the value of the last division is returned. A division by zero throws an exception of type .codn numeric-error . .coNP Functions @ sum and @ prod .synb .mets (sum < sequence <> [ keyfun ]) .mets (prod < sequence <> [ keyfun ]) .syne .desc The .code sum and .code prod functions operate on an effective sequence of numbers derived from .metn sequence , which is an object suitable for iteration according to .codn seq-begin . If the .meta keyfun argument is omitted, then the effective sequence is the .meta sequence argument itself. Otherwise, the effective sequence is understood to be a projection mapping of the elements of .meta sequence through .meta keyfun as would be calculated by the .mono .meti (mapcar < keyfun << sequence ) .onom expression. The .code sum function returns the left-associative sum of the elements of the effective sequence calculated as if using the .code + function. Similarly, the .code prod function calculates the left-associative product of the elements of the sequence as if using the .code * function. If .meta sequence is empty then .code sum returns .code 0 and .code prod returns .codn 1 . If the effective sequence contains one number, then both functions return that number. .coNP Functions @ wrap and @ wrap* .synb .mets (wrap < start < end << number ) .mets (wrap* < start < end << number ) .syne .desc The .code wrap and .code wrap* functions reduce .meta number into the range specified by .meta start and .metn end . Under .code wrap the range is inclusive of the .meta end value, whereas under .code wrap* it is exclusive. The following equivalence holds .verb (wrap a b c) <--> (wrap* a (succ b) c) .brev The expression .code "(wrap* x0 x1 x)" performs the following calculation: .mono .mets (+ (mod (- x x0) (- x1 x0)) x0) .onom In other words, first .meta start is subtracted from .metn number . Then the result is reduced modulo the displacement between .code start and .codn end . Finally, .meta start is added back to that result, which is returned. .TP* Example: .verb ;; perform ROT13 on the string "nop" [mapcar (opip (+ 13) (wrap #\ea #\ez)) "nop"] -> "abc" .brev .coNP Functions @ gcd and @ lcm .synb .mets (gcd << number *) .mets (lcm << number *) .syne .desc The .code gcd function computes the greatest common divisor: the largest positive integer which divides each .metn number . The .code lcm function computes the lowest common multiple: the smallest positive integer which is a multiple of each .metn number . Each .meta number must be an integer. Negative integers are replaced by their absolute values, so .code "(lcm -3 -4)" is .code 12 and .code "(gcd -12 -9)" yields .codn 3 . The value of .code (gcd) is .code 0 and that of .code (lcm) is 1 . The value of .code "(gcd x)" and .code "(lcm x)" is .codn "(abs x)" . Any arguments of .code gcd which are zero are effectively ignored so that .code "(gcd 0)" and .code "(gcd 0 0 0)" are both the same as .code (gcd) and .code "(gcd 1 0 2 0 3)" is the same as .codn "(gcd 1 2 3)" . If .code lcm has any argument which is zero, it yields zero. .coNP Function @ divides .synb .mets (divides < d << n ) .syne .desc The .code divides function tests whether integer .meta d divides integer .metn n . If this is true, .code t is returned, otherwise .codn nil . The integers 1 and -1 divide every other integer and themselves. By established convention, every integer, except zero, divides zero. For other values, .meta d divides .meta n if division of .meta n by .meta d leaves no remainder. .coNP Function @ abs .synb .mets (abs << number ) .syne .desc The .code abs function computes the absolute value of .metn number . If .meta number is positive, it is returned. If .meta number is negative, its additive inverse is returned: a positive number of the same type with exactly the same magnitude. .coNP Function @ signum .synb .mets (signum << number ) .syne .desc The .code signum function calculates a representation of the sign of .meta number as a numeric value. If .meta number is an integer, then .code signum returns -1 if the integer is negative, 1 if the integer is positive, or else 0. If .meta number is a floating-point value then .code signum returns -1.0 if the value is negative, 1.0 if the value is positive or else 0.0. .coNP Functions @, trunc @, floor @ ceil and @ round .synb .mets (trunc < dividend <> [ divisor ]) .mets (floor < dividend <> [ divisor ]) .mets (ceil < dividend <> [ divisor ]) .mets (round < dividend <> [ divisor ]) .syne .desc The .codn trunc , .codn floor , .code ceil and .code round functions perform division of the .meta dividend by the .metn divisor , returning an integer quotient. If the .meta divisor is omitted, it defaults to 1. A zero .meta divisor results in an exception of type .codn numeric-error . If both inputs are integers, the result is of type integer. If all inputs are numbers and at least one of them is floating-point, the others are converted to floating-point and the result is floating-point. The .meta dividend input may be a range. In this situation, the operation is recursively distributed over the .code from and .code to fields of the range, individually matched against the .metn divisor , and the result is a range composed of these two individual quotients. When the quotient is a scalar value, .code trunc returns the closest integer, in the zero direction, from the value of the quotient. The .code floor function returns the highest integer which does not exceed the value of the quotient. That is to say, the division is truncated to an integer value toward negative infinity. The .code ceil function the lowest integer which is not below the value of the quotient. does not exceed the value of .metn dividend . That is to say, the division is truncated to an integer value toward positive infinity. The .code round function returns the nearest integer to the quotient. Exact halfway cases are rounded to the integer away from zero so that .code "(round -1 2)" yields .code -1 and .code "(round 1 2)" yields 1. Note that for large floating point values, due to the limited precision, the integer value corresponding to the mathematical floor or ceiling may not be available. .TP* "Dialect Note:" In ANSI Common Lisp, the .code round function chooses the nearest even integer, rather than rounding halfway cases away from zero. \*(TX's choice harmonizes with the semantics of the .code round function in the C language. .coNP Function @ mod .synb .mets (mod < dividend << divisor ) .syne .desc The .code mod function performs a modulus operation. Firstly, the absolute value of .meta divisor is taken to be a modulus. Then a residue of .meta dividend with respect to .meta modulus is calculated. The residue's sign follows that of the sign of .metn divisor . That is, it is the smallest magnitude (closest to zero) residue of .meta dividend with respect to the absolute value of .metn divisor , having the same sign as .metn divisor . If the operands are integer, the result is an integer. If either operand is of type float, then the result is a float. The modulus operation is then generalized into the floating point domain. For instance the expression .code "(mod 0.75 0.5)" yields a residue of 0.25 because 0.5 "goes into" 0.75 only once, with a "remainder" of 0.25. If .meta divisor is zero, .code mod throws an exception of type .codn numeric-error . .coNP Functions @, trunc-rem @, floor-rem @ ceil-rem and @ round-rem .synb .mets (trunc-rem < dividend <> [ divisor ]) .mets (floor-rem < dividend <> [ divisor ]) .mets (ceil-rem < dividend <> [ divisor ]) .mets (round-rem < dividend <> [ divisor ]) .syne .desc These functions, respectively, perform the same division operation as .codn trunc , .codn floor , .codn ceil , and .codn round , referred to here as the respective target functions. If the .meta divisor is missing, it defaults to 1. Each function returns a list of two values: a .meta quotient and a .metn remainder . The .meta quotient is exactly the same value as what would be returned by the respective target function for the same inputs. The .meta remainder value obeys the following identity: .mono .mets (eql < remainder (- < dividend >> (* divisor << quotient ))) .onom If .meta divisor is zero, these functions throw an exception of type .codn numeric-error . .coNP Functions @, sin @, cos @, tan @, asin @, acos @ atan and @ atan2 .synb .mets (sin << radians ) .mets (cos << radians ) .mets (tan << radians ) .mets (atan << slope ) .mets (atan2 < y << x ) .mets (asin << num ) .mets (acos << num ) .syne .desc These trigonometric functions convert their argument to floating point and return a float result. The .codn sin , .code cos and .code tan functions compute the sine and cosine and tangent of the .meta radians argument which represents an angle expressed in radians. The .codn atan , .code acos and .code asin are their respective inverse functions. The .meta num argument to .code asin and .code acos must be in the range -1.0 to 1.0. The .code atan2 function converts the rectilinear coordinates .meta x and .meta y to an angle in polar coordinates in the range [0, 2\(*p). .coNP Functions @, sinh @, cosh @, tanh @, asinh @ acosh and @ atanh .synb .mets (sinh << argument ) .mets (cosh << argument ) .mets (tanh << argument ) .mets (atanh << argument ) .mets (asinh << argument ) .mets (acosh << argument ) .syne .desc These functions are the hyperbolic analogs of the trigonometric functions .codn sin , .code cos and so forth. They convert their argument to floating point and return a float result. .coNP Functions @, exp @, log @ log10 and @ log2 .synb .mets (exp << arg ) .mets (log << arg ) .mets (log10 << arg ) .mets (log2 << arg ) .syne .desc The .code exp function calculates the value of the transcendental number e raised to the exponent .metn arg . The .code log function calculates the base e logarithm of .metn arg , which must be a positive value. The .code log10 function calculates the base 10 logarithm of .metn arg , which must be a positive value. The .code log2 function calculates the base 2 logarithm of .metn arg , which must be a positive value. .coNP Functions @, expt @ sqrt and @ isqrt .synb .mets (expt < base << exponent *) .mets (sqrt << arg ) .mets (isqrt << arg ) .syne .desc The .code expt function raises .meta base to zero or more exponents given by the .meta exponent arguments. .code "(expt x)" is equivalent to .codn "(expt x 1)" , and yields .code x for all .codn x . For three or more arguments, the operation is right-associative. That is to say, .code "(expt x y z)" is equivalent to .codn "(expt x (expt y z))" , similarly to the way nested exponents work in standard algebraic notation. Exponentiation is done pairwise using a binary operation. If both operands to this binary operation are nonnegative integers, then the result is an integer. If the exponent is negative, and the base is zero, the situation is treated as a division by zero: an exception of type .code numeric-error is thrown. Otherwise, a negative exponent is converted to floating-point, if it already isn't, and a floating-point exponentiation is performed. If either operand is a float, then the other operand is converted to a float, and a floating point exponentiation is performed. Exponentiation that would produce a complex number is not supported. If the exponent is zero, then the return value is 1.0 if at least one operand is floating-point, otherwise 1. The .code sqrt function produces a floating-point square root of .metn arg , which is converted from integer to floating-point if necessary. Negative operands are not supported. The .code isqrt function computes the integer square root of .metn arg , which must be an integer. The integer square root is a value which is the greatest integer that is no greater than the real square root of .metn arg . The input value must be an integer. .coNP Function @ exptmod .synb .mets (exptmod < base < exponent << modulus ) .syne .desc The .code exptmod function performs modular exponentiation and accepts only integer arguments. Furthermore, .meta exponent must be a nonnegative and .meta modulus must be positive. The return value is .meta base raised to .metn exponent , and reduced to the least positive residue modulo .metn modulus . .coNP Function @ square .synb .mets (square << argument ) .syne .desc The .code square function returns the product of .meta argument with itself. The following equivalence holds, except that .code x is evaluated only once in the .code square expression: .verb (square x) <--> (* x x) .brev .coNP Function @ cum-norm-dist .synb .mets (cum-norm-dist << argument ) .syne .desc The .code cum-norm-dist function calculates an approximation to the cumulative normal distribution function: the integral, of the normal distribution function, from negative infinity to the .metn argument . .coNP Function @ inv-cum-norm .synb .mets (inv-cum-norm << argument ) .syne .desc The .code inv-cum-norm function calculates an approximate to the inverse of the cumulative normal distribution function. The argument, a value expected to lie in the range [0, 1], represents the integral of the normal distribution function from negative infinity to some domain point .IR p . The function calculates the approximate value of .IR p . The minimum value returned is -10, and the maximum value returned is 10, regardless of how closely the argument approaches, respectively, the 0 or 1 integral endpoints. For values less than zero, or exceeding 1, the values returned, respectively, are -10 and 10. .coNP Functions @ n-choose-k and @ n-perm-k .synb .mets (n-choose-k < n << k ) .mets (n-perm-k < n << k ) .syne .desc The .code n-choose-k function computes the binomial coefficient nCk which expresses the number of combinations of .meta k items that can be chosen from a set of .metn n , where combinations are subsets. The .code n-perm-k function computes nPk: the number of permutations of size .meta k that can be drawn from a set of .metn n , where permutations are sequences, whose order is significant. The calculations only make sense when .meta n and .meta k are nonnegative integers, and .meta k does not exceed .metn n . The behavior is not specified if these conditions are not met. .coNP Functions @, fixnump @, bignump @, integerp @ floatp and @ numberp .synb .mets (fixnump << object ) .mets (bignump << object ) .mets (integerp << object ) .mets (floatp << object ) .mets (numberp << object ) .syne .desc These functions test the type of .metn object , returning .code t if it is an object of the implied type, .code nil otherwise. The .codn fixnump , .code bignump and .code floatp functions return .code t if the object is of the basic type .codn fixnum , .code bignum or .codn float . The function .code integerp returns true of .meta object is either a .code fixnum or a .codn bignum . The function .code numberp returns .code t if .meta object is either a .codn fixnum , .code bignum or .codn float . .coNP Function @ arithp .synb .mets (arithp << object ) .syne .desc The .code arithp function returns true if .meta object is a character, integer, floating-point number, range or a user-defined arithmetic object. For a range, .code t is returned without examining the values of the .code from and .code to fields. A user-defined arithmetic object is identified as a struct type which implements the .code + method as a static slot. .coNP Functions @ zerop and @ nzerop .synb .mets (zerop << number ) .mets (nzerop << number ) .syne .desc The .code zerop function tests .meta number for equivalence to zero. The argument must be a number or character. It returns .code t for the integer value .code 0 and for the floating-point value .codn 0.0 . For other numbers, it returns .codn nil . It returns .code t for the null character .code #\enul and .code nil for all other characters. If .meta number is a range, then .code zerop returns .code t if both of the range endpoints individually satisfy .codn zerop . The .code nzerop function is the logical inverse of .codn zerop : it returns .code t for those arguments for which .code zerop returns .code nil and vice versa. .coNP Functions @ plusp and @ minusp .synb .mets (plusp << number ) .mets (minusp << number ) .syne .desc These functions test whether a number is positive or negative, returning .code t or .codn nil , as the case may be. The argument may also be a character. All characters other than the null character .code #\enul are positive. No character is negative. .coNP Functions @ evenp and @ oddp .synb .mets (evenp << integer ) .mets (oddp << integer ) .syne .desc The .code evenp and .code oddp functions require integer arguments. .code evenp returns .code t if .meta integer is even (divisible by two), otherwise it returns .codn nil . .code oddp returns .code t if .meta integer is not divisible by two (odd), otherwise it returns .codn nil . .coNP Functions @, succ @, ssucc @, sssucc @, pred @ ppred and @ pppred .synb .mets (succ << number ) .mets (ssucc << number ) .mets (sssucc << number ) .mets (pred << number ) .mets (ppred << number ) .mets (pppred << number ) .syne .desc The .code succ function adds 1 to its argument and returns the resulting value. If the argument is an integer, then the return value is the successor of that integer, and if it is a character, then the return value is the successor of that character according to Unicode. The .code pred function subtracts 1 from its argument, and under similar considerations as above, the result represents the predecessor. The .code ssucc and .code sssucc functions add 2 and 3, respectively. Similarly, .code ppred and .code pppred subtract 2 and 3 from their argument. .coNP Functions @, > @, < @, >= @ <= and @ = .synb .mets (> < object << object *) .mets (< < object << object *) .mets (>= < object << object *) .mets (<= < object << object *) .mets (= < object << object *) .syne .desc These relational functions compare characters, numbers, ranges and sequences of characters or numbers for numeric equality or inequality. The arguments must be one or more numbers, characters, ranges, or sequences of these objects, or, recursively, of sequences. If just one argument is given, then these functions all return .codn t . If two arguments are given then, they are compared as follows. First, if the numbers do not have the same type, then the one which has the lower ranking type is converted to the type of the other, according to this ranking: character < integer < float. For instance if a character and integer are compared, the character is converted to its integer character code. Then a numeric comparison is applied. Three or more arguments may be given, in which case the comparison proceeds pairwise from left to right. For instance in .codn "(< a b c)" , the comparison .code "(< a b)" is performed in isolation. If the comparison is false, then .code nil is returned, otherwise the comparison .code "(< b c)" is performed in isolation, and if that is false, .code nil is returned, otherwise .code t is returned. Note that it is possible for .code b to undergo two different conversions. For instance in the .mono .meti (< < float < character << integer ) .onom comparison, .meta character will first convert to a floating-point representation of its Unicode value so that it can be compared to .metn float , and if that comparison succeeds, then in the second comparison, .meta character will be converted to integer so that it can be compared to .metn integer . Ranges may only be compared with ranges. Corresponding fields of ranges are compared for equality by .code = such that .code "#R(0 1)" and .code "#R(0 1.0)" are reported as equal. The inequality comparisons are lexicographic, such that the .code from field of the range is considered more major than the .code to field. For example the inequalities .code "(< #R(1 2) #R(2 0))" and .code "(< #R(1 2) #R(1 3))" hold. Sequences may only be compared with sequences, but mixtures of any kinds of sequences may be compared: lists with vectors, vectors with strings, and so on. The .code = function considers a pair of sequences of unequal length to be unequal, reporting .codn nil . Sequences are equal if they have the same length and their corresponding elements are recursively equal under the .code = function. The inequality functions treat sequences lexicographically. A pair of sequences is compared by comparing corresponding elements. The .code < function tests each successive pair of corresponding elements recursively using the .code < function. If this recursive comparison reports .codn t , then the function immediately returns .code t without considering any more pairs of elements. Otherwise the same pair of elements is compared again using the .code = function. If that reports false, then the function reports false without considering any more pairs of elements. Otherwise processing continues with the next pair, if any. If all corresponding elements are equal, but the right sequence is longer, .code < returns .codn t , otherwise the function reports .codn nil . The .code <= function tests each successive pair of corresponding elements recursively using the .code <= function. If this returns .code nil then the function returns .code nil without considering any more pairs. Otherwise processing continues with the next pair, if any. If all corresponding elements satisfy the test, but the left sequence is longer, then .code nil is returned. Otherwise .code t is returned. The inequality relations exhibit symmetry, which means that the functions .code > and .code >= functions are equivalent, respectively, to .code < and .code <= with the order of the argument values reversed. For instance, the expression .code "(< a b c)" is equivalent to .code "(> c b a)" except for the difference in evaluation order of the .codn a , .code b and .code c operands themselves. Any semantic description of .code < or .code <= applies, respectively, also to .code > or .code >= with the appropriate adjustment for argument order reversal. .coNP Function @ /= .synb .mets (/= << number *) .syne .desc The arguments to .code /= may be numbers or characters. The .code /= function returns .code t if no two of its arguments are numerically equal. That is to say, if there exist some .code a and .code b which are distinct arguments such that .code "(= a b)" is true, then the function returns .codn nil . Otherwise it returns .codn t . .coNP Functions @ max and @ min .synb .mets (max < first-arg << arg *) .mets (min < first-arg << arg *) .syne .desc The .code max and .code min functions determine and return the highest or lowest value from among their arguments. If only .meta first-arg is given, that value is returned. These functions are type generic, since they compare arguments using the same semantics as the .code less function. If two or more arguments are given, then .code "(max a b)" is equivalent to .codn "(if (less a b) b a)" , and .code "(min a b)" is equivalent to .codn "(if (less a b) a b)" . If the operands do not have the same type, then one of them is converted to the type of the other; however, the original unconverted values are returned. For instance .code "(max 4 3.0)" yields the integer .codn 4 , not .codn 4.0 . If three or more arguments are given, .code max and .code min reduce the arguments in a left-associative manner. Thus .code "(max a b c)" means .codn "(max (max a b) c)" . .coNP Function @ clamp .synb .mets (clamp < low < high << val ) .syne .desc The .code clamp function clamps value .meta val into the range .meta low to .metn high . The .code clamp function returns .meta low if .meta val is less than .metn low . If .meta val is greater than or equal to .metn low , but less than .metn high , then it returns .metn val . Otherwise it returns .metn high . More precisely, .code "(clamp a b c)" is equivalent to .codn "(max a (min b c))" . .coNP Function @ bracket .synb .mets (bracket < value << level *) .syne .desc The .code bracket function's arguments consist of one required .meta value followed by .I n .meta level arguments. The .meta level arguments are optional; in other words, .I n may be zero. The .code bracket function calculates the .I bracket of the .meta value argument: a zero-based positional index of the value, in relation to the .meta level arguments. Each of the .meta level arguments, of which there may be none, is associated with an integer index, starting at zero, in left-to-right order. The .meta level arguments are examined in that order. When a .meta level argument is encountered which exceeds .metn value , that .meta level argument's index is returned. If .meta value exceeds all of the .meta level arguments, then .I n is returned. Determining whether .meta value exceeds a .meta level is performed using the .code less function. .TP* Examples: .verb (bracket 42) -> 0 (bracket 5 10) -> 0 (bracket 15 10) -> 1 (bracket 15 10 20) -> 1 (bracket 15 10 20 30) -> 1 (bracket 20 10 20 30) -> 2 (bracket 35 10 20 30) -> 3 (bracket "a" "aardvark" "zebra") -> 0 (bracket "ant" "aardvark" "zebra") -> 1 (bracket "zebu" "aardvark" "zebra") -> 2 .brev .coNP Functions @, int-str @ flo-str and @ num-str .synb .mets (int-str < string <> [ radix ]) .mets (flo-str << string ) .mets (num-str << string ) .syne .desc These functions extract numeric values from character string .metn string . Leading whitespace in .metn string , if any, is skipped. If no digits can be successfully extracted, then .code nil is returned. Trailing material which does not contribute to the number is ignored. The .code int-str function converts a string of digits in the specified .meta radix to an integer value. If .meta radix isn't specified, it defaults to 10. Otherwise it must be an integer in the range 2 to 36, or else the character .codn #\ec . For radices above 10, letters of the alphabet are used for digits: .code A represent a digit whose value is 10, .code B represents 11 and so forth until .codn Z . Uppercase and lowercase letters are recognized. Any character which is not a digit of the specified radix is regarded as the start of trailing junk at which the extraction of the digits stops. When .meta radix is specified as the character object .codn #\ec , this indicates that a C-language-style integer constant should be recognized. If, after any optional sign, the remainder of .meta string begins with the character pair .code 0x then that pair is considered removed from the string, and it is treated as base 16 (hexadecimal). If, after any optional sign, the remainder of .meta string begins with a leading zero not followed by .codn x , then the radix is taken to be 8 (octal). In scanning these formats, .code int-str function is not otherwise constrained by C language representational limitations. Specifically, the input values are taken to be the printed representation of arbitrary-precision integers and treated accordingly. The .code flo-str function converts a floating-point decimal notation to a nearby floating point value. The material which contributes to the value is the longest match for optional leading space, followed by a mantissa which consists of an optional sign followed by a mixture of at least one digit, and at most one decimal point, optionally followed by an exponent part denoted by the letter .code E or .codn e , an optional sign and one or more optional exponent digits. If the value specified by .meta string is out of range of the floating-point representation, then .code nil is returned. The .code num-str function converts a decimal notation to either an integer as if by a radix 10 application of .codn int-str , or to a floating point value as if by .codn flo-str . The floating point interpretation is chosen if the possibly empty initial sequence of digits (following any whitespace and optional sign) is followed by a period, or by .code e or .codn E . .coNP Functions @ int-flo and @ flo-int .synb .mets (int-flo << float ) .mets (flo-int << integer ) .syne .desc These functions perform numeric conversion between integer and floating point type. The .code int-flo function returns an integer by truncating toward zero. The .code flo-int function returns an exact floating point value corresponding to .metn integer , if possible, otherwise an approximation using a nearby floating point value. .coNP Functions @ tofloat and @ toint .synb .mets (tofloat << value ) .mets (toint < value <> [ radix ]) .syne .desc These functions convert .meta value to floating-point or integer, respectively. The .meta value can be of several types, including string. If a floating-point value is passed into tofloat, or an integer value into toint, then that value is simply returned. If .meta value is a character, then it is treated as a string of length one containing that character. If .meta value is a string, then it is converted by .code tofloat as if by the function .metn flo-str , and by .code toint as if by the function .codn int-str . If .meta value is an integer, then it is converted by .code tofloat as if by the function .codn flo-int . If .meta value is a floating-point number, then it is converted by .code toint as if by the function .codn int-flo . If .meta value is a structure, then it is expected to implement the .code tofloat or .code toint method. This method is invoked by the same-named function, and the value is returned. .coNP Variables @ fixnum-min and @ fixnum-max .desc These variables hold, respectively, the most negative value of the .code fixnum integer type, and its most positive value. Integer values from .code fixnum-min to .code fixnum-max are all of type .codn fixnum . Integers outside of this range are .code bignum integers. .coNP Functions @ tofloatz and @ tointz .synb .mets (tofloatz << value ) .mets (tointz < value <> [ radix ]) .syne .desc These functions are closely related to, respectively, .code tofloat and .codn toint . They differ in that these functions return a floating-point or integer zero, respectively, in some situations in which those functions would return .code nil or throw an error. Whereas those functions reject a .meta value argument of .codn nil , for that same argument .code tofloatz function returns 0.0 and .code tointz returns 0. Likewise, in cases when .code value contains a string or character which cannot be converted to a number, and .code tofloat and .code toint would return .codn nil , these functions return 0.0 and 0, respectively. In other situations, these functions behave exactly like .code tofloat and .codn toint . .coNP Variables @, flo-min @ flo-max and @ flo-epsilon .desc These variables hold, respectively: the smallest positive floating-point value; the largest positive floating-point value; and the difference between 1.0 and the smallest representable value greater than 1.0. .code flo-min and .code flo-max define the floating-point range, which consists of three regions: values from .code "(- flo-max)" to .codn "(- flo-min)" ; the value 0.0, and values from .code flo-min to .codn flo-max . .coNP Variable @ flo-dig .desc This variable holds an integer representing the number of decimal digits in a decimal floating-point number such that this number can be converted to a \*(TX floating-point number, and back to decimal, without a change in any of the digits. This holds regardless of the value of the number, provided that it does not exceed the floating-point range. .coNP Variable @ flo-max-dig .desc This variable holds an integer representing the maximum number of decimal digits required to capture the value of a floating-point number such that the resulting decimal form will convert back to the same floating-point number. See also the .code *print-flo-precision* variable. .coNP Variables @ %pi% and @ %e% .desc These variables hold an approximation of the mathematical constants \(*p and e. To four digits of precision, \(*p is 3.142 and e is 2.718. The .code %pi% and .code %e% approximations are accurate to .code flo-dig decimal digits. .coNP Function @ digits .synb .mets (digits < number <> [ radix ]) .syne .desc The .code digits function returns a list of the digits of .meta number represented in the base given by .metn radix . The .meta number argument must be a nonnegative integer, and .meta radix must be an integer greater than one. If .meta radix is omitted, it defaults to 10. The return value is a list of the digits in descending order of significance: most significant to least significant. The digits are integers. For instance, if .meta radix is 42, then the digits are integer values in the range 0 to 41. The returned list always contains at least one element, and includes no leading zeros, except when .meta number is zero. In that case, a one-element list containing zero is returned. .TP* Examples: .verb (digits 1234) -> (1 2 3 4) (digits 1234567 1000) -> (1 234 567) (digits 30 2) -> (1 1 1 1 0) (digits 0) -> (0) .brev .coNP Function @ digpow .synb .mets (digpow < number <> [ radix ]) .syne .desc The .code digpow function decomposes the .meta number argument into a power series whose terms add up to .metn number . The .meta number argument must be a nonnegative integer, and .meta radix must be an integer greater than one. The returned power series consists of a list of nonnegative integers. It is formed from the digits of .meta number in the given .metn radix , which serve as coefficients which multiply successive powers of the .metn radix , starting at the zeroth power (one). The terms are given in decreasing order of significance: the term corresponding to the most significant digit of .metn number , multiplying the highest power of .metn radix , is listed first. The returned list always contains at least one element, and includes no leading zeros, except when .meta number is zero. In that case, a one-element list containing zero is returned. .verb (digpow 1234) -> (1000 200 30 4) (digpow 1234567 1000) -> (1000000 234000 567) (digpow 30 2) -> (16 8 4 2 0) (digpow 0) -> (0) .brev .coNP Functions @ poly and @ rpoly .synb .mets (poly < arg << coeffs ) .mets (rpoly < arg << coeffs ) .syne .desc The .code poly and .code rpoly functions evaluate a polynomial, for the given numeric argument value .meta arg and the coefficients given by .metn coeffs , a sequence of numbers. If .meta coeffs is an empty sequence, it denotes the zero polynomial, whose value is zero everywhere; the functions return zero in this case. Otherwise, the .code poly function considers .meta coeffs to hold the coefficients in the conventional order, namely in order of decreasing degree of polynomial term. The first element of .meta coeffs is the leading coefficient, and the constant term appears as the last element. The .code rpoly function takes the coefficients in opposite order: the first element of .meta coeffs gives the constant term coefficient, and the last element gives the leading coefficient. Note: except in the case of .code rpoly operating on a list or list-like sequence of coefficients, Horner's method of evaluation is used: a single result accumulator is initialized with zero, and then for each successive coefficient, in order of decreasing term degree, the accumulator is multiplied by the argument, and the coefficient is added. When .code rpoly operates on a list or list-like sequence, it makes a single pass through the coefficients in order, thus taking them in increasing term degree. It maintains two accumulators: one for successive powers of .meta arg and one for the resulting value. For each coefficient, the power accumulator is updated by a multiplication by .meta arg and then this value is multiplied by the coefficient, and that value is then added to the result accumulator. .TP* Examples: .verb ;; 2 ;; evaluate x + 2x + 3 for x = 10. (poly 10 '(1 2 3)) -> 123 ;; 2 ;; evaluate 3x + 2x + 1 for x = 10. (rpoly 10 '(1 2 3)) -> 321 .brev .coNP Function @ bignum-len .synb .mets (bignum-len << arg ) .syne .desc The .code bignum-len function reports the machine-specific .I "bignum order" of the integer or character argument .metn arg . If .meta arg is a character or .code fixnum integer, the function returns zero. Otherwise .meta arg is expected to be a .code bignum integer, and the function returns the number of "limbs" used for its representation, a positive integer. Note: the .code bignum-len function is intended to be of use in algorithms whose performance benefits from ordering the operations on multiple integer operands according to the magnitudes of those operands. The function provides an estimate of magnitude which trades accuracy for efficiency. .coNP Function @ quantile .synb .mets (quantile < p >> [ group-size <> [ rate ]]) .syne .desc The .code quantile function returns a function which estimates a specific quantile of a set of real-valued samples. The desired quantile is indicated by the .meta p parameter, which is a number in the range 0 to 1.0. If .meta p is specified as 0.5, then the median is estimated. The .meta p value of 0.9 leads to the estimation of the 90th percentile: a value such that approximately 90% of the samples are below that value. If the .meta group-size parameter is specified, it must be a positive integer. The returned function then operates in grouped mode. The .meta rate parameter is relevant only to grouped mode. Grouped mode is described below. The function returned by .code quantile maintains internal state in relation to calculating the quantile. The function may be called with any number of arguments, including none. It expects every argument to be either a number, or a sequence of numbers. These numbers are accumulated into the quantile calculation, and a revised estimate of the quantile is then returned. Note: the algorithm used is the P-Squared algorithm invented in 1985 by Raj Jain and Imrich Chlamtac, which avoids accumulating and sorting the entire data set, while still obtaining good quality estimates of the quantile. The algorithm requires an initial seed of five samples. Then additional samples input into the algorithm produce quantile estimates. To eliminate this special case from the abstract interface, the \*(TX implementation is capable of producing an estimate when five or fewer samples have been presented, including none. In this low situation, the .meta p value is ignored in reporting the estimate. When no samples have been given, the estimate is zero. When one sample has been given, the estimate is that sample itself. When between two and five samples have been given, the estimate is their median. Using the median as the estimate ensures a smooth transition from these early estimates into the estimates produced by the P-Squared algorithm. This is because the P-Squared algorithm always reports the value of the middle height accumulator as the estimate, and that accumulator's initial value is the median of the first five samples. The function returned by .codn quantile , though not accumulating all of the samples passed to it, nevertheless has a limited sample capacity, because the registers it uses for tracking the sample group positions are fixed-width integers. The sample capacity is approximately 4 times the value of .codn fixnum-max . .TP* Example: .verb (defparm q (quantile 0.9)) ;; create 90-th percentile accumulator [q] -> 0.0 ;; no samples given: estimate is 0. [q 3.14] -> 3.14 ;; one sample: estimate is that sample [q 13.3 7.9 5.2 6.3] -> 7.9 ;; five samples: estimate is median. [q 6.8 7.3 9.1 4.0] ;; more than five samples; estimate now -> 8.44651234567901 ;; from P-Square algorithm [q #(13.1 5 2.5)] ;; vector argument -> 9.68660493827161 [q] -> 9.68660493827161 ;; no arguments: repeat current estimate .brev If the .meta group-size argument is specified, then the quantile accumulator operates in grouped mode. Grouped mode allows infinite sample operation without overflow: an unlimited number of samples can be accepted. However, old samples lose their influence over the estimated value: newer samples are considered more significant than old samples. In grouped mode, the quantile accumulator is reset to its initial state whenever .meta group-size samples have been accumulated, and begins freshly calculating the quantile. Prior to the reset, an estimate is obtained and retained in an internal register. Going forward, this remembered previous estimate is blended in with the newly calculated estimate values, as described below. The cycle repeats itself whenever .meta group-size samples accumulate: the state is reset, and the current estimate is loaded into the previous estimate register, which is then blended with newly computed values. The .meta rate parameter, whose default value is 0.999, controls the estimate blending. It should be a value between 0 and 1. Upon each reset, a blend value register is initialized to 1.0. Each time a new sample is accumulated, the blend register is multiplied by the rate parameter, and the product is stored back into the blend register. Thus if the rate is between 0 and 1, exclusive, then the blend register exponentially decreases as the number of samples grows. The blend register indicates the fraction of the estimate which comes from the remembered previous estimate. For instance, if the current blend value is 0.8, then the returned estimate value is 0.8 times the remembered previous estimate, plus 0.2 times the newly computed estimate for the current sample in the new group: the previous and current estimate are blended 80:20. The default .meta rate value of 0.999 is chosen for a slow transition to the new estimates, which helps to conceal inaccuracies in the algorithm associated with having accumulated a small number of samples. At this rate, it requires about 290 samples before the blend value drops to 75% of the old estimate. If .code rate is specified as 0, then no blending of the previous estimate value takes place, since the blend factor will drop to zero upon the first sample being received after the group reset, causing the newly calculated estimates to be returned without blending. The previous sample groups therefore have no influence over newer estimates. If .code rate is specified as 1, then the blend factor will stay at 1, and so the estimate will forever remain at the previous value, ignoring the calculations driven by the new samples. Note: it is recommended that if .meta group-size is specified, the value should be at least several hundred. Too small a group size will prevent the estimation algorithm from settling on good results. The .meta rate parameter should not much smaller than 1. A rate too low will cause the previous estimate's contribution to the quantile value to diminish, too quickly, before the new estimation settles. .coNP Variables @, flo-near @, flo-down @ flo-up and @ flo-zero .desc These variables hold integer values suitable as arguments to the .code flo-set-round-mode function, which controls the rounding mode for the results of floating-point operations. These variables are only defined on platforms which support rounding control. Their values have the following meanings: .RS .coIP flo-near Round to nearest: the result of an operation is rounded to the nearest representable value. .coIP flo-down Round down: the result of an operation is rounded to the nearest representable value that lies in the direction of negative infinity. .coIP flo-up Round up: the result of an operation is rounded to the nearest representable value that lies in the direction of positive infinity. .coIP flo-zero Round to zero: the result of an operation is rounded to the nearest representable value that lies in the direction of zero. .RE .IP .coNP Functions @ flo-get-round-mode and @ flo-set-round-mode .synb .mets (flo-get-round-mode) .mets (flo-set-round-mode << mode ) .syne .desc Sometimes floating-point operations produce a result which requires more bits of precision than the floating point representation can provide. A representable floating-point value must be substituted for the true result and yielded by the operation. On platforms which support rounding control, these functions are provided for selecting the decision procedure by which the floating-point representation is taken. The .code flo-get-round-mode returns the current rounding mode. The rounding mode is represented by an integer value which is either equal to one of the four variables .codn flo-near , .codn flo-down , .code flo-up and .codn flo-zero , or else some other value specific to the host environment. Initially, the value is that of .codn flo-near . Otherwise, the value returned is that which was stored by the most recent successful call to .codn flo-set-round-mode . The .code flo-set-round-mode function changes the rounding mode. The argument to its .meta mode parameter may be the value of one of the above four variables, or else some other value supported by the host environment's .code fesetround C library function. The .code flo-set-round-mode function returns .code t if it is successful, otherwise the return value is .code nil and the rounding mode is not changed. If a value is passed to .code flo-set-round-mode which is not the value of one of the above four rounding mode variables, and the function succeeds anyway, then the rounding behavior of floating-point operations depends on the host environment's interpretation of that value. .SS* Supplementary Math Library The following functions are defined, if they are available from the host platform. They corresponds to same-named functions in the ISO C language standard, which appeared in the 1999 revision ("C99"). Even if some of these functions happen not to be defined, it is nevertheless possible to define them as methods in a user-defined arithmetic structure. See the section User-Defined Arithmetic Types below .coNP Functions @, cbrt @, erf @, erfc @, exp10 @, exp2 @, expm1 @, gamma @, j0 @, j1 @, lgamma @, log1p @, logb @, nearbyint @, rint @, significand @, tgamma @ y0 and @ y1 .synb .mets (cbrt << arg ) .mets (erf << arg ) .mets (erfc << arg ) .mets (exp10 << arg ) .mets (exp2 << arg ) .mets (expm1 << arg ) .mets (gamma << arg ) .mets (j0 << arg ) .mets (j1 << arg ) .mets (lgamma << arg ) .mets (log1p << arg ) .mets (logb << arg ) .mets (nearbyint << arg ) .mets (rint << arg ) .mets (significand << arg ) .mets (tgamma << arg ) .mets (y0 << arg ) .mets (y1 << arg ) .syne .desc These are one-argument functions, which take a numeric argument, and return a floating-point result. .coNP Functions @, copysign @, drem @, fdim @, fmax @, fmin @, hypot @, jn @, ldexp @, nextafter @, remainder @, scalb @ scalbln and @ yn .synb .mets (copysign < arg1 << arg2 ) .mets (drem < arg1 << arg2 ) .mets (fdim < arg1 << arg2 ) .mets (fmax < arg1 << arg2 ) .mets (fmin < arg1 << arg2 ) .mets (hypot < arg1 << arg2 ) .mets (jn < arg1 << arg2 ) .mets (ldexp < arg1 << arg2 ) .mets (nextafter < arg1 << arg2 ) .mets (remainder < arg1 << arg2 ) .mets (scalb < arg1 << arg2 ) .mets (scalbln < arg1 << arg2 ) .mets (yn < arg1 << arg2 ) .syne .desc These are two-argument functions, which take numeric arguments, and return a floating-point result. .SS* Bit Operations In \*(TL, similarly to Common Lisp, bit operations on integers are based on a concept that might be called "infinite two's complement". Under infinite two's complement, a positive number is regarded as having a binary representation prefixed by an infinite stream of zero digits (for example .code 1 is .codn ...00001 ). A negative number in infinite two's complement is the bitwise negation of its positive counterpart, plus one: it carries an infinite prefix of 1 digits. So for instance the number .code -1 is represented by .codn ...11111111 : an infinite sequence of 1 bits. There is no specific sign bit; any operation which produces such an infinite sequence of 1 digits on the left gives rise to a negative number. For instance, consider the operation of computing the bitwise complement of the number .codn 1 . Since the number .code 1 is represented as .codn ...0000001 , its complement is .codn ...11111110 . Each one of the .code 0 digits in the infinite sequence is replaced by .codn 1 , And this leading sequence means that the number is negative, in fact corresponding to the two's complement representation of the value .codn -2 . Hence, the infinite digit concept corresponds to an arithmetic interpretation. In fact \*(TL's bignum integers do not use a two's complement representation internally. Numbers are represented as an array which holds a pure binary number. A separate field indicates the sign: negative, or nonnegative. That negative numbers appear as two's complement under the bit operations is merely a carefully maintained illusion (which makes bit operations on negative numbers more expensive). The .code logtrunc function, as well as a feature of the .code lognot function, allow bit manipulation code to be written which works with positive numbers only, even if complements are required. The trade off is that the application has to manage a limit on the number of bits. .coNP Functions @, logand @ logior and @ logxor .synb .mets (logand << integer *) .mets (logior << integer *) .mets (logxor < int1 << int2 ) .syne .desc These operations perform the familiar bitwise and, inclusive or, and exclusive or operations, respectively. Positive values inputs are treated as pure binary numbers. Negative inputs are treated as infinite-bit two's complement. For example .code "(logand -2 7)" produces .codn 6 . This is because .code -2 is .code ...111110 in infinite-bit two's complement. And-ing this value with .code 7 (or .codn ...000111 ) produces .codn 110 . The .code logand and .code logior functions are variadic, and may be called with zero, one, two, or more input values. If .code logand is called with no arguments, it produces the value -1 (all bits 1). If .code logior is called with no arguments it produces zero. In the one-argument case, the functions just return their argument value. In the two-argument case, one of the operands may be a character, if the other operand is a fixnum integer. The character operand is taken to be an integer corresponding to the character value's Unicode code point value. The resulting value is regarded as a Unicode code point and converted to a character value accordingly. When three or more arguments are specified, the operation's semantics is that of a left-associative reduction through two-argument invocations, so that the three-argument case .code "(logand a b c)" is equivalent to the expression .codn "(logand (logand a b) c)" , which features two two-argument cases. .coNP Function @ logtest .synb .mets (logtest < int1 << int2 ) .syne .desc The .code logtest function returns true if .meta int1 and .meta int2 have bits in common. The following equivalence holds: .verb (logtest a b) <--> (not (zerop (logand a b))) .brev .coNP Functions @ lognot and @ logtrunc .synb .mets (lognot < value <> [ bits ]) .mets (logtrunc < value << bits ) .syne .desc The .code lognot function performs a bitwise complement of .metn value . When the one-argument form of lognot is used, then if .meta value is nonnegative, then the result is negative, and vice versa, according to the infinite-bit two's complement representation. For instance .code "(lognot -2)" is .codn 1 , and .code "(lognot 1)" is .codn -2 . The two-argument form of .code lognot produces a truncated complement. Conceptually, a bitwise complement is first calculated, and then the resulting number is truncated to the number of bits given by .metn bits , which must be a nonnegative integer. The following equivalence holds: .verb (lognot a b) <--> (logtrunc (lognot a) b) .brev The .code logtrunc function truncates the integer .meta value to the specified number of bits. If .meta value is negative, then the two's complement representation is truncated. The return value of .code logtrunc is always a nonnegative integer. .coNP Function @ sign-extend .synb .mets (sign-extend < value << bits ) .syne .desc The .code sign-extend function first truncates the infinite-bit two's complement representation of the integer .meta value to the specified number of bits, similarly to the .code logtrunc function. Then, this truncated value is regarded as a .metn bits -wide two's complement integer. The value of this integer is calculated and returned. .TP* Examples: .verb (sign-extend 127 8) -> 127 (sign-extend 128 8) -> -128 (sign-extend 129 8) -> -127 (sign-extend 255 8) -> -1 (sign-extend 256 8) -> 0 (sign-extend -1 8) -> -1 (sign-extend -255 8) -> 0 .brev .coNP Function @ ash .synb .mets (ash < value << bits ) .syne .desc The .code ash function shifts .meta value by the specified number of .meta bits producing a new value. If .meta bits is positive, then a left shift takes place. If .meta bits is negative, then a right shift takes place. If .meta bits is zero, then .meta value is returned unaltered. For positive numbers, a left shift by n bits is equivalent to a multiplication by two to the power of n, or .codn "(expt 2 n)" . A right shift by n bits of a positive integer is equivalent to integer division by .codn "(expt 2 n)" , with truncation toward zero. For negative numbers, the bit shift is performed as if on the two's complement representation. Under the infinite two's complement representation, a right shift does not exhaust the infinite sequence of .code 1 digits which extends to the left. Thus if .code -4 is shifted right it becomes .code -2 because the bitwise representations of these values are .code ...111100 and .codn ...11110 . .coNP Function @ bit .synb .mets (bit < value << bit ) .syne .desc The .code bit function tests whether the integer or character .meta value has a 1 in bit position .metn bit . The .meta bit argument must be a nonnegative integer. A value of .meta bit of zero indicates the least-significant-bit position of .metn value . The .code bit function has a Boolean result, returning the symbol .code t if bit .meta bit of .meta value is set, otherwise .codn nil . If .meta value is negative, it is treated as if it had an infinite-bit two's complement representation. For instance, if .meta value is .codn -2 , then the .code bit function returns .code nil for a .meta bit value of zero, and .code t for all other values, since the infinite bit two's complement representation of .code -2 is .codn ...11110 . .coNP Function @ mask .synb .mets (mask << integer *) .syne .desc The .code mask function takes zero or more integer arguments, and produces an integer value which corresponds to a bitmask made up of the bit positions specified by the integer arguments. If .code mask is called with no arguments, then the return value is zero. If .code mask is called with a single argument .meta integer then the return value is the same as that of the expression .codn "(ash 1 )" : the value 1 shifted left by .meta integer bit positions. If .meta integer is zero, then the result is .codn 1 ; if .meta integer is .codn 1 , the result is .code 2 and so forth. If .meta value is negative, then the result is zero. If .code mask is called with two or more arguments, then the result is a bitwise or of the masks individually computed for each of the values. In other words, the following equivalences hold: .verb (mask) <--> 0 (mask a) <--> (ash 1 a) (mask a b c ...) <--> (logior (mask a) (mask b) (mask c) ...) .brev .coNP Function @ bitset .synb .mets (bitset << integer ) .syne .desc The .code bitset function returns a list of the positions of bits which have a value of 1 in a positive .meta integer argument, or the positions of bits which have a value of zero in a negative .meta integer argument. The positions are ordered from least to greatest. The least significant bit has position zero. If .meta integer is zero, the empty list .code nil is returned. A negative integer is treated as an infinite-bit two's complement representation. The argument may be a character. If .meta integer .code x is nonnegative, the following equivalence holds: .verb x <--> [apply mask (bitset x)] .brev That is to say, the value of .code x may be reconstituted by applying the bit positions returned by .code bitset as arguments to the .code mask function. The value of a negative .code x may be reconstituted from its .code bitset as follows: .verb x <--> (pred (- [apply mask (bitset x)])) .brev also, more trivially, thus: .verb x <--> (- [apply mask (bitset (- x))]) .brev .coNP Function @ width .synb .mets (width << integer *) .syne .desc A two's complement representation of an integer consists of a sign bit and a mantissa field. The .code width function computes the minimum number of bits required for the mantissa portion of the two's complement representation of the .meta integer argument. For a nonnegative argument, the width also corresponds to the number of bits required for a natural binary representation of that value. Two integer values have a width of zero, namely 0 and -1. This means that these two values can be represented in a one-bit two's complement, consisting of only a sign bit: the one-bit two's complement bitfield 1 denotes -1, and 0 denotes 0. Similarly, two integer values have a width of 1: 1 and -2. The two-bit two's complement bitfield 01 denotes 1, and 10 denotes -2. The argument may be a character. .coNP Function @ logcount .synb .mets (logcount << integer ) .syne .desc The .code logcount function considers .meta integer to have a two's complement representation. If the integer is positive, it returns the count of bits in that representation whose value is 1. If .meta integer is negative, it returns the count of zero bits instead. If .meta integer is zero, the value returned is zero. The argument may be a character. .coNP Macros @ set-mask and @ clear-mask .synb .mets (set-mask < place << integer *) .mets (clear-mask < place << integer *) .syne .desc The .code set-mask and .code clear-mask macros set to 1 and 0, respectively, the bits in .meta place corresponding to bits that are equal to 1 in the mask resulting from applying the inclusive or operation to the .meta integer arguments. The following equivalences hold: .verb (set-mask place integer ...) <--> (set place (logior place integer ...) (clear-mask place integer ...) <--> (set place (logand place (lognot (logior integer ...)))) .brev .SS* User-Defined Arithmetic Types \*(TL makes it possible for the user application program to define structure types which can participate in arithmetic operations as if they were numbers. Under most arithmetic functions, a structure object may be used instead of a number, if that structure object implements a specific method which is required by that arithmetic function. The following paragraphs give general remarks about the method conventions. Not all arithmetic and bit manipulation functions have a corresponding method, and a small number of functions do not follow these conventions. In the simplest case of arithmetic functions which are unary, the method takes no argument other than the object itself. Most unary arithmetic functions expect a structure argument to have a method which has the same name as that function. For instance, if .code x is a structure, then .code "(cos x)" will invoke .codn "x.(cos)" . If .code x has no .code cos method, then an .code error exception is thrown. A few unary methods are not named after the corresponding function. The unary case of the .code - function expects an object to have a method named .codn neg ; thus, .code "(- x)" invokes .codn "x.(neg)" . Unary division requires a method called .codn recip ; thus, .codn "(/ x)" , invokes .codn "x.(recip)" . When a structure object is used as an argument in a two-argument (binary) arithmetic function, there are several cases to consider. If the left argument to a binary function is an object, then that object is expected to support a binary method. That method is called with two arguments: the object itself, of course, and the right argument of the arithmetic operation. In this case, the method is named after the function. For instance, if .code x is an object, then .code "(+ x 3)" invokes .codn "x.(+ 3)" . If the right argument, and only the right argument, of a binary operation is an object, then the situation falls into two cases depending on whether the operation is commutative. If the operation is commutative, then the same method is used as in the case when the object is the left argument. The arguments are merely reversed. Thus .code "(+ 3 x)" also invokes .codn "x.(+ 3)" . If the operation is not commutative, then the object must supply an alternative method. For most functions, that method is named by a symbol whose name begins with a .code r- prefix. For instance .code "(mod x 5)" invokes .code "x.(mod 5)" whereas .code "(mod 5 x)" invokes .codn "x.(r-mod 5)" . Note: the "r" may be remembered as indicating that the object is the .B right argument of the binary operation or that the arguments are .BR reversed . Two functions do not follow the .code r- convention. These are .code - and .codn / . For these, the methods used for the object as a right argument, respectively, are .code -- and .codn // . Thus .code "(/ 5 x)" invokes .code "x.(// 5)" and .code "(- 5 x)" invokes .codn "x.(-- 5)" . Several binary functions do not support an object as the right argument. These are .codn sign-extend , .code ash and .codn bit . Variadic arithmetic functions, when given three or more arguments, are regarded as performing a left-associative decimation of the arguments through a binary function. Thus for instance .code "(- 1 x 4)" is understood as .code "(- (- 1 x) 4)" where .code "x.(-- 1)" is evaluated first. If that method yields an object .code o then .code "o.(- 4)" is invoked. Certain variadic arithmetic functions, if invoked with one argument, just return that argument: for instance, .code + and .code * are in this category. A special concession exists in these functions: if their one and only argument is a structure, then that structure is returned without any error checking, even if it implements no methods related to arithmetic. The following sections describe each of the methods that must be implemented by an object for the associated arithmetic function to work with that object, either at all, or in a specific argument position, as the case may be. These methods are not provided by \*(TL; the application is required to provide them. .de bmc . coNP Method @ \\$1 . synb . mets << obj .(\\$1 << arg ) . syne . desc The . code \\$1 method is invoked when a structure is used as an argument to the . code \\$1 function. If an object . meta obj is combined with an argument . metn arg , either as . mono . meti (\\$1 < obj << arg ) . onom or as . mono . meti (\\$1 < arg << obj ) . onom then, effectively, the method call . mono . meti << obj .(\\$1 << arg ) . onom takes place, and its return value is taken as the result of the operation. .. .de bmcv . coNP Method @ \\$1 . synb . mets << obj .(\\$1 << arg ) . syne . desc The . code \\$1 method is invoked when a structure is used as an argument to the . code \\$1 function together with at least one other operand. If an object . meta obj is combined with an argument . metn arg , either as . mono . meti (\\$1 < obj << arg ) . onom or as . mono . meti (\\$1 < arg << obj ) . onom then, effectively, the method call . mono . meti << obj .(\\$1 << arg ) . onom takes place, and its return value is taken as the result of the operation. .. .de bmnl . coNP Method @ \\$1 . synb . mets << obj .(\\$1 << arg ) . syne . desc The . code \\$1 method is invoked when the structure . meta obj is used as the left argument of the . code \\$1 function. If an object . meta obj is combined with an argument . metn arg , as . mono . meti (\\$1 < obj << arg ) . onom then, effectively, the method call . mono . meti << obj .(\\$1 << arg ) . onom takes place, and its return value is taken as the result of the operation. .. .de bmnr . coNP Method @ \\$1 . synb . mets << obj .(\\$1 << arg ) . syne . desc The . code \\$1 method is invoked when the structure . meta obj is used as the right argument of the . code \\$2 function. If an object . meta obj is combined with an argument . metn arg , as . mono . meti (\\$2 < arg << obj ) . onom then, effectively, the method call . mono . meti << obj .(\\$1 << arg ) . onom takes place, and its return value is taken as the result of the operation. .. .de umv . coNP Method @ \\$1 . synb . mets << obj .(\\$1) . syne . desc The . code \\$1 method is invoked when the structure . meta obj is used as the sole argument to the . code \\$2 function. If an object . meta obj is passed to the function as . mono . meti (\\$2 << obj ) . onom then, effectively, the method call . mono . meti << obj .(\\$1) . onom takes place, and its return value is taken as the result of the operation. .. .de bma . coNP Method @ \\$1 . synb . mets << obj .(\\$1 << arg ) . syne . desc The . code \\$1 method is invoked when the . code \\$1 function is invoked with two operands, and the structure . meta obj is the left operand. The method is also invoked when the . code \\$2 function is invoked with two operands, and .meta obj is the right operand. If an object . meta obj is combined with an argument . metn arg , either as . mono . meti (\\$1 < obj << arg ) . onom or as . mono . meti (\\$2 < arg << obj ) . onom then, effectively, the method call . mono . meti << obj .(\\$1 << arg ) . onom takes place, and its return value is taken as the result of the operation. .. .de um . coNP Method @ \\$1 . synb . mets << obj .(\\$1) . syne . desc The . code \\$1 method is invoked when a structure is used as the argument to the . code \\$1 function. If an object . meta obj is passed to the function as . mono . meti (\\$1 << obj ) . onom then, effectively, the method call . mono . meti << obj .(\\$1) . onom takes place, and its return value is taken as the result of the operation. .. .de tmnl . coNP Method @ \\$1 . synb . mets << obj .(\\$1 < arg1 << arg2 ) . syne . desc The . code \\$1 method is invoked when the structure . meta obj is used as the left argument of the . code \\$1 function. If an object . meta obj is combined with arguments . meta arg1 and . metn arg2 , as . mono . meti (\\$1 < obj < arg1 << arg2 ) . onom then, effectively, the method call . mono . meti << obj .(\\$1 < arg1 << arg2 ) . onom takes place, and its return value is taken as the result of the operation. .. .um tofloat The method should return a floating-point value. It is also permissible for the method to return .codn nil , in which case if it is invoked via .codn tofloatz , that function will replace the .code nil return with value of 0.0. .um toint The method should return an integer value. It is permissible for the method to return .codn nil , in which case if it is invoked via .codn tointz , that function will replace the .code nil return with value of 0. .bmcv + .bmnl - .bmnr -- - .umv neg - .bmcv * .bmnl / .bmnr // / .umv recip / .um abs .um signum .bmnl trunc .bmnr r-trunc trunc .umv trunc1 trunc .bmnl mod .bmnr r-mod mod .bmnl expt .bmnr r-expt expt .tmnl exptmod Note: the .code exptmod function doesn't support structure objects in the second and third argument positions. The .meta exponent and .meta base arguments must be integers. .um isqrt .um square .bma > < .bma < > .bma >= <= .bma <= >= .bmc = .um zerop .um plusp .um minusp .um evenp .um oddp .bmnl floor .bmnr r-floor floor .umv floor1 floor .bmnl ceil .bmnr r-ceil ceil .umv ceil1 ceil .bmnl round .bmnr r-round round .umv round1 round .um sin .um cos .um tan .um asin .um acos .um atan .bmnl atan2 .bmnr r-atan2 atan2 .um sinh .um cosh .um tanh .um asinh .um acosh .um atanh .um log .um log2 .um log10 .um exp .um sqrt .bmcv logand .bmcv logior .bmnl lognot .bmnr r-lognot lognot .umv lognot1 lognot .bmnl logtrunc .bmnr r-logtrunc logtrunc .bmnl sign-extend .um cbrt .um erf .um erfc .um exp10 .um exp2 .um expm1 .um gamma .um j0 .um j1 .um lgamma .um log1p .um logb .um nearbyint .um rint .um significand .um tgamma .um y0 .um y1 .bmnr r-copysign copysign .bmnr r-drem drem .bmnr r-fdim fdim .bmnr r-fmax fmax .bmnr r-fmin fmin .bmnr r-hypot hypot .bmnr r-jn jn .bmnr r-ldexp ldexp .bmnr r-nextafter nextafter .bmnr r-remainder remainder .bmnr r-scalb scalb .bmnr r-scalbln scalbln .bmnr r-yn yn Note: the .code sign-extend function doesn't support a structure as the right argument, .metn bits , which must be an integer. .bmnl ash Note: the .code ash function doesn't support a structure as the right argument, .metn bits , which must be an integer. .bmnl bit Note: the .code bit function doesn't support a structure as the right argument, .metn bit , which must be an integer. .um width .um logcount .um bitset .SS* Exception Handling An .I exception in \*(TX is a special event in the execution of the program which potentially results in a transfer of control. An exception is identified by a symbol, known as the .IR "exception type" , and it carries zero or more arguments, called the .IR "exception arguments" . When an exception is initiated, it is said to be .IR thrown . This action is initiated by the following functions: .codn throw , .code throwf and .codn error , and possibly other functions which invoke these. When an exception is thrown, \*(TX enters into exception processing mode. Exception processing mode terminates in one of several ways: .IP - A .I catch is found which matches the exception, and control is transferred to the catch by a nonlocal transfer which performs unwinding. Catches are defined by the .code catch macro. .IP - A .I handler is found which matches the exception, and control is transferred to the handler by invoking its function. The handler function accepts the exception by performing a nonlocal transfer to a destination of its choice, or else declines to accept the exception by returning. Handlers are defined by the .code handler-bind operator or .code handle macro. .IP - If no catch or accepting handler is found for an exception derived from .code error and .code *unhandled-hook* is .codn nil , then a built-in strategy for handling the exception is invoked, consisting of unwinding, and then printing some informational messages and terminating. If the .code *unhandled-hook* variable contains a value that isn't .codn nil , then control is transferred to the function stored in the that variable first; only if that function returns is the above built-in strategy invoked. .IP - If no catch or accepting handler is found for an exception derived from .codn warning , then a warning diagnostic is issued on the .code *stderr* stream and a .code continue exception is thrown with no arguments. If no catch or handler is found for that exception, then control returns normally to the site which threw the warning exception. .IP - If no catch or accepting handler is found for an exception that is neither derived from .code error nor from .codn warning , then no control transfer takes place; control returns to the .code throw or .code throwf function which returns normally, with a return value of .codn nil . .PP .NP* Catches and Handlers There are two ways by which exceptions are handled: catches and handlers. Catches and handlers are similar, but different. A catch is an exit point associated with an active scope. When an exception is handled by a catch, the form which threw the exception is abandoned, and unwinding takes place to the catch site, which receives the exception type and arguments. A handler is also associated with an active scope. However, it is a function, and not a dynamic exit point. When an exception is passed to handler, unwinding does not take place; rather, the function is called. The function then either completes the exception handling by performing a nonlocal transfer, or else declines the exception by performing an ordinary return. Catches and handlers are identified by exception type symbols. A catch or handler is eligible to process an exception if it handles a type which is a supertype of the exception which is being processed. Handles and catches are found by means of a combined search which proceeds from the innermost nesting of dynamic scope to the outermost, without performing any unwinding. When an eligible handler is encountered, its registered function is called, thereby suspending the search. If the handler function returns, the search continues from that scope to yet unvisited outer scopes. When an eligible catch is encountered rather than a handler, the search terminates and a control transfer takes place to the catch site. That control transfer then performs unwinding, which requires it to make a second pass through the same nestings of dynamic scope that had just been traversed in order to find that catch. .NP* Handlers and Sandboxing Because handlers execute in the dynamic context of the exception origin, without any unwinding having taken place, they expose a potential route of sandbox escape via the package system, unless special steps are taken. The threat is that code at the handler site could take advantage of the current value of the .code *package* and .code *package-alist* variables established at the exception throw site to gain inappropriate access to symbols. For this reason, when a handler is established, the current values of .code *package* and .code *package-alist* are recorded into the handler frame. When that handler is later invoked, it executes in a dynamic environment in which those variables are bound to the previously noted values. The catch mechanism doesn't do any such thing because the unwinding which is performed prior to the invocation of a catch implicitly restores the values of .B all special variables to the values they had at the time the frame was established. .NP* Exception Type Hierarchy Exception type symbols are arranged in an inheritance hierarchy, at whose top the symbol .code t is the supertype of every exception type, and the .code nil symbol is at the bottom, the subtype of every exception type. Keyword symbols may be used as exception types. Every symbol is its own supertype and subtype. Thus whenever X is known to be a subtype of Y, it is possible that X is exactly Y. The .code defex macro registers exception supertype/subtype relationships among symbols. The following tree diagram shows the relationships among \*(TL's built-in exception symbols. Not shown is the exception symbol .codn nil , subtype of every exception type: .verb t ----+--- warning | +--- restart ---+--- continue | | | +--- retry | | | +--- skip | +--- error ---+--- type-error | +--- internal-error | +--- panic | +--- numeric-error | +--- range-error | +--- query-error | +--- file-error -------+--- path-not-found | | | +--- path-exists | | | +--- path-permission | +--- process-error | +--- socket-error | +--- system-error | +--- alloc-error | +--- timeout-error | +--- assert | +--- syntax-error | +--- eval-error | +--- match-error | +--- case-error | +--- opt-error .brev Program designers are encouraged to derive new error exceptions from the .code error type. The .code restart type is intended to be the root of a hierarchy of exception types used for denoting restart points: designers are encouraged to derive restarts from this type. A catch for the .code continue exception should be established around constructs which can throw an error from which it is possible to recover. That exception provides the entry point into the recovery which resumes execution. A catch for .code retry should be provided in situations when it is possible and makes sense for a failed operation to be tried again. A catch for .code skip should be provided in situations when it is possible and sensible to continue with subsequent operations even though an operation has failed. .NP* Dialect Notes Exception handling in \*(TL provides capabilities similar to the condition system in ANSI Common Lisp. The implementation and terminology differ. Most obviously, ANSI CL uses the "condition" term, whereas \*(TL uses "exception". In ANSI CL, a condition is "raised", whereas a \*(TL exception is "thrown". In ANSI CL, when a condition is raised, a condition object is created. Condition object are similar to class objects, but are not required to be in the Common Lisp Object System. They are related by inheritance and can have properties. \*(TL exceptions are unencapsulated: they consist of a symbol, plus zero or more arguments. The symbols are related by inheritance. When a condition is raised in ANSI CL, the dynamic scope is searched for a handler, which is an ordinary function which receives the condition. No unwinding or nonlocal transfer takes place. The handler can return, in which case the search continues. Matching the condition to the handler is by inheritance. Handler functions are bound to exception type names. If a handler chooses to actually handle a condition (thereby terminating the search) it must itself perform some kind of dynamic control transfer, rather than return normally. ANSI CL provides a dynamic control mechanism known as restarts which is usually used for this purpose. A condition handler may invoke a particular restart handler. Restart handlers are similar to exception handlers: they are functions associated with symbols in the dynamic environment. In \*(TL, the special behavior which occurs for exceptions derived from .code error and those from .code warning is built into the exception handling system, and tied to those types. When an error or warning exception is unhandled, the exception handling system itself reacts, so the special behaviors occur no matter how these exceptions are raised. In ANSI CL, the special behavior for unhandled .code error conditions (of invoking the debugger) is implemented only in the .code error function; .code error conditions signalled other than via that function are not subject to any special behavior. There is a parallel situation with regard to warnings: the ANSI CL .code warn function implements a special behavior for unhandled warnings (of emitting a diagnostic) but warnings not signalled via that function are not treated that way. Thus in \*(TL, there is no way to raise an error or warning that is simply ignored due to being unhandled. In \*(TL exceptions are a unification of conditions and restarts. From an ANSI CL perspective, \*(TL exceptions are a lot like CL restarts, except that the symbols are arranged in an inheritance hierarchy. \*(TL exceptions are used both as the equivalent of ANSI CL conditions and as restarts. In \*(TL the terminology "catch" and "handle" is used in a specific way. To handle an exception means to receive it without unwinding, with the possibility of declining to handle it, so that the search continues for another handler. To catch an exception means to match an exception to a catch handler, terminate the search, unwind and pass control to the handler. \*(TL provides an operator called .code handler-bind for specifying handlers. It has a different syntax from ANSI CL's .codn handler-bind . \*(TL provides a macro called .code handle which simplifies the use of .codn handler-bind . This macro superficially resembles ANSI CL's .codn handler-case , but is semantically different. The most notable difference is that the bodies of handlers established by .code handler-bind execute without any unwinding taking place and may return normally, thereby declining to take the exception. In other words, .code handle has the same semantics as .codn handler-bind , providing only convenient syntax. \*(TL provides a macro called .code catch which has the same syntax as .code handle but specifies a catch point for exceptions. If, during an exception search, a .code catch clause matches an exception, a dynamic control transfer takes place from the throw site to the catch site. Then the clause body is executed. The .code catch macro resembles ANSI CL's .code restart-case or possibly .codn handler-case , depending on point of view. \*(TL provides unified introspection over handler and catch frames. A program can programmatically discover what handler and catches are available in a given dynamic scope. ANSI CL provides introspection over restarts only; the standard doesn't specify any mechanism for inquiring what condition handlers are bound at a given point in the execution. .TP* Example: The following two examples express a similar approach implemented using ANSI Common Lisp conditions and restarts, and then using \*(TL exceptions. .verb ;; Common Lisp (define-condition foo-error (error) ((arg :initarg :arg :reader foo-error-arg))) (defun raise-foo-error (arg) (restart-case (let ((c (make-condition 'foo-error :arg arg))) (error c)) (recover (recover-arg) (format t "recover, arg: ~s~%" recover-arg)))) (handler-bind ((foo-error (lambda (cond) (format t "handling foo-error, arg: ~s~%" (foo-error-arg cond)) (invoke-restart 'recover 100)))) (raise-foo-error 200)) .brev The output of the above is: .verb handling foo-error, arg: 200 recover, arg: 100 .brev The following is possible \*(TL equivalent for the above Common Lisp example. It produces identical output. .verb (defex foo-error error) (defex recover restart) ;; recommended practice (defun raise-foo-error (arg) (catch (throw 'foo-error arg) (recover (recover-arg) (format t "recover, arg: ~s\en" recover-arg)))) (handle (raise-foo-error 200) (foo-error (arg) (format t "handling foo-error, arg: ~s\en" arg) (throw 'recover 100))) .brev To summarize the differences: exceptions serve as both conditions and restarts in \*(TX. The same .code throw function is used to initiate exception handling for .code foo-error and then to transfer control out of the handler to the recovery code. The handler accepts one exception by raising another. When an exception symbol is used for restarting, it is a recommended practice to insert it into the inheritance hierarchy rooted at the .code restart symbol, either by inheriting directly from .code restart or from an exception subtype of that symbol. .coNP Treatment of @ errno In Built-in Exceptions Some \*(TL library functions generate exceptions in response to conditions arising in the operating system, and those conditions are associated with a numeric code in the POSIX/ISO C variable .codn errno . This code isn't represented as an exception argument. Rather, in many of these situations, the .code errno value is attached to the error message string which is passed as the first and only exception argument. The value can be retrieved by using the function .code string-get-code on the error message string. If this function returns .codn nil , then no such code is available in connection with the given error. .TP* Example: .verb (catch (open-file "AsDf") (error (msg) ;; the value 2 is retrieved from msg ;; 2 is the common value of ENOENT (list (string-get-code msg) msg))) -> (2 "error opening \e"AsDf\e": 2/\e"No such file or directory\e"") .brev .coNP Functions @, throw @ throwf and @ error .synb .mets (throw < symbol << arg *) .mets (throwf < symbol < format-string << format-arg *) .mets (error < format-string << format-arg *) .syne .desc These functions generate an exception. The .code throw and .code throwf functions generate an exception identified by .metn symbol , whereas .code error throws an exception of type .codn error . The call .code "(error ...)" can be regarded as a shorthand for .codn "(throwf 'error ...)" . The .code throw function takes zero or more additional arguments. These arguments become the arguments of a .code catch handler which takes the exception. The handler will have to be capable of accepting that number of arguments. The .code throwf and .code error functions generate an exception which has a single argument: a character string created by a formatted print to a string stream using the .code format string and additional arguments. Because .code error throws an error exception, it does not return. If an error exception is not handled, \*(TX will issue diagnostic messages and terminate. Likewise, .code throw or .code throwf are used to generate an error exception, they do not return. If the .code throw and .code throwf functions are used to generate an exception not derived from .codn error , and no handler is found which accepts the exception, they return normally, with a value of .codn nil . .coNP Macros @, catch @ catch* and @ catch** .synb .mets (catch < try-expression .mets \ \ >> {( symbol <> ( arg *) << body-form *)}*) .mets (catch* < try-expression .mets \ \ >> {( symbol >> ( type-arg << arg *) << body-form *)}*) .mets (catch** < try-expression .mets \ \ >> {( symbol < desc >> ( type-arg << arg *) << body-form *)}*) .syne .desc The .code catch macro establishes an exception catching block around the .metn try-expression . The .meta try-expression is followed by zero or more catch clauses. Each catch clause consists of a symbol which denotes an exception type, an argument list, and zero or more body forms. If .meta try-expression terminates normally, then the catch clauses are ignored. The catch itself terminates, and its return value is that of the .metn try-expression . If .meta try-expression throws an exception which is a subtype of one or more of the type symbols given in the exception clauses, then the first (leftmost) such clause becomes the exit point where the exception is handled. The exception is converted into arguments for the clause, and the clause body is executed. When the clause body terminates, the catch terminates, and the return value of the catch is that of the clause body. If .meta try-expression throws an exception which is not a subtype of any of the symbols given in the clauses, then the search for an exit point for the exception continues through the enclosing forms. The catch clauses are not involved in the handling of that exception. When a clause catches an exception, the number of arguments in the catch must match the number of elements in the exception. A catch argument list resembles a function or lambda argument list, and may be dotted. For instance the clause .code "(foo (a . b))" catches an exception subtyped from .codn foo , with one or more elements. The first element binds to parameter .codn a , and the rest, if any, bind to parameter .codn b . If there is only one element, .code b takes on the value .codn nil . The .code catch* macro is a variant of .code catch with the following difference: when .code catch* invokes a clause, it passes the exception symbol as the leftmost argument .metn type-arg . Then the exception arguments follow. In contrast, only the exception arguments are passed to the clauses of .codn catch . The .code catch** macro is a further variant, which differs from .code catch* by requiring each catch clause to provide a description .metn desc , an expression which evaluates to a character string. The .meta desc expressions are evaluated in left-to-right order prior to the evaluation of .metn try-expression . Also see: the .code unwind-protect operator, and the functions .codn throw , .code throwf and .codn error , as well as the .code handler-bind operator and .code handle macro. .coNP Operator @ unwind-protect .synb .mets (unwind-protect < protected-form << cleanup-form *) .syne .desc The .code unwind-protect operator evaluates .meta protected-form in such a way that no matter how the execution of .meta protected-form terminates, the .metn cleanup-form s will be executed. The .metn cleanup-form s, however, are not protected. If a .meta cleanup-form terminates via some nonlocal jump, the subsequent .metn cleanup-form s are not evaluated. .metn cleanup-form s themselves can "hijack" a nonlocal control transfer such as an exception. If a .meta cleanup-form is evaluated during the processing of a dynamic control transfer such as an exception, and that .meta cleanup-form initiates its own dynamic control transfer, the original control transfer is aborted and replaced with the new one. The exit points for dynamic control transfers are removed as unwinding takes place. That is to say, at the start of a dynamic control transfer, a search takes place for the target exit point. That search might skip other exit points which aren't targets of the control transfer. Those skipped exit points are left undisturbed and are still visible during unwinding until their individual binding forms are abandoned. Thus at the time of execution of an .code unwind-protect .metn cleanup-form , all of the exit points of dynamically surrounding forms are still visible, even ones which are nearer than the targeted exit point. .TP* Example: .verb (block foo (unwind-protect (progn (return-from foo 42) (format t "not reached!\en")) (format t "cleanup!\en"))) .brev In this example, the protected .code progn form terminates by returning from block .codn foo . Therefore the form does not complete and so the output .str not reached! is not produced. However, the cleanup form executes, producing the output .strn cleanup! . .coNP Macro @ ignerr .synb .mets (ignerr << form *) .syne .desc The .code ignerr macro operator evaluates each .meta form similarly to the .code progn operator. If no forms are present, it returns .codn nil . Otherwise it evaluates each .meta form in turn, yielding the value of the last one. If the evaluation of any .meta form is abandoned due to an exception of type .codn error , the code generated by the .code ignerr macro catches this exception. In this situation, the execution of the .code ignerr form terminates without evaluating the remaining forms, and yields .codn nil . .coNP Macro @ ignwarn .synb .mets (ignwarn << form *) .syne .desc The .code ignwarn macro resembles .codn ignerr . It arranges for the evaluation of each .meta form in left-to-right order. If all the forms are evaluated, then the value of the last one is returned. If no forms are present, then .code nil is returned. If any .meta form throws an exception of type .code warning then this exception is intercepted by a handler established by .codn ignwarn . This handler reacts by throwing an exception of type .codn continue . The effect is that the warning is ignored, since the handler doesn't issue any diagnostic, and passes control to the warning's continue point. Note: all sites within \*(TX which throw a .code warning also provide a nearby catch for a .code continue exception, for resuming evaluation at the point where the warning was issued. .coNP Operator @ handler-bind .synb .mets (handler-bind < function-form < symbol-list << body-form *) .syne .desc The .code handler-bind operator establishes a handler for one or more exception types, and evaluates zero or more .metn body-form s in a dynamic scope in which that handler is visible. When the .code handler-bind form terminates normally, the handler is removed. The value of the last .meta body-form is returned, or else .code nil if there are no forms. The .meta function-form argument is an expression which must evaluate to a function. The function must be capable of accepting the exception arguments. All exceptions functions require at least one argument, since the leftmost argument in an exception handler call is the exception type symbol. The .meta symbol-list argument is a list of symbols, not evaluated. If it is empty, then the handler isn't eligible for any exceptions. Otherwise it is eligible for any exception whose exception type is a subtype of any of the symbols. If the evaluation of any .meta body-form throws an exception which is not handled within that form, and the handler is eligible for that exception, then the function is invoked. It receives the exception's type symbol as the leftmost argument. If the exception has arguments, they appear as additional arguments in the function call. If the function returns normally, then the exception search continues. The handler remains established until the exception is handled in such a way that a dynamic control transfer abandons the .code handler-bind form. Note: while a handler's function is executing, the handler is disabled. If the function throws an exception for which the handler is eligible, the handler will not receive that exception; it will be skipped by the exception search as if it didn't exist. When the handler function terminates, either via a normal return or a nonlocal control transfer, then the handler is reenabled. .coNP Macros @ handle and @ handle* .synb .mets (handle < try-expression .mets \ \ >> {( symbol <> ( arg *) << body-form *)}*) .mets (handle* < try-expression .mets \ \ >> {( symbol >> ( type-arg << arg *) << body-form *)}*) .syne .desc The .code handle macro is a syntactic sugar for the .code handler-bind operator. Its syntax is exactly like that of .codn catch . The difference between .code handle and .code catch is that the clauses in .code handle are invoked without unwinding. That is to say, .code handle does not establish an exit point for an exception. When control passes to a clause, it is by means of an ordinary function call and not a dynamic control transfer. No evaluation frames are yet unwound when this takes place. The .code handle macro establishes a handler, by .code handler-bind whose .meta symbol-list consists of every .meta symbol gathered from every clause. The handler function established in the generated .code handler-bind is synthesized from all of the clauses, together with dispatch logic which which passes the exception and its arguments to the first eligible clause. The .meta try-expression is evaluated in the context of this handler. The clause of the .code handle syntax can return normally, like a function, in which case the handler is understood to have declined the exception, and exception processing continues. To handle an exception, the clause of the .code handle macro must perform a dynamic control transfer, such returning from a block via .code return or throwing an exception. The .code handle* macro is a variant of .code handle with the following difference: when .code handle* invokes a clause, it passes the exception symbol as the leftmost argument .metn type-arg . Then the exception arguments follow. In contrast, only the exception arguments are passed to the clauses of .codn handle . .coNP Macro @ with-resources .synb .mets (with-resources >> ({( sym >> [ init-form <> [ cleanup-form *]])}*) .mets \ \ << body-form *) .syne .desc The .code with-resources macro provides a sequential binding construct similar to .codn let* . Every .meta sym is established as a variable which is visible to the .metn init-form s of subsequent variables, to all subsequent .metn cleanup-form s including that of the same variable, and to the .metn body-form s. If no .meta init-form is supplied, then .meta sym is bound to the value .codn nil . If an .meta init-form is supplied, but no .metn cleanup-form s, then .meta sym is bound to the value of the .metn init-form . If one or more .metn cleanup-form s are supplied in addition to .metn init-form , they specify forms to be executed upon the termination of the .code with-resources construct. When an instance of .code with-resources terminates, either normally or by a nonlocal control transfer, then for each .meta sym whose .meta init-form had executed, thus causing that .meta sym to be bound to a value, the .metn cleanup-form s corresponding to .meta sym are evaluated in the usual left-to-right order. The .metn sym s are cleaned up in reverse (right-to-left) order. The .metn cleanup-form s of the most recently bound .meta sym are processed first; those of the least recently bound .meta sym are processed last. When the .code with-resources form terminates normally, the value of the last .meta body-form is returned, or else .code nil if no .metn body-form s are present. .TP* Note: From its inception, until \*(TX 265, .code with-resources featured an undocumented behavior. Details are given in the COMPATIBILITY section's Compatibility Version Values subsection, in the notes for compatibility value 265. .TP* "Example:" The following expression opens a text file and reads a line from it, returning that line, while ensuring that the stream is closed immediately: .verb (with-resources ((f (open-file "/etc/motd") (close-stream f))) (whilet ((l (get-line f))) (put-line l))) .brev Note that a better way to initialize exactly one stream resource is with the .code with-stream macro, which implicitly closes the stream when it terminates. .coNP Special Variable @ *unhandled-hook* .desc The .code *unhandled-hook* variable is initialized with .code nil by default. It may instead be assigned a function which is capable of taking three arguments. When an exception occurs which has no handler, this function is called, with the following arguments: the exception type symbol, the exception object, and a third value which is either .code nil or else the form which was being evaluated when the exception was thrown. The call occurs before any unwinding takes place. If the variable is .codn nil , or isn't a function, or the function returns after being called, then unwinding takes place, after which some informational messages are printed about the exception, and the process exits with a failed termination status. In the case when the variable contains a object other than .code nil which isn't a function, a diagnostic message is printed on the .code *stderr* stream prior to unwinding. Prior to the function being called, the .code *unhandled-hook* variable is reset to .codn nil . Note: the functions .code source-loc or .code source-loc-str may be applied to the third argument of the .code *unhandled-hook* function to obtain more information about the form. .coNP Macro @ defex .synb .mets (defex <> { symbol }*) .syne .desc The macro .code defex records hierarchical relationships among symbols, for the purposes of the use of those symbols as exceptions. It is closely related to the .code @(defex) directive in the \*(TX pattern language, performing the same function. All symbols are considered to be exception subtypes, and every symbol is implicitly its own exception subtype. This macro does not introduce symbols as exception types; it only introduces subtype-supertype relationships. If .code defex is invoked with no arguments, it has no effect. If arguments are present, they must be symbols. If .code defex is invoked with only one symbol as its argument, it has no effect. At least two symbols must be specified for a useful effect to take place. If exactly two symbols are specified, then, subject to error checks, .code defex makes the left symbol an .I "exception subtype" of the right symbol. This behavior generalizes to three or more arguments: if three or more symbols are specified, then each symbol other than the last is registered as a subtype of the symbol which follows. If a .code defex has three or more arguments, they are processed from left to right. If errors are encountered during the processing, the correct registrations already made for prior arguments remain in place. Every symbol is implicitly considered to be its own exception subtype, therefore it is erroneous to explicitly register a symbol as its own subtype. The symbol .code nil is implicitly a subtype of every exception type. Therefore, it is erroneous to attempt to specify it as a supertype in a registration. Using .code nil as a subtype in a registration is silently permitted, but has no effect. No explicit registration is recorded between .code nil and its successor in the argument list. The symbol .code t is implicitly the supertype of every exception type. Therefore, it is erroneous to attempt to register it as an exception subtype. Using .code t as a supertype in a registration is also erroneous. A symbol .code a may not be registered as a subtype of a symbol .code b if the reverse relationship already exists between those two symbols. The foregoing rules allow redefinitions to take place, while forbidding cycles from being created in the exception subtype inheritance graph. Keyword symbols may be used as exception types. .coNP Function @ register-exception-subtypes .synb .mets (register-exception-subtypes <> { symbol }*) .syne .desc The .code register-exception-subtypes function constitutes the underlying implementation for the .code defex macro. The following equivalence applies: .verb (defex a b ...) <--> (register-exception-subtypes 'a 'b ...) .brev That is, the .code defex macro works as if by generating a call to the function, with the arguments quoted. The semantics of the function is precisely that of the macro. .coNP Function @ exception-subtype-p .synb .mets (exception-subtype-p < left-symbol << right-symbol ) .syne .desc The .code exception-subtype-p function tests whether two symbols are in a relationship as exception types, such that .meta left-symbol is a direct or indirect exception subtype of .metn right-symbol . If that is the case, then .code t is returned, otherwise .codn nil . .coNP Function @ exception-subtype-map .synb .mets (exception-subtype-map) .syne .desc The .code exception-subtype-map function returns a tree structure which captures information about all registered exception types. The map appears as an association list which contains an entry for every exception symbol, paired with that type's supertype path. The first element in the supertype path is the exception's immediate supertype. The next element is that type's supertype and so on. The last element in every path is the grand supertype .codn t . For instance, if only the types .codn a , .code b and .code c existed in the system, and were linked according to this inheritance graph: .verb t ----+--- b --- a | +--- c .brev such that the supertype of .code b and .code c is .codn t , and .code a has .code b as supertype, then the function might return: .verb ((a b t) (b t) (c t) (t)) .brev or any other equivalent permutation. The returned list may share substructure, so that the .code "(t)" sublist is shared among all four entries, and .code "(b t)" between the first two. If the program alters the tree structure returned by .codn exception-map-p , the consequences are unspecified; this structure may be the actual object which represents the type hierarchy. .coNP Structures @, frame @ catch-frame and @ handle-frame .synb .mets (defstruct frame nil) .mets (defstruct catch-frame frame types desc jump) .mets (defstruct handle-frame frame types fun) .syne .desc The structure types .codn frame , .code catch-frame and .code handle-frame are used by the .code get-frames and .code find-frame functions to represent information about the currently established exception catches (see the .code catch macro) and handlers (see .code handler-bind and .codn handle ). The .code frame type serves as the common base for .code catch-frame and .codn handle-frame . Modifying any of the slots of these structures has no effect on the actual frame from which they are derived; the frame structures are only representation which provides information about frames. They are not the actual frames themselves. Both .code catch-frame and .code handle-frame have a .code types slot. This holds the list of exception type symbols which are matched by the catch or handler. The .code desc slot of a .code catch-frame holds a list of the descriptions produced by the .code catch** macro. If there are no descriptions, then this member is .codn nil , otherwise it is a list whose elements are in correspondence with the list in the .code types slot. The .code jump slot of a .code catch-frame is an opaque .code cptr ("C pointer") object which is related to the stack address of the catch frame. If it is altered, the catch frame object becomes invalid for the purposes of .codn invoke-catch . The .code fun slot of a .code handle-frame is the registered handler function. Note that all the clauses of a .code handle macro are compiled to a single function, which is established via .codn handler-bind , so an instance of the .code handle macro corresponds to a single .codn handle-frame . .coNP Function @ get-frames .synb .mets (get-frames) .syne .desc The .code get-frames function inquires the current dynamic environment in order to retrieve information about established exception catch and handler frames. The function returns a list, ordered from the innermost nesting level to the outermost nesting, of structure objects derived from the .code frame structure type. The list contains two kinds of objects: structures of type .code catch-frame and of type .codn handle-frame . These objects are not the frames themselves, but only provide information about frames. Modifying the slots in these structures has no effect on the original frames. Also, these structures have their own lifetime and can endure after the original frames have disappeared. This has implications for the use of the .code invoke-catch function. The .code handle-frame structures have a .code fun slot, which holds a function. It may be invoked directly. A .code catch-frame structure may be passed as an argument to the .code invoke-catch function. .coNP Functions @ find-frame and @ find-frames .synb .mets (find-frame >> [ exception-symbol <> [ frame-type ]]) .mets (find-frames >> [ exception-symbol <> [ frame-type ]]) .syne .desc The .code find-frame function locates the first (innermost) instance of a specific kind of exception frame (a catch frame or a handler frame) which is eligible for processing an exception of a specific type. If such a frame is found, it is returned. The returned frame object is of the same kind as the objects which comprise the list returned by the function .codn get-frames . If such a frame is not found, .code nil is returned. The .meta exception-symbol argument specifies a match by exception type: the candidate frame must specify in its list of matches at least one type which is an exception supertype of .metn exception-symbol . If this argument is omitted, it defaults to .code nil which finds any handler that matches at least one type. There is no way to search for handlers which match an empty set of types; the .code find-frame function skips such frames. The .meta frame-type argument specifies which frame type to find. Useful values for this argument are the structure type names .code catch-frame and .code handle-frame or the actual structure type objects which these type names denote. If any other value is specified, the function returns .codn nil . If the argument is omitted, it defaults to the type of the .code catch-frame structure. That is to say, by default, the function looks for catch frames. Thus, if .code find-frame is called with no arguments at all it finds the innermost catch frame, if any exists, or else returns .codn nil . The .code find-frames function is similar to .code find-frame except that it returns all matching frames, ordered from the innermost nesting level to the outermost nesting. If called with no arguments, it returns a list of the catch frames. .coNP Function @ invoke-catch .synb .mets (invoke-catch < catch-frame < symbol << argument *) .syne .desc The .code invoke-catch function abandons the current evaluation context to perform a nonlocal control transfer directly to the catch described by the .meta catch-frame argument, which must be a structure of type .code catch-frame obtained using any of the functions .codn get-frames , .code find-frames or .codn find-frame . The control transfer is possible only if the catch frame represented by .meta catch-frame structure is still established, and if the structure hasn't been tampered with. If a given .code catch-frame structure is usable with .codn invoke-catch , then a copy of that structure made with .code copy-struct is also usable, denoting the same catch frame. The .meta symbol argument should be an exception symbol. It is passed to the exception frame, as if it had appeared as the first argument of the .code throw function. Similarly, the .metn argument s are passed to the catch frame as if they were the trailing arguments of a .codn throw . The difference between .code invoke-catch and .code throw is that .code invoke-catch targets a specific catch frame as its exit point, rather than searching for a matching catch or handler frame. That specific frame receives the control. The frame receives control even if it it is not otherwise eligible for catching the exception type denoted by .metn symbol . .coNP Macro @ assert .synb .mets (assert < expr >> [ format-string << format-arg *]) .syne .desc The .code assert macro evaluates .metn expr . If .meta expr yields any true value, then .code assert terminates normally, and that value is returned. If instead .meta expr yields .codn nil , then .code assert throws an exception of type .codn assert . The exception carries an informative character string that contains a diagnostic detailing the expression which yielded .codn nil , and the source location of that expression, if available. If the .meta format-string and possibly additional format arguments are given to .code assert then those arguments are used to format additional text which is appended to the diagnostic message after a separating character such as a colon. .SS* Static Error Diagnosis This section describes a number of features related to the diagnosis of errors during the static processing of program code prior to evaluation. The material is of interest to developers of macros intended for broad reuse. .NP* Error Exceptions \*(TL uses exceptions of type .code eval-error to identify erroneous situations during both transformation of code and its evaluation. These exceptions have one argument, which is a character string. If not handled by program code, .code eval-error exceptions are specially recognized and treated by the built-in handling logic. The message is incorporated into diagnostic output which includes more information which is deduced. .NP* Warning Exceptions \*(TL uses exceptions of type .code warning to identify certain situations of interest. Ordinary non-deferrable warnings have a structure identical to errors, except for the exception symbol. \*(TX's provides built-in "auto continue" handling for warnings. If a warning exception is not intercepted by a catch or an accepting handler, then a diagnostic is issued on the .code *stderr* stream, after which a .code continue exception is thrown with no arguments. If that .code continue exception is not handled, then control returns normally to the point that exception to resume the computation which generated the warning. Callers which invoke code that may generate warning exceptions are therefore not required to handle them. However, callers which do handle warning exceptions expect to be able to throw a .code continue exception in order to resume the computation that triggered the warning, without allowing other handlers to see the exception. The generation of a warning should thus conform to the following pattern: .verb (catch (throw 'warning "message") (continue ())) .brev .NP* Deferrable Warnings \*(TX supports a form of diagnostic known as a .IR "deferrable warning" . A deferrable warning is distinguished in two ways. Firstly, it is either of the type .code defr-warning or subtyped from that type. The .code defr-warning type itself is a direct subtype of .codn warning . Secondly, a deferrable warning carries an additional tag argument after the exception message. A deferrable exception is thrown according to this pattern: .verb (catch (throw 'defr-warning "message" . tag) (continue ())) .brev \*(TX's built-in exception handling logic reacts specially to the presence of the tag material in the exception. First, the global .I "tentative definition list" is searched for the presence of the tag, using .code equal equality. If the tag is found, then the warning is discarded. If the tag is not found, then the exception argument list is added to the global .IR "deferred warning list" . In either case, the .code continue exception is thrown to resume the computation which threw the warning, as in the case of an ordinary non-deferrable warning. The purpose of this mechanism is to suppress warnings which become superfluous when more of the program code is examined. For instance, a warning about a call to an undefined function is superfluous if a definition of that function is supplied later, yet before that function call is executed. Deferred warnings accumulate in the deferred warning list from which they can be removed. The list is purged at various times such as when a top-level load completes, and the deferred warnings are released, as if by a call to the .code release-deferred-warnings function. .coNP Functions @ compile-error and @ compile-warning .synb .mets (compile-error < context-obj < fmt-string << fmt-arg *) .mets (compile-warning < context-obj < fmt-string << fmt-arg *) .syne .desc The functions .code compile-error and .code compile-warning provide a convenient and uniform way for code transforming functions such as macro-expanders to generate diagnostics. The .code compile-error function throws an exception of type .codn eval-error . The .code compile-warning function throws an exception of type .code warning and internally provides a .code catch for the .code continue exception which allow a warning handler to resume execution after the warning. If a handler throws a .code continue exception which is caught by .codn compile-warning , then .code compile-warning returns .codn nil . Because .code compile-warning throws a non-error exception, it returns .code nil in the event that no catch is found for the exception, and no handler which accepts it. The argument conventions are the same for both functions. The .meta context-obj is typically a compound form to which the diagnostic applies. The functions produce a diagnostic message which incorporates the location information and symbol obtained from .meta context-obj and the .codn format -style arguments .meta fmt-string and its .metn fmt-arg s. .coNP Function @ compile-defr-warning .synb .mets (compile-defr-warning < context-obj < tag .mets \ \ < fmt-string << fmt-arg *) .syne .desc The .code compile-defr-warning function throws an exception of type .code defr-warning and internally provides a .code catch for the .code continue exception needed to resume after the warning. The function produces a diagnostic message which incorporates the location information and symbol obtained from .meta context-obj and the .codn format -style arguments .meta fmt-string and its .metn fmt-arg s. This diagnostic message constitutes the first argument of the exception. The .meta tag argument is taken as the second argument. If the exception isn't intercepted by a catch or by an accepting handler, .code compile-defr-warning returns .codn nil . In also returns .code nil if it catches a .code continue exception. .coNP Function @ purge-deferred-warning .synb .mets (purge-deferred-warning << tag ) .syne .desc The .code purge-deferred-warning removes all warnings marked with .meta tag from the deferred list. It also removes all tags matching .meta tag from the tentative definition list. Tags are compared using the .code equal function. .coNP Function @ register-tentative-def .synb .mets (register-tentative-def << tag ) .syne .desc The .code register-tentative-def function adds .meta tag to the list of tentative definitions which are used to suppress deferrable warnings. The idea is that a definition of some construct has been seen, but not yet executed. Thus the construct is not defined, but it can reasonably be expected that it will be defined; hence, warnings about its nonexistence can be suppressed. For example, in the following code, when the expression .code "(foo)" is being expanded and transformed, the .code foo function does not exist: .verb (progn (defun foo ()) (foo)) .brev The function won't be defined until the .code progn is evaluated. Thus a warning is generated that .code "(foo)" refers to an undefined function. However, this warning is discarded, because the expander for .code defun registers a tentative definition tag for .codn foo . When the definition of .code foo takes place, the .code defun operator will call .code purge-deferred-warning which will remove not only all accumulated warnings related to the undefinedness of .code foo but also remove the tentative definition. Note: this mechanism isn't perfect because it will still suppresses the warning in situations like .verb (progn (if nil (defun foo ())) (foo)) .brev .coNP Function @ tentative-def-exists .synb .mets (tentative-def-exists << tag ) .syne .desc The .code tentative-def-exists function checks whether .meta tag has been registered via .code register-tentative-def and not yet purged by .codn purge-deferred-warning . .coNP Function @ defer-warning .synb .mets (defer-warning << args ) .syne .desc The .code defer-warning function attempts to register a deferred warning. The .meta args argument corresponds to the arguments which are passed to the .code throw function in order to generate a warning exception, not including the exception symbol. Args is expected to have at least two elements, the second of which is a deferred warning tag. The .code defer-warning function returns .codn nil . Note: this function is intended for use in exception handlers. The following example shows a handler which intercepts warnings. It defers deferrable warnings, and prints ordinary warnings: .verb (handle (some-form ..) ;; some code which might generate warnings (defr-warning (msg tag) ;; catch deferrable and defer (defer-warning (cons msg tag)) (throw 'continue)) ;; warning processed: resume execution (warning (msg) (put-line `warning: @msg`) ;; print non-deferrable (throw 'continue))) ;; warning processed: resume execution .brev .coNP Function @ release-deferred-warnings .synb .mets (release-deferred-warnings) .syne .desc The .code release-deferred-warnings removes all warnings from the deferred list. Then, it issues each deferred warning as an ordinary warning. Note: there is normally no need for user programs to use this function since deferred warnings are issued automatically. .coNP Function @ dump-deferred-warnings .synb .mets (dump-deferred-warnings << stream ) .syne .desc The .code dump-deferred-warnings empties the list of deferred warnings, and converts each one into a diagnostic message sent to sent to .metn stream . After the diagnostics are printed, the list of pending warnings is cleared. Note: there is normally no need for user programs to use this function since deferred warnings are issued automatically. .SS* Delimited Continuations \*(TL supports delimited continuations, which are integrated with the .code block feature. Any named or anonymous block, including the implicit blocks created around function bodies, can be used as the delimiting .I prompt for the capture of a continuation. A delimited continuation is section of a possible future of the computation, up to a delimiting prompt, .I reified as a first class function. .TP* Example: .verb (defun receive (cont) (format t "cont returned ~a\en" (call cont 3))) (defun function () (sys:capture-cont 'abcd (fun receive))) (block abcd (format t "function returned ~a\en" (function)) 4) Output: function returned 3 cont returned 4 function returned t .brev .PP Evaluation begins with the .code block form. This form calls .code function which uses .code sys:capture-cont to capture a continuation up to the .code abcd prompt. The continuation is passed to the .code receive function as an argument. This captured object represents the continuation of computation up to that prompt. It appears as a one-argument function which, when called, resumes the captured computation. Its argument emerges out of the .code sys:capture-cont call as a return value. When the computation eventually returns all the way to the delimiting prompt, the return value of that prompt will then appear as the return value of the continuation function. In this example, the function .code receive immediately invokes the continuation function which it receives, passing it the argument value .codn 3 . And so, evaluation now continues in the resumed future represented by the continuation. Inside the continuation, .code sys:capture-cont appears to return, yielding the value .codn 3 . This bubbles up through .code function up to the .code "block abcd" where a message is printed: .strn "function returned 3" . The .code block terminates, yielding the value 4. Thereby, the continuation ends, since it is delimited up to that block. Control now returns to the .code receive function which invoked the continuation, where the function call form .code "(call cont)" terminates, yielding the value .code 4 that was returned by the continuation's delimiting .code block form. The message .str "cont returned 4" is printed. The .code receive function returns normally, returning the value .code t which emerged from the .code format call. Control is now back in .code function where the .code sys:capture-cont form terminates and returns the .codn t . This bubbles up to .code block which prints .strn "function returned t" . In summary, a continuation represents, as a function, the subsequent computation that is to take place starting at some point, up to some recently established, dynamically enclosing delimiting prompt. When the continuation is captured, that future doesn't have to take place; an alternative future can carry out in which that continuation is available as a function. That alternative future can invoke the continuation at will. Invocations (resumptions) of the continuation appear as additional returns from the capture operator. A resumption of a continuation terminates when the delimiting prompt terminates, and the continuation yields the value which emerges from the prompt. Delimited continuations are implemented by capturing a segment of the evaluation stack between the prompt and the capture point. When a continuation is resumed, this saved copy of a stack segment is inserted on top of the current stack and the procedure context is resumed such that evaluation appears to emerge from the capture operator. As the continuation runs to completion, it simply pops these inserted stack frames naturally. Eventually it pops out of the delimiting prompt, at which point control ends up at the point which invoked the continuation function. The low-level operator for capturing a continuation is .codn sys:capture-cont . More expressive and convenient programming with continuations is provided by the macros .codn obtain , .codn obtain-block , .code yield-from and .codn yield , which create an abstraction which models the continuation as a suspended procedure supporting two-way communication of data. A .code suspend operator is provided, which is more general. It is identical to the .code shift operator described in various computer science literature about delimited continuations, except that it refers to a specific delimiting prompt by name. Continuations raise the issue of what to do about unwinding. The language Scheme provides the much criticized .code dynamic-wind operator which can execute initialization and clean-up code as a continuation is entered and abandoned. \*(TX takes a simpler, albeit risky approach. It provides a non-unwinding escape operator .code sys:abscond-from for use with continuations. Code which has captured a continuation can use this operator to escape from the delimiting block without triggering any unwinding among the frames between the capture point and the delimiter. When the continuation is restarted, it will then do so with all of the resources associated with it frames intact. When the continuation executes normal returns within its context, the unwinding takes place then. Thus tidy, "thread-like" use of continuations is possible with a small measure of coding discipline. Unfortunately, the absconding operator is dangerous: its use breaks the language guarantee that clean-up associated with a form is done no matter how a form terminates. .NP* Comparison with Lexical Closures Delimited continuations resemble lexical closures in some ways. Both constructs provide a way to return to some context whose evaluation has already been abandoned, and to access some aspects of that context. However, lexical closures are statically scoped. Closures capture the lexically apparent scope at a given point, and produce a function whose body has access to that scope, as well as to some arbitrary arguments. Thus, a lexical scope is reified as a first-class function. By contrast, a delimited continuation is dynamic. It captures an an entire segment of a program activation chain, up to the delimiting prompt. This segment includes scopes which are not lexically visible at the capture point: the scopes of parent functions. Moreover, the segment includes not only scopes, but also other aspects of the evaluation context, such as the possibility of returning to callers, and the (captured portion of) the original dynamic environment, such as exception handlers. That is to say, a lexical closure's body cannot return to the surrounding code or see any of its original dynamic environment; it can only inspect the environment, and then return to its own caller. Whereas a restarted delimited continuation can continue evaluation of the surrounding code, return to surrounding forms and parent functions, and access the dynamic environment. The continuation function returns to its caller when that entire restarted context terminates, whereas a closure returns to its caller as soon as the closure body terminates. .NP* Differences in Compiled vs. Interpreted Behavior Delimited continuations in \*(TX expose a behavioral difference between compiled and interpreted code which mutates the values of lexical variables. When a continuation is captured in compiled code, it captures not only the bindings of lexical variables, but also potentially their current values at the time of capture. What this means is that whenever the continuation is resumed, those variables will appear to have the captured values, regardless of any mutations that have taken place since. In other words, the captured future includes those specific values. This is because in compiled code, variables are allocated on the stack, which is copied as part of creating a continuation. Those variables are effectively newly instantiated in each resumption of the continuation, when the captured stack segment is reinstated into the stack, and take on those original values. In contrast, interpretation of code only maintains an environment pointer on the stack; the lexical environment is a dynamically allocated object whose contents aren't included in the continuation's stack segment capture. If the captured variables are modified after the capture, the continuation will see the updated values: all resumptions of the continuation share the same instance of the captured environment among themselves, and with the original context where the capture took place. An additional complication is that when compiled code captures lexical closures, captured variables are moved into dynamic storage and then they become shared: the semantics of the mutation of those variables is then similar to the situation in interpreted code. Therefore, the above described non-sharing capture behavior of compiled code is not required to hold. In continuation-based code which relies on mutation of lexical variables created with .code let or .codn let* , the macros .code hlet and .code hlet* can be used instead. These macros create variable bindings whose storage is always outside of the stack, and therefore the variables will exhibit consistent interpreted and compiled semantics under continuations. All contexts which capture the same lexical binding of a given .cod3 hlet / hlet* variable share a single instance. The most recent assignment to the variable taking place in any context establishes its value, as seen by any other context. The resumption of a continuation will not restore such a variable to a previous value. If the affected variables are other kinds of bindings such as function parameters or variables created with specialized binding constructs such as .codn with-stream , additional coding changes may be required to get interpreted code working under compilation. .coNP Function @ sys:capture-cont .synb .mets (sys:capture-cont < name < receive-fun <> [ context-form ]) .syne .desc The .code sys:capture-cont function captures a continuation, and also serves as the resume point for the resulting continuation. Which of these two situations is the case (capture or resumption) is distinguished by the use of the .meta receive-fun argument, which must be a function capable of being called with one argument. A block named .meta name must be visible; the continuation is delimited by the closest enclosing block of this name. The optional .meta context-form argument should be a compound form. If .code sys:capture-cont reports an error, it reports it against this form, and uses the form's operator symbol as the name of the function which encountered the error. If the argument is omitted, .code sys:capture-cont uses its own name. The .code sys:capture-cont function captures a continuation, represented as a function. It immediately calls .metn receive-fun , passing it it the continuation function as an argument. If .meta receive-fun returns normally, then .code sys:capture-cont returns whatever value .meta receive-fun returns. Resuming a continuation is done by invoking the continuation function. When this happens, the entire continuation context is restored by recreating its captured evaluation frames on top of the current stack. Inside the continuation, the .code sys:capture-cont function call which captured the continuation now appears to return, and yields a value. That value is precisely the value which was just passed to the continuation function moments ago. The resumed continuation can terminate in one of three ways. Firstly, it can simply keep executing until it discards all of its evaluation frames below the delimiting block, and then allows that block to terminate naturally by evaluating the last form contained in the block. Secondly, can use .code return-from against its delimiting block to explicitly abandon all evaluations in between and terminate that block. Or it may perform a nonlocal control transfer past the delimited block somewhere into the evaluation frames of the caller. In the first two cases, the termination of the block turns into an ordinary return from the continuation function, and the result value of the terminated block becomes the return value of that function call. In the last case, the call of the continuation function is abandoned and unwinding continues through the caller. If the symbol .code sys:cont-poison is passed to the continuation function, the continuation will be resumed in a different manner: its context will be restored as in the ordinary resume case, whereupon it will be immediately abandoned by a nonlocal exit, causing unwinding to take place across all of the continuation's evaluation frames. The function then returns .codn nil . If the symbol .code sys:cont-free is passed to the continuation function, the continuation isn't be resumed at all; rather, the buffer which holds the saved context of the continuation is released. Thereafter, an attempt to resume the continuation results in an error exception being thrown. After releasing the buffer, the function returns .codn nil . .TP* Notes: The continuation function may be used any time after it is produced, and may be called more than once, regardless of whether the originally captured dynamic context is still executing. The continuation object may be communicated into the resumed continuation, which can then use it to call itself, resulting in multiple nested resumptions of the same continuation. A delimited continuation is effectively a first class function. The underlying continuation object produced by .code sys:capture-cont stores a copy of the captured dynamic context. Whenever the continuation function is invoked, a copy of the captured is reinstated as if it were a new context. Thus each apparent return from the .code sys:capture-cont inside a resumed continuation is not actually made in the original context, but in a copy of that context. That context can be resumed multiple times sequentially or recursively. Just like lexical closures, continuations do not copy lexical environments; they capture lexical environments by reference. If a continuation modifies the values of captured lexical variables, those modifications are visible to other resumptions of the same continuation, to other continuations which capture the same environment, to lexical closures which capture the same environment and to the original context which created that environment, if it is still active. Unlike lexical closures, continuations do capture the local bindings of special variables. That is to say, if .code *var* is a special variable, then a lexical closure created inside a .code "(let ((*var* 42)) ...)" form will not capture the local rebinding of .code *var* which holds 42. When the closure is invoked and accesses .codn *var* , it accesses whatever value of .code *var* is dynamically current, as dictated by the environment which calls the closure, rather than the capturing environment. With continuations, the behavior is different. If a continuation is captured inside a .code "(let ((*var* 42)) ...)" form then it does capture the local binding. This is regardless whether the delimited prompt of the capture is enclosed in this form, or outside of the form. The special variable has a binding in a dynamic environment. There is always a reference to a current dynamic environment associated with every evaluation context, and a continuation captures that reference. Because it is a reference, it means that the binding is shared. That is to say, all invocations of all continuations which capture the same dynamic environment in which that .code "(let ((*var* 42)) ...)" binding was made share the same binding; if .code *var* is modified by assignment, the modification is visible to all those views. Inside a resumed continuation, a form which binds a special variable such as .code "(let ((*var* 42)) ...)" may terminate. As expected, this causes the binding to be removed, revealing either another local binding of .code *var* or the global binding. However, this unbinding only affects only that that executing continuation; it has no effect inside other instances of the same continuation or other continuations which capture the same variable. Unbinding isn't a mutation of the dynamic environment, but may be understood as merely the restoration of an earlier dynamic environment reference. .TP* "Example:" The following example shows an implementation of the .code suspend operator. .verb (defmacro suspend (:form form name var . body) ^(sys:capture-cont ',name (lambda (,var) (sys:abscond-from ,name ,*body)) ',form)) .brev .coNP Operator @ sys:abscond-from .synb .mets (sys:abscond-from < name <> [ value ]) .syne .desc The .code sys:abscond-from operator closely resembles .code return-from and performs the same function: it causes an enclosing block .meta name to terminate with .meta value which defaults to .codn nil . However, unlike .codn return-from , .code sys:abscond-from does not perform any unwinding. This operator should never be used for any purpose other than implementing primitives for the use of delimited continuations. It is used by the .code yield-from and .code yield operators to escape out of a block in which a continuation has been captured. Neglecting to unwind is valid due to the expectation that control will return into a restarted copy of that context. .coNP Function @ sys:abscond* .synb .mets (sys:abscond* < name <> [ value ]) .syne .desc The .code sys:abscond* function is similar to the .code sys:abscond-from operator, except that .code name is an ordinary function parameter, and so when .code return* is used, an argument expression must be specified which evaluates to a symbol. Thus .code sys:abscond* allows the target block of a return to be dynamically computed. The following equivalence holds between the operator and function: .verb (sys:abscond-from a b) <--> (sys:abscond* 'a b) .brev Expressions used as .meta name arguments to .code abscond* which do not simply quote a symbol have no equivalent in .codn abscond-from . .coNP Macros @ obtain and @ yield-from .synb .mets (obtain << forms *) .mets (yield-from < name <> [ form ]) .syne .desc The .code obtain and .code yield-from macros closely interoperate. The .code obtain macro treats zero or more .metn form s as a suspendable execution context called the .IR "obtain block" . It is expected that .metn form s establish a block named .meta name and return its result value to .codn obtain . Without evaluating any of the forms in the obtain block, .code obtain returns a function, which takes one optional argument. This argument, called the .IR "resume value" , defaults to .code nil if it is omitted. The function represents the suspended execution context. The context is resumed whenever the function is called, and executes until the next .code yield-from statement which references the block named .metn name . The function's reply argument is noted. If the .code yield-from specifies a .meta form argument, then the execution context suspends, and the resume function terminates and returns the value of that form. When the function is called again to resume the context, the .code yield-from returns the previously noted resume value (and the new resume value just passed is noted in its place). If the .code yield-from specifies no .meta form argument, then it briefly suspends the execution context only to retrieve the resume value, without producing an item. Since no item is produced, the resume function does not return. The execution context implicitly resumes. When execution reaches the last form in the obtain block, the resume value is discarded. The execution context terminates, and the most recent call to the resume function returns the value of that last form. .TP* Notes: The .code obtain macro registers a finalizer against the returned resume function. The finalizer invokes the function, passing it the symbol .codn sys:cont-poison , thereby triggering unwinding in the most recently captured continuation. Thus, abandoned .code obtain blocks are subject to unwinding when they become garbage. The .code yield-from macro works by capturing a continuation and performing a nonlocal exit to the nearest block called .metn name . It passes a special yield object to that block. The .code obtain macro generates code which knows what to do with this special yield object. .TP* Examples: The following example shows a function which recursively traverses a .code cons cell structure, yielding all the .cod2 non- nil atoms it encounters. Finally, it returns the object .codn nil . The function is invoked on a list, and the invocation is wrapped in an .code obtain block to convert it to a generating function. The generating function is then called six times to retrieve the five atoms from the list, and the final .code nil value. These are collected into a list. This example demonstrates the power of delimited continuations to suspend and resume a recursive procedure. .verb (defun yflatten (obj) (labels ((flatten-rec (obj) (cond ((null obj)) ((atom obj) (yield-from yflatten obj)) (t (flatten-rec (car obj)) (flatten-rec (cdr obj)))))) (flatten-rec obj) nil)) (let ((f (obtain (yflatten '(a (b (c . d)) e))))) (list [f] [f] [f] [f] [f] [f])) --> (a b c d e nil) .brev The following interactive session log exemplifies two-way communication between the main code and a suspending function. Here, .code mappend is invoked on a list of symbols representing fruit and vegetable names. The objective is to return a list containing only fruits. The .code lambda function suspends execution and yields a question out of the .code map block. It then classifies the item as a fruit or not according to the reply it receives. The reply emerges as the result value of the .code yield-from call. The .code obtain macro converts the block to a generating function. The first call to the function is made with no argument, because the argument would be ignored anyway. The function returns a question, asking whether the first item in the list, the potato, is a fruit. To answer positively or negatively, the user calls the function again, passing in .code t or .codn nil , respectively. The function returns the next question, which is answered in the same manner. When the question for the last item is answered, the function call yields the final item: the ordinary result of the block, which is the list of fruit names. .verb 1> (obtain (block map (mappend (lambda (item) (if (yield-from map `is @item a fruit?`) (list item))) '(potato apple banana lettuce orange carrot)))) # 2> (call *1) "is potato a fruit?" 3> (call *1 nil) "is apple a fruit?" 4> (call *1 t) "is banana a fruit?" 5> (call *1 t) "is lettuce a fruit?" 6> (call *1 nil) "is orange a fruit?" 7> (call *1 t) "is carrot a fruit?" 8> (call *1 nil) (apple banana orange) .brev The following example demonstrates an accumulator. Values passed to the resume function are added to a counter which is initially zero. Each call to the function returns the updated value of the accumulator. Note the use of .code "(yield-from acc)" with no arguments to receive the value passed to the first call to the resume function, without yielding an item. The first return value .code 1 is produced by the .code "(yield-from acc sum)" form, not by .codn "(yield-from acc)" . The latter only obtains the initial value .code 1 and uses it to establish the seed value of the accumulator. Without causing the resume function to terminate and return, control passes into the loop, which yields the first item, causing the resume function call .code "(call *1 1)" to return .codn 1 : .verb 1> (obtain (block acc (let ((sum (yield-from acc))) (while t (inc sum (yield-from acc sum)))))) # 2> (call *1 1) 1 3> (call *1 2) 3 4> (call *1 3) 6 5> (call *1 4) 10 .brev .coNP Macro @ obtain-block .synb .mets (obtain-block < name << forms *) .syne .desc The .code obtain-block macro combines .code block and .code obtain into a single expression. The .metn form s are evaluated in a block named .codn name . That is to say, the following equivalence holds: .verb (obtain-block n f ...) <--> (obtain (block n f ...)) .brev .coNP Macro @ yield .synb .mets (yield <> [ form ]) .syne .desc The .code yield macro is to .code yield-from as .code return is to .codn return-from : it yields from an anonymous block. It is equivalent to calling .code yield-from using .code nil as the block name. In other words, the following equivalence holds: .verb (yield x) <--> (yield-from nil x) .brev .TP* Example: .verb ;; Yield the integers 0 to 4 from a for loop, taking ;; advantage of its implicit anonymous block: (defvarl f (obtain (for ((i 0)) ((< i 5)) ((inc i)) (yield i)))) [f] -> 0 [f] -> 1 [f] -> 2 [f] -> 3 [f] -> 4 [f] -> nil [f] -> nil .brev .coNP Macros @ obtain* and @ obtain*-block .synb .mets (obtain* << forms *) .mets (obtain*-block < name << forms *) .syne .desc The .code obtain* and .code obtain*-block macros implement a useful variation of .code obtain and .codn obtain-block . The .code obtain* macro differs from .code obtain in exactly one regard: prior to returning the function, it invokes it one time, with the argument value .codn nil . Thus, the following equivalence holds .verb (obtain* forms ...) <--> (let ((f (obtain forms ...))) (call f) f) .brev In other words, the suspended block is immediately resumed, so that it executes either to completion (in which case its value is discarded), or to its first .code yield or .code yield-from call (in which case the yielded value is discarded). Note: the .code obtain* macro is useful in creating suspensions which accept data rather than produce data. The .code obtain*-block macro combines .code obtain* and .code block in the same manner that .code obtain-block combines .code obtain and .codn block . .TP* Example: .verb ;; Pass three values into suspended block, ;; which get accumulated into list. (let ((f (obtain*-block nil (list (yield nil) (yield nil) (yield nil))))) (call f 1) (call f 2) (call f 3)) -> (1 2 3) ;; Under obtain, extra call is required: (let ((f (obtain-block nil (list (yield nil) (yield nil) (yield nil))))) (call f nil) ;; execute block to first yield (call f 1) ;; resume first yield with 1 (call f 2) (call f 3)) -> (1 2 3) .brev .coNP Macro @ suspend .synb .mets (suspend < block-name < var-name << body-form *) .syne .desc The .code suspend operator captures a continuation up to the prompt given by the symbol .meta block-name and binds it to the variable name given by .metn var-name , which must be a symbol suitable for binding variables with .codn let . Each .meta body-form is then evaluated in the scope of the variable .metn var-name . When the last .meta body-form is evaluated, a nonlocal exit takes place to the block named by .meta block-name (using the .code sys:abscond-from operator, so that unwinding isn't performed). When the continuation bound to .meta var-name is invoked, a copy of the entire block .meta block-name is restarted, and in that copy, the .code suspend call appears to return normally, yielding the value which had been passed to the continuation. .TP* Example Define John McCarthy's .code amb function using .code block and .codn suspend : .verb (defmacro amb-scope (. forms) ^(block amb-scope ,*forms)) (defun amb (. args) (suspend amb-scope cont (each ((a args)) (if a (iflet ((r (call cont a))) (return-from amb-scope r)))))) .brev Use .code amb to bind the .code x and .code y which satisfy the predicate .mono .meti (eql (* x y) 8) .onom nondeterministically: .verb (amb-scope (let ((x (amb 1 2 3)) (y (amb 4 5 6))) (amb (eql (* x y) 8)) (list x y))) -> (2 4) .brev .coNP Macros @ hlet and @ hlet* .synb .mets (hlet >> ({ sym | >> ( sym << init-form )}*) << body-form *) .mets (hlet* >> ({ sym | >> ( sym << init-form )}*) << body-form *) .syne .desc The .code hlet and .code hlet* macros behave exactly like .code let and .codn let* , respectively except that they guarantee that the variable bindings are allocated in storage which isn't captured by delimited continuations. The .code h in the names stands for "heap", serving as a mnemonic based on the implementation concept of these bindings being "heap-allocated". .SS* Regular-Expression Library \*(TX provides a "pure" regular-expression implementation based on automata theory, which equates regular expressions, finite automata and sets of strings. A regular expression determines whether or not a string of input characters belongs to a set. \*(TX regular expressions do not support features such as "anchoring" a match to the start or end of a string, or capturing parenthesized subexpression matches into registers. Parenthesis syntax denotes only grouping, with no additional meaning. The semantics of whether a regular expression is used for a substring search, prefix match, suffix match, string splitting and so forth comes from the functions which use regular expressions to perform these operations. .NP* Regular Expressions as Functions .synb .mets >> [ regex >> [ start <> [ from-end ]] << string ] .syne .desc A regular expression is callable as a function in \*(TL. When used this way, it requires a string argument. It searches the string for the leftmost match for itself, and returns the matching substring, which could be empty. If no match is found, it returns .codn nil . A regex takes one, two, or three arguments. The required .meta string is always the rightmost argument. This allows for convenient partial application over optional arguments using macros in the .code op family, and macros in which the .code op syntax is implicit. The optional arguments .meta start and .meta from-end are treated exactly as their like-named counterparts in the .code search-regst function. .TP* Example: Keep those elements from a list of strings which match the regular expression .codn #/a.*b/ : .verb (keep-if #/a.*b/ '#"abracadabra zebra hat adlib adobe deer") --> ("abracadabra" "adlib" "adobe") .brev .coNP Functions @, search-regex @ range-regex and @ search-regst .synb .mets (search-regex < string < regex >> [ start <> [ from-end ]]) .mets (range-regex < string < regex >> [ start <> [ from-end ]]) .mets (search-regst < string < regex >> [ start <> [ from-end ]]) .syne .desc The .code search-regex function searches through .meta string starting at position .meta start for a match for .metn regex . If .meta start is omitted, the search starts at position 0. If .meta from-end is specified and has a .cod2 non- nil value, the search proceeds in reverse, from the position just beyond the last character of .metn string , toward .metn start . If .meta start exceeds the length of the string, then .code search-regex returns .codn nil . If .meta start is negative then it indicates positions from the end of the string, such that -1 is the last character, -2 the second last and so forth. If the value is so negative that it refers beyond the start of the string, then the starting position is deemed to be zero. If .meta start is equal to the length of .metn string , and thus refers to the position one character past its length, then a match occurs at that position if .meta regex admits such a match. The .code search-regex function returns .code nil if no match is found, otherwise it returns a cons, whose .code car indicates the position of the match, and whose .code cdr indicates the length of the match. If .meta regex is capable of matching empty strings, and no other kind of match is found within .metn string , then search regex reports a zero length match. If .meta from-end is false, then this match is reported at .metn start , otherwise it is reported at the position one character beyond the end of the string. The .code range-regex function is similar to .codn search-regex , except that when a match is found, it returns a position range, rather than a position and length. A range object is returned whose .code from field indicates the position of the match, and whose .code to indicates the position one element past the last character of the match. If the match is empty, the two integers are equal. Also see the .code rr function, which provides an alternative argument syntax for the semantics of .codn range-regex . The .code search-regst differs from .code search-regex in the representation of the return value in the matching case. Rather than returning the position and length of the match, it returns the matching substring of .metn string . .coNP Functions @ match-regex and @ match-regst .synb .mets (match-regex < string < regex <> [ position ]) .mets (match-regst < string < regex <> [ position ]) .syne .desc The .code match-regex function tests whether .meta regex matches at .meta position in .metn string . If .meta position is not specified, it is taken to be zero. Negative values of .meta position index from the right end of the string such that -1 refers to the last character. Excessively negative values which index before the first character cause .code nil to be returned. If the regex matches, then the length of the match is returned. If it does not match, then .code nil is returned. The .code match-regst differs from .code match-regex in the representation of the return value in the matching case. Rather than returning the length of the match, it returns matching substring of .metn string . .coNP Functions @ match-regex-right and @ match-regst-right .synb .mets (match-regex-right < string < regex <> [ end-position ]) .mets (match-regst-right < string < regex <> [ end-position ]) .syne .desc The .code match-regex-right function tests whether some substring of .meta string which terminates at the character position just before .meta end-position matches .metn regex . If .meta end-position is not specified, it defaults to the length of the string, and the function performs a right-anchored regex match. The .meta end-position argument can be a negative integer, in which case it denotes positions from the end of the string, such that -1 refers to the last character. If the value is excessively negative such that the position immediately before it is before the start of the string, then .code nil is returned. If .meta end-position is a positive value beyond the length of .metn string , then, likewise, .code nil is returned. If a match is found, then the length of the match is returned. A more precise way of articulating the role of .meta end-position is that for the purposes of matching, .code string is considered to terminate just before .metn end-position : in other words, that .meta end-position is the length of the string. The match is then anchored to the end of this effective string. The .code match-regst-right differs from .code match-regst-right in the representation of the return value in the matching case. Rather than returning the length of the match, it returns the matching substring of .metn string . .TP* Examples: .verb ;; Return matching portion rather than length thereof. (defun match-regex-right-substring (str reg : end-pos) (set end-pos (or end-pos (length str))) (let ((len (match-regex-right str reg end-pos))) (if len [str (- end-pos len)..end-pos] nil))) (match-regex-right-substring "abc" #/c/) -> "" (match-regex-right-substring "acc" #/c*/) -> "cc" ;; Regex matches starting at multiple positions, but all ;; the matches extend past the limit. (match-regex-right-substring "acc" #/c*/ 2) -> nil ;; If the above behavior is not wanted, then ;; we can extract the string up to the limiting ;; position and do the match on that. (match-regex-right-substring ["acc" 0..2] #/c*/) -> "c" ;; Equivalent of above call (match-regex-right-substring "ac" #/c*/) -> "c" .brev .coNP Function @ regex-prefix-match .synb .mets (regex-prefix-match < regex < string <> [ position ]) .syne .desc The .code regex-prefix-match determines whether the input string might be the prefix of a string which matches regular expression .metn regex . The result is true if the input string matches .meta regex exactly. However, it is also true in situations in which the input string doesn't match .metn regex , yet can be extended with one or more additional characters beyond the end such that the extended string .B does match. The .meta string argument must be a character string. The function takes the input string to be the suffix of .meta string which starts at the character position indicated by the .meta position argument. If that argument is omitted, then .meta string is taken as the input in its entirety. Negative values index backwards from the end of .meta string according to the usual conventions elsewhere in the library. Note: this function is not to be confused for the semantics of a regex matching a prefix of a string: that capability is provided by the functions .codn match-regex , .codn m^ , .codn r^ , .code f^ and .codn fr^ . .TP* Examples: .verb ;; The empty string is not a viable prefix match for ;; a regex that matches no strings at all: (regex-prefix-match #/~.*/ "") -> nil (regex-prefix-match #/[]/ "") -> nil ;; The empty string is a viable prefix of any regex ;; which matches at least one string: (regex-prefix-match #// "") -> t (regex-prefix-match #/abc/ "") -> t ;; This string doesn't match the regex because ;; it doesn't end in b, but is a viable prefix: (regex-prefix-match #/a*b/ "aa") -> t (regex-prefix-match #/a*b/ "ab") -> t (regex-prefix-match #/a*b/ "ac") -> nil (regex-prefix-match #/a*b/ "abc") -> nil .brev .coNP Function @ regsub .synb .mets (regsub < regex < replacement << string ) .mets (regsub < substring < replacement << string ) .mets (regsub < function < replacement << string ) .syne .desc The .code regsub function operates in two modes, depending on whether the first argument is a regular expression, or function. If the first argument is a regular expression or string, then .code regsub searches .meta string for multiple occurrences of non-overlapping matches for that .meta regex or .metn substring . A new string is constructed similar to .meta string but in which each matching region is replaced with using .meta replacement as follows. The .meta replacement object may be a character or a string, in which case it is simply taken to be the replacement for each match of the regular expression. The .meta replacement object may be a function of one argument, in which case for every match which is found, this function is invoked, with the matching piece of text as an argument. The function's return value is then taken to be the replacement text. If the first argument is a function, then it is called, with .meta string as its argument. The return value must be either a range object (see the .code rcons function) which indicates the extent of .meta string to be replaced, or else .code nil which indicates that no replacement is to take place. .TP* Examples: .verb ;; match every lowercase e or o, and replace by filtering ;; through the upcase-str function: [regsub #/[eo]/ upcase-str "Hello world!"] -> "HEllO wOrld!" ;; Replace Hello with Goodbye: (regsub #/Hello/ "Goodbye" "Hello world!") -> "Goodbye world!" ;; Same, as a simple substring match, rather than regex: (regsub "Hello" "Goodbye" "Hello world!") -> "Goodbye world!" ;; Left-anchored replacement with r^ function: (regsub (fr^ #/H/) "J" "Hello, hello!") -> "Jello, hello!" .brev .coNP Function @ regexp .synb .mets (regexp << obj ) .syne .desc The .code regexp function returns .code t if .meta obj is a compiled regular-expression object. For any other object type, it returns .codn nil . .coNP Functions @ trim-left and @ trim-right .synb .mets (trim-left >> { regex | << prefix } << string ) .mets (trim-right >> { regex | << suffix } << string ) .syne .desc The .code trim-left and .code trim-right functions return a new string, equivalent to .meta string with a leading or trailing portion removed. If the first argument is a regular expression .metn regex , then, respectively, .code trim-left and .code trim-right find a prefix or suffix of .meta string which matches the regular expression. If there is no match, or if the match is empty, then .meta string is returned. Otherwise, a copy of .meta string is returned in which the matching characters are removed. If .meta regex matches all of .meta string then the empty string is returned. If the first argument is a character string, then it is treated as an exact match for that sequence of characters. Thus, .code trim-left interprets that string as a .meta prefix to be removed, and .code trim-right as a .metn suffix . If .meta string starts with .metn prefix , then .code trim-left returns a copy of .meta string with .meta prefix removed. Otherwise, .meta string is returned. Likewise, if .meta string ends with .metn suffix , then .code trim-right returns a copy of .meta string with .meta suffix removed. Otherwise, .meta string is returned. .coNP Function @ regex-compile .synb .mets (regex-compile < form-or-string <> [ error-stream ]) .syne .desc The .code regex-compile function takes the source code of a regular expression, expressed as a Lisp data structure representing an abstract syntax tree, or else a regular expression specified as a character string, and compiles it to a regular-expression object. If .meta form-or-string is a character string, it is parsed to an abstract syntax tree first, if by the .code regex-parse function. If the parse is successful (the result is not .codn nil ) then the resulting tree structure is compiled by a recursive call to .codn regex-compile . The optional .meta error-stream argument is passed down to .code regex-parse as well as in the recursive call to .codn regex-compile , if that call takes place. If .meta error-stream is specified, it must be a stream. Any error diagnostics are sent to that stream. .TP* Examples: .verb ;; the equivalent of #/[a-zA-Z0-9_]/ (regex-compile '(set (#\ea . #\ez) (#\eA . #\eZ) (#\e0 . #\e9) #\e_)) ;; the equivalent of #/.*/ and #/.+/ (regex-compile '(0+ wild)) (regex-compile '(1+ wild)) ;; #/a|b|c/ (regex-compile '(or (or #\ea #\eb) #\ec)) ;; string (regex-compile "a|b|c") .brev .coNP Function @ regex-source .synb .mets (regex-source << regex ) .syne .desc The .code regex-source function returns the source code of compiled regular expression .metn regex . The source code isn't the textual notation, but the Lisp data structure representing the abstract syntax tree: the same representation as what is returned by .codn regex-parse . .coNP Function @ regex-parse .synb .mets (regex-parse < string <> [ error-stream ]) .syne .desc The .code regex-parse function parses a character string which contains a regular expression and turns it into a Lisp data structure (the abstract syntax tree representation of the regular expression). The regular-expression syntax .code #/RE/ produces the same structure, but as a literal which is processed at the time \*(TX source code is read; the .code regex-parse function performs this parsing at run time. If there are parse errors, the function returns .codn nil . The optional .meta error-stream argument specifies a stream to which error messages are sent from the parser. By default, diagnostic output goes to the .code *stdnull* stream, which discards it. If .meta error-stream is specified as .codn t , then the diagnostic output goes to the .code *stdout* stream. If .code regex-parse returns a .cod2 non- nil value, that structure is then something which is suitable as input to .codn regex-compile . There is a small difference in the syntax accepted by .code regex-parse and the syntax of regular-expression literals. Any .code / (slash) characters occurring in any position within .meta string are treated as ordinary characters, not as regular-expression delimiters. The call .mono (regex-parse "/a/") .onom matches three characters: a slash, followed by the letter "a", followed by another slash. Note that the slashes are not escaped. Note: if a .code regex-parse call is written using a string literal as the .meta string argument, then note that any backslashes which are to be processed by the regular expression must be doubled up, otherwise they belong to the string literal: .verb (regex-parse "\e*") ;; error, invalid string literal escape (regex-parse "\e\e*") ;; correct: the \e* literal match for * .brev The double backslash in the string literal produces a single backslash in the resulting string object that is processed by .codn regex-parse . .coNP Function @ regex-optimize .synb .mets (regex-optimize << regex-tree-syntax ) .syne .desc The .code regex-compile function accepts the source code of a regular expression, expressed as a Lisp data structure representing an abstract syntax tree, and calculates an equivalent structure in which certain simplifications have been performed, or in some cases substitutions which eliminate the dependence on derivative-based processing. The .meta regex-tree-syntax argument is assumed to be correct, as if it were produced by the .code regex-parse or .code regex-from-trie functions. Incorrect syntax produces unspecified results: an exception may be thrown, or some object may appear to be successfully returned. Note: it is unnecessary to call this function to prepare the input for .code regex-compile because that function optimizes internally. However, the source code attached to a compiled regular-expression object is the original unoptimized syntax tree, and that is used for rendering the .code #/.../ notation when the object is printed. If the syntax is passed through .code regex-optimize before .codn regex-compile , the resulting object will have the optimized code attached to it, and subsequently render that way in printed form. .TP* Examples: .verb ;; a|b|c -> [abc] (regex-optimize '(or #\ea (or #\eb #\ec))) -> (set #\ea #\eb #\ec) ;; (a|) -> a? (regex-optimize '(or #\ea nil)) -> (? #\ea) .brev .coNP Function @ read-until-match .synb .mets (read-until-match < regex >> [ stream <> [ include-match ]]) .syne .desc The .code read-until-match function reads characters from .metn stream , accumulating them into a string, which is returned. If an argument is not specified for .metn stream , then the .code *stdin* stream is used. The .meta include-match argument is Boolean, indicating whether the delimiting text matched by .meta regex is included in the returned string. It defaults to .codn nil . The accumulation of characters is terminated by a non-empty match on .metn regex , the end of the stream, or an error. This means that characters are read from the stream and accumulated while the stream has more characters available, and while its prefix does not match .metn regex . If .meta regex matches the stream before any characters are accumulated, then an empty string is returned. If the stream ends or a non-exception-throwing error occurs before any characters are accumulated, the function returns .codn nil . When the accumulation of characters terminates by a match on .metn regex , the longest possible matching sequence of characters is removed from the stream. If .meta include-match is true, that matching text is included in the returned string. Otherwise, it is discarded. The next available character in the stream is the first nonmatching character following the matched text. However, the next available character, as well as some number of subsequent characters, may originate from the stream's push-back buffer, rather than from the underlying operating system object, due to this function's internal use of the .code unget-char function. Therefore, the stream position, as would be reported by .codn seek-stream , is unspecified. .coNP Functions @ scan-until-match and @ count-until-match .synb .mets (scan-until-match < regex <> [ stream ]) .mets (count-until-match < regex <> [ stream ]) .syne .desc The functions .code scan-until-match and .code count-until-match read characters from .meta stream until a match occurs in the stream for regular expression .metn regex , the stream runs out of characters, or an error occurs. If the stream runs out of characters, or a non-exception-throwing error occurs, before a match for .meta regex is identified, these functions return .codn nil . If a match for .meta regex occurs in .metn stream , then .code count-until-match returns the number of characters that were read and discarded prior to encountering the first matching character. In the same situation, the .code scan-until-match function returns a .code cons cell whose .code car holds the count of discarded characters, that being the same value as what would be returned by .codn count-until-match , and whose .code cdr holds a character string that comprises the text matched by .metn regex . The text matched by .meta regex is as long as possible, and is removed from the stream. The next available character in the stream is the first nonmatching character following the matched text. However, the next available character, as well as some number of subsequent characters, may originate from the stream's push-back buffer, rather than from the underlying operating system object, due to these functions' internal use of the .code unget-char function. Therefore, the stream position, as would be reported by .codn seek-stream , is unspecified. .coNP Functions @, m^$ @ m^ and @ m$ .synb .mets (m^$ < regex <> [ position ] << string ) .mets (m^ < regex <> [ position ] << string ) .mets (m$ < regex <> [ end-position ] << string ) .syne .desc These functions provide functionality similar to the .code match-regst and .code match-regst-right functions, but under alternative interfaces which are more convenient. The .code ^ and .code $ notation used in their names are an allusion to the regular-expression search-anchoring operators found in familiar POSIX utilities such as .codn grep . The .meta position argument, if omitted, defaults to zero, so that the entire .meta string is operated upon. The .meta end-position argument defaults to the length of .metn string , so that the end position coincides with the end of the string. If the .meta position or .meta end-position arguments are negative, they index backwards from the length of .meta string so that -1 denotes the last character. A value in either parameter which is excessively negative or positive, such that it indexes before the start of the string or exceeds its length results in a failed match and consequently .code nil being returned. The .code m^$ function tests whether the entire portion of .meta string starting at .meta position through to the end of the string is in the set of strings matched by .metn regex . If this is true, then that portion of the string is returned. Otherwise .code nil is returned. The .code m^ function tests whether the portion of the .meta string starting at .meta position has a prefix which matches .metn regex . If so, then this matching prefix is returned. Otherwise .code nil is returned. The .code m$ function tests whether the portion of .meta string ending just before .meta end-position has a suffix which matches .metn regex . If so, then this matching suffix is returned. Otherwise .code nil is returned. .coNP Functions @, r^$ @, r^ @ r$ and @ rr .synb .mets (r^$ < regex <> [ position ] << string ) .mets (r^ < regex <> [ position ] << string ) .mets (r$ < regex <> [ end-position ] << string ) .mets (rr < regex >> [ position <> [ from-end ]] << string ) .syne .desc The first three of these functions perform the same operations as, respectively, .codn m^$ , .code m^ and .codn m$ , with the same argument conventions. They differ in return value. When a match is found, they return a range value indicating the extent of the matching substring within .meta string rather than the matching substring itself. The .code rr function performs the same operation as .code range-regex with different conventions with regard to argument order, harmonizing with those of the other three functions above. The .meta position argument, if omitted, defaults to zero, so that the entire .meta string is operated upon. The .meta end-position argument defaults to the length of .metn string , so that the end position coincides with the end of the string. With one exception, a value in either parameter which is excessively negative or positive, such that it indexes before the start of the string or exceeds its length results in a failed match and consequently .code nil being returned. The exception is that the .code rr function permits a negative .meta position value which refers before the start of the string; this is effectively treated as zero. The .meta from-end argument defaults to .codn nil . The .code r^$ function tests whether the entire portion of .meta string starting at .meta position through to the end of the string is in the set of strings matched by .metn regex . If this is true, then the matching range is returned, as a range object. The .code r^ function tests whether the portion of the .meta string starting at .meta position has a prefix which matches .metn regex . If so, then the matching range is returned, as a range object. Otherwise .code nil is returned. The .code r$ function tests whether the portion of .meta string ending just before .meta end-position has a suffix which matches .metn regex . If so, then the matching range is returned. Otherwise .code nil is returned. The .code rr function searches .meta string starting at .meta position for a match for .codn regex . If .meta from-end is specified and true, the rightmost match is reported. If a match is found, it is reported as a range. A regular expression which matches empty strings matches at the start position, and every other position, including the position just after the last character, coinciding with the length of .metn string . Except for the different argument order such that .meta string is always the rightmost argument, the .code rr function is equivalent to the .code range-regex function, such that correspondingly named arguments have the same semantics. .coNP Function @ rra .synb .mets (rra < regex >> [ start <> [ end ]] << string ) .syne .desc The .code rra function searches .meta string between the .meta start and .meta end position for matches for the regular expression .metn regex . The matches are returned as a list of range objects. The .meta start argument defaults to zero, and .meta end defaults to the length of the string (the position one past the last character). Negative values of .meta start and .meta end indicate positions from the end of the string, such that -1 denotes the last character, -2 the second-to-last and so forth. If .meta start is so negative that it refers before the start of .metn string , it is treated as zero. If this situation is true of the .meta end argument, then the function returns .codn nil . If .meta start refers to a character position beyond the length of .meta string (two characters or more beyond the end of the string), then the function returns .codn nil . If this situation is true of .metn end , then .meta end is curtailed to the string length. The .code rra function returns all non-overlapping matches, including zero length matches. Zero length matches may occur before the first character of the string, or after the last character. If so, these are included. .coNP Functions @, f^$ @ f^ and @ f$ .synb .mets (f^$ < regex <> [ position ]) .mets (f^ < regex <> [ position ]) .mets (f$ < regex <> [ end-position ]) .syne .desc These regular-expression functions do not directly perform regex operations. Rather, they each return a function of one argument which performs a regex operation. The returned functions perform the same operations as, respectively, .codn m^$ , .code m^ and .codn m$ . The following equivalences nearly hold, except that the functions on the right side produced by .code op can accept two arguments when only .code r is curried, whereas the functions on the left take only one argument: .verb [f^$ r] <--> (op m^$ r) [f^$ r p] <--> (op m^$ r p) [f^ r] <--> (op m^ r) [f^ r p] <--> (op m^ r p) [f$ r] <--> (op m$ r) [f$ r p] <--> (op m$ r p) .brev That is to say, .code f^$ returns a function which binds .meta regex and possibly the optional .metn position . When this function is invoked, it must be given an argument which is a string. It performs the same operation as .code m^$ being called on .meta regex and possibly .metn position . The same holds between .code f^ and .codn m^ , and between .code f$ and .codn m$ . .TP* Examples: .verb ;; produce list which contains only strings ;; beginning with "cat": (keep-if (f^ #/cat/) '#"dog catalog cat fox catapult") --> ("catalog" "cat" "catapult") ;; map all strings in a list to just their trailing ;; digits. (mapcar (f$ #/\ed*/) '#"a123 4 z bc465") --> ("123" "4" "" "465") ;; check that all strings consist of digits after ;; the third position. (all '#"ABC123 DFE45 12379" (f^$ #/\ed*/ 3)) --> "79" ; i.e. true (all '#"ABC123 DFE45 12379A" (f^$ #/\ed*/ 3)) --> nil .brev .coNP Functions @, fr^$ @, fr^ @ fr$ and @ frr .synb .mets (fr^$ < regex <> [ position ]) .mets (fr^ < regex <> [ position ]) .mets (fr$ < regex <> [ end-position ]) .mets (frr < regex <> [[ start-position ] << from-end ]) .syne .desc These regular-expression functions do not directly perform regex operations. Rather, they each return a function of one argument which performs a regex operation. The returned functions perform the same operations as, respectively, .codn r^$ , .codn r^ , .code r$ and .codn rr . The following equivalences nearly hold, except that some of the functions on the right side produced by op .code op can accept additional arguments after the input string, whereas the functions on the left produced by .code f^$ et al. accept only one parameter: the input string. .verb [fr^$ r] <--> (op m^$ r) [fr^$ r p] <--> (op m^$ r p) [fr^ r] <--> (op m^ r) [fr^ r p] <--> (op m^ r p) [fr$ r] <--> (op m$ r) [fr$ r p] <--> (op m$ r p) [frr r] <--> (op m$ r) [frr r s] <--> (op m$ r s) [frr r s fe] <--> (op m$ r s fe) .brev That is to say, .code fr^$ returns a function which binds .meta regex and possibly the optional .metn position . When this function is invoked, it must be given an argument which is a string. It performs the same operation as .code r^$ being called on .meta regex and possibly .metn position , and the string. The same holds between .code fr^ and .codn r^ , between .code fr$ and .codn r$ , and between .code frr and .codn rr . .TP* Examples: .verb ;; Remove leading digits from "123A456", ;; other than first digit: (regsub (fr^ #/\ed+/ 1) "" "123A456") --> "1A456" .brev .SS* Hashing Library A hash table is an object which retains an association between pairs of objects. Each pair consists of a key and a value. Given an object which is similar to a key in the hash table, it is possible to retrieve the corresponding value. Entries in a hash table are not ordered in any way, and lookup is facilitated by hashing: quickly mapping a key object to a numeric value which is then used to index into one of many buckets where the matching key will be found (if such a key is present in the hash table). In addition to keys and values, a hash table contains a storage location which allows it to be associated with user data. Important to the operation of a hash table is the criterion by which keys are considered same. By default, this similarity follows the .code eql function. A hash table will search for a stored key which is .code eql to the given search key. A hash table constructed with the .codn equal -based property compares keys using the .code equal function instead. In addition to storing key-value pairs, a hash table can have a piece of information associated with it, called the user data. \*(TX hash tables contain a seed value which permutes the hashing operation, at least for keys of certain types. This feature, if the seed is randomized, helps to prevent software from being susceptible to hash collision denial-of-service attacks. However, by default, the seed is not randomized. Newly created hash tables for which a seed value is not specified take their seed value from the .code *hash-seed* special variable, which is initialized to zero. That includes hash tables created by parsing hash literal syntax. Security-sensitive programs requiring protection against collision attacks may use .code gen-hash-seed to create a randomized hash seed, and, depending on their specific need, either store that value in .codn *hash-seed* , or pass the value to hash-table constructors like .codn make-hash , or both. Note: randomization of hash seeding isn't a default behavior because it affects program reproducibility. The seed value affects the order in which keys are traversed, which can change the output of programs whose inputs have not changed, and whose logic is is otherwise deterministic. A hash table can be traversed to visit all of the keys and data. The order of traversal bears no relation to the order of insertion, or to any properties of the key type. During an open traversal, new keys can be inserted into a hash table or deleted from it while a a traversal is in progress. Insertion of a new key during traversal will not cause any existing key to be visited twice or to be skipped; however, it is not specified whether the new key will be traversed. Similarly, if a key is deleted during traversal, and that key has not yet been visited, it is not specified whether it will be visited during the remainder of the traversal. These remarks apply not only to deletion via .code remhash or the .code del operator, but also to wholesale deletion of all keys via .codn clearhash . The garbage collection of hash tables supports weak keys and weak values. If a hash table has weak keys, this means that from the point of view of garbage collection, that table holds only weak references to the keys stored in it. Similarly, if a hash table has weak values, it means that it holds a weak reference to each value stored. A weak reference is one which does not prevent the reclamation of an object by the garbage collector. That is to say, when the garbage collector discovers that the only references to some object are weak references, then that object is considered garbage, just as if it had no references to it. The object is reclaimed, and the weak references "lapse" in some way, which depends on what kind they are. Hash-table weak references lapse by entry removal. When an object used as a key in one or more weak-key hash tables becomes unreachable, those hash entries disappear. This happens even if the values are themselves reachable. Vice versa, when an object appearing as a value in one or more weak-value hash tables becomes unreachable, those entries disappear, even if the keys are reachable. When a hash table has both weak keys and weak values, then an the behavior is one of two possible semantics. Under the .codn or -semantics, the hash table entry is removed if either the key or the value is unreachable. Under the .codn and -semantics, the entry is removed only if both the key and value are unreachable. If the keys of a weak-key hash table are reachable from the values, or if the values of a weak-key hash table are reachable from the keys, then the weak semantics is defeated for the affected entries: the hash table retains those entries as if it were an ordinary table. A hash table with both weak keys and values does not have this issue, regardless of its semantics. An open traversal of a hash table is performed by the .code maphash function and the .code dohash operator. The traversal is open because code supplied by the program is evaluated for each entry. The functions .codn hash-keys , .codn hash-values , .codn hash-pairs , and .code hash-alist also perform an open traversal, because they return lazy lists. The traversal isn't complete until the returned lazy list is fully instantiated. In the meanwhile, the \*(TX program can mutate the hash table from which the lazy list is being generated. Certain hash operations expose access to the internal key-value association entries of a hash table, which are represented as ordinary .code cons cells. Modifying the .code car field of such a cell potentially violates the integrity of the hash table; the behavior of subsequent lookup and insertion operations becomes unspecified. Similarly, if an object is used as a key in an .codn equal -based hash table, and that object is mutated in such a way that its equality to other objects under the .code equal function is affected or its hash value under .code hash-equal is altered, the behavior of subsequent lookup and insertion operations on the becomes unspecified. .coNP Functions @ make-hash and @ hash .synb .mets (make-hash < weak-keys < weak-vals .mets \ \ \ \ \ \ \ \ \ \ < equal-based <> [ hash-seed ]) .mets (hash {:weak-keys | :weak-vals | :weak-or | :weak-and .mets \ \ \ \ \ \ :eql-based | :equal-based | .mets \ \ \ \ \ \ :eq-based | :userdata << obj }*) .syne .desc These functions construct a new hash table. .code make-hash takes three mandatory Boolean arguments. The Boolean .meta weak-keys argument specifies whether the hash table shall have weak keys. The .meta weak-vals argument specifies whether it shall have weak values, and .meta equal-based specifies whether it is .codn equal -based. If the .meta weak-keys argument is one of the keywords .code :weak-and or .code :weak-or then the hash table shall have both weak keys and weak values, with the semantics implied by the keyword: .code :weak-and specifies .codn and -semantics and .code :weak-or specifies .codn or -semantics. The .meta weak-vals argument is then ignored. If both .meta weak-keys and .meta weak-vals are true, and .meta weak-keys is not one of the keywords .code :weak-and or .codn :weak-or , then the hash table has .codn or -semantics. The .code hash function defaults all three of these properties to false, and allows them to be overridden to true by the presence of keyword arguments. The optional .meta hash-seed parameter must be an integer, if specified. Its value perturbs the hashing function of the hash table, which affects .code :equal-based hash tables, when character strings and buffers are used as keys. If .meta hash-seed is omitted, then the value of the .code *hash-seed* variable is used as the seed. It is an error to attempt to construct an .codn equal -based hash table which has weak keys. The .code hash function provides an alternative interface. It accepts optional keyword arguments. The supported keyword symbols are: .codn :weak-keys , .codn :weak-vals , .codn :weak-and , .codn :weak-or , .codn :equal-based , .code :eql-based .code :eq-based and .code :userdata which can be specified in any order to turn on the corresponding properties in the newly constructed hash table. Only one of .codn :equal-based , .code :eql-based and .code :eq-based may be specified. If specified, then the hash table uses .codn equal , .code eql or .code eq equality, respectively, for considering two keys to be the same key. If none of these is specified, the .code hash function produces an .code :equal-based hash table by default. If .codn :weak-keys , .code :weak-and or .code :weak-or is specified, then .code :equal-based may not be specified. At most one of .code :weak-and or .code :weak-or may be specified. If either of these is specified, then the .code :weak-keys and .code :weak-vals keywords are redundant and unnecessary. If .code :weak-keys and .code :weak-vals are both specified, and .code :weak-and isn't specified, the situation is equivalent to .codn :weak-or . If .code :userdata is present, it must be followed by an argument value; that value specifies the user data for the hash table, which can be retrieved using the .code hash-userdata function. Note: there doesn't exist a keyword for specifying the seed. This omission is deliberate. These hash construction keywords may appear in the hash literal .code #H syntax. A seed keyword would allow literals to specify their own seed, which would allow malicious hash literals to be crafted that perpetrate a hash collision attack against the parser. .coNP Functions @, hash-construct @ hash-from-pairs and @ hash-from-alist .synb .mets (hash-construct < hash-args << key-val-pairs ) .mets (hash-from-pairs < key-val-pairs << hash-arg *) .mets (hash-from-alist < alist << hash-arg *) .syne .desc The .code hash-construct function constructs a populated hash in one step. The .meta hash-args argument specifies a list suitable as an argument list in a call to the .code hash function. The .meta key-val-pairs is a sequence of pairs, which are two-element lists representing key-value pairs. A hash is constructed as if by a call to .mono .meti (apply hash << hash-args ), .onom then populated with the specified pairs, and returned. The .code hash-from-pairs function is an alternative interface to the same semantics. The .meta key-val-pairs argument is first, and the .meta hash-args are passed as trailing variadic arguments, rather than a single list argument. The .code hash-from-alist function is similar to .codn hash-from-pairs , except that the .meta alist argument specifies they keys and values as an association list. The elements of the list are .code cons cells, each of whose .code car is a key, and whose .code cdr is the value. .coNP Function @ hash-list .synb .mets (hash-list < key-list << hash-arg *) .syne .desc The .code hash-list function constructs a hash as if by a call to .mono .meti [apply hash << hash-args ], .onom where .meta hash-args is a list of the individual .meta hash-arg variadic arguments. The hash is then populated with keys taken from .meta key-list and returned. The value associated with each key is that key itself. .coNP Function @ hash-zip .synb .mets (hash-zip < key-seq < value-seq << hash-arg *) .syne .desc The .code hash-zip function constructs a hash as if by a call to .mono .meti (apply hash << hash-args ), .onom where .meta hash-args is a list of the individual .meta hash-arg variadic arguments. The hash is then populated with keys taken from .meta key-seq which are paired with values taken from from .metn value-seq , and returned. If .meta key-seq is longer than .metn value-seq , then the excess keys are ignored, and vice versa. .coNP Function @ hash-props .synb .mets (hash-props >> { key << value }*) .syne .desc The .code hash-props function constructs a populated hash table without requiring the caller to construct a list of entries. The hash table contents are specified as direct arguments. The .code hash-props function requires an even number of arguments, which are interleaved key-value pairs. The returned hash table is .codn equal -based, and no parameters are available for customizing any of its properties, such as weakness. .coNP Function @ hash-map .synb .mets (hash-map < function < sequence << hash-arg *) .syne .desc The .code hash-map function constructs a a hash table from a .meta sequence of keys and a .meta function which maps them to values. The .meta function argument must be a function that can be called with one argument. The elements of .meta sequence become the keys of the returned hash table. The value associated with each key is determined by passing that value to function .meta fun and taking the returned value. The remaining .meta hash-arg arguments determine what kind of hash table is created, as if the .code hash function were applied to them. If the sequence contains duplicate elements (according to the hash table equality in effect for the hash table being constructed), duplicate elements later in the sequence replace earlier elements. .coNP Function @ hash-update .synb .mets (hash-update < hash << function ) .syne .desc The .code hash-update function replaces each value in .metn hash , with the value of .meta function applied to that value. The return value is .metn hash . .coNP Function @ hash-update-1 .synb .mets (hash-update-1 < hash < key < function <> [ init ]) .syne .desc The .code hash-update-1 function operates on a single entry in the hash table. If .meta key exists in the hash table, then its corresponding value is passed into .metn function , and the return value of .meta function is then installed in place of the key's value. The value is then returned. If .meta key does not exist in the hash table, and no .meta init argument is given, then .code hash-update-1 does nothing and returns .codn nil . If .meta key does not exist in the hash table, and an .meta init argument is given, then .meta function is applied to .metn init , and then .meta key is inserted into .meta hash with the value returned by .meta function as the datum. This value is also returned. .coNP Functions @ group-by and @ group-map .synb .mets (group-by < by-fun < sequence << option *) .mets (group-map < by-fun < filter-fun < sequence << option *) .syne .desc The .code group-by function produces a hash table from .metn sequence . Entries of the hash table are not elements of .metn sequence , but lists of elements of .metn sequence . The function .meta by-fun is applied to each element of .meta sequence to compute a key. That key is used to determine which list the item is added to in the hash table. The trailing arguments .mono .meti << option * .onom if any, consist of the same keywords that are understood by the .code hash function, and determine the properties of the hash. The .code group-map fun extends the semantics of .code group-by with a filtering step. It groups the elements of .meta sequence in exactly the same manner, using .metn by-fun . These lists of elements are then passed to .meta filter-fun whose return values become the values associated with the hash table keys. The effect of .code group-map may be obtained by a combination of .code group-by and .code hash-update according to the following equivalence: .verb (group-map bf ff seq) <--> (let ((h (group-by bf seq))) (hash-update h ff)) .brev .TP* Examples: Group the integers from 0 to 10 into three buckets keyed on 0, 1 and 2 according to the modulo 3 congruence: .verb (group-by (op mod @1 3) 0..11) -> #H(() (0 (0 3 6 9)) (1 (1 4 7 10)) (2 (2 5 8))) .brev Same as above, but associate the keys with the sums of the buckets: .verb [group-map (op mod @1 3) sum 0..11] -> #H(() (0 18) (1 22) (2 15)) .brev .coNP Function @ group-reduce .synb .mets (group-reduce < hash < classify-fun < binary-fun < seq .mets \ \ >> [ init-value <> [ filter-fun ]]) .syne .desc The .code group-reduce updates hash table .meta hash by grouping and reducing sequence .metn seq . The function regards the hash table as being populated with keys denoting accumulator values. Missing accumulators which need to be created in the hash table are initialized with .meta init-value which defaults to .codn nil . The function iterates over .meta seq and treats each element according to the following steps: .RS .IP 1. Each element is mapped to a hash key through .metn classify-fun . .IP 2. The value associated with the hash key (the accumulator for that key) is retrieved. If it doesn't exist, .meta init-value is used. .IP 3. The function .meta binary-fun is invoked with two arguments: the accumulator from step 2, and the original element from .metn seq . .IP 4. The resulting value from step 3 is stored back into the hash table under the key from step 2. .RE .IP After the above processing, one more step is performed if the .meta filter-fun argument is present. In this case, the hash table is destructively mapped through .meta filter-fun before being returned. That is to say, every value in the hash table is projected through .meta filter-fun and stored back in the table under the same key, as if by an invocation the .mono .meti (hash-update < hash << filter-fun ) .onom expression. .IP If .code group-reduce is invoked on an empty hash table, its net result closely resembles a .code group-by operation followed by separately performing a .code reduce-left on each value in the hash. .TP* Examples: Frequency histogram: .verb [group-reduce (hash) identity (do inc @1) "fourscoreandsevenyearsago" 0] --> #H(() (#\ea 3) (#\ec 1) (#\ed 1) (#\ee 4) (#\ef 1) (#\eg 1) (#\en 2) (#\eo 3) (#\er 3) (#\es 3) (#\eu 1) (#\ev 1) (#\ey 1)) .brev Separate the integers 1\(en10 into even and odd, and sum these groups: .verb [group-reduce (hash) evenp + 1..11 0] -> #H(() (t 30) (nil 25)) .brev .coNP Functions @ hist-sort and @ hist-sort-by .synb .mets (hist-sort < sequence << option *) .mets (hist-sort-by < by-fun < sequence << option *) .syne .desc The .code hist-sort function produces a histogram in the form of an association list, which is sorted in descending order of frequency. The keys in the association list are elements of .meta sequence and the values are the frequency values: positive integers indicating how many times the keys occur in .metn sequence . Note: for a description of association lists, see the .code assoc function, and the section Association Lists in which its description is contained. The .code hist-sort function works by internally constructing a hash table, which is not returned. Elements of .meta sequence serve as keys in that hash. The trailing arguments .mono .meti << option * .onom if any, consist of the same keywords that are understood by the .code hash function, and determine the properties of that hash. The .code hist-sort-by function differs from .code hist-sort in that it requires an additional argument .meta by-fun with the following semantics: every element of .meta sequence is passed to .meta by-fun such that the resulting value is used as the hash key in the resulting histogram. Thus, an invocation of .code hist-sort is equivalent to an invocation of .code hist-sort-by where the .meta by-fun argument is specified as the .code identity function. .TP* Examples .verb (hist-sort nil) -> nil (hist-sort '(3 4 5)) -> ((3 . 1) (4 . 1) (5 . 1)) (hist-sort '("a" "b" "c" "a" "b" "a" "b" "a")) -> (("a" . 4) ("b" . 3) ("c" . 1)) .brev .coNP Functions @ make-similar-hash and @ copy-hash .synb .mets (make-similar-hash << hash ) .mets (copy-hash << hash ) .syne .desc The .code make-similar-hash and .code copy-hash functions create a new hash object based on the existing .meta hash object. .code make-similar-hash produces an empty hash table which inherits all of the attributes of .metn hash . It uses the same kind of key equality, the same configuration of weak keys and values, and has the same user data (see the .code set-hash-userdata function). The .code copy-hash also produces a hash table similar to .metn hash , in the same way as .codn make-similar-hash . However, rather than producing producing an empty hash table, it returns a duplicate table which has all the same elements as .metn hash : it contains the same key and value objects. .coNP Function @ inhash .synb .mets (inhash < hash < key <> [ init ]) .syne .desc The .code inhash function searches hash table .meta hash for .metn key . If .meta key is found, then it return the hash table's cons cell which represents the association between .meta hash and .metn key . Otherwise, it returns .codn nil . If argument .meta init is specified, then the function will create an entry for .meta key in .meta hash whose value is that of .metn init . The cons cell representing that association is returned. Note: for as long as the .meta key continues to exist inside .metn hash . modifying the .code car field of the returned cons has ramifications for the logical integrity of the hash; doing so results in unspecified behavior for subsequent insertion and lookup operations. Modifying the .code cdr field has the effect of updating the association with a new value. .coNP Accessor @ gethash .synb .mets (gethash < hash < key <> [ alt ]) .mets (set (gethash < hash < key <> [ alt ]) << new-value ) .syne .desc The .code gethash function searches hash table .meta hash for key .metn key . If the key is found then the associated value is returned. Otherwise, if the .meta alt argument was specified, it is returned. If the .meta alt argument was not specified, .code nil is returned. A valid .code gethash form serves as a place. It denotes either an existing value in a hash table or a value that would be created by the evaluation of the form. The .meta alt argument is meaningful when .code gethash is used as a place, and, if present, is always evaluated whenever the place is evaluated. In place update operations, it provides the initial value, which defaults to .code nil if the argument is not specified. For example .code "(inc (gethash h k d))" will increment the value stored under key .code k in hash table .code h by one. If the key does not exist in the hash table, then the value .code "(+ 1 d)" is inserted into the table under that key. The expression .code d is always evaluated, whether or not its value is needed. If a .code gethash place is subject to a deletion, but doesn't exist, it is not an error. The operation does nothing, and .code nil is considered the prior value of the place yielded by the deletion. .coNP Function @ sethash .synb .mets (sethash < hash < key << value ) .syne .desc The .code sethash function places a value into .meta hash table under the given .metn key . If a similar key already exists in the hash table, then that key's value is replaced by .metn value . Otherwise, the .meta key and .meta value pair is newly inserted into .metn hash . The .code sethash function returns the .meta value argument. .coNP Function @ pushhash .synb .mets (pushhash < hash < key << element ) .syne .desc The .code pushhash function is useful when the values stored in a hash table are lists. If the given .meta key does not already exist in .metn hash , then a list of length one is made which contains .metn element , and stored in .meta hash table under .metn key . If the .meta key already exists in the hash table, then the corresponding value must be a list. The .meta element value is added to the front of that list, and the extended list then becomes the new value under .metn key . The return value is Boolean. If true, indicates that the hash-table entry was newly created. If false, it indicates that the push took place on an existing entry. .coNP Function @ remhash .synb .mets (remhash < hash << key ) .syne .desc The .code remhash function searches .meta hash for a key similar to the .metn key . If that key is found, then that key and its corresponding value are removed from the hash table. If the key is found and removal takes place, then the associated value is returned. Otherwise .code nil is returned. .coNP Function @ clearhash .synb .mets (clearhash << hash ) .syne .desc The .code clearhash function removes all key-value pairs from .metn hash , causing it to be empty. If .meta hash is already empty prior to the operation, then .codn nil , is returned. Otherwise an integer is returned indicating the number of entries that were purged from .metn hash . .coNP Function @ hash-count .synb .mets (hash-count << hash ) .syne .desc The .code hash-count function returns an integer representing the number of key-value pairs stored in .metn hash . .coNP Accessor @ hash-userdata .synb .mets (hash-userdata << hash ) .mets (set (hash-userdata << hash ) << new-value ) .syne .desc The .code hash-userdata function retrieves the user data object associated with .metn hash . A hash table can be created with user data using the .code :userdata keyword in a hash-table literal or in a call to the .code hash function, directly, or via other hash-constructing functions which take the hash construction keywords, such as .codn group-by . If a hash table is created without user data, its user data is initialized to .codn nil . Because .code hash-userdata is an accessor, a .code hash-userdata form can be used as a place. Assigning a value to this place causes the user data of .meta hash to be replaced with that value. .coNP Function @ get-hash-userdata .synb .mets (get-hash-userdata << hash ) .syne .desc The .code get-hash-userdata function is a deprecated synonym for .codn hash-userdata . .coNP Function @ set-hash-userdata .synb .mets (set-hash-userdata < hash << object ) .syne .desc The .code set-hash-userdata replaces, with the .metn object , the user data object associated with .metn hash . .coNP Function @ hashp .synb .mets (hashp << object ) .syne .desc The .code hashp function returns .code t if the .meta object is a hash table, otherwise it returns .codn nil . .coNP Function @ maphash .synb .mets (maphash < binary-function << hash ) .syne .desc The .code maphash function successively invokes .meta binary-function for each entry stored in .metn hash . Each entry's key and value are passed as arguments to .metn binary-function . The function returns .codn nil . .coNP Functions @ hash-revget and @ hash-keys-of .synb .mets (hash-revget < hash < value >> [ testfun <> [ keyfun ]]) .mets (hash-keys-of < hash < value >> [ testfun <> [ keyfun ]]) .syne .desc The .code hash-revget function performs a reverse lookup on .metn hash . It searches through the entries stored in .meta hash for an entry whose value matches .metn value . If such an entry is found, that entry's key is returned. Otherwise .code nil is returned. If multiple matching entries exist, it is not specified which entry's key is returned. The .code hash-keys-of function has exactly the same argument conventions, and likewise searches the .metn hash . However, it returns a list of all keys whose values match .metn value . The .meta keyfun function is applied to each value in .meta hash and the resulting value is compared with .metn value . The default .meta keyfun is the .code identity function. The comparison is performed using .metn testfun . The default .meta testfun is the .code equal function. .coNP Function @ hash-invert .synb .mets (hash-invert < hash >> [ joinfun >> [ unitfun << hash-arg *]]) .syne .desc The .code hash-invert function calculates and returns an inversion of hash table .metn hash . The values in .meta hash become keys in the returned hash table. Conversely, the values in the returned hash table are derived from the keys. The optional .meta joinfun and .meta unitfun arguments must be functions, if they are given. These functions determine the behavior of .code hash-invert with regard to duplicate values in .meta hash which turn into duplicate keys. The .meta joinfun function must be callable with two arguments, and .meta joinfun must accept one argument. If .meta joinfun is omitted, it defaults to the .code identity* function; .meta unitfun defaults to .codn identity . The .code hash-invert function constructs a hash table as if by a call to the .code hash function, passing the .meta hash-arg arguments which determine the properties of the newly created hash. The new hash table is then populated by iterating over the key-value pairs of .meta hash and inserting them as follows: The key from .meta hash is turned into a value .meta v1 by invoking the .meta unitfun function on it, and taking the return value. The value from .meta hash is used as a key to perform a lookup in the new hash table. If no entry exists, then a new entry is created, whose value is .metn v1 . Otherwise if the entry already exists, then the value .meta v0 of that entry is combined with .meta v1 by calling the .meta joinfun on the arguments .meta v0 and .metn v1 . The entry is updated with the resulting value. The new hash table is then returned. .TP* Examples: .verb ;; Invert simple 1 to 1 table: (hash-invert #H(() (a 1) (b 2) (c 3))) --> #H(() (1 a) (2 b) (3 c)) ;; Invert table such that the keys of duplicate values ;; are accumulated into lists: [hash-invert #H(() (1 a) (2 a) (3 c) (5 c) (7 d)) append list] --> #H(() (d (7)) (c (3 5)) (a (1 2))) ;; Invert table such that keys of duplicate values are summed: [hash-invert #H(() (1 a) (2 a) (3 c) (5 c) (7 d)) +] --> #H(() (d 7) (c 8) (a 3)) .brev .coNP Functions @ hash-eql and @ hash-equal .synb .mets (hash-eql << object ) .mets (hash-equal < object <> [ hash-seed ]) .syne .desc These functions each compute an integer hash value from the internal representation of .metn object , which satisfies the following properties. If two objects .code A and .code B are the same under the .code eql function, then .code "(hash-eql A)" and .code "(hash-eql B)" produce the same integer hash value. Similarly, if two objects .code A and .code B are the same under the .code equal function, then .code "(hash-equal A)" and .code "(hash-equal B)" each produce the same integer hash value. In all other circumstances, the hash values of two distinct objects are unrelated, and may or may not be the same. Object of struct type may support custom hashing by way of defining an equality substitution via an .code equal method. See the Equality Substitution section under Structures. The optional .meta hash-seed value perturbs the hashing function used by .code hash-equal for strings and buffer objects. This seed value must be a nonnegative integer no wider than 64 bits: that is, in the range 0 to 18446744073709551615. If the value isn't specified, it defaults to zero. On systems with 32-bit addresses, only the low 32 bits of this value may be significant. Effectively, each possible value of the significant part of the seed specifies a different hashing function. If two objects .code A and .code B are the same under the .code equal function, then .code "(hash-equal A S)" and .code "(hash-equal B S)" each produce the same integer hash value for any valid seed value .codn S . The value returned is a .code fixnum value, and may be negative. It may be any value in the range .code fixnum-min to .codn fixnum-max . .coNP Functions @, hash-keys @, hash-values @ hash-pairs and @ hash-alist .synb .mets (hash-keys << hash ) .mets (hash-values << hash ) .mets (hash-pairs << hash ) .mets (hash-alist << hash ) .syne .desc These functions retrieve the bulk key-value data of hash table .meta hash in various ways. .code hash-keys retrieves a list of the keys. .code hash-values retrieves a list of the values. .code hash-pairs retrieves a list of pairs, which are two-element lists consisting of the key, followed by the value. Finally, .code hash-alist retrieves the key-value pairs as a Lisp association list: a list of cons cells whose .code car fields are keys, and whose .code cdr fields are the values. Note that .code hash-alist returns the actual entries from the hash table, which are conses. Modifying the .code cdr fields of these conses constitutes modifying the hash values in the original hash table. Modifying the .code car fields interferes with the integrity of the hash table, resulting in unspecified behavior for subsequent hash insertion and lookup operations. These functions all retrieve the keys and values in the same order. For example, if the keys are retrieved with .codn hash-keys , and the values with .codn hash-values , then the corresponding entries from each list pairwise correspond to the pairs in .metn hash . The list returned by each of these functions is lazy, and hence constitutes an open traversal of the hash table. .coNP Operator @ dohash .synb .mets (dohash >> ( key-var < value-var < hash-form <> [ result-form ]) .mets \ \ << body-form *) .syne .desc The .code dohash operator iterates over a hash table. The .meta hash-form expression must evaluate to an object of hash-table type. The .meta key-var and .meta value-var arguments must be symbols suitable for use as variable names. Bindings are established for these variables over the scope of the .metn body-form s and the optional .metn result-form . For each element in the hash table, the .meta key-var and .meta value-var variables are set to the key and value of that entry, respectively, and each .metn body-form , if there are any, is evaluated. When all of the entries of the table are thus processed, the .meta result-form is evaluated, and its return value becomes the return value of the dohash form. If there is no .metn result-form , the return value is .codn nil . The .meta result-form and .metn body-form s are in the scope of an implicit anonymous block, which means that it is possible to terminate the execution of dohash early using .mono .meti (return << value ) .onom or .codn (return) . .coNP Functions @, hash-uni @, hash-diff @ hash-symdiff and @ hash-isec .synb .mets (hash-uni < hash1 < hash2 >> [ joinfun >> [ map1fun <> [ map2fun ]]]) .mets (hash-join < hash1 < hash2 < joinfun >> [ hash1dfl <> [ hash2dfl ]]) .mets (hash-diff < hash1 << hash2 ) .mets (hash-symdiff < hash1 << hash2 ) .mets (hash-isec < hash1 < hash2 <> [ joinfun ]) .syne .desc These functions perform basic set operations on hash tables in a nondestructive way, returning a new hash table without altering the inputs. The arguments .meta hash1 and .meta hash2 must be compatible hash tables. This means that their keys must use the same kind of equality. The resulting hash table inherits attributes from .metn hash1 , as if created by the .code make-similar-hash function. If .meta hash1 has userdata, the resulting hash table has the same userdata. If .meta hash1 has weak keys, the resulting table has weak keys, and so forth. The .code hash-uni function performs a set union. The resulting hash contains all of the keys from .meta hash1 and all of the keys from .metn hash2 , and their corresponding values. If a key occurs both in .meta hash1 and .metn hash2 , then it occurs only once in the resulting hash. In this case, if the .meta joinfun argument is not given, the value associated with this key is the one from .metn hash1 . If .meta joinfun is specified then it is called with two arguments: the respective data items from .meta hash1 and .metn hash2 . The return value of this function is used as the value in the union hash. If .meta map1fun is specified it must be a function that can be called with one argument. All values from .meta hash1 are projected through this function: the function is applied to each value, and the function's return value is used in place of the original value. Similarly, if .meta map2fun is present, specifies a function through which values from .meta hash2 are projected. These two functions are independent of .metn joinfun ; they are applied to values without regard for whether their keys exist in both hashes or just one. The .code hash-join function performs a union operation similar to, but usefully different from .codn hash-uni . The .meta joinfun argument is mandatory in .codn hash-join , and is applied to all items, regardless of whether they are present in just one hash or both hashes. The arguments .meta hash1dfl and .meta hash2dfl specify default values used in invocations of .meta joinfun for keys that are present only in one hash. These values default to .codn nil . For every key that is present only in .metn hash1 , .meta joinfun is invoked with that key's value as its left argument, and the .meta hash2dfl value as the right argument. Conversely, for every key that is present only in .metn hash2 , .meta joinfun is invoked with the .meta hash1dfl value as the left argument, and that key's value as its right argument. For every key that is present in both hashes, .meta joinfun is invoked with the values, respectively, from .meta hash1 and .metn hash2 . The returned hash contains all the keys from both hashes, associated with the values returned by .metn joinfun . The .code hash-diff function performs a set difference. First, a copy of .meta hash1 is made as if by the .code copy-hash function. Then from this copy, all keys which occur in .code hash2 are deleted. The .code hash-symdiff function performs a symmetric difference. A new hash is returned which contains all of the keys from .meta hash1 that are not in .meta hash2 and vice versa: all of the keys from .meta hash2 that are not in .metn hash1 . The keys carry their corresponding values from .meta hash1 and .metn hash2 , respectively. The .code hash-isec function performs a set intersection. The resulting hash contains only those keys which occur both in .meta hash1 and .metn hash2 . If .meta joinfun is not specified, the values selected for these common keys are those from .metn hash1 . If .meta joinfun is specified, then for each key which occurs in both .meta hash1 and .metn hash2 , it is called with two arguments: the respective data items. The return value is then used as the data item in the intersection hash. .coNP Functions @ hash-subset and @ hash-proper-subset .synb .mets (hash-subset < hash1 << hash2 ) .mets (hash-proper-subset < hash1 << hash2 ) .syne .desc The .code hash-subset function returns .code t if the keys in .meta hash1 are a subset of the keys in .metn hash2 . The .code hash-proper-subset function returns .code t if the keys in .meta hash1 are a proper subset of the keys in .metn hash2 . This means that .meta hash2 has all the keys which are in .meta hash1 and at least one which isn't. Note: the return value may not be mathematically meaningful if .meta hash1 and .meta hash2 use different equality. In any case, the actual behavior may be understood as follows. The implementation of .code hash-subset tests whether each of the keys in .meta hash1 occurs in .meta hash2 using their respective equalities. The implementation of .code hash-proper-subset applies .code hash-subset first, as above. If that is true, and the two hashes have the same number of elements, the result is falsified. .coNP Functions @, hash-begin @, hash-reset @ hash-next and @ hash-peek .synb .mets (hash-begin << hash ) .mets (hash-reset < hash-iter << hash ) .mets (hash-next << hash-iter ) .mets (hash-peek << hash-iter ) .syne .desc The .code hash-begin function returns a an iterator object capable of retrieving the entries in stored in .meta hash one by one. The .code hash-reset function changes the state of an existing iterator, such that it becomes prepared to retrieve the entries stored in the newly given .metn hash , which may be the same one as the previously associated hash. In addition, .code hash-reset may be given a .meta hash argument of .codn nil , which dissociates it from its hash table. The .code hash-next function's .meta hash-iter argument is a hash iterator returned by .codn hash-begin . If unvisited entries remain in .metn hash , then .code hash-next returns the next one as a cons cell whose .code car holds the key and whose .code cdr holds the value. That entry is then considered visited by the iterator. If no more entries remain to be visited, .code hash-next returns .codn nil . The .code hash-next function also returns .code nil if the iterator has been dissociated from a hash table by .codn hash-reset . The .code hash-peek function returns the same value that a subsequent call to .code hash-next will return for the same .metn hash-iter , without changing the state of .metn hash-iter . That is to say, if a cell representing a hash entry is returned, that entry remains unvisited by the iterator. .coNP Macro @ with-hash-iter .synb .mets (with-hash-iter >> ( isym < hash-form >> [ ksym <> [ vsym ]]) .mets \ \ << body-form *) .syne .desc The .code with-hash-iter macro evaluates .metn body-form s in an environment in which a lexically scoped function is visible. The function is named by .meta isym which must be a symbol suitable for naming functions with .codn flet . The .meta hash-form argument must be a form which evaluates to a hash-table object. Invocations of the function retrieve successive entries of the hash table as cons-cell pairs of keys and values. The function returns .code nil to indicate no more entries remain. If either of the .meta ksym or .meta vsym arguments are present, they must be symbols suitable as variable names. They are bound as variables visible to .metn body-form s, initialized to the value .codn nil . If .meta ksym is specified, then whenever the function .meta isym macro is invoked and retrieves a hash-table entry, the .meta ksym variable is set to the key. If the function returns .code nil then the value of .meta ksym is set to .codn nil . Similarly, if .meta vsym is specified, then the function stores the retrieved hash value in that variable, or else sets the variable to .code nil if there is no next value. .coNP Function @ copy-hash-iter .synb .mets (copy-hash-iter << iter ) .syne .desc The .code copy-hash-iter function creates and returns a duplicate of the .meta iter object, which must be a hash iterator returned by .codn hash-begin . The returned object has the same state as the original; it references the same traversal position in the same hash. However, it is independent of the original. Calls to .code hash-next on the original have no effect on the duplicate and vice versa. .coNP Special Variable @ *hash-seed* .desc The .code *hash-seed* special variable is initialized with a value of zero. Whenever a new hash table is explicitly or implicitly created, it takes its seed from the value of the .code *hash-seed* variable in the current dynamic environment. The only situation in which .code *hash-seed* is not used when creating a new hash table is when .code make-hash is called with an argument given for the optional .meta hash-seed argument. Only .codn equal -based hash tables make use of their seed, and only for keys which are strings and buffers. The purpose of the seed is to scramble the hashing function, to make a hash table resistant to a type of denial-of-service attack, whereby a malicious input causes a hash table to be populated with a large number of keys which all map to the same hash-table chain, causing the performance to severely degrade. The value of .code *hash-seed* must be a nonnegative integer, no wider than 64 bits. On systems with 32-bit addresses, only the least significant 32 bits of this value may be significant. .coNP Function @ gen-hash-seed .synb .mets (gen-hash-seed) .syne .desc The .code gen-hash-seed function returns an integer value suitable for the .code *hash-seed* variable, or as the .code hash-seed argument of the .code make-hash and .code hash-equal functions. The value is derived from the host environment, from information such as the process ID and time of day. .SS* Search Tree Library \*(TL provides binary search trees, which are objects of type .codn tree . Trees have a printed notation denoted by the .code #T prefix. A tree may be constructed by invoking the .code tree function. Binary search trees differ from hashes in that they maintain items in order. They also differ from hashes in that they store only elements, not key-value pairs. Every tree is associated with three .IR "key abstraction functions" : It has a .I "key function" which is applied to the elements to map each one to a key. It also has a .I "less function" and .I "equal function" for comparing keys. If these three functions are not specified, they respectively default to .codn identity , .code less and .codn equal , which means that the tree uses its elements as keys directly, and that they are compared using .code less and .codn equal . Note: these default functions work for simple elements such as character strings or numbers, and also structures implementing .IR "equality substitution" . The elements are stored inside a tree using tree nodes, which are objects of type .codn tnode , whose printed notation is introduced by the .code #N prefix. Several tree-related functions take .code tnode objects as arguments or return .code tnode objects. Trees may store duplicate elements. The .code #T literal syntax may freely specify duplicate elements. The .code tree constructor function specifies an initial sequence of elements to be populated into the newly constructed tree. If this initial sequence contains duplicate elements, they are preserved if the optional .meta allow-dupes argument is true, otherwise only the rightmost member of any duplicate group appears in the tree. The insertion functions .code tree-insert and .code tree-insert-node also overwrite duplicates by default, but optionally allow them. Duplicates are ordered by insertion: most recently inserted duplicate is rightmost. However, tree lookup chooses an unspecified duplicate. .coNP Function @ tnode .synb .mets (tnode < key < left << right ) .syne .desc The .code tnode function allocates, initializes and returns a single tree node. A tree node has three fields .metn key , .meta left and .metn right , which are accessed using the functions .codn key , .code left and .codn right . .coNP Function @ tnodep .synb .mets (tnodep << value ) .syne .desc The .code tnodep function returns .code t if .meta value is a tree node. Otherwise, it returns .codn nil . .coNP Accessors @, key @ left and @ right .synb .mets (key << node ) .mets (left << node ) .mets (right << node ) .mets (set (key << node ) << new-key ) .mets (set (left << node ) << new-left ) .mets (set (right << node ) << new-right ) .syne .desc The .codn key , .code left and .code right functions retrieve the corresponding fields of the .meta node object, which must be of type .codn tnode . Forms based on the .codn key , .code left and .code right symbol are defined as syntactic places. Assigning a value .code v to .code "(key n)" using the .code set operator, as in .codn "(set (key n) v)" , is equivalent to .code "(set-key n v)" except that the value of the expression is .code v rather than .codn n . Similar statements hold true for .code left and .code right in relation to .code set-left and .codn set-right . .coNP Functions @, set-key @ set-left and @ set-right .synb .mets (set-key < node << new-key ) .mets (set-left < node << new-left ) .mets (set-right < node << new-right ) .syne .desc The .codn set-key , .code set-left and .code set-right functions replace the corresponding fields of .meta node with new values. The .meta node argument must be of type .codn tnode . These functions all return .metn node . .coNP Function @ copy-tnode .synb .mets (copy-tnode << node ) .syne .desc The .code copy-tnode function creates a new .code tnode object, whose .codn key , .code left and .code right fields are copied from .codn node . .coNP Function @ tree .synb .mets (tree >> [ elems .mets \ \ \ \ \ \ >> [ keyfun >> [ lessfun >> [ equalfun <> [ allow-dupes ]]]]) .syne .desc The .code tree function constructs and returns a new tree object. All arguments are optional. The .meta elems argument specifies a sequence of the elements to be stored in the tree. If the argument is absent or the sequence is empty, then an empty tree is created. The .meta keyfun argument specifies the function which is applied to every element to produce a key. If omitted, the tree object shall behave as if the .code identity function were used, taking the elements themselves to be keys. The .meta lessfun argument specifies the function by which two keys are compared for inequality. If omitted, the .code less function is used. A function used as .meta lessfun should take two arguments, produce a Boolean result, and have ordering properties similar to the .code less function. The .meta equalfun argument specifies the function by which two keys are compared for equality. The default value is the .code equal function. A function used as .meta equalfun should take two arguments, produce a Boolean result, and have the properties of an equivalence relation. These three functions are collectively referred to as the tree's .IR "key abstraction functions" . The .meta allow-dupes argument, which defaults to .codn nil , is relevant if an .meta elems sequence is specified containing some elements which which appear to be duplicates, according to the tree object's .meta equalfun function. If .meta allow-dupes is true then duplicates are preserved: the tree will have as many nodes as there are elements in the .meta elems sequence. Moreover, the duplicates appear in the same relative order in the tree as they appear in the original .meta elems sequence. If .meta allow-dupes is false, then duplicates are suppressed: if any element appears more than once in .metn elements , then only the last occurrence of that element appears in the tree. Note: the .code tree-insert and .code tree-insert-node functions also has an optional argument indicating whether a duplicate insertion replaces an existing element. Note: although the order of duplicate elements is preserved, when the .code tree-lookup function is used look up a key which is duplicated, the element which is retrieved is unspecified, and can change when the tree is reorganized due to insertions and deletions. .coNP Function @ treep .synb .mets (treep << value ) .syne .desc The .code treep function returns .code t if .meta value is a tree. Otherwise, it returns .codn nil . .coNP Function @ tree-count .synb .mets (tree-count << tree ) .syne .desc The .code tree-count function returns an integer indicating the number of nodes currently inserted into .metn tree , which must be a search tree object. .coNP Function @ tree-insert-node .synb .mets (tree-insert-node < tree < node <> [ allow-dupe ]) .syne .desc The .code tree-insert-node function inserts an existing .meta node object into a search tree. The .meta tree object must be of type .codn tree , and .meta node must be of type .codn tnode . The .code key field of the .meta node object holds the element that is being inserted. The actual search key which is associated with this element is determined by applying .metn tree 's .meta keyfun to the .metn node 's .code key value. The .meta node object must not currently be inserted into any existing tree. The values stored in the .code left and .code right fields of .meta node are overwritten as required by the semantics of the insertion operation. Their original values are ignored. The .meta allow-dupe argument, defaulting to .codn nil , is concerned with what happens if the tree already contains one or more nodes having a key equal to the .metn node 's key. If .meta allow-dupe is false, then .meta node replaces an unspecified one of those existing nodes: that replaced node is deleted from the tree. Key equivalence is determined using tree's equality function (see the .meta equalfun argument of the .code tree function). If .meta allow-dupe is true, then the new node is inserted without replacing any node, and appears together with the existing duplicate or duplicates. Among the duplicates, the newly inserted node is the rightmost node in the tree order. The .code tree-insert-node function returns the .meta node argument. .coNP Function @ tree-insert .synb .mets (tree-insert < tree < elem <> [ allow-dupe ]) .syne .desc The .code tree-insert function inserts .meta elem into .metn tree . The .meta tree argument must be an object of type .codn tree . The .meta elem value may be of any type which is semantically compatible with .metn tree 's key abstraction functions. The .code tree-insert function allocates a new .code tnode as if by invoking .mono .meti (tnode < elem nil nil) .onom function, and inserts that .code tnode as if by using the .code tree-insert-node function. If one or more elements equal to .meta elem already exist in the tree, then the behavior is determined by the .meta allow-dupe argument, which defaults to .codn nil . The semantics of .meta allow-dupe is as given in the description of .codn tree-insert-node . The .code tree-insert function returns the newly inserted .code tnode object. .coNP Function @ tree-lookup-node .synb .mets (tree-lookup-node < tree << key ) .syne .desc The .code tree-lookup-node searches .meta tree for an element which matches .metn key . The .meta tree argument must be an object of type .codn tree . The .meta key argument may be a value of any type. An element inside .meta tree matches .meta key if the tree's .meta keyfun applied to that element produces a key value which is equal to .meta key under the tree's .meta equalfun function. If such an element is found, then .code tree-lookup-node returns the tree node which contains that element as its .meta key field. If no such element is found, then .code tree-lookup-node returns .codn nil . If multiple nodes exist in the tree which have a matching key, it is unspecified which one of those nodes is retrieved. .coNP Function @ tree-lookup .synb .mets (tree-lookup < tree << key ) .syne .desc The .code tree-lookup function finds an element inside .meta tree which matches the given .metn key . If the element is found, it is returned. Otherwise, .code nil is returned. Note: the semantics of the .code tree-lookup function can be understood in terms of .codn tree-lookup-node . A possible implementation is this: .verb (defun tree-lookup (tree key) (iflet ((node (tree-lookup-node tree key))) (key node))) .brev If the tree contains multiple elements which match .metn key , it is unspecified which element is retrieved. .coNP Function @ tree-delete-node .synb .mets (tree-delete-node < tree << key ) .syne .desc The .code tree-delete-node function searches .meta tree for an element which matches .metn key . The .meta tree argument must be an object of type .codn tree . The .meta key argument may be a value of any type which is semantically compatible with .metn tree 's key abstraction functions. If the matching element is found, then its node is removed from the tree, and returned. Otherwise, if a matching element is not found, then .code nil is returned. If more than one element exists inside .meta tree which matches .metn key , it is unspecified which node is deleted and returned. .coNP Function @ tree-delete .synb .mets (tree-delete < tree << key ) .syne .desc The .code tree-delete function tries to removes from .meta tree the element which matches .metn key . If successful, it returns that element, otherwise it returns .codn nil . If more than one element exists inside .meta tree which matches .metn key , it is unspecified which one is deleted. Note: the semantics of the .code tree-delete function can be understood in terms of .codn tree-delete-node . A possible implementation is this: .verb (defun tree-delete (tree key) (iflet ((node (tree-delete-node tree key))) (key node))) .brev .coNP Function @ tree-delete-specific-node .synb .mets (tree-delete-specific-node < tree << node ) .syne .desc The .code tree-delete-specific-node function searches .meta tree to find the specific node given by the .meta node argument. If .meta node is inserted into the tree, then it is deleted, and returned. If .meta node is not found in the tree, then the tree is unchanged, and .code nil is returned. Note: the search for .meta node is informed by .metn node 's key, for efficiency. However, if the tree contains duplicates of that key, then a linear search takes place among the duplicates. .coNP Functions @ tree-min-node and @ tree-min .synb .mets (tree-min-node << tree ) .mets (tree-min << tree ) .syne .desc The .code tree-min-node function returns the node in .meta tree which holds the lowest element. If the tree is empty, it returns .codn nil . The .code tree-min function returns the lowest element, or else .code nil if the tree is empty. .coNP Functions @ tree-del-min-node and @ tree-del-min .synb .mets (tree-del-min-node << tree ) .mets (tree-del-min << tree ) .syne .desc The .code tree-del-min-node function returns the node in .meta tree which has the lowest key, and removes that node from the tree. If the tree is empty, it returns .codn nil . The .code tree-del-min function returns the lowest element and removes it from the tree, or else .code nil if the tree is empty. The following equivalence holds: .verb (tree-del-min tr) <--> (iflet ((node (tree-del-min-node tr))) (key node)) .brev Note: .code tree-insert together with .code tree-del-min provide the basis for using a tree as a priority queue. Elements are inserted into the queue using .code tree-insert and then removed in priority order using .codn tree-del-min . .coNP Function @ tree-root .synb .mets (tree-root << tree ) .syne .desc The .code tree-root function returns the root node of .metn tree , which must be a .code tree object. If .meta tree is empty, then .code nil is returned. .coNP Function @ tree-clear .synb .mets (tree-clear << tree ) .syne .desc The .code tree-clear function deletes all elements from .metn tree , which must be a .code tree object. If .meta tree is already empty, then the function returns .codn nil , otherwise it returns an integer which gives the count of the number of deleted nodes. .coNP Function @ copy-search-tree .synb .mets (copy-search-tree << tree ) .syne .desc The .code copy-search-tree returns a new tree object which is a copy of .metn tree . The .meta tree argument must be an object of type .codn tree . The returned object has the same key abstraction functions as .meta tree and contains the same elements. The nodes held inside the new tree are freshly allocated, but their key objects are shared with the original tree. .coNP Function @ make-similar-tree .synb .mets (make-similar-tree << tree ) .syne .desc The .code copy-search-tree returns a new, empty search tree object. The .meta tree argument must be an object of type .codn tree . The returned object has the same key abstraction functions as .metn tree . .coNP Function @ tree-begin .synb .mets (tree-begin < tree >> [ low-key <> [ high-key ]]) .syne .desc The .code tree-begin function returns a new object of type .code tree-iter which provides in-order traversal of nodes stored in .metn tree . The .meta tree argument must be an object of type .codn tree . If the .meta low-key argument is specified, then nodes with keys lesser than .meta low-key are omitted from the traversal. If the .meta high-key argument is specified, then nodes with keys equal to or greater than .meta high-key are omitted from the traversal. The nodes are traversed by applying the .code tree-next function to the returned .code tree-iter object. A .code tree-iter object is iterable. .TP* Example: .verb (collect-each ((el (tree-begin #T(() 1 2 3 4 5) 2 5))) (* 10 el)) --> (20 30 40) .brev .coNP Function @ tree-reset .synb .mets (tree-reset < iter < tree >> [ low-key <> [ high-key ]]) .syne .desc The .code tree-reset functions is closely analogous to .codn tree-begin . The .meta iter argument must be an existing .code tree-iter object, previously returned by a call to .codn tree-begin . Regardless of its current state, the .meta iter object is re-initialized to traverse the specified .meta tree with the specified parameters, and is then returned. The .code tree-reset function prepares .meta iter to traverse in the same manner as would new iterator returned by .code tree-begin for the specified .metn tree , .meta low-key and .meta high-key arguments. .coNP Functions @ tree-next and @ tree-peek .synb .mets (tree-next << iter ) .mets (tree-peek << iter ) .syne .desc The .code tree-next and .code tree-peek function returns the next node in sequence from the tree iterator .metn iter . The iterator must be an object of type .codn tree-iter , returned by the .code tree-begin function. If there are no more nodes to be visited, these functions .codn nil . If, during the traversal of a tree, nodes are inserted or deleted, the behavior of .code tree-next and .code tree-peek on .code tree-iter objects that were obtained prior to the insertion or deletion is not specified. An attempt to complete the iteration may not successfully visit all keys that should be visited. The .code tree-next function changes the state of the iterator. If .code tree-next is invoked repeatedly on the same iterator, it returns successive nodes of the tree. If .code tree-peek is invoked more than once on the same iterator without any intervening calls to .codn tree-next , it returns the same node; it does not appear to change the state of the iterator and therefore does not advance through successive nodes. .coNP Function @ sub-tree .synb .mets (sub-tree < tree >> [ from-key <> [ to-key ]]) .syne .desc The .code sub-tree function selects elements from .metn tree , which must be a search tree. If .meta from-key is specified, then elements lesser than .meta from-key are omitted from the selection. If .meta to-key is specified, the elements greater than or equal to .meta to-key are omitted from the selection. A list of the selected elements is returned, in which the elements appear in the same order as they do in .metn tree . .coNP Function @ copy-tree-iter .synb .mets (copy-tree-iter << iter ) .syne .desc The .code copy-tree-iter function creates and returns a duplicate of the .meta iter object, which must be a tree iterator returned by .codn tree-begin . The returned object has the same state as the original; it references the same traversal position in the same tree. However, it is independent of the original. Calls to .code tree-next on the original have no effect on the duplicate and vice versa. .coNP Function @ replace-tree-iter .synb .mets (replace-tree-iter < dest-iter << src-iter ) .syne .desc The .code replace-tree-iter function causes the tree iterator .meta dest-iter to be in the same state as .metn src-iter . Both .meta dest-iter and .meta src-iter must be tree iterator objects returned by .codn tree-begin . The contents of .meta dest-iter are updated such that it now references the same tree as .metn src-iter , at the same position. The .meta dest-iter argument is returned. .coNP Special Variable @ *tree-fun-whitelist* .desc The .code *tree-fun-whitelist* variable holds a list of function names that may be used in the .code #T tree literal syntax as the .metn keyfun , .meta lessfun or .meta equalfun operations of a tree. The initial value of this variable is a list which holds at least the following three symbols: .codn identity , .code less and .codn equal . The application may change the value of this variable, or dynamically bind it, in order to allow .code #T literals to be processed which specify functions other than these three. .SS* Partial Evaluation and Combinators .coNP Macros @ op and @ do .synb .mets (op << form +) .mets (do < oper << form *) .syne .desc Like the .code lambda operator, the .code op macro denotes an anonymous function. Unlike .codn lambda , the arguments of the function are implicit, or optionally specified within the expression, rather than as a formal parameter list which precedes a body. The .meta form arguments of .code op are implicitly turned into a DWIM expression, which means that argument evaluation follows Lisp-1 rules. (See the .code dwim operator). The argument forms of .code op are arbitrary expressions, within which special conventions are permitted regarding the use of certain implicit variables: .RS .meIP >> @ num A number preceded by a .code @ is, syntactically, a metanumber. If it appears inside .code op as an expression, it behaves as a positional argument, whose existence it implies. For instance .code @2 means that the function shall have at least two arguments, the second argument of which is be substituted in place of the .codn @2 . .code op generates a function which has a number of required arguments equal to the highest value of .meta num appearing in a .mono .meti >> @ num .onom construct in the body. For instance .code "(op car @3)" generates a three-argument function (which passes its third argument to .codn car , returning the result, and ignores its first two arguments). There is no way to use .code op to generate functions which have optional arguments. The positional arguments are mutable; they may be assigned. .coIP @rest If the meta-symbol .code @rest appears in the .code op syntax as an expression, it explicitly denotes and evaluates to the list of trailing arguments. Like the metanumber positional arguments, it may be assigned. .coIP @rec If the meta-symbol .code @rec appears in the .code op syntax as an expression, it denotes a mutable variable which is bound to the function itself which is generated by that .code op expression. .coIP "@(rec ...)" If this syntax appears inside .codn op , it specifies a recursive call to the function. .RE .IP Functions generated by .code op are always variadic; they always take additional arguments after any required ones, whether or not the .code @rest syntax is used. If the body does not contain any .mono .meti >> @ num .onom or .code @rest syntax, then .code @rest is implicitly inserted. What this means is that, for example, since the form .code "(op foo)" does not contain any implicit positional arguments like .codn @1 , and does not contain .codn @rest , it is actually a shorthand for .codn "(op foo . @rest)" : a function which applies .code foo to all of its arguments. If the body does contain at least one .mono .meti >> @ num .onom or .codn @rest , then .code @rest isn't implicitly inserted. The notation .code "(op foo @1)" denotes a function which takes any number of arguments, and ignores all but the first one, which is passed to .codn foo . The .code do operator is similar to .codn op , with the following three differences: .RS .IP 1. The first argument of .codn do , namely .metn oper , is an operator. This argument is not processed for the presence of implicit variables. Thus for instance .code "(do @1 ...)" is invalid. By contrast, .code "(op @1 ...)" is possible, and makes sense under the right circumstances. The .meta oper argument may be the name of a macro or special operator, whereas .code op doesn't support the invocation of macros or special operators. For instance .code "(do let ((x @1)) (+ x 1))" is possible. .IP 2. The .meta form arguments of .code do are not implicitly treated as DWIM expressions, but as ordinary expressions. .IP 3. When .code do syntax doesn't contain any references to implicit variables (metanumbers or .codn @rest ) then a variadic function is generated which requires one argument. That argument is added to the form. Thus for instance .code "(do set x)" effectively serves as a shorthand for .codn "(do set x @1)" . The corresponding defaulting behavior in .code op is that a variadic function is generated which requires no arguments. All of the available arguments are applied. Thus .code "(op f x)" is effectively a shorthand for .codn "(op f x . @rest)" . .RE .IP Because it accepts operators, .code do can be used with imperative constructs which are not functions, like .codn set . For example, .code "(do set x)" produces an anonymous function which, if called with one argument, stores that argument into .metn x . The actions of .code op and .code do can be understood by the following examples, which convey how the syntax is rewritten to lambda. However, note that the real translator uses generated symbols for the arguments, which are not equal to any symbols in the program. .verb (op) -> invalid (op +) -> (lambda rest [+ . rest]) (op + foo) -> (lambda rest [+ foo . rest]) (op @1 . @rest) -> (lambda (arg1 . rest) [arg1 . @rest]) (op @1 @rest) -> (lambda (arg1 . rest) [arg1 @rest]) (op @1 @2) -> (lambda (arg1 arg2 . rest) [arg1 arg2]) (op foo @1 (@2) (bar @3)) -> (lambda (arg1 arg2 arg3 . rest) [foo arg1 (arg2) (bar arg3)]) (op foo @rest @1) -> (lambda (arg1 . rest) [foo rest arg1]) (do + foo) -> (lambda (arg1 . rest) (+ foo arg1)) (do @1 @2) -> (lambda (arg1 arg2 . rest) (@1 arg2)) ;; invalid! (do foo @rest @1) -> (lambda (arg1 . rest) (foo rest arg1)) .brev Note that if argument .mono .meti >> @ n .onom appears in the syntax, it is not necessary for arguments .code @1 through .mono .meti >> @ n-1 .onom to appear. The function will have .code n arguments: .verb (op @3) -> (lambda (arg1 arg2 arg3 . rest) [arg3]) .brev The .code op and .code do operators can be nested, in any combination. This raises the question: if an expression like .codn @1 , .code @rest or .code @rec occurs in an .code op that is nested within an .codn op , what is the meaning? An expression with a single .code @ always belongs with the innermost .code op or .code do operator. So for instance .code "(op (op @1))" means that an .code "(op @1)" expression is nested within an outer .code op expression that contains no references to its implicit variables. The .code @1 belongs to the inner .codn op . There is a way for an inner .code op to refer to the implicit variables of an outer one. This is expressed by adding an extra .code @ prefix for every level of escape. For example in .code "(op (op @@1))" the .code @@1 belongs to the outer .codn op : it is the same as .code @1 appearing in the outer .codn op . That is to say, in the expression .codn "(op @1 (op @@1))" , the .code @1 and .code @@1 are the same thing: both are parameter 1 of the lambda function generated by the outer .codn op . By contrast, in the expression .code "(op @1 (op @1))" there are two different parameters: the first .code @1 is argument of the outer function, and the second .code @1 is the first argument of the inner function. If there are three levels of nesting, then three .code @ meta-prefixes are needed to insert a parameter from the outermost .code op into the innermost .codn op . Note that the implicit variables belonging to an .code op can be used in the dot position of a function call, such as: .verb [(op list 1 . @1) 2] -> (1 . 2) .brev This is a consequence of the special transformations described in the paragraph .B "Dot Position in Function Calls" in the subsection .B "Additional Syntax" of the .BR "TXR Lisp" section. The .code op syntax works in conjunction with quasiliterals which are nested within it. The metanumber notation as well as .code @rest are recognized without requiring an additional .code @ escape, which is effectively optional: .verb (apply (op list `@1-@rest`) '(1 2 3)) -> "1-2 3" (apply (op list `@@1-@@rest`) '(1 2 3)) -> "1-2 3" .brev Though they produce the same result, the above two examples differ in that .code @rest embeds a metasymbol into the quasiliteral structure, whereas .code @@rest embeds the Lisp expression .code @rest into the quasiliteral. Either way, in the scope of .codn op , .code @rest undergoes the macro-expansion which renames it to the machine-generated function argument symbol of the implicit function denoted by the .code op macro form. This convenient omission of the .code @ character isn't supported for reaching the arguments of an outer .code op from a quasiliteral within a nested .codn op : .verb ;; To reach @@1, @@@1 must be written. ;; @@1 Lisp expression introduced by @. (op ... (op ... `@@@1`)) .brev Because the .code do macro may be applied to operators, it is possible to apply it to itself, as well as to .codn op , as in the following example: .verb [[[[(do do do op list) 1] 2] 3] 4] -> (1 2 3 4) .brev The chained application associates right-to-left: the rightmost .code do is applied to .codn op ; the second rightmost .code do is applied to the rightmost one and so on. The effect is that partial application has been achieved. The value .code 1 is passed to the resulting function, which returns another function which takes the next argument. Finally, all these chained argument values are passed to .codn list . Each .cod3 do / op level is processed independently. The following examples show how the list may be permuted into several different orders by referring to an implicit argument at various levels of nesting, making it the first argument of .codn list . The unmentioned arguments implicitly follow, in order. This works because mentioning the argument explicitly means that its corresponding .code do operator no longer inserts its argument implicitly into body of the function which it generates: .verb [[[[(do do do op list @1) 1] 2] 3] 4] -> (4 1 2 3) [[[[(do do do op list @@1) 1] 2] 3] 4] ->(3 1 2 4) [[[[(do do do op list @@@1) 1] 2] 3] 4] -> (2 1 3 4) [[[[(do do do op list @@@@1) 1] 2] 3] 4] -> (1 2 3 4)) .brev The following example mentions all arguments at every .cod3 do / op nesting level, thereby explicitly establishing the order in which they are passed to .codn list : .verb [[[[(do do do op list @1 @@1 @@@1 @@@@1) 1] 2] 3] 4] -> (4 3 2 1) .brev .TP* Examples: .verb (let ((c 0)) (mapcar (op cons (inc c)) '(a b c))) --> ((1 . a) (2 . b) (3 . c)) (reduce-left (op + (* 10 @1) @2) '(1 2 3)) --> 123 .brev .coNP Macro @ lop .synb .mets (lop << form +) .syne .desc The .code lop macro is variant of .code op with special semantics. The .meta form arguments support the same notation as those of the .code op operator. If only one .meta form is given then .code lop is equivalent to .codn op . If two or more .meta form arguments are present, then .code lop generates a variadic function which inserts all of its trailing arguments between the first and second .metn form s. That is to say, trailing arguments coming into the anonymous function become the left arguments of the function or function-like object denoted by the first .meta form and the remaining .metn form s give additional arguments. Hence the name .codn lop , which stands for \(dqleft-inserting .codn op \(dq. This left insertion of the trailing arguments takes place regardless of whether .code @rest occurs in any .metn form . The .meta form syntax determines the number of required arguments of the generated function, according to the highest-valued meta-number. The trailing arguments which are inserted into the left position are any arguments in excess of the required arguments. The .code lop macro's expansion can be understood via the following equivalences, except that in the real implementation, the symbols .code rest and .code arg1 through .code arg3 are replaced with hygienic, unique symbols. .verb (lop f) <--> (op f) <--> (lambda (. rest) [f . rest]) (lop f x y) <--> (lambda (. rest) [apply f (append rest [list x y])]) (lop f x @3 y) <--> (lambda (arg1 arg2 arg3 . rest) [apply f (append rest [list x arg3 y])]) .brev .TP* Examples: .verb (mapcar (lop list 3) '(a b c)) --> ((a 3) (b 3) (c 3)) (mapcar (lop list @1) '(a b c)) --> ((a) (b) (c)) (mapcar (lop list @1) '(a b c) '(d e f)) --> ((d a) (e b) (f c)) .brev .coNP Macro @ ldo .synb .mets (ldo < oper << form *) .syne .desc The .code ldo macro provides a shorthand notation for uses of the .code do macro which inserts the first argument of the anonymous function as the leftmost argument of the specified operator. The .code ldo syntax can be understood in terms of these equivalences: .verb (ldo f) <--> (do f @1) (ldo f x) <--> (do f @1 x) (ldo f x y) <--> (do f @1 x y) (ldo f x @2 y) <--> (do f @1 x @2 y) .brev The implicit argument .code @1 is always inserted as the leftmost argument of the operator specified by the first form. .TP* Example: .verb ;; push elements of l1 onto l2. (let ((l1 '(a b c)) l2) (mapdo (ldo push l2) l1) l2) --> (c b a) .brev .coNP Macros @, ap @, ip @ ado and @ ido .synb .mets (ap << form +) .mets (ip << form +) .mets (ado << form +) .mets (ido << form +) .syne .desc The .code ap macro is based on the .code op macro and has identical argument conventions. The .code ap macro analyzes its arguments and produces a function .metn f , in exactly the same same way as the .code op macro. However, instead of returning .metn f , directly, it returns a different function .metn g , which is a one-argument function that accepts a list. The list specifies arguments to which .meta g applies .metn f , and then returns the resulting value. In other words, the following equivalence holds: .verb (ap form ...) <--> (apf (op form ...)) .brev The .code ap macro nests properly with .code op and .codn do , in any combination, in regard to the .meta ...@@n notation. The .code ip macro is similar to the .code ap macro, except that it is based on the semantics of the function .code iapply rather than .codn apply , according to the following equivalence: .verb (ip form ...) <--> (ipf (op form ...)) .brev The .code ado and .code ido macros are related to do macro in the same way that .code ap and .code ip are related to .codn op . They produce a one-argument function which works as if by applying the function generated by do to its its own arguments, according to the following equivalence: .verb (ado form ...) <--> (apf (do form ...)) (ido form ...) <--> (ipf (do form ...)) .brev See also: the .code apf and .code ipf functions. .TP* Example: .verb ;; Take a list of pairs and produce a list in which those pairs ;; are reversed. (mapcar (ap list @2 @1) '((1 2) (a b))) -> ((2 1) (b a)) .brev .coNP Macros @, opip @, oand @ lopip and @ loand .synb .mets (opip << clause *) .mets (oand << clause *) .mets (lopip << clause *) .mets (loand << clause *) .syne .desc The .code opip and .code oand operators make it possible to chain together functions which are expressed using the .code op syntax. (See the .code op operator for more information). Both macros perform the same transformation except that .code opip translates its arguments to a call to the .code chain function, whereas .code oand translates its arguments in the same way to a call to the .code chand function. More precisely, these macros perform the following rewrites: .verb (opip arg1 arg2 ... argn) -> [chain {arg1} {arg2} ... {argn}] (oand arg1 arg2 ... argn) -> [chand {arg1} {arg2} ... {argn}] .brev where the above .code {arg} notation denotes the following transformation applied to each argument: .verb ;; these specific form patterns are left untransformed: (dwim ...) -> (dwim ...) [...] -> [...] (qref ...) -> (qref ...) (uref ...) -> (uref ...) (op ...) -> (op ...) (do ...) -> (do ...) (lop ...) -> (lop ...) (ldo ...) -> (ldo ...) (ap ...) -> (ap ...) (ip ...) -> (ip ...) (ado ...) -> (ado ...) (ido ...) -> (ido ...) (ret ...) -> (ret ...) (aret ...) -> (aret ...) .slot -> .slot .(method ...) -> .(method ...) atom -> atom ;; forms headed by let are treated specially (let sym) -> ;; described below (let (s0 i0) (s1 i1) ....) -> ;; described below (let ((s0 i0) (s1 i1)) -> ;; described below body) ;; other compound forms are transformed like this: (function ...) -> (op function ...) (operator ...) -> (do operator ...) (macro ...) -> (do macro ...) .brev In other words, compound forms whose leftmost symbol is a macro or operator are translated to the .code do notation. Compound forms denoting function calls are translated to the .code op notation. Compound forms which are .code dwim invocations, either explicit or via the DWIM brackets notation, are used without transformation. Used without transformation also are forms denoting struct slot access, either explicitly using .code uref or .code qref or the respective dot notations, forms which invoke any of the .code do family of operators, as well as any atom forms. The .code lopip and .code loand operators are similar to, respectively, .code opip and .codn oand , except that they insert the implicit argument as the leftmost argument. For these macros, the above specification of what transformations are applied to the arguments is modified as follows: .verb ;; other compound forms are transformed like this: (function ...) -> (lop function ...) (operator ...) -> (ldo operator ...) (macro ...) -> (ldo macro ...) .brev When a .code let or .code let* expression occurs in .code opip syntax, it denotes a special syntax which is treated as follows. .RS .IP 1. The simple form .mono .meti (let << sym ) .onom where .meta sym is a symbol is transformed into the .mono .meti (let >> ( sym @1)) .onom syntax, which is then handled via the following case (2). .IP 2. The form .mono .meti (let >> {( sym << init )}+) .onom specifies an implicit function which binds the specified variables. The variables are bound sequentially as if by .codn let* , even though the operator is .codn let . Note also that the bindings are not enclosed in a list. An example of the syntax is .code "(let (x @1) (y (+ x @2)))" which specifies a function of two arguments, inside of which .code x will be bound to the first argument, and .code y will be bound to the value of .code x plus the second argument. The remaining elements of the .code opip are incorporated into the body of this function. The value of the first argument, .codn @1 , is injected into the .code opip remaining opip chain, and that chain is processed in a scope in which the variables bound by .code let are visible. .IP 3. All other .code let forms not matching the above syntax are treated like all special operators. They become a .code do element of the .code opip pipeline. For instance .code "(let ((x @1)) (+ x 1))" denotes a one-argument function which binds .code x to its first argument, then produces the value .code "(+ x 1)" which is passed to the next stage of the .code opip chain. The remaining chain is not evaluated in the scope of the .code x variable. .RE .IP Note: the .code opip and .code oand macros use their macro environment in determining whether a form is a macro call, thereby respecting lexical scoping. Note: an .code opip form with no arguments specifies a function which returns .codn nil , which follows from a documented property of the .code chain function. .TP* Examples: Take each element from the list .code "(1 2 3 4)" and multiply it by three, then add 1. If the result is odd, collect that into the resulting list: .mono (mappend (opip (* 3) (+ 1) [iff oddp list]) (range 1 4)) .onom The above is equivalent to: .mono (mappend (chain (op * 3) (op + 1) [iff oddp list]) (range 1 4)) .onom The .code "(* 3)" and .code "(+ 1)" terms are rewritten to .code "(op * 3)" and .codn "(op + 1)" , respectively, whereas .code "[iff oddp list]" is passed through untransformed. The following demonstrates the single variable .codn let : .mono (let ((pipe (opip (+ 1) (let x) (+ 2) (let y) (+ 3) (list x y)))) [pipe 1]) -> (2 4 7) .onom The .code x variable intercepts the value coming from .code "(+ 1)" and binds .code x to that value. When the .code opip function is invoked with the argument .codn 1 , that value is .codn 2 . That value also continues to the .code "(+ 2)" element which yields .codn 4 , which is similarly captured by variable .codn y . The final .code list expression lists the values of .code x and .codn y , as well as, implicitly, the value .code @1 coming from the previous element, .coNP Macros @ opf and @ lopf .synb .mets (opf < function << clause *) .mets (lopf < function << clause *) .syne .desc The .code opf and .code lopf function make available the .codn opip -style functional arguments in conjunction with an arbitrary .metn function . The .meta clause arguments of .code opf and .code lopf are processed exactly like those of .code opip and .codn lopip . The syntax .verb (opf f c1 c2 c3 ...) .brev is converted into a function call of the form: .verb [f {c1} {c2} {c3} ...] .brev where every argument .code {cN} is converted to a form denoting a function, in exactly the same manner as the arguments of .codn opip . The same remarks apply to code .code lopf in relation to .codn lopip . Thus, it is possible to express .code opip using .code opf by choosing .code chain as the .meta function argument, according to this equivalence: .verb (opip c1 c2 c3 ...) <--> (opf chain c1 c2 c3 ...) .brev .TP* Example: .verb ;; Remove values greater than 10 or less than five from list (remove-if (lopf orf (> 10) (< 5)) (range 0 20)) -> (5 6 7 8 9 10)) ;; Note: could be expressed as (remove-if (orf (lop > 10) (lop < 5)) (range 0 20)) .brev As the example shows, the .code opf and .code lopf macros provide a way to avoid repeating the .code op and .code lop syntax in every argument of a functional combinator of a function. .coNP Macros @ flow and @ lflow .synb .mets (flow < form << opip-arg *) .mets (lflow < form << lopip-arg *) .syne .desc The .code flow macro passes the value of .meta form through the processing stages described by the .meta opip-arg arguments, yielding the resulting value. The .meta opip-arg arguments follow the semantics of the .code opip macro. The same requirements apply to .codn lflow , except that it is related to the .code lopip macro which inserts the implicit argument into the leftmost position. The following equivalences hold: .verb (flow x ...) <--> [(opip ...) x] (lflow x ...) <--> [(lopip ...) x] .brev That is to say, .code flow is equivalent to the application of an .codn opip -generated function to the value of .metn form , and likewise .code lflow is equivalent to the application of a .codn lopip -generated function. Note: if there are no .meta opip-arg or .meta lopip-arg arguments, then .code flow evaluates the .code x argument and returns .codn nil ; which follows from the behavior of the .code opip and .code lopip macros, when those are invoked with no arguments. .TP* Examples: .verb (flow 1 (+ 2) (* 3) (cons 0)) -> (0 . 9) (flow "abc" (upcase-str) (regsub #/B/ "ZTE")) -> "AZTEC" (flow 1 (- 10)) -> 9 (lflow 10 (- 1)) -> 9 .brev .coNP Macro @ ret .synb .mets (ret << form ) .syne .desc The .code ret macro's .meta form argument is treated similarly to the second and subsequent arguments of the .code op operator. The .code ret macro produces a function which takes any number of arguments, and returns the value specified by .metn form . .meta form can contain .code op meta syntax like .code @n and .codn @rest . The following equivalence holds: .verb (ret x) <--> (op identity* x)) .brev Thus the expression .code "(ret @2)" returns a function similar to .codn "(lambda (x y . z) y)" , and the expression .code "(ret 42)" returns a function similar to .codn "(lambda (. rest) 42)" . .coNP Macro @ aret .synb .mets (aret << form ) .syne .desc The .code aret macro's .meta form argument is treated similarly to the second and subsequent arguments of the .code op operator. The .code aret macro produces a function which takes any number of arguments, and returns the value specified by .metn form . .meta form can contain .code ap meta syntax like .mono .meti >> @ n .onom and .codn @rest . The following equivalence holds: .verb (aret x) <--> (ap identity* x)) .brev Thus the expression .code "(aret @2)" returns a function similar to .codn "(lambda (. rest) (second rest))" , and the expression .code "(aret 42)" returns a function similar to .codn "(lambda (. rest) 42)" . .coNP Macro @ tap .synb .mets (tap << arg +) .syne .desc The .code tap macro is intended for use in conjunction with .codn opip , .code flow and other macros in that family. It is a short-hand for writing a pipeline element which performs a side-effecting operation, but unconditionally returns the original input value. The exact expansion of .code tap is unspecified, but the following equivalence indicates a possible expansion strategy: .verb (tap ...) <--> (prog1 @1 (...)) .brev Assuming that expansion strategy, the expression .code "(tap put-line `foo: @1`)" would expand to .codn "(prog1 @1 (put-line `foo: @1`))" . Note: .codn tap , in addition to being useful for inserting necessary side effects into pipelines, is also useful for inserting temporary debug print forms. For that purpose, inserting the `prinl` function is often enough: .verb (flow 10 (+ 2) print (* 4)) .brev Here, the pipeline will calculate .code "(* 4 (+ 2 10))" with the side effect of the value of .code "(+ 2 10)" being printed. With .codn tap , the output can be customized, allowing multiple output points to be distinguished. .verb (flow 10 (tap put-line `input: @1`) (+ 2) (tap put-line `+ 2: @1`) (* 4) (tap put-line `* 4: @1`)) -> 48 .brev Output produced: .verb input: 10 + 2: 12 * 4: 48 .brev .coNP Function @ dup .synb .mets (dup << func ) .syne .desc The .code dup function returns a one-argument function which calls the two-argument function .meta func by duplicating its argument. .TP* Example: .verb ;; square the elements of a list (mapcar [dup *] '(1 2 3)) -> (1 4 9) .brev .coNP Function @ flipargs .synb .mets (flipargs << func ) .syne .desc The .code flipargs function returns a two-argument function which calls the two-argument function .meta func with reversed arguments. .coNP Functions @ chain and @ chand .synb .mets (chain << func *) .mets (chand << func *) .syne .desc The .code chain function accepts zero or more functions as arguments, and returns a single function, called the chained function, which represents the chained application of those functions, in left-to-right order. If .code chain is given no arguments, then it returns a variadic function which ignores all of its arguments and returns .codn nil . Otherwise, the first function may accept any number of arguments. The second and subsequent functions, if any, must accept one argument. The chained function can be called with an argument list which is acceptable to the first function. Those arguments are in fact passed to the first function. The return value of that call is then passed to the second function, and the return value of that call is passed to the third function and so on. The final return value is returned to the caller. The .code chand function is similar, except that it combines the functionality of .code andf into chaining. The difference between .code chain and .code chand is that .code chand immediately terminates and returns .code nil whenever any of the functions returns .codn nil , without calling the remaining functions. .TP* Example: .verb (call [chain + (op * 2)] 3 4) -> 14 .brev In this example, a two-element chain is formed from the .code + function and the function produced by .code "(op * 2)" which is a one-argument function that returns the value of its argument multiplied by two. (See the definition of the .code op operator). The chained function is invoked using the .code call function, with the arguments .code 3 and .codn 4 . The chained evaluation begins by passing .code 3 and .code 4 to .codn + , which yields .codn 7 . This .code 7 is then passed to the .code "(op * 2)" doubling function, resulting in .codn 14 . A way to write the above example without the use of the DWIM brackets and the op operator is this: .verb (call (chain (fun +) (lambda (x) (* 2 x))) 3 4) .brev .coNP Function @ juxt .synb .mets (juxt << func *) .syne .desc The .code juxt function accepts a variable number of arguments which are functions. It combines these into a single function which, when invoked, passes its arguments to each of these functions, and collects the results into a list. Note: the .code juxt function can be understood in terms of the following reference implementation: .verb (defun juxt (funcs) (lambda (. args) (mapcar (lambda (fun) (apply fun args)) funcs))) .brev The .code callf function generalizes .code juxt by allowing the combining function to be specified. .TP* Example: .verb ;; separate list (1 2 3 4 5 6) into lists of evens and odds, ;; which end up juxtaposed in the output list: [(op [juxt keep-if remove-if] evenp) '(1 2 3 4 5 6)] -> ((2 4 6) (1 3 5)) ;; call several functions on 1, collecting their results: [[juxt (op + 1) (op - 1) evenp sin cos] 1]' -> (2 0 nil 0.841470984807897 0.54030230586814) .brev .coNP Functions @ andf and @ orf .synb .mets (andf << func *) .mets (orf << func *) .syne .desc The .code andf and .code orf functions are the functional equivalent of the .code and and .code or operators. These functions accept multiple functions and return a new function which represents the logical combination of those functions. The input functions should have the same arity. Failing that, there should exist some common argument arity with which each of these can be invoked. The resulting combined function is then callable with that many arguments. The .code andf function returns a function which combines the input functions with a short-circuiting logical conjunction. The resulting function passes its arguments to the input functions successively, in left-to-right order. As soon as any of the functions returns .codn nil , then .code nil is returned and the remaining functions are not called. If none of the functions return .codn nil , then the value returned by the last function is returned. If the list of functions is empty, then .code t is returned. That is, .code (andf) returns a function which accepts any arguments and returns .codn t . The .code orf function returns a function which combines the input functions with a short-circuiting logical disjunction. The resulting function passes its arguments to the input functions successively, in left-to-right order. As soon as any of the functions returns a .cod2 non- nil value, that value is returned and the remaining functions are not called. If all of the functions return .codn nil , then .code nil is returned. If the list of functions is empty, then .code nil is returned. That is, .code (orf) returns a function which accepts any arguments and returns .codn nil . .coNP Function @ notf .synb .mets (notf << function ) .syne .desc The .code notf function returns a function which is the Boolean negation of .metn function . The returned function takes a variable number of arguments. When invoked, it passes all of these arguments to .meta function and then inverts the result as if by application of .codn not . .coNP Functions @ nandf and @ norf .synb .mets (nandf << func *) .mets (norf << func *) .syne .desc The .code nandf and .code norf functions are the logical negation of the .code andf and .code orf functions. They are related according to the following equivalences: .verb [nandf f0 f1 f2 ...] <--> (notf [andf f0 f1 f2 ...]) [norf f0 f1 f2 ...] <--> (notf [orf f0 f1 f2 ...]) .brev .coNP Functions @ iff and @ iffi .synb .mets (iff < condfun >> [ thenfun <> [ elsefun ]]) .mets (iffi < condfun < thenfun <> [ elsefun ]) .syne .desc The .code iff function is the functional equivalent of the .code if operator. It accepts functional arguments and returns a function. The resulting function passes its arguments to .metn condfun . If .meta condfun yields true, then the arguments are passed to .meta thenfun and the resulting value is returned. Otherwise the arguments are passed to .meta elsefun and the resulting value is returned. If .meta thenfun is omitted then .code identity is used as default. This omission is not permitted by .codn iffi , only .codn iff . If .meta elsefun needs to be called, but is omitted, then .code nil is returned. The .code iffi function differs from .code iff only in the defaulting behavior with respect to the .meta elsefun argument. If .meta elsefun is omitted in a call to .code iffi then the default function is .codn identity . This is useful in situations when one value is to be replaced with another one when the condition is true, otherwise preserved. The following equivalences hold between .code iffi and .codn iff : .verb (iffi a b c) <--> (iff a b c) (iffi a b) <--> (iff a b identity) [iffi a b nilf] <--> [iff a b] [iffi a identity nilf] <--> [iff a] .brev The following equivalences illustrate .code iff with both optional arguments omitted: .verb [iff a] <---> [iff a identity nilf] <---> a .brev .coNP Functions @, tf @ nilf and @ ignore .synb .mets (tf << arg *) .mets (nilf << arg *) .mets (ignore << arg *) .syne .desc The .code tf and .code nilf functions take zero or more arguments, and ignore them. The .code tf function returns .codn t , and the .code nilf function returns .codn nil . The .code ignore function is a synonym of .codn nilf . Note: the following equivalences hold between these functions and the .code ret operator, and .code retf function. .verb (fun tf) <--> (ret t) <--> (retf t) (fun nilf) <--> (ret nil) <--> (ret) <--> (retf nil) .brev In Lisp-1-style code, .code tf and .code nilf behave like constants which can replace uses of .code "(ret t)" and .codn "(ret nil)" : .verb [mapcar (ret nil) list] <--> [mapcar nilf list] .brev Note: the .code ignore function can be used for suppressing unused variable warnings. .TP* Example: .verb ;; tf and nilf are useful when functions are chained together. ;; test whether (trunc n 2) is odd. (defun trunc-n-2-odd (n) [[chain (op trunc @1 2) [iff oddp tf nilf]] n]) .brev In this example, two functions are chained together, and .code n is passed through the chain such that it is first divided by two via the function denoted by .code "(op trunc @1 2)" and then the result is passed into the function denoted by .codn "[iff oddp tf nilf]" . The .code iff function passes its argument into .codn oddp , and if .code oddp yields true, it passes the same argument to .codn tf . Here .code tf proves its utility by ignoring that value and returning .codn t . If the argument (the divided value) passed into .code iff is even, then iff passes it into the .code nilf function, which ignores the value and returns .codn nil . The following example shows how .code ignore may be used to suppress compiler warnings about unused parameters or other variables: .verb (defun (x y) (ignore x y)) .brev .coNP Function @ retf .synb .mets (retf << value ) .syne .desc The .code retf function returns a function. That function can take zero or more arguments. When called, it ignores its arguments and returns .metn value . See also: the .code ret macro. .TP* Example: .verb ;; the function returned by (retf 42) ;; ignores 1 2 3 and returns 42. (call (retf 42) 1 2 3) -> 42 .brev .coNP Functions @ apf and @ ipf .synb .mets (apf < function << arg *) .mets (ipf < function << arg *) .syne .desc The .code apf function returns a one-argument function whose argument conventions are similar to those of the .code apply function: it accepts one or more arguments, the last of which should be a list. When that function is called, it applies .meta function to these arguments to as if by .codn apply . It then returns whatever .meta function returns. If one or more additional .metn arg s are passed to .codn apf , then these are stored in the function which is returned. When that function is invoked, it prepends all of the stored arguments to the passed arguments, and applies the .metn function . to the resulting combined argument list. Thus the .metn arg s become the leftmost arguments of .metn function . The .code ipf function is similar to .codn apf , except that the argument conventions and application semantics of the function returned by .code ipf are based on .code iapply rather than .codn apply . See also: the .code ap macro. .TP* Example: .verb ;; Function returned by [apf +] accepts the ;; (1 2 3) list and applies + to it, as ;; if (+ 1 2 3) were called. (call [apf +] '(1 2 3)) -> 6 .brev .coNP Function @ callf .synb .mets (callf < main-function << arg-function *) .syne .desc The .code callf function returns a function which applies each .meta arg-function to its arguments, juxtaposing the return values of these calls to form arguments to which .meta main-function is then applied. The return value of .meta main-function is returned. The following equivalence holds, except for the order of evaluation of arguments: .verb (callf fm f0 f1 f2 ...) <--> (chain (juxt f0 f1 f2 ...) (apf fm)) .brev .TP* Example: .verb ;; Keep those pairs which are two of a kind (keep-if [callf eql first second] '((1 1) (2 3) (4 4) (5 6))) -> ((1 1) (4 4)) .brev The following equivalence holds between .code juxt and .codn callf : .verb [juxt f0 f1 f2 ...] <--> [callf list f0 f1 f2 ...]:w .brev Thus, .code juxt may be regarded as a specialization of .code callf in which the main function is implicitly .codn list . .coNP Function @ mapf .synb .mets (mapf < main-function << arg-function *) .syne .desc The .code mapf function returns a function which distributes its arguments into the .metn arg-function s. That is to say, each successive argument of the returned function is associated with a successive .metn arg-function . Each .meta arg-function is called, passed the corresponding argument. The return values of these functions are then passed as arguments to .meta main-function and the resulting value is returned. If the returned function is called with fewer arguments than there are .metn arg-function s, then only that many functions are used. Conversely, if the function is called with more arguments than there are .metn arg-function s, then those arguments are ignored. The following equivalence holds: .verb (mapf fm f0 f1 ...) <--> (lambda (. rest) [apply fm [mapcar call (list f0 f1 ...) rest]]) .brev .TP* Example: .verb ;; Add the squares of 2 and 3 [[mapf + [dup *] [dup *]] 2 3] -> 13 .brev .SS* Input and Output (Streams) \*(TL supports input and output streams of various kinds, with generic operations that work across the stream types. In general, I/O errors are usually turned into exceptions. When the description of error reporting is omitted from the description of a function, it can be assumed that it throws an error. .coNP Special Variables @, *stdout* @, *stddebug* @, *stdin* @ *stderr* and @ *stdnull* .desc These variables hold predefined stream objects. The .codn *stdin* , .code *stdout* and .code *stderr* streams closely correspond to the underlying operating system streams. Various I/O functions require stream objects as arguments. The .code *stddebug* stream goes to the same destination as .codn *stdout* , but is a separate object which can be redirected independently, allowing debugging output to be separated from normal output. The .code *stdnull* stream is a special kind of stream called a null stream. To read operations, the stream appears empty, like a stream open on an empty file. To write operations, it appears as a data sink of infinite capacity which consumes data and discards it. This stream is similar to the .code /dev/null device on Unix, and in fact has a relationship to it. If an attempt is made to obtain the underlying file descriptor of .code *stdnull* using the .code fileno function, then the .code /dev/null device is open, if the host platform supports it. The resulting file descriptor number is returned, and also retained in the .code *stdnull* device. When .code close-stream is invoked on .codn *stdnull* , that descriptor is closed. This feature of .code *stdnull* allows it to be useful for establishing redirections around the execution of external utilities. .TP* Example: .verb ;; redirect output of ls *.txt command to /dev/null (let ((*stderr* *stdnull*)) (sh "ls *.txt")) .brev .coNP Special Variables @ *print-flo-format* and @ *pprint-flo-format* .desc The .code *print-flo-format* variable determines the conversion format which is applied when a floating-point value is converted to decimal text by the functions .codn print , .codn prinl , and .codn tostring . The default value is .codn "~s" . The related variable .code *pprint-flo-format* similarly determines the conversion format applied to floating-point values by the functions .codn pprint , .codn pprinl , and .codn tostringp . The default value is .codn "~a" . The format string in either variable must specify the consumption of exactly one .code format argument. The conversion string may use embedded width and precision values: for instance, .code "~3,4f" is a valid value for .code *print-flo-format* or .codn *pprint-flo-format* . .coNP Special Variable @ *print-flo-precision* .desc The .code *print-flo-precision* special variable specifies the default floating-point printing precision which is used when the .code ~a or .code ~s conversion specifier of the .code format function is used for printing a floating-point value, and no precision is specified. Note that since the default value of the variable .code *print-flo-format* is the string .codn "~s" , the .code *printf-flo-precision* variable, by default, also determines the precision which applies when floating-point values are converted to decimal text by the functions .codn print , .codn pprint , .codn prinl , .codn pprinl , .code tostring and .codn tostringp . The default value of .code *print-flo-precision* is that of the .code flo-dig variable. Note: to print floating-point values in such a way that their values can be precisely recovered from the printed representation, it is recommended to override .code *print-flo-precision* to the value of the .code flo-max-dig variable. .coNP Special Variable @ *print-flo-digits* .desc The .code *print-flo-precision* special variable specifies the default floating-point printing precision which is used when the .code ~f or .code ~e conversion specifier of the .code format function is used for printing a floating-point value, and no precision is specified. Its default value is .codn 3 . .coNP Special Variable @ *print-base* .desc The .code *print-base* variable controls the base (radix) used for printing integer values. It applies when the functions .codn print , .codn pprint , .codn prinl , .codn pprinl , .code tostring and .code tostringp process an integer value. It also applies when the .code ~a and .code ~s conversion specifiers of the .code format function are used for printing an integer value. The default value of the variable is .codn 10 . Meaningful values are: .codn 2 , .codn 8 , .code 10 and .codn 16 . When base 16 is selected, hexadecimal digits are printed as uppercase characters. .coNP Special Variable @ *print-circle* .desc The .code *print-circle* variable is a Boolean which controls whether the circle notation is in effect for printing aggregate objects: conses, ranges, vectors, hash tables and structs. The initial value of this variable is .codn nil : circle notation printing is disabled. The circle notation works for structs also, including structs which have user-defined .code print methods. When a .code print method calls functions which print objects, such as .codn print , .code pprinl or .code format on the same stream, the detection of circularity and substructure sharing continues in these recursive invocations. However, there are limitations in the degree of support for circle notation printing across .code print methods. Namely, a .code print method of a struct .meta S must not procure and submit for printing objects which are not part of the ordinary structure that is reachable from the (static or instance) slots of .metn S , if those objects have already been printed prior to invoking the .code print method, and have been printed without a .code #= circle notation label. The "ordinary structure that is reachable from the slots" denotes structure that is directly reachable by traversing conses, ranges, vectors, hashes and struct slots: all printable aggregate objects. .coNP Special Variable @ *read-unknown-structs* .desc The .code *read-unknown-structs* variable controls the behavior of the parser upon encountering structure literal .code #S syntax which specifies an unknown structure type. If this variable's value is .code nil then such a literal is erroneous; an exception is thrown. Otherwise, such a structure is converted not into a structure object, which is impossible, but into a list object whose first element is the symbol .codn sys:struct-lit . The remaining elements are taken from the .code #S syntax. .coNP Function @ format .synb .mets (format < stream-designator < format-string << format-arg *) .syne .desc The .code format function performs output to a stream given by .metn stream-designator , by interpreting the actions implicit in a .metn format-string , incorporating material pulled from additional arguments given by .mono .meti << format-arg *. .onom Though the function is simple to invoke, there is complexity in format string language, which is documented below. The .meta stream-designator argument can be a stream object, or one of the values .code t or .codn nil . The value .code t serves as a shorthand for .codn *stdout* . The value .code nil means that the function will send output into a newly instantiated string output stream, and then return the resulting string. .TP* "Format string syntax:" Within .metn format-string , most characters represent themselves. Those characters are simply output. The character .code ~ (tilde) introduces formatting directives, which are denoted by a single character, usually a letter. The special sequence .code ~~ (tilde-tilde) encodes a single tilde. Nothing is permitted between the two tildes. The syntax of a directive is generally as follows: .mono .mets <> ~[ width ] <> [, precision ] < letter .onom In other words, the .code ~ (tilde) character, followed by a .meta width specifier, a .meta precision specifier introduced by a comma, and a .metn letter , such that .meta width and .meta precision are independently optional: either or both may be omitted. No whitespace is allowed between these elements. The .meta letter is a single alphabetic character which determines the general action of the directive. The optional width and precision are specified as follows: .RS .meIP < width The width specifier consists of an optional .code < (left angle bracket) character or .code ^ (caret) character followed by an optional width specification. If the leading .code < character is present, then the printing will be left-adjusted within this field. If the .code ^ character is present, the printing will be centered within the field. Otherwise it will be right-adjusted by default. The width can be specified as a decimal integer with an optional leading minus sign, or as the character .codn * . The .code * notation means that instead of digits, the value of the next argument is consumed, and expected to be an integer which specifies the width. If the width, specified either way, is negative, then the field will be left-adjusted. If the value is positive, but either the .code < or .code ^ prefix character is present in the width specifier, then the field is adjusted according to that character. The padding calculations for alignment and centering take into account character display width, as defined by the .code display-width function. For instance, a character string containing four Chinese characters (kanji) has a display width of 8, not 4. The width specification does not restrict the printed portion of a datum. Rather, for some kinds of conversions, it is the precision specification that performs such truncation. A datum's display width (or that of its printed portion, after such truncation is applied) can equal or exceed the specified field width. In this situation it overflows the field: the printed portion is rendered in its entirety without any padding applied on either side for alignment or centering. .meIP < precision The precision specifier is introduced by a leading comma. If this comma appears immediately after the directive's .code ~ character, then it means that .meta width is being omitted; there is only a precision field. The precision specifier may begin with these optional characters, whose effect .RS .coIP 0 the "leading zero option": pad with leading zeros; .coIP + print a sign for positive values; .coIP - print a single leading zero in place of a positive sign; and .IP space print a space in place of a positive sign. .RE The precision value influences the printing of values of all types. The precision options apply only when the value being printed is a number; otherwise they are ignored. If the .codn + , .code - or space are multiply specified, the rightmost one takes precedence. The precision specifier itself follows: it must be either a decimal integer or the .code * character indicating that the precision value comes from an integer argument. The leading zero option is only active if accompanied by a precision value, either coming from additional digits in the formatting directive, or from an argument indicated by .codn * . If no precision specifier is present, then the leading zero option is interpreted as a specifier indicating a precision value of zero, rather than requesting leading zeros. To request zero padding together with zero precision, either two or more zero digits are required, or else the leading zero indicator must be given together with the .code * specifier. For non-numeric values, the precision specifies the maximum number of print positions to occupy, taking into account the display width of each character of the printed representation of the object, as according to the .code display-width function. The object's printed representation is truncated, if necessary, to the maximum number of characters which will not exceed the specified number of print positions. A numeric argument is formatted into the field in two distinct steps, both of which involve the precision value in a different role. The details of the first of these steps, and the role payed by precision, depend on which conversion directive is used, as well as whether the argument is integer or floating-point. That first step prepares the printed representation of a number which is then fitted into the field by the second step, and also calculates the .I "effective precision" value, which is based on the original width and precision. The second step works with the effective precision rather than the original precision. Its description follows. First, the length of the printed representation of the number, not including its sign, is calculated. If this part of the number is shorter than the effective precision, then it is padded on the left with spaces or leading zeros so that the resulting string is equal to the precision. Next, if the number is negative, or else if adding a positive sign has been requested, then the sign is added. It is added to the left of the padding zeros, or else to the right of padding spaces, whichever the case may be. At this stage, if the number is not yet adorned with a sign, and either the .code - or space precision option had been given, then the appropriate character, the digit .code 0 or a space, is added in the place where the sign would go. This is done only if the result will not overflow the field width, but without regard for whether the character will overflow the effective precision. Finally, the resulting number is rendered into the field, using the requested left, right or center adjustment, as if it were a character string. If it overflows the field, it is reproduced in its entirety without any adjustment being performed. .RE .TP* "Format directives:" .RS Format directives are case sensitive, so that for example .code ~x and .code ~X have a different effect, and .code ~A doesn't exist whereas .code ~a does. They are: .coIP a Prints any object in an aesthetic way, as if by the .code pprint function. The aesthetic notation violates read-print consistency: this notation is not necessarily readable if it is implanted in \*(TX source code. The field width specifier is honored, including the left-right adjustment semantics. When the .code a specifier is used for numbers, the formatting is performed in two distinct steps: the printed representation of the number is calculated first, and then that representation is set into the field. At the same time, an effective precision is calculated, based on the precision and width, and that effective precision is used in the second step. In the first step, the rendering of a floating-point number to its printed representation, the precision specifies the maximum number of total significant figures, which do not include any digits in the exponent, if one is printed. Numbers are printed in E notation if their magnitude is small, or else if their exponent exceeds their precision. If the precision is not specified, then it is obtained from the .code *print-flo-precision* special variable, whose default value is the same as that of the .code flo-dig variable. The effective precision for the second step is then taken from the original precision, or one less than the width, whichever of the these two values is smaller, but no lower than zero. If the width is unspecified, it is taken as zero. Floating point values which are integers are printed without a trailing .code .0 (point zero). The .code + flag in the precision is honored for rendering an explicit .code + sign on nonnegative values. If a leading zero is specified in the precision, and a nonzero width is specified, then the printed value's integer part will be padded with leading zeros up to one less than the field width. These zeros are placed before the sign. A precision value of zero imposed on floating-point values is equivalent to a value of one; it is not possible to request zero significant figures. Integers are not affected by the precision value in the conversion to text; all of the digits of the integer are taken into the second step. In the case of integers, The effective precision for the second step is then taken from the original precision, or one less than the width, whichever of the these two values is smaller. However, if the width is not specified, or given as zero, then the unmodified precision value is taken as the effective precision. Thus, in the zero width or missing width case, integers are always padded with spaces or leading zeros due to the precision value, even if such padding overflows the field width. Rationale: the purpose of the elaborate rules for calculating the effective precision is to both obtain consistency in the printing of integers and floating-point values that are integers, as well as to break that consistency when the width is omitted or zero. This break in consistency has two benefits. The common situation of adding leading spaces or zeros to integers can be specified without specifying the width. For instance .str "~,8a" will format an integer right-justified in an eight-character extent, without width having to be used in order to specify a field to accommodate that padding. The effective padding amount going into the second step is 8, exceeding the zero width, and thus allowing the padding to overflow the field. In the case of floating-point, precision alone can express the common requirement for limiting the number of digits can be expressed by the precision, without causing unwanted padding when there are fewer digits. If the above .str "~,8a" is used to format a floating-point value, it will be limited to 8 digits of precision, regardless of its magnitude and the position of its decimal point, or whether or not exponential notation is used. The effective precision for field placement shall then be zero in the second step, so that no padding is generated. However, if a nonzero width is used, then formatting becomes consistent between floating-point and integer so that, for instance, the format directive .str "~8,8a" produces the same output for the argument values 42 and 42.0, namely an eight-character-wide field in which the digits .str 42 appear right-aligned. .coIP s Prints any object in a standard way, as if by the .code print function. Objects for which read-print consistency is possible are printed in a way such that if their notation is implanted in \*(TX source, they are readable. The field width specifier is honored, including the left-right adjustment semantics. The precision field is treated similarly to the .code ~a format directive, except that non-exponentiated floating point numbers that would be mistaken for integers include a trailing .code .0 for the sake of read-print consistency. Objects truncated by precision may not have read-print consistency. For instance, if a string object is truncated, it loses its trailing closing quote, so that the resulting representation is no longer a properly formed string object. For integer objects, the .code *print-base* variable is honored. Effectively, an integer is printed by the .code s directive as if by the .codn b , .codn o , .codn d , or .code x directive, depending on the value of the variable. .coIP d Requires an argument of integer or character type type. The integer value or character code is printed in decimal. Width and precision semantics are as described for the .code a format directive, for integers. .coIP x Requires an argument of character, integer or buffer type. The integer value, character code, or buffer contents are printed in hexadecimal, using lowercase letters for the digits .code a through .codn f . Width and precision semantics are as described for the .code a format directive, for integers. .coIP X Like the .code x directive, but the hexadecimal digits .code a through .code f are rendered in uppercase. .coIP o Like the .code x directive, but octal is used instead of hexadecimal. .coIP b Like the .code x directive, but binary is used instead of hexadecimal. .coIP f The .code f directive prints numbers in a fixed point decimal notation, with a fixed number of digits after the decimal point. It requires a numeric argument. (Unlike .codn x , .code X and .codn o , it does not allow an argument of character type). The formatting performed by .code f is performed in two distinct steps: the printed representation of the number is calculated first, and then that representation is set into the field. The precision parameter coming from the directive is only involved in the first step. In the first step, the precision specifier gives the number of digits past the decimal point. The number is rounded off to the specified precision, if necessary. Furthermore, that many digits are always printed, regardless of the actual precision of the number or its type. If it is omitted, then the value is obtained from the special variable .codn *print-flo-digits* , whose default value is three: three digits past the decimal point. A precision of zero means no digits past the decimal point, and in this case the decimal point is suppressed (regardless of whether the numeric argument is floating-point or integer). No limit is placed on the number of significant figures in the number by either the precision or width value. When the resulting textual number passes to the second formatting step, the precision value, for the purposes of that step, is calculated by taking one less than the field width, or else zero if the field width is zero. This value is not related to the precision that had been used to determine the number of places past the decimal point. .coIP e The .code e directive prints numbers in E notation. It requires a numeric argument. (Unlike .codn x , .code X and .codn o , it does not allow an argument of character type). The formatting performed by .code e is performed in two distinct steps: the printed representation of the number is calculated first, and then that representation is set into the field. The precision parameter coming from the directive is only involved in the first step. In the first step, the precision specifier gives the number of digits past the decimal point printed in the E notation, not counting the digits in the exponent. Exactly that many digits are printed, regardless of the precision of the number. If the precision is omitted, then the number of digits after the decimal point is obtained from the value of the special variable .codn *print-flo-digits* , whose default value is three. If the precision is zero, then a decimal portion is truncated off entirely, including the decimal point. When the resulting textual number passes to the second formatting step, the precision value, for the purposes of that step, is calculated by taking one less than the field width, or else zero if the field width is zero. This value is not related to the precision that had been used to determine the number of places past the decimal point. .coIP p The .code p directive prints a numeric representation in hexadecimal of the bit pattern of the object, which is meaningful to someone familiar with the internals of \*(TX. If the object is a pointer to heaped data, that value has a correspondence to its address. .coIP ! The .code ! directive establishes hanging indentation, and turns on the stream's indentation mode. Subsequent lines printed within the execution of the same .code format call will be automatically indented. If no width is specified, then the directive sets the hanging indentation to the current printing column position. If a width is specified, then it represents an offset (positive or negative). If the .code < prefix character is present, the hanging indentation is set to the specified offset relative to the current printing column. If the .code < prefix is present on the width field, then the offset is applied relative to the indentation which was saved on entry into the .code format function. The indentation mode and indentation column are automatically restored to their previous values when .code format function terminates, naturally or via an exception or nonlocal jump. The effect of a precision field (even if zero) combined with the .code ! directive is currently not specified, and reserved for future extension. The precision field is processed syntactically, and no error occurs, however. .RE .coNP Function @ fmt .synb .mets (fmt < format-string << format-arg *) .syne .desc The .code fmt function provides a shorthand for formatting to a string, according to the following equivalence which holds between .code fmt and .codn format : .verb (fmt s arg ...) <--> (format nil s arg ...) .brev .coNP Macro @ pic .synb .mets (pic < format-string << format-arg *) .syne .desc The .code pic macro ("picture based formatting") provides a notation for constructing a character string under the control of .meta format-string which indicates the insertion of zero or more .meta format-arg argument values. Like the .code fmt function or quasiliteral syntax, the .code pic macro returns a character string. The .code pic macro's .meta format-string notation is different from quasiliterals or from .codn fmt . The .code pic .meta format-string argument isn't an evaluated expression, but syntax. It must be either a string literal or else a string quasiliteral. No other syntax is permitted. If .meta pic is a string, is scanned left to right in search of .IR "pic patterns" . Any characters not belonging to a pic pattern are copied into the output string verbatim. When a pic pattern is found, it is removed from .meta format-string and applied to the next successive .meta format-arg to perform a conversion and formatting of that value to text. The resulting text is appended to the output string, and the process continues in search of the next pic pattern. When the .meta format-string is exhausted, the constructed string is returned. If .meta format-string is a quasiliteral, then all of the text strings embedded within the quasiliteral are examined in the same way, in left to right order. Each such string is transformed into an expression which produces a character string according to the semantics of the pic patterns it contains, and the resulting expressions are substituted into the original quasiliteral to produce a transformed quasiliteral. There must be exactly as many .meta format-arg arguments as there are pic patterns in .metn format-string . The .code pic macro arranges for the left-to-right evaluation of the .meta format-arg expressions. If .meta format-string is a quasiliteral, the evaluation of these expressions is interleaved into the quasiliterals expressions and variables, in the order implied by the placement of the corresponding pic patterns relative to the quasiliteral elements. For instance, if .meta format-string is .code `@(abc)<<<@(xyz)` then the function .code abc is called first, then the .meta format-argument is evaluated which produces a value for the .code <<< pic pattern, after which the .code xyz function is called. There are two kinds of pic patterns: alignment patterns, numeric patterns and escape patterns. Escape patterns consist of a two-character sequence introduced by the .code ~ (tilde) character, which is followed by one of the characters that are special in pic pattern syntax: .verb < > | + - 0 # . ! ~ , ( ) .brev An escape pattern produces the second character as its output. For instance .code ~~ encoded a single .code ~ character, and .code ~# encodes a literal .code # character that is not part of any pattern. Alignment patterns are described next. .RS .coIP <<...<< A sequence of one or more .code < (less than) characters specifies that the corresponding argument is rendered left-aligned in a field whose width is given by the number of .code < characters. If the argument's textual representation doesn't fit into the field, it overflows. .coIP >>...>> A sequence of one or more .code > (greater than) characters specifies that the corresponding argument is rendered right-aligned in a field whose width is given by the number of .code > characters. If the argument's textual representation doesn't fit into the field, it overflows. .coIP ||...|| A sequence of one or more .code | (pipe) characters specifies that the corresponding argument is centered in a field whose width is given by the number of .code | characters. If the argument's textual representation doesn't fit into the field, it overflows. If the argument cannot be precisely centered, because the even-odd parity of its character count is different from the parity of the field width, it is centered slightly to the left: one less space appears on its left side in respect to its right side. .RE .IP The numeric patterns, by means of their visual pattern and several optional prefix codes, specify the parameters for the conversion of a numeric argument, which is rendered right-aligned in a fixed-width field. Numeric patterns that do not contain any commas conform this simple rule: .mono .mets <> [ sign ] [0] {#}+ >> [ point {#}+ | !] .onom or else if they contain commas, the placement of these commas is governed by the more complicated rule: .mono .mets <> [ sign ] [0 [,]] {#}+ {,{#}+}* >> [ point {#}+ {,{#}+}* | !] .onom Commas may be placed anywhere within the pattern of hash characters, except at the beginning or end, or adjacent to the decimal point. If the leading zero is present, a comma may appear immediately after it, before the first hash. A second form of both of the above patterns is supported, for specifying that negative numbers be shown in parentheses. Instead of the sign, an opening parenthesis may appear, which must be matched by a closing parenthesis which follows a valid pattern interior: .mono .mets ( [0] {#}+ >> [ point {#}+ | !] ) .onom With embedded commas: .mono .mets ( [0 [,]] {#}+ {,{#}+}* >> [ point {#}+ {,{#}+}* | !] ) .onom The pattern consists of an optional .meta sign which is one of the characters .code + (plus) or .code - (minus), or else it may optionally begin with an opening parenthesis, indicating one of the two alternative forms. This is followed by an optional leading zero. After this comes a sequence of one or more .code # (hash) characters, which may contain exactly one .meta point element, which is defined as one of the characters .code . (period) or .code ! (exclamation mark). This .meta point element may appear at most once, and must not be the first or last character, unless it is the exclamation mark, in which case it may appear last. Except if ending in the exclamation mark, a numeric pattern specifies a field width which is equal to the number of characters occurring in the pattern itself. For instance, the patterns .codn #### , .code +### and .code 0#.# all specify a field width of four. If the numeric pattern ends in an exclamation mark, that character is not counted toward the field width that it specifies. Thus the pattern .code ###! specifies a field width of three. If the leading sign is present, it has the following meanings: .RS .coIP + If the corresponding numeric argument is nonnegative, the .code + character shall appear before first digit. Otherwise the minus character will appear. .coIP - Like .code + except that when the numeric argument is nonnegative, instead of a .code + character, a space appears before the first digit. This space counts toward the field width and therefore contributes to overflow. .RE .IP If a leading sign is not present, then no extra character appears before the first digit of a positive value, which means that an extra character of field width is available for representing nonnegative values. If the leading zero is present, it specifies that the number is padded with zeros on the left. In combination with the .code - sign, this shall not cause the leading space before a positive value to be overwritten with a zero; leading zeros, if any, begin after that space. The remainder of the pattern specifies the number of digits of the fractional part which is indicated by number of .code # characters after the .metn point . The number is rounded to that many fractional digits, which are all rendered, even if there are trailing zeros. If no .meta point is not specified, then the number of fractional digits is zero. The same is true if .meta point is specified as .code ! as the last character. In both cases, the numeric argument is rounded to integer, and rendered without any decimal point or fractional part. There is a difference between .meta point being specified using the ordinary decimal point character .code . versus the .code ! character. The .code ! character specifies that if the conversion of the numeric argument overflows the field, then instead of showing any digits, the field is filled with a pattern consisting of .code # (hash) characters, and possibly an embedded decimal point. In contrast, the .code . character permits the field's width to increase to accommodate overflowing output. If overflow takes place and the .code ! character appears other than as the rightmost character of the pattern, then the decimal point character .code . character appears at the position indicated by that .code ! character. If the .code ! character is the rightmost character of the pattern, then, just as in the case of normal, non-overflowing output, it doesn't contribute to the width of the hash fill, and only hash characters appear. If commas appear in the numeric pattern according to the more complex syntactic rule, they count toward the field width and specify the insertion of digit-separating commas at the indicated locations. Digit separators may be specified on either side of the decimal point, but not adjacent to it. In the output, a digit separating comma shall not appear if it would be immediately preceded by a .code + or .code - sign or space. In this situations, the sign character or space appears in place of the digit separator. A digit separator that appears in a position occupied by a space is also suppressed in favor of the space. Digit separators are included among leading zeros. It is not logically possible for a digit separator to appear as the first character of a pattern's output, because it may not be the first character of a pattern. However, if a numeric pattern is preceded or followed by a comma, those commas are ordinary characters which are copied to the output. When, due to the presence of .codn ! , an overflowing field is handled by the generation of a the hash character fill, the hash characters are treated as digits for the purpose of digit separation. When the pattern uses parentheses to specify that negative numbers are to be shown with parentheses, the parentheses count toward the field width. The field portion between the parentheses is called the inner field. The parentheses appear in the output when the number is negative, and are placed immediately outside of the inner field, so that if leading zeros are not requested, there may be one or more spaces between the opening parenthesis and the first digit. If the number is nonnegative, then each parenthesis is replaced by one space, flanking the inner field in the same manner as parentheses. .TP* Examples: .verb ;; numeric formatting (pic "######" 1234.1) -> " 1234" (pic "######.#" 1234.1) -> " 1234.1" (pic "#######.##" 1234.1) -> " 1234.10" (pic "#######.##" -1234.1) -> " -1234.10" (pic "0######.##" 1234.1) -> "0001234.10" (pic "+######.##" 1234.1) -> " +1234.10" (pic "-######.##" 1234.1) -> " 1234.10" (pic "+0#####.##" 1234.1) -> "+001234.10" (pic "-0#####.##" 1234.1) -> " 001234.10" (pic "#######.##" -1234.1) -> " -1234.10" ;; digit separation (pic "0,###,###.##" 1234.1) -> "0,000,123.10" (pic "#,###,###.##" 1234.1) -> " 123.10" ;; overflow with ! (pic "#!#" 1234) -> "###" (pic "#!#" 123) -> "###" (pic "-##!#" -123) -> "#####" (pic "+##!#" 123) -> "#####" (pic "###!" 1234) -> "###" ;; negative parentheses (pic "(#,###.##) 1234.56) -> " 1,234.56 " (pic "(#,###.##) -234.56) -> "( 234.56)" ;; alignment, multiple arguments (pic "<<<<<< 0#.# >>>>>>>" "foo" (+ 2 2) "bar") --> "foo 04.0 bar" ;; quasiliteral (let ((a 2) (b "###") (c 13.5)) (pic `abc@(+ a a)###.##@b>>>>` c "x")) --> "abc4 13.50### x" ;; filename generation (mapcar (do pic "foo~-0##.jpg") (rlist 0..5 8 12)) --> ("foo-000.jpg" "foo-001.jpg" "foo-002.jpg" "foo-003.jpg" "foo-004.jpg" "foo-005.jpg" "foo-008.jpg" "foo-012.jpg") .brev .coNP Functions @, print @, pprint @, prinl @, pprinl @ tostring and @ tostringp .synb .mets (print < obj >> [ stream <> [ pretty-p ]]) .mets (pprint < obj <> [ stream ]) .mets (prinl < obj <> [ stream ]) .mets (pprinl < obj <> [ stream ]) .mets (tostring << obj ) .mets (tostringp << obj ) .syne .desc The .code print and .code pprint functions render a printed character representation of the .meta obj argument into .metn stream . If the .meta stream argument is not supplied, then the destination is the stream currently stored in the .code *stdout* variable. If Boolean argument .meta pretty-p is not supplied or is explicitly specified as .codn nil , then the .code print function renders in a way which strives for read-print consistency: an object is printed in a notation which is recognized as a similar object of the same kind when it appears in \*(TX source code. Floating-point objects are printed as if using the .code format function, with formatting controlled by the .code *print-flo-format* variable. If .meta pretty-p is true, then .code print does not strive for read-print consistency. In \*(TX, the term .I "pretty printing" refers to rendering a printed representation of an object without the notational details required to unambiguously delimit the object, and represent its value and type without ambiguity. For instance, the four-character string .strn abcd , the two-byte buffer object .code #b'abcd' as well as the symbol .code abcd all pretty-print as .codn abcd . To understand the meaning, the user has to refer to the documentation of the specific application which produces that representation. When .code pretty-p is true, strings are printed by sending their characters to the output stream, as if by the .code put-string function, rather than being rendered in the string literal notation consisting of double quotes, and escape sequences for control characters. Likewise, character objects are printed via .code put-char rather than the .code #\e notation. When .code pretty-p is true, buffer objects are printed as strings of hexadecimal digit pairs, without being embedded in the .code #b'...' notation, and without any line breaks. This behavior is new in \*(TX 275; see the COMPATIBILITY section. The .code pretty-p flag causes symbols to be printed without their package prefix, except that symbols from the keyword package are still printed with the leading colon. Floating-point objects are printed as if using the .code format function, with formatting controlled by the .code *pprint-flo-format* variable. When aggregate objects like conses, ranges and vectors are printed, the notations of these objects themselves are unaffected by the .code pretty-p flag; however, that flag is distributed to the elements. The .code print function returns .metn obj . The .code pprint ("pretty print") function is equivalent to .codn print , with the .meta pretty-p argument hardcoded true. The .code prinl function ("print and new line") behaves like a call to .code print with .meta pretty-p defaulting to .codn nil , followed by issuing a newline characters to the stream. The .code pprinl function ("pretty print and new line") behaves like .code pprint followed by issuing a newline to the stream. The .code tostring and .code tostringp functions are like .code print and .codn pprint , but they do not accept a stream argument. Instead they print to a freshly instantiated string stream, and return the resulting string. The following equivalences hold between calls to the .code format function and calls to the above functions: .verb (format stream "~s" obj) <--> (print obj stream) (format t "~s" obj) <--> (print obj) (format t "~s\en" obj) <--> (prinl obj) (format nil "~s" obj) <--> (tostring obj) .brev For .codn pprint , .code tostringp and .codn pprinl , the equivalence is produced by using .code ~a in format rather than .codn ~s . .TP* Notes: For floating-point numbers, the above description of the behavior in terms of the format specifiers .code ~s and .code ~a only applies with respect to the default values of the variables .code *print-flo-format* and .codn *pprint-flo-format* . For characters, the print function behaves as follows: most control characters in the Unicode .code C0 and .code C1 range are rendered using the .code #\ex notation, using two hex digits. Codes in the range .code D800 to .codn DFFF , and the codes .code FFFE and .code FFFF are printed in the .code #\exNNNN with four hexadecimal digits, and character above this range are printed using the same notation, but with six hexadecimal digits. Certain characters in the .code C0 range are printed using their names such as .code #\enul and .codn #\ereturn , which are documented in the Character Literals section. The .code DC00 character is printed as .codn #\epnul . All other characters are printed as .mono .meti >> #\e char .onom where .meta char is the actual character. Caution: read-print consistency is affected by trailing material. If additional digits are printed immediately after a number without intervening whitespace, they extend that number. If hex digits are printed after the character .codn x , which is rendered as .codn #\ex , they look like a hex character code. .coNP Function @ tprint .synb .mets (tprint < obj <> [ stream ]) .syne .desc The .code tprint function prints a representation of .meta obj on .metn stream . If the stream argument is not supplied, then the destination is the stream currently stored in the .code *stdout* variable. For all object types except lists and vectors, .code tprint behaves like .codn pprinl . If .code obj is a list or vector, then .code tprint recurses: the .code tprint function is applied to each element. An empty list or vector results in no output at all. This effectively means that an arbitrarily nested structure of lists and vectors is printed flattened, with one element on each line. .coNP Function @ display-width .synb .mets (display-width << char ) .mets (display-width << string ) .syne .desc The .code display-width function calculates the number of places occupied by the printed representation of .meta char or .meta string on a monospace display which renders certain characters, such as the East Asian kanji and other characters, using two places. For a .meta string argument, this value is the sum of the individual display width of the string's constituent characters. The display width of an empty string is zero. Control characters are assigned a display width of zero, regardless of their display control semantics, if any. Characters marked by Unicode as being wide or full width, have a display width of two. Other characters have a display width of one. .coNP Function @ streamp .synb .mets (streamp << obj ) .syne .desc The .code streamp function returns .code t if .meta obj is any type of stream. Otherwise it returns .codn nil . .coNP Function @ real-time-stream-p .synb .mets (real-time-stream-p << obj ) .syne .desc The .code real-time-streamp-p function returns .code t if .meta obj is a stream marked as "real-time". If .meta obj is not a stream, or not a stream marked as "real-time", then it returns .codn nil . Only certain kinds of streams accept the real-time attribute: file streams and tail streams. This attribute controls the semantics of the application of .code lazy-stream-cons to the stream. For a real-time stream, .code lazy-stream-cons returns a stream with "naive" semantics which returns data as soon as it is available, at the cost of generating spurious .code nil item when the stream terminates. The application has to recognize and discard that .code nil item. The ordinary lazy streams read ahead by one line and suppress this extra item, so their representation is more accurate. When \*(TX starts up, it automatically marks the .code *stdin* stream as real-time, if it is connected to a TTY device (a device for which the POSIX function .code isatty reports true). This is only supported on platforms that have this function. The behavior is overridden by the .code -n command-line option. .coNP Function @ open-file .synb .mets (open-file < path <> [ mode-string ]) .syne .desc The .code open-file function creates a stream connected to the file which is located at the given .metn path , which is a string. The .meta mode-string argument is a string which uses the same conventions as the mode argument of the C language .code fopen function, with greater permissiveness, and some extensions. The syntax of .meta mode-string is described by the following grammar. Note that it permits no whitespace characters: .mono .mets < mode-string := [ < mode ] [ < options ] .mets < mode := { < selector [ + ] | + } .mets < selector := { r | w | a | m | T } .mets < options := { b | x | l | u | i | n | < digit | .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ <> z[ digit ] | < redirection | >> ? fdno } .mets < digit := { 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 } .onom If the .meta mode-string argument is omitted, the behavior is the same as an empty mode string. The .meta mode part of the mode string generates the following possibilities: .RS .meIP empty If the .meta mode is missing, then a default mode is implied. The default is specific to the particular stream-opening function. In the case of .codn open-file , the default mode is .codn r . .coIP + A .meta mode consisting of just the .code + character is equivalent to .codn r+ . .coIP r This .meta mode means that the file is opened for reading. .coIP r+ The file is opened for reading and writing. It is not created if it doesn't exist. .coIP w The file is opened for writing. If it exists, it is truncated to zero length. If it doesn't exist, it is created. .coIP w+ The file is opened for reading and writing. If it exists, it is truncated to zero length. If it doesn't exist, it is created. .coIP m The file is opened for modification. This is the same as .code w except that the file is not truncated if it exists. .coIP m+ The file is opened for reading and modification. This is the same as .code w+ except that the file is not truncated if it exists. .coIP a The file is opened for writing. If it doesn't exist, it is created. If it exists, the current position is advanced to one byte past the end of the file, so that newly written data are appended. .coIP a+ The file is opened for reading and writing. If it doesn't exist, it is created. The read position is at the beginning of the file, but writes are appended to the end regardless of the position. .coIP T This selector may be used on operating systems which support the .code O_TMPFILE mode of the .code open POSIX C library function. The .meta path must specify a directory to which the calling process has write permission. An anonymous, unlinked file will be created in the filesystem which holds that directory, open for reading and writing. See additional notes at the end of this section. .RE .IP The meanings of the option characters are: .RS .coIP b The file is opened in binary mode: no line-ending translation takes place. In the absence of this option, files are opened in text mode, in which newline characters in the stream are an abstract indication of the end of a line, translate to a system-specific way of terminating lines in text files. .coIP x The file is created and opened only if it does not already exist. Otherwise, a .code file-error exception is thrown. This option is allowed only with the .code w and .code w+ modes. .coIP l Specifies that the stream will be line buffered. This means that an implicit flush operation takes place whenever the newline character is output. .coIP u Specifies that the stream will be unbuffered. It is erroneous for both .code l and .code u to be specified. .coIP i Specifies that the stream will have the real-time property set. For a description of the semantics, see the .code real-time-stream-p function. Briefly, this property affects the semantics of lazy lists which draw input from the stream. In addition, for a stream opened for writing or reading and writing, the .code i mode letter specifies that the stream will be line buffered, unless specified as unbuffered with .codn u . .coIP n Specifies that the operation shall not block. .meIP digit A decimal digit specifies the stream buffer size as binary exponential buffer size order, such that .code 0 specifies 1024 bytes, .code 1 specifies 2048 and so forth up to .code 9 specifying 524288 bytes. If no such digit is specified, then the stream uses a default buffer size. It is erroneous for the size order digit to be present together with the option .codn u . .coIP z This option specifies .code gzip compression. A .code gzip-stream is opened for the file rather than an ordinary .codn stdio-stream , which performs compression when writing and decompression when reading. The compression uses the Deflate algorithm, and a file format compatible with the .code gzip utility. If .code z is immediately followed by a digit, then that specifies the compression level. The default level is 6. A value of zero is an invalid level, silently accepted and treated as the default. When .code z is specified, special restrictions apply to the .metn mode-string . If these are violated, an exception is thrown. A .code gzip-stream does not support an update mode; it may not be open simultaneously for reading and writing as, for example, requested by the .str r+ mode. The options .codn l , .code u and .code i are inapplicable. The digit option specifying the buffer size order also may not be used with .codn z . The .code seek-stream function may only be used in the forward direction on .code gzip streams open for writing, and has the semantics of writing a region of zero bytes. Arbitrary seeking is supported in read mode, but via a costly emulation which decompresses data from the beginning of the file to the desired seek point. .meIP redirection This option refers to a special syntax that only has an effect in mode strings that are passed to the .code open-process function; the syntax performs I/O redirections in the child process created by that function, and is described in that function's documentation. .meIP >> ? fdno Like .metn redirection , this option refers to syntax which only has an effect in mode strings that are passed to the .code open-process function. The syntax selects an alternative file descriptor to connect to the returned stream. This is described in the documentation for .code open-process function. .RE .IP The .code O_TMPFILE flag on which the .str T mode selector depends was introduced by the Linux kernel, and is likely only supported on Linux systems. It is not supported on all filesystem types. The .code T mode offers a way to create temporary files in a robust way in any file system which supports the mechanism. There is no concern about choosing a unique file name, since the file doesn't have one. The file is guaranteed to disappear if the process is terminated in any manner. In contrast, traditional temporary files which are initially named a name and then unlinked may remain if the process is abruptly terminated before it is able to call .codn unlink . On Linux, it is possible to link a file created with .str T into the filesystem, according to the following pattern: .verb ;; atomically create file called "name" with content "hello" (let* ((stream (open-file "." "T")) (fd (fileno stream))) (put-string "hello\en" stream) (flush-stream stream) (rlink `/proc/self/fd/@fd` "name")) .brev The atomic creation of a file can be simulated by the familiar pattern of writing to a visible temporary file and then renaming. However, the above pattern eliminates the risk that a temporary file will be left behind if the procedure is interrupted for any reason before reaching the .code rlink call. Any reason includes process termination that cannot be intercepted and handled, and operating system failure or power loss. .coNP Function @ open-tail .synb .mets (open-tail < path >> [ mode-string <> [ seek-to-end-p ]]) .syne .desc The .code open-tail function creates a tail stream connected to the file which is located at the given .metn path . The .meta mode-string argument is a string which uses the same conventions as the mode argument of the C language .code fopen function. If this argument is omitted, then .str r is used. See the .code open-file function for a discussion of modes. The .code seek-to-end-p argument is a Boolean which determines whether the initial read/write position is at the start of the file, or just past the end. It defaults to .codn nil . This argument only makes a difference if the file exists at the time .code open-tail is called. If the file does not exist, and is later created, then the tail stream will follow that file from the beginning. In other words, .meta seek-to-end-p controls whether the tail stream reads all the existing data in the file, if any, or whether it reads only newly added data from approximately the time the stream is created. A tail stream has special semantics with regard to reading at the end of file. A tail stream never reports an end-of-file condition; instead it polls the file until more data is added. Furthermore, if the file is truncated, or replaced with a smaller file, the tail stream follows this change: it automatically opens the smaller file and starts reading from the beginning (the .meta seek-to-end-p flag only applies to the initial open). In this manner, a tail stream can dynamically grow rotating log files. Caveat: since a tail stream can reopen a new file which has the same name as the original file, it behave incorrectly if the program changes the current working directory, and the pathname is relative. .coNP Function @ open-directory .synb .mets (open-directory << path ) .syne .desc The .code open-directory function tries to create a stream which reads the directory given by the string argument .metn path . If a filesystem object exists under the path, is accessible, and is a directory, then the function returns a stream. Otherwise, a file error exception is thrown. The resulting stream supports the .code get-line operation. Each call to the .code get-line operation retrieves a string representing the next directory entry. The value .code nil is returned when there are no more directory entries. The .code . and .code .. entries in Unix filesystems are not skipped. .coNP Function @ tmpfile .synb .mets (tmpfile) .syne .desc The .code tmpfile function creates a new temporary binary file which is different from any existing file. It opens a stream for that file and returns the stream. The stream is created with the .code open-file mode .strn w+b . When the stream is closed, or the \*(TX image terminates, the file is deleted. Note: the .code tmpfile function is implemented using the same-named ISO C and POSIX library function. On POSIX systems of sufficient quality, .code tmpfile deletes the file before returning the open stream, such that the file object continues to exist while the stream is open, but is not known by any name in the file system. POSIX (IEEE Std 1003.1-2017) notes that in some implementations, "a permanent file may be left behind if the process calling tmpfile() is killed while it is processing a call to tmpfile". Notes: if a unique file is required which exists in the file system under a known name until explicitly deleted, the .code mkstemp function may be used. If a unique directory needs to be created, the .code mkdtemp function may be used. These two functions are described in the Unix Filesystem Complex Operations section of the manual. .coNP Function @ make-string-input-stream .synb .mets (make-string-input-stream << string ) .syne .desc The .code make-string-input-stream function produces an input stream object. Character read operations on the stream object read successive characters from .metn string . Output operations and byte operations are not supported. .coNP Function @ make-string-byte-input-stream .synb .mets (make-string-byte-input-stream << string ) .syne .desc The .code make-string-byte-input-stream function produces an input stream object. Byte read operations on this stream object read successive byte values obtained by encoding .meta string into UTF-8. Character read operations are not supported, and neither are output operations. .coNP Function @ make-strlist-input-stream .synb .mets (make-strlist-input-stream << list ) .syne .desc The .code make-strlist-input-stream function produces an input stream object based on a list of strings. Through the character read operations invoked on this stream, the list of strings appears as a list of newline-terminated lines. Output operations and byte operations are not supported. .coNP Function @ make-string-output-stream .synb .mets (make-string-output-stream) .syne .desc The .code make-string-output-stream function, which takes no arguments, creates a string output stream. Data sent to this stream is accumulated into a string object. String output streams support both character and byte output operations. Bytes are assumed to represent a UTF-8 encoding, and are decoded in order to form characters which are stored into the string. If an incomplete UTF-8 code is output, and a character output operation then takes place, that code is assumed to be terminated and is decoded as invalid bytes. The UTF-8 decoding machine is reset and ready for the start of a new code. The .code get-string-from-stream function is used to retrieve the accumulated string. If the null character is written to a string output stream, the behavior is unspecified. \*(TX strings cannot contain null bytes. The pseudo-null character .codn #\exDC00 , also notated .codn #\epnul , will produce a null byte when converted to UTF-8 and thus serves as an effective internal representation of the null character in external data. .coNP Function @ get-string-from-stream .synb .mets (get-string-from-stream << stream ) .syne .desc The .meta stream argument must be a string output stream. This function finalizes the data sent to the stream and retrieves the accumulated character string. If a partial UTF-8 code has been written to .metn stream , and then this function is called, the byte stream is considered complete and the partial code is decoded as invalid bytes. After this function is called, further output on the stream is not possible. .coNP Function @ make-strlist-output-stream .synb .mets (make-strlist-output-stream) .syne .desc The .code make-strlist-output-stream function is similar to .codn make-string-output-stream . However, the stream object produced by this function does not produce a string, but a list of strings. The data is broken into multiple strings by newline characters written to the stream. Newline characters do not appear in the string list. Also, byte output operations are not supported. .coNP Function @ get-list-from-stream .synb .mets (get-list-from-stream << stream ) .syne .desc The .code get-list-from-stream function returns the string list which has accumulated inside a string output stream given by .metn stream . The string output stream is finalized, so that further output is no longer possible. .coNP Macro @ with-in-string-stream .synb .mets (with-in-string-stream >> ( stream-var << string ) .mets \ \ << body-form *) .syne .desc The .code with-in-string-stream macro binds the symbol .meta stream-var as a variable, initializing it with a newly created string input stream. The string input stream is constructed from .meta string as if by the .mono .meti (make-string-input-stream << string ) .onom expression. Then it evaluates the .metn body-form s in the scope of the variable. The value of the last .meta body-form is returned, or else .code nil if no forms are present. The .meta stream-var argument must be a bindable symbol, as defined by the .code bindable function. The .meta string argument must be a form which evaluates to a character string value. .coNP Macro @ with-in-string-byte-stream .synb .mets (with-in-string-byte-stream >> ( stream-var << string ) .mets \ \ << body-form *) .syne .desc The .code with-in-string-byte-stream macro binds the symbol .meta stream-var as a variable, initializing it with a newly created string byte input stream. The string input stream is constructed from .meta string as if by the .mono .meti (make-string-byte-input-stream << string ) .onom expression. Then it evaluates the .metn body-form s in the scope of the variable. The value of the last .meta body-form is returned, or else .code nil if no forms are present. The .meta string argument must be a form which evaluates to a character string value. .coNP Macro @ with-out-string-stream .synb .mets (with-out-string-stream <> ( stream-var ) << body-form *) .syne .desc The .code with-out-string-stream macro binds the symbol specified by the .meta stream-var argument as a variable, initializing it with a newly created string output stream. The output stream is created as if by the .code make-string-output-stream function. Then it evaluates .metn body-form s in the scope of that variable. After these forms are evaluated, the string is extracted from the string output stream, as if by the .code get-string-from-stream function, and returned as the result value of the form. .coNP Macro @ with-out-strlist-stream .synb .mets (with-out-strlist-stream <> ( stream-var ) << body-form *) .syne .desc The .code with-out-strlist-stream macro binds the symbol specified by the .meta stream-var argument as a variable, initializing it with a newly created string list output stream. The output stream is created as if by the .code make-strlist-output-stream function. Then it evaluates .metn body-form s in the scope of that variable. After these forms are evaluated, the string list is extracted from the string output stream, as if by the .code get-strlist-from-stream function, and returned as the result value of the form. .coNP Function @ make-byte-input-stream .synb .mets (make-byte-input-stream << obj ) .syne .desc The .code make-byte-input-stream creates a stream which supports the .code get-byte operation for traversing a byte-wise representation of .metn obj . The function serves as a generic interface for calling one of several other stream constructing functions based on the type of the .meta obj argument. The .meta obj argument must be either a buffer, in which case .code make-byte-input-stream behaves like .codn make-buf-stream , or else a string, in which case the function behaves like .codn make-string-byte-input-stream . Note: the repertoire of types handled by .code make-byte-input-stream may expand in future language versions. .coNP Function @ close-stream .synb .mets (close-stream < stream <> [ throw-on-error-p ]) .syne .desc The .code close-stream function performs a close operation on .metn stream , whose meaning is depends on the type of the stream. For some types of streams, such as string streams, it does nothing. For streams which are connected to operating system files or devices, will perform a close of the underlying file descriptor, and dissociate that descriptor from the stream. Any buffered data is flushed first. .code close-stream returns a Boolean true value if the close has occurred without errors, otherwise .codn nil . For most streams, "without errors" means that any buffered output data is flushed successfully. For command and process pipes (see .code open-command and .codn open-process ), success also means that the process terminates normally, with a successful error code, or an unsuccessful one. An abnormal termination is considered an error, as is the inability to retrieve the termination status, as well as the situation that the process continues running in spite of the close attempt. Detecting these situations is platform specific. If the .meta throw-on-error-p argument is specified, and isn't .codn nil , then the function throws an exception if an error occurs during the close operation instead of returning .codn nil . If .code close-stream is called in such a way that it returns a value, without throwing an exception, and that value isn't .codn nil , that value is retained. Additional calls to the function with the same .meta stream object return that same value without having any effect on the stream. These additional calls ignore the .meta throw-on-error-p argument. The .meta stream may be associated with a process, in one of several ways: implicitly, by the functions .code open-process and .code open-command and related functions, or explicitly by the .code open-fileno function, if a .meta pid argument is specified. In this situation, .code close-stream waits for the termination of that process, after closing the underlying file descriptor. If the process terminates normally, then .code close-stream returns its termination status, which is zero if the termination is successful. If the status of the process cannot be obtained, or is an abnormal termination, then the return value is .codn nil . In that situation, if .meta throw-on-error-p is true, an exception is thrown instead. .coNP Macro @ with-stream .synb .mets (with-stream >> ( stream-var << init-form ) .mets \ \ << body-form *) .syne .desc The .code with-stream macro binds the variable whose name is given by the .meta stream-var argument, and macro arranges for the evaluation of .metn body-form s in the scope of that variable. The variable is initialized with the value produced by the evaluation of .meta init-form which must be an expression which evaluates to a stream. After each .meta body-form is evaluated, the stream is closed, as if by the .mono .meti (close-stream << stream-var ) .onom expression. The value of the last .meta body-form then becomes the result value of the form, or else .code nil if these forms are absent. If the evaluation of the .metn body-form s is abandoned, the stream is still closed. That is to say, the closure of the stream is a protected action, as if by the .code unwind-protect operator. .coNP Functions @, get-error @ get-error-str and @ clear-error .synb .mets (get-error << stream ) .mets (get-error-str << stream ) .mets (clear-error << stream ) .syne .desc When a stream operation fails, the .code get-error and .code get-error-str functions may be used to inquire about a more detailed cause of the error. Not all streams support these functions to the same extent. For instance, string input streams have no persistent state. The only error which occurs is the condition when the string has no more data. The .code get-error inquires .meta stream about its error condition. The function returns .code nil to indicate there is no error condition, .code t to indicate an end-of-data condition, or else a value which is specific to the stream type indicating the specific error type. Note: for some streams, it is possible for the .code t value to be returned even though no operation has failed; that is to say, the streams "know" they are at the end of the data even though no read operation has failed. Code which depends on this will not work with streams which do not thus indicate the end-of-data .IR "a priori" , but by means of a read operation which fails. The .code get-error-str function returns a text representation of the error code. The .code nil error code is represented as the string .codn "no error" ; the .code t error code as .code "eof" and other codes have a stream-specific representation. The .code clear-error function removes the error situation from a stream. On some streams, it does nothing. If an error has occurred on a stream, this function should be called prior to retrying any I/O or positioning operations. The return value is the previous error code, or .code nil if there was no error, or the operation is not supported on the stream. .coNP Functions @, get-line @ get-char and @ get-byte .synb .mets (get-line <> [ stream ]) .mets (get-char <> [ stream ]) .mets (get-byte <> [ stream ]) .syne .desc These fundamental stream functions perform input. The .meta stream argument is optional. If it is specified, it should be an input stream which supports the given operation. If it is not specified, then the .code *stdin* stream is used. The .code get-char function pulls a character from a stream which supports character input. Streams which support character input also support the .code get-line function which extracts a line of text delimited by the end of the stream or a newline character and returns it as a string. (The newline character does not appear in the string which is returned). Character input from streams based on bytes requires UTF-8 decoding, so that .code get-char may actually read several bytes from the underlying low-level operating system stream. The .code get-byte function bypasses UTF-8 decoding and reads raw bytes from any stream which supports byte input. Bytes are represented as integer values in the range 0 to 255. Note that if a stream supports both byte input and character input, then mixing the two operations will interfere with the UTF-8 decoding. These functions return .code nil when the end of data is reached. Errors are represented as exceptions. See also: .code get-lines .coNP Function @ get-string .synb .mets (get-string >> [ stream >> [ count <> [ close-after-p ]]]) .syne .desc The .code get-string function reads characters from a stream, and assembles them into a string, which is returned. If the .meta stream argument is omitted, then the .code *stdin* stream is used. The stream is closed after extracting the data, unless .meta close-after-p is specified as .codn nil . The default value of this argument is .codn t . If the .meta count argument is missing, then all of the characters from the stream are read and assembled into a string. If present, the .meta count argument should be a positive integer indicating a limit on how many characters to read. The returned string will be no longer than .metn count , but may be shorter. .coNP Functions @ unget-char and @ unget-byte .synb .mets (unget-char < char <> [ stream ]) .mets (unget-byte < byte <> [ stream ]) .syne .desc These functions put back, into a stream, a character or byte which was previously read. The character or byte must match the one which was most recently read. If the .meta stream argument is omitted, then the .code *stdin* stream is used. If the operation succeeds, the byte or character value is returned. A .code nil return indicates that the operation is unsupported. Some streams do not support these operations; some support only one of them. In general, if a stream supports .codn get-char , it supports .codn unget-char , and likewise for .code get-byte and .codn unget-byte . Streams may require a pushed back byte or character to match the character which was previously read from that stream position, and may not allow a byte or character to be pushed back beyond the beginning of the stream. Space may be available for only one byte of pushback under the .code unget-byte operation. The number of characters that may be pushed back by .code unget-char is not limited. Pushing both a byte and a character, in either order, is also unsupported. Pushing a byte and then reading a character, or pushing a character and reading a byte, are unsupported mixtures of operations. If the stream is binary, then pushing back a byte decrements its position, except if the position is already zero. At that point, the position becomes indeterminate. The behavior of pushing back immediately after a .code seek-stream positioning operation is unspecified. .coNP Functions @, put-string @, put-line @ put-char and @ put-byte .synb .mets (put-string < string <> [ stream ]) .mets (put-line >> [ string <> [ stream ]]) .mets (put-char < char <> [ stream ]) .mets (put-byte < byte <> [ stream ]) .syne .desc These functions perform output on an output stream. The .meta stream argument must be an output stream which supports the given operation. If it is omitted, then .code *stdout* is used. The .code put-char function writes a character given by .code char to a stream. If the stream is based on bytes, then the character is encoded into UTF-8 and multiple bytes are written. Streams which support .code put-char also support .code put-line and .codn put-string . The .code put-string function writes the characters of a string out to the stream as if by multiple calls to .codn put-char . The .meta string argument may be a symbol, in which case its name is used as the string. The .code put-line function is like .codn put-string , but also writes an additional newline character. The string is optional in .codn put-line , and defaults to the empty string. The .code put-byte function writes a raw byte given by the .meta byte argument to .metn stream , if .meta stream supports a byte write operation. The byte value is specified as an integer value in the range 0 to 255. All these functions return .codn t . On failure, they do not return, but throw exceptions of type .codn file-error . .coNP Functions @ put-strings and @ put-lines .synb .mets (put-strings < sequence <> [ stream ]]) .mets (put-lines < sequence <> [ stream ]]) .syne .desc These functions assume .meta sequence to be a sequence of strings, or of symbols, or a mixture thereof. These strings are sent to the stream. The .meta stream argument must be an output stream. If it is omitted, then .code *stdout* is used. The .code put-strings function iterates over .meta sequence and writes each element to the stream as if using the .code put-string function. The .code put-lines function iterates over .code sequence and writes each element to the stream as if using the .code put-line function. Both functions return .codn t . .coNP Function @ flush-stream .synb .mets (flush-stream <> [ stream ]) .syne .desc The .code flush-stream function is meaningful for output streams which accumulate data which is passed on to the operating system in larger transfer units. Calling .code flush-stream causes all accumulated data inside .meta stream to be passed to the operating system. If called on streams for which this function is not meaningful, it does nothing, and returns .codn nil . If .meta stream is omitted, the current value of .code *stdout* is used. .coNP Function @ seek-stream .synb .mets (seek-stream < stream < offset << whence ) .syne .desc The .code seek-stream function is meaningful for file streams. It changes the current read/write position within .metn stream . It can also be used to determine the current position: see the notes about the return value below. The .meta offset argument is a positive or negative integer which gives a displacement that is measured from the point identified by the .meta whence argument. Note that for text files, there isn't necessarily a 1:1 correspondence between characters and positions due to line-ending conversions and conversions to and from UTF-8. The .meta whence argument is one of three keywords: .codn :from-start , .code :from-current and .codn :from-end . These denote the start of the file, the current position in the file and the end of the file. If .meta offset is zero, and .meta whence is .codn :from-current , then .code seek-stream returns the current absolute position within the stream, if it can successfully obtain it. Otherwise, it returns .code t if it is successful. If a character has been successfully put back into a text stream with .code unget-char and is still pending, then the position value is unspecified. If a byte has been put back into a binary stream with .codn unget-byte , and the previous position wasn't zero, then the position is decremented by one. On failure, it throws an exception of type .codn stream-error . .coNP Function @ truncate-stream .synb .mets (truncate-stream < stream <> [ length ]) .syne .desc The .code truncate-stream causes the length of the underlying file associated with .meta stream to be set to .meta length bytes. The stream must be a file stream, and must be open for writing. If .meta length is omitted, then it defaults to the current position, retrieved as if by invoking the .code seek-stream with an .meta offset argument of zero and .meta whence argument of .codn :from-current . Hence, after the .code truncate-stream operation, that position is one byte past the end of the file. .coNP Functions @ stream-get-prop and @ stream-set-prop .synb .mets (stream-get-prop < stream << indicator ) .mets (stream-set-prop < stream < indicator << value ) .syne .desc These functions get and set properties on a stream. Only certain properties are meaningful with certain kinds of streams, and the meaning depends on the stream. If two or more stream types support a property of the same name, it is expected that the property has the same or similar meaning for both streams to the maximum extent that similarity is possible. The .code stream-set-prop function sets a property on a stream. The .meta indicator argument is a symbol, usually a keyword symbol, denoting the property, and .meta value is the property value. If the stream understands and accepts the property, the function returns .codn t . Otherwise it returns .codn nil . The .code stream-get-prop function inquires about the value of a property on a stream. If the stream understands the property, then it returns its current value. If the stream does not understand a property, nil is returned, which is also returned if the property exists, but its value happens to be .codn nil . The .code :name property is widely supported by streams of various types. It associates the stream with a name. This property is not always modifiable. File, process and stream socket I/O streams have a .code :fd property which can be accessed, but not modified. It retrieves the same value as the .code fileno function. The "real time" property supported by these streams, connected with the .code real-time-stream-p function, also appears as the .code :real-time property. I/O streams also have a property called .code :byte-oriented which, if set, suppresses the decoding of UTF-8 on character input. Rather, each byte of the file corresponds directly to one character. Bytes in the range 1 to 255 correspond to the character code points U+0001 to U+00FF. Byte value 0 is mapped to the code point U+DC00. The logging priority of the .code *stdlog* syslog stream is controlled by the .code :prio property. If .meta stream is a catenated stream (see the function .codn make-catenated-stream ) then these functions transparently operate on the current head stream of the catenation. .coNP Functions @ make-catenated-stream and @ cat-streams .synb .mets (make-catenated-stream << stream *) .mets (cat-streams << stream-list ) .syne .desc The .code make-catenated-stream function takes zero or more arguments which are input streams of the same type, and combines them into a single virtual stream called a catenated stream. The .code cat-streams function takes a single list of input streams of the same type, and similarly combines them into a catenated stream. A catenated stream does not support seeking operations or output, regardless of the capabilities of the streams in the list. If the stream list is not empty, then the leftmost element of the list is called the head stream. The .codn get-char , .codn get-byte , .codn get-line , .code unget-char and .code unget-byte functions delegate to the corresponding operations on the head stream, if it exists. If the stream list is empty, they return .code nil to the caller. If the .codn get-char , .code get-byte or .code get-line operation on the head stream yields .codn nil , and there are more lists in the stream, then the stream is closed, removed from the list, and the next stream, if any, becomes the head list. The operation is then tried again. If any of these operations fail on the last list, it is not removed from the list, so that a stream remains in place which can take the .code unget-char or .code unget-byte operations. In this manner, the catenated streams appear to be a single stream. Note that the operations can fail due to being unsupported. It is the caller's responsibility to make sure all of the streams in the list are compatible with the intended operations. If the stream list is empty then an empty catenated stream is produced. Input operations on this stream yield .codn nil , and the .code unget-char and .code unget-byte operations throw an exception. .coNP Function @ catenated-stream-p .synb .mets (catenated-stream-p << obj ) .syne .desc The .code catenated-stream-p function returns .code t if .meta obj is a catenated stream. Otherwise it returns .codn nil . .coNP Function @ catenated-stream-push .synb .mets (catenated-stream-push < new-stream << cat-stream ) .syne .desc The .code catenated-stream-push function pushes .meta new-stream to the front of the stream list inside .metn cat-stream . If an .code unget-byte or .code unget-char operation was successfully performed on .meta cat-stream previously to a call to .codn catenated-stream-push , those operations were forwarded to the front stream. If those bytes or characters are still pending, they are pending inside that stream, and thus are logically preceded by the contents of .metn new-stream . .coNP Functions @ open-files and @ open-files* .synb .mets (open-files < path-list >> [ alternative-stream <> [ mode-string ]]) .mets (open-files* < path-list >> [ alternative-stream <> [ mode-string ]]) .syne .desc The .code open-files and .code open-files* functions create a list of streams by invoking the .code open-file function on each element of .metn path-list . By default, the mode string .str r is passed to .codn open-file ; if the .meta mode-string argument specified, it overrides this default. In that situation, the specified mode should permit reading. These streams are turned into a catenated stream as if they were the arguments of a call to .codn make-catenated-stream . The effect is that multiple files appear to be catenated together into a single input stream. If the optional .meta alternative-stream argument is supplied, then if .meta path-list is empty, .meta alternative-stream is returned instead of an empty catenated stream. The difference between .code open-files and .code open-files* is that .code open-files creates all of the streams up-front. So if any of the paths cannot be opened, the operation throws. The .code open-files* variant is lazy: it creates a lazy list of streams out of the path list. The streams are opened as needed: before the second stream is opened, the program has to read the first stream to the end, and so on. .TP* Example: Collect lines from all files that are given as arguments on the command line. If there are no files, then read from standard input: .verb @(next (open-files *args* *stdin*)) @(collect) @line @(end) .brev .coNP Function @ path-equal .synb .mets (path-equal < left-path << right-path ) .syne .desc The .code path-equal function determines whether the two paths .meta left-path and .meta right-path are equal under a certain definition of equivalence, whose requirements are given below. The function returns .code t if the paths are equal, otherwise .codn nil . If .meta left-path and .meta right-path are strings which are identical under the .code equal function, then they are considered equal paths. Otherwise, the two paths are equal if the relative path from .meta left-path to .meta right-path is .str . (dot), as would be determined by the .code path-rel function, if it were applied to .meta left-path and .meta right-path as its arguments. If .code path-rel would return the dot path, then the two paths are equal. If .code path-rel would return any other value, or throw an exception, then the paths are unequal. .TP* Examples: .verb ;; simple case (path-equal "a" "a") -> t (path-equal "a" "b") -> nil ;; trailing slashes don't matter (path-equal "a" "a/") -> t (path-equal "a/" "a/") -> t ;; .. components resolved: (path-equal "a/b/../c" "a/c") -> t ;; . components resolved: (path-equal "a" "a/././.") -> t (path-equal "a/." "a/././.") -> t ;; (On Microsoft Windows) ;; different drive: (path-equal "c:/a" "d:/b/../a") -> nil ;; same drive: (path-equal "c:/a" "c:/b/../a") -> t .brev .coNP Functions @ abs-path-p and @ portable-abs-path-p .synb .mets (abs-path-p << path ) .mets (portable-abs-path-p << path ) .syne .desc The .code abs-path-p and .code portable-abs-path-p functions test whether the argument .meta path is an absolute path, returning a .code t or .code nil indication. The .code portable-abs-path-p function behaves in the same manner on all platforms, implementing a platform-agnostic definition of .IR "absolute path" , as follows. An absolute path is a string which either begins with a slash or backslash character, or which begins with an alphanumeric word, followed by a colon, followed by a slash or backslash. The empty string isn't an absolute path. Examples of absolute paths under .codn portable-abs-path-p : .verb /etc c:/tmp ftp://user@server disk0:/home Z:\eUsers .brev Examples of strings which are not absolute paths: .verb . abc foo:bar/x $:\eabc .brev The .code abs-path-p is similar to .code portable-abs-path-p except that it reports false for paths which are not absolute paths according to the host platform. The following paths are not absolute on POSIX platforms: .verb c:/tmp ftp://user@server disk0:/home Z:\eUsers .brev .coNP Function @ pure-rel-path-p .synb .mets (pure-rel-path-p << path ) .syne .desc The .code pure-rel-path-p function tests whether the string .meta path represents a .IR "pure relative path" , which is defined as a path which isn't absolute according to .codn abs-path-p , which isn't the string .str . (single period), which doesn't begin with a period followed by a slash or backslash, and which doesn't begin with an alphanumeric word terminated by a colon. The empty string is a pure relative path. Other examples of pure relative paths: .verb abc.d .tmp/bar 1234 x $:/xyz .brev Examples of strings which are not pure relative paths: .verb . / /etc ./abc .\e foo: $:\eabc .brev .coNP Functions @ dir-name and @ base-name .synb .mets (dir-name << path ) .mets (base-name < path <> [ suffix ]) .syne .desc The .code dir-name and .code base-name functions calculate, respective, the directory part and base name part of a pathname. The calculation is performed in a platform-dependent way, using the characters in the variable .code path-sep-chars as path component separators. Both functions first remove from any further consideration all superfluous trailing occurrences of the directory separator characters from .codn path . Thus input such as .str "a////" is reduced to just .strn "a" , and .str "///" is reduced to .strn "/" . The resulting trimmed path is the .IR "effective path" . If the effective path is an empty string, then .code dir-name returns .str "." and .code base-name returns the empty string. If the effective path is not empty, and contains no path separator characters, then .code dir-name returns .str "." and .code base-name returns the effective path. Otherwise, the effective path is divided into two parts: the .I "raw directory prefix" and the remainder. The raw directory prefix is the maximally long prefix of the effective path which ends in a separator character. The .code dir-name function returns the raw directory prefix, if that prefix consists of nothing but a single directory separator character. Otherwise it returns the raw directory prefix, with the trailing path separator removed. The .code base-name function returns the remaining part of the effective path, after the raw directory prefix. If the .meta suffix argument is given to .codn base-name , it specifies a proper suffix to be removed from the returned base name. First, the base name is calculated according to the foregoing rules. Then, if .meta suffix matches a trailing portion of the base name, but not the entire base name, it is removed from the base name. The .meta suffix parameter may be given a .codn nil , argument, which is treated exactly as if it were absent. Note: this requirement allows for the following idiom to work correctly even in cases when .code p has no suffix: .verb ;; calculate base name of p with short suffix removed (base-name p (short-suffix p)) ;; calculate base name of p with long suffix removed (base-name p (long-suffix p)) .brev .TP* Examples: .verb (base-name "") -> "" (base-name "/") -> "/" (base-name ".") -> "." (base-name "./") -> "." (base-name "a") -> "a" (base-name "/a") -> "a" (base-name "/a/") -> "a" (base-name "/a/b") -> "b" (base-name "/a/b/") -> "b" (base-name "/a/b///") -> "b" ;; with suffix (base-name "" "") -> "" (base-name "/" "/") -> "/" (base-name "/" "") -> "/" (base-name "." ".") -> "." (base-name "." "") -> "." (base-name "./" "/") -> "." (base-name "a" "a") -> "a" (base-name "a" "") -> "a" (base-name "a.b" ".b") -> "a" (base-name "a.b/" ".b") -> "a" (base-name "a.b/" ".b/") -> "a.b" (base-name "a.b/" "a.b") -> "a.b" .brev .coNP Functions @ long-suffix and @ short-suffix .synb .mets (long-suffix < path <> [ alt ]) .mets (short-suffix < path <> [ alt ]) .syne .desc The .code long-suffix and .code short-suffix functions calculate the .I "long suffix" and .I "short suffix" of .metn path , which must be a string. If .meta path does not contain any occurrences of the character .code . (period) in the role of a suffix delimiter, then .meta path does not have a suffix. In this situation, both functions return the .meta alt argument, which defaults to .code nil if it is omitted. What it means for .meta path to have a suffix delimiter is that the .code . character occurs somewhere in the last component of .metn path , other than as the first character of that component. What constitutes the last component is specified in more detail below. If a suffix delimiter is present, then the long or short suffix is the substring of .meta path which includes the delimiting period and all characters which follow, except that if .meta path ends in a sequence of one or more path separator characters, those characters are omitted from the returned suffix. If multiple periods occur in the last component of the path, the delimiter for the long suffix is the leftmost period and the delimiter for the short suffix is the rightmost period. If the delimiting period is the rightmost character of .metn path , or occurs immediately before a trailing path separator, then the suffix delimited by that period is the period itself. If .meta path contains only one suffix delimiter, then its long and short suffix coincide. For the purpose of identifying the last component of .metn path , if .meta path ends a sequence of one or more path-separator characters, then those characters are removed from consideration. If the remaining string contains path-separator characters, then the last component consists of that portion of it which follows the rightmost path-separator character. Otherwise, the last component is the entire string. The suffix, if present, is identified and extracted from this last component. .TP* Examples: .verb (short-suffix "") -> nil (short-suffix ".") -> nil (short-suffix "abc") -> nil (short-suffix ".abc") -> nil (short-suffix "/.abc") -> nil (short-suffix "abc" "") -> "" (short-suffix "abc.") -> "." (short-suffix "abc.tar") -> ".tar" (short-suffix "abc.tar///") -> ".tar" (short-suffix "abc.tar.gz") -> ".gz" (short-suffix "abc.tar.gz/") -> ".gz" (short-suffix "x.y.z/abc.tar.gz/") -> ".gz" (short-suffix "x.y.z/abc.tar.gz//") -> nil (long-suffix "") -> nil (long-suffix ".") -> nil (long-suffix "abc") -> nil (long-suffix ".abc") -> nil (long-suffix "/.abc") -> nil (long-suffix "abc.") -> "." (long-suffix "abc.tar") -> ".tar" (long-suffix "abc.tar///") -> ".tar" (long-suffix "abc.tar.gz") -> ".tar.gz" (long-suffix "abc.tar.gz/") -> ".tar.gz" (long-suffix "x.y.z/abc.tar.gz/") -> ".tar.gz" .brev .coNP Functions @ trim-long-suffix and @ trim-short-suffix .synb .mets (trim-long-suffix << path ) .mets (trim-short-suffix << path ) .syne .desc The .code trim-long-suffix and .code trim-short-suffix functions calculate the portion of .meta path .I "long suffix" and .I "short suffix" of the string argument .metn path , and return a path with the suffix removed. Respectively, .code trim-long-suffix and .code trim-short-suffix calculate the suffix in exactly the same manner as .code long-suffix and .codn short-suffix . If .meta path is found not to contain a suffix, then it is returned. If .meta path contains a suffix, then a new string is returned from which the suffix is deleted. If the suffix is followed by one or more path separator characters, these are preserved in the return value. .TP* Examples: .verb (trim-short-suffix "") -> "" (trim-short-suffix "a") -> "a" (trim-short-suffix ".") -> "." (trim-short-suffix ".a") -> ".a" (trim-short-suffix "a.") -> "a" (trim-short-suffix "a.b") -> "a" (trim-short-suffix "a.b.c") -> "a.b" (trim-short-suffix "a./") -> "a/" (trim-short-suffix "a.b/") -> "a/" (trim-short-suffix "a.b.c/") -> "a.b/" (trim-long-suffix "a.b.c") -> "a" (trim-long-suffix "a.b.c/") -> "a/" (trim-long-suffix "a.b.c///") -> "a///" .brev .coNP Function @ add-suffix .synb .mets (add-suffix < path << suffix ) .syne .desc The .code add-suffix function combines the string arguments .meta path and .meta suffix in a way which harmonizes with the .code long-suffix and .code short-suffix functions. If .meta path does not end in a path separator character, that category being defined by the .code path-sep-chars variable, then .code add-suffix returns the trivial string catenation of .meta path and .metn suffix . Otherwise, .code add-suffix returns a string formed by inserting .meta suffix into .meta path just prior to the sequence of trailing path separator characters. The returned string is a catenation of that portion of .meta path which excludes the sequence of trailing path separators, followed by .metn suffix , followed by the sequence of trailing path separators. A path separator which occurs as a part of syntax that indicates an absolute pathname is not considered a trailing separator. A path which begins with a separator is absolute. Other platform-specific path patterns may constitute an absolute pathname. Note: in cases when .meta suffix does not begin with a period, or is inserted in such a way that it is the start of a path component, then the functions .code long-suffix and .code short-suffix will not recognize .meta suffix in the resulting path. .TP* Examples: .verb (add-suffix "" "") -> "" (add-suffix "" "a") -> "a" (add-suffix "." "a") -> ".a" (add-suffix "." ".a") -> "..a" (add-suffix "/" ".b") -> "/.b" (add-suffix "//" ".b") -> "/.b/" (add-suffix "//" "b") -> "/b/" (add-suffix "a" "") -> "a" (add-suffix "a" ".b") -> "a.b" (add-suffix "a/" ".b") -> "a.b/" (add-suffix "a//" ".b") -> "a.b//" ;; On MS Windows (add-suffix "c://" "x") -> "c:/x/" (add-suffix "host://" "x") -> "host://x" (add-suffix "host:///" "x") -> "host://x/" .brev .coNP Function @ path-cat .synb .mets (path-cat >> [ dir-path <> { rel-path }*]) .syne .desc The .code path-cat function joins together zero or more paths, returning the combined path. All arguments are strings. The following description defines the behavior when .code path-cat is given exactly two arguments, which are interpreted as .meta dir-path and .metn rel-path . A description of the variable-argument semantics follows. Firstly, the two-argument .code path-cat is related to the functions .code dir-name and .code base-name in the following way: if .meta p is some path denoting an object in the file system, then .code "(path-cat (dir-name p) (base-name p))" produces a path .meta p* which denotes the same object. The paths .meta p and .meta p* might not be equivalent strings. The .code path-cat function ensures that paths are joined without superfluous path-separator characters, regardless of whether .meta dir-path ends in a separator. If a separator must be added, the character .code / (forward slash) is always used, even on platforms where .code \e (backslash) is also a pathname separator, and even if either argument includes backslashes. The .code path-cat function eliminates trivial occurrences of the .code . (dot) path component. It preserves trailing separators in the following way: if .meta rel-path ends in a path-separator character, then the returned string shall end in that character; and if .meta rel-path vanishes entirely because it is equivalent to the dot, then the returned string is .meta dir-name itself. If .meta dir-path is an empty string, then .code rel-path is returned, and vice versa. The variadic semantics of .code path-cat are as follows. If .code path-cat is called with no arguments at all, it returns the path .str . (period) denoting the relative path of the current directory. If .code path-cat is called with one argument, that argument is returned. If .code path-cat is called with three or more arguments, a left-associative reduction takes place using the two-argument semantics. The first two arguments are catenated into a single path, which is then catenated with the third argument, and so on. The above semantics imply that the following equivalence holds: .verb [reduce-left path-cat list] <--> [apply path-cat list] .brev .TP* Examples: .verb (path-cat "" "") --> "" (path-cat "" ".") --> "." (path-cat "." "") --> "." (path-cat "." ".") --> "." (path-cat "abc" ".") --> "abc" (path-cat "." "abc") --> "abc" (path-cat "./" ".") --> "./" (path-cat "." "./") --> "./" (path-cat "abc/" ".") --> "abc/" (path-cat "./" "abc") --> "abc" (path-cat "/" ".") --> "/" (path-cat "/" "abc") --> "/abc" (path-cat "ab/cd" "ef") --> "ab/cd/ef" (path-cat "a" "b" "c") --> "a/b/c" (path-cat "a" "b" "" "c" "/") --> "a/b/c/" .brev .coNP Function @ trim-path-seps .synb .mets (trim-path-seps << path ) .syne .desc The .code trim-path-seps function removes a consecutive run of one or more trailing separators from the end of the input string .metn path . The function treats the .mets path in a system-independent way: both the backslash and forward slash are considered a trailing separator. The function preserves any necessary trailing separators, such as that of the absolute path .str / or the trailing slashes in volume absolute paths such as .strn c:/ . .TP* Examples: .verb (trim-path-seps "") -> "" (trim-path-seps "/") -> "/" (trim-path-seps "//") -> "/" (trim-path-seps "a///") -> "a" (trim-path-seps "/a///") -> "/a") (trim-path-seps "\e\e") -> "\e\e" (trim-path-seps "\e\e\e\e") -> "\e\e" (trim-path-seps "\e\ea\e\e\e\e\e\e") -> "\e\ea") (trim-path-seps "c:/") -> "c:/" (trim-path-seps "c://") -> "c:/" (trim-path-seps "c:///") -> "c:/" (trim-path-seps "c:a///") -> "c:a" ;; not a volume prefix: (trim-path-seps "/c:/a///") -> "/c:/a" (trim-path-seps "/c://///") -> "/c:") (trim-path-seps "c:\e\e") -> "c:\e\e" (trim-path-seps "c:\e\e\e\e") -> "c:\e\e" (trim-path-seps "c:a\e\e\e\e\e\e") -> "c:a" ;; mixtures (trim-path-seps "c:/\e\e/\e\e/") -> "c:/" .brev .coNP Function @ rel-path .synb .mets (rel-path < from-path << to-path ) .syne .desc The .code rel-path function calculates the relative path between two file system locations indicated by string arguments .meta from-path and .metn to-path . The .meta from-path is assumed to be a directory. The return value is a relative path which could be used to access an object named by .meta to-path if .meta from-path were the current working directory. The calculation performed by .code rel-path is a pure calculation; it has no interaction with the host operating system. No component of either input path has to exist. Symbolic links are not resolved. This can lead to incorrect results, as noted below. Either both the inputs must be absolute paths, or must both be relative, otherwise an error exception is thrown. On the MS Windows platform, if one input specifies a drive letter prefix, the other input must specify the same prefix, or else an error exception is thrown; there is no relative path between locations on different drives. The behavior is unspecified if the arguments are two UNC paths indicating different hosts. The .code rel-path function first splits both paths into components according to the platform-specific pathname separators indicated by the .code path-sep-chars variable. Next, it eliminates all empty components, .code . (dot) components and .code .. (dotdot) components from both separated paths. All dot components are removed, and any component which is neither dot nor dotdot is removed if it is followed by dotdot. Then, a common prefix is determined between the two component sequences, and a relative component sequence is calculated from them as follows: If the component sequence corresponding to .meta from-path is longer than the common prefix, then the excess part of that sequence after the common prefix must not contain any .code .. (dotdot) components, or else an error exception is thrown. Otherwise, every component in this excess part of the .meta from-path component sequence is converted to .code .. in order to express the relative navigation from .meta from-path up to the directory indicated by the common prefix. Next, if the component sequence corresponding to .meta to-path has any components in excess of the common prefix, those excess components are appended to this possibly empty sequence of dotdot components, in order to express navigation from the common prefix down to the .meta to-path object. This excess sequence coming from .meta to-path may include .code .. components. Finally, if the resulting sequence is nonempty, it is joined together using the leftmost path separator character indicated in .code path-sep-chars and returned. If it is empty, then the string .str . is returned. Note: because the function doesn't access the file system and in particular does not resolve symbolic links or other indirection devices, the result may be incorrect. For example, suppose that the current working directory contains a symbolic link called .code up which expands to .code .. (dotdot). The expression .code "(rel-path \(dqup/a\(dq \(dq../a\(dq)" is oblivious to this, and calculates .strn ../../../a . The correct result in light of .code up being an alias for .code .. calls for a return value of .strn . . The exact problem is that any symbolic links in the excess part of .meta from-path after the common prefix are assumed by .code rel-path to be simple subdirectory names, which can be navigated in reverse using a .code .. link. This reverse navigation assumption is false for any symbolic link which which does not act as an alias for a subdirectory in the same location. In situations where this possibility exists, it is recommended to use .code realpath function to canonicalize the input paths. The following is an example of the algorithm being applied to arguments .str a/d/../b/x/y/ and .strn a/b/w , where the assumption is that this is on a POSIX platform where the leftmost character in .code path-sep-chars is .codn / : Firstly, both inputs are converted to component sequences, those respectively being: .verb ("a" "d" ".." "b" "x" "y" "") ("a" "b" "w") .brev Next the .code .. and empty components are removed: .verb ("a" "b" "x" "y") ("a" "b" "w") .brev At this point, the common prefix is identified: .verb ("a" "b") .brev The .meta from-path has two components in excess of the prefix: .verb ("x" "y") .brev which are each replaced by .strn .. . The .meta to-path has one component in excess of the common prefix, .strn w . These two sequences are appended together: .verb (".." ".." "w") .brev The resulting path is then formed by joining these with the separator character, resulting in the relative path .strn "../../w" . .TP* Examples: .verb ;; mixtures of relative and absolute (rel-path "/abc" "abc") -> ;; error (rel-path "abc" "/abc") -> ;; error ;; dotdot in excess part of from path: (rel-path "../../x" "y") -> ;; error (rel-path "." ".") -> "." (rel-path "./abc" "abc") -> "." (rel-path "abc" "./abc") -> "." (rel-path "./abc" "./abc") -> "." (rel-path "abc" "abc") -> "." (rel-path "." "abc") -> "abc" (rel-path "abc/def" "abc/ghi") -> "../ghi" (rel-path "xyz/../abc/def" "abc/ghi") -> "../ghi" (rel-path "abc" "d/e/f/g/h") -> "../d/e/f/g/h" (rel-path "abc" "d/e/../g/h") -> "../d/g/h" (rel-path "d/e/../g/h" ".") -> "../../.." (rel-path "d/e/../g/h" "a/b") -> "../../../a/b" (rel-path "x" "../../../y") -> "../../../../y" (rel-path "x///" "x") -> "." (rel-path "x" "x///") -> "." (rel-path "///x" "/x") -> "." .brev .coNP Variable @ path-sep-chars .desc The .code path-sep-chars variable holds a string consisting of the characters which the underlying operating system recognizes as pathname separators. If a particular of these characters is considered preferred on the host platform, that character is placed in the first position of .codn path-sep-chars . Altering the value of this variable has no effect on any \*(TL library function. .coNP Functions @ read and @ iread .synb .mets (read >> [ source .mets \ \ \ \ \ \ >> [ err-stream >> [ err-retval >> [ name <> [ lineno ]]]]]) .mets (iread >> [ source .mets \ \ \ \ \ \ \ >> [ err-stream >> [ err-retval >> [ name <> [ lineno ]]]]]) .syne .desc The .code read function converts text denoting \*(TL structure, into the corresponding data structure. The .meta source argument may be either a character string, or a stream. If it is omitted, then .code *stdin* is used as the stream. The .meta source must provide the text representation of one complete \*(TL object. If .meta source and the function being applied is .codn read , then if the object is followed by any non-whitespace material, the situation is treated as a syntax error, even if that material is a syntactically valid additional object. The .code iread function ignores this situation. Other differences between .code read and .code iread are given below. Multiple calls to .code read on the same stream will extract successive objects from the stream. To parse successive objects from a string, it is necessary to convert it to a string stream. The optional .meta err-stream argument can be used to specify a stream to which diagnostics of parse errors are sent. If absent, the diagnostics are suppressed. The optional .meta name argument can be used to specify the file name which is used for reporting errors. If this argument is missing, the name is taken from the name property of the .meta source argument if it is a stream, or else the word .code string is used as the name if .meta source is a string. The optional .code lineno argument, defaulting to 1, specifies the starting line number. This, like the .meta name argument, is used for reporting errors. If there are no parse errors, the function returns the parsed data structure. If there are parse errors, and the .meta err-retval parameter is present, its value is returned. If the .meta err-retval parameter is not present, then an exception of type .code syntax-error is thrown. The .code iread function ("interactive read") is similar to .code read except that it parses a modified version of the syntax. The modified syntax does not support the application of the dot and dotdot operators on a top-level expression. For instance, if the input is .code a.b or .code "a .. b" then .code iread will only read the .code a token whereas .code read will read the entire expression. This modified syntax allows .code iread to return immediately when an expression is recognized, which is the expected behavior if the input is being read from an interactive terminal. By contrast, .code read waits for more input after seeing a complete expression, because of the possibility that the expression will be further extended by means of the dot or dotdot operators. An explicit end-of-input signal must be given from the terminal to terminate the expression. The special variable .code *rec-source-loc* controls whether these functions record source location info similarly to .codn load . Note: if these functions are used to scan data which is evaluated as Lisp code, it may be useful to set .code *rec-source-loc* true in order to obtain better diagnostics. However, source location recording incurs a performance and storage penalty. .coNP Function @ read-objects .synb .mets (read-objects >> [ source .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ >> [ err-stream .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ >> [ err-retval >> [ name <> [ lineno ]]]]]) .syne .desc The .code read-objects function has the same argument syntax and semantics as the .code read function, except that rather than reading one object, it reads all the Lisp objects from the source, and returns a list of these objects. If the stream is empty, then .code read-objects returns the empty list .codn nil , whereas the .code read function treats the situation as an error. .coNP Function @ parse-errors .synb .mets (parse-errors << stream ) .syne .desc The .code parse-errors function retrieves information, from a .metn stream , pertaining to the status of the most recent parsing operation performed on that stream: namely, a previous call to .codn read , .code iread or .codn get-json . If the .meta stream object has not been used for parsing, or else the most recent parsing operation did not encounter errors, then .code parse-errors returns .codn nil . If the most recent parsing operation on .meta stream encountered errors, then .code parse-errors function returns a positive integer value indicating the error count. Otherwise it returns .codn nil . If a parse error operation encounters a syntax error before obtaining any token from the stream, then the error count is zero and .code parse-errors returns .codn nil . Consequently, .code parse-errors may be used after a failed parse operation to distinguish a true syntax error from an end-of-stream condition. .coNP Function @ record-adapter .synb .mets (record-adapter < regex >> [ stream <> [ include-match ]]) .syne .desc The .code record-adapter function returns a new stream object which acts as an .I adapter to the existing .metn stream . If an argument is not specified for .metn stream , then the .code *std-input* stream is used. With the exception of .metn get-line , all operations on the returned adapter transparently delegate to the original .meta stream object. When the .code get-line function is used on the adapter, it behaves differently. A string is extracted from .metn stream , and returned. However, the string isn't a line delimited by a newline character, but rather a record delimited by .metn regex . This record is extracted as if by a call to the .code read-until-match function, invoked with the .metn regex , .meta stream and .meta include-match arguments. All behavior which is built on the .code get-lines function is affected by the record-delimiting semantics of a record adapter's .code get-line implementation. Notably, the .code get-lines and .code lazy-stream-cons functions return a lazy list of delimited records rather than of lines. .SS* Stream Output Indentation \*(TL streams provide support for establishing hanging indentations in text output. Each stream which supports output has a built-in state variable called indentation mode, and another variable indicating the current indentation amount. When indentation mode is enabled, then prior to the first character of every line, the stream prepends the indentation: space characters equal in number to the current indentation value. This logic is implemented by the .code put-char and .code put-string functions, and all functions based on these. The .code put-byte function does not interact with indentation. The column position tracking will be incorrect if byte and character output are mixed, affecting the placement of indentation. Indentation mode takes on four numeric values, given by the four variables .codn indent-off , .codn indent-data , .code indent-code and .codn indent-foff . As far as stream output is concerned, the code and data modes represented by .code indent-code and .code indent-data behave the same way: both represent the "indentation turned on" state. The difference between them influences the behavior of the .code width-check function. This function isn't used by any lower-level stream output routines. It is used by the object printing functions like .code print and .code pprint to break up long lines. The .code indent-off and .code indent-foff modes are also treated the same way by lower level stream output, indicating "indentation turned off". The modes are distinguished by .code print and .code pprint in the following way: .code indent-off is a "soft" disable which allows these object-printing routines to temporarily turn on indentation while traversing aggregate objects. Whereas the .code indent-foff ("force off") value is a "hard" disable: the object-printing routines will not enable indentation and will not break up long lines. .coNP Variables @, indent-off @, indent-data @ indent-code and @ indent-foff .desc These variables hold integer values representing output stream indentation modes. The value of .code indent-off is zero. .coNP Functions @ get-indent-mode and @ set-indent-mode .synb .mets (get-indent-mode << stream ) .mets (set-indent-mode < stream << new-mode ) .mets (test-set-indent-mode < stream < compare-mode << new-mode ) .syne .desc These functions retrieve and manipulate the stream indent mode. The .code get-indent-mode retrieves the current indent mode of .metn stream . The .code set-indent-mode function sets the indent mode of .meta stream to .meta new-mode and returns the previous mode. Note: it is encouraged to save and restore the indentation mode, and in a way that is exception safe. If a block of code sets up indentation on a stream such as .code *stdout* and is terminated by an exception, the indentation will remain in effect and affect subsequent output. The .code with-resources macro or .code unwind-protect operator may be used. .coNP Functions @ test-set-indent-mode and @ test-neq-set-indent-mode .synb .mets (test-set-indent-mode < stream < compare-mode << new-mode ) .mets (test-neq-set-indent-mode < stream < compare-mode << new-mode ) .syne .desc The .code test-set-indent-mode function sets the indent mode of .meta stream to .meta new-mode if and only if its current mode is equal to .metn compare-mode . Whether or not it changes the mode, it returns the previous mode. The .code test-neq-set-indent-mode only differs in that it sets .meta stream to .meta new-mode if and only if the current mode is .B not equal to .metn compare-mode . .coNP Functions @, get-indent @, set-indent @ inc-indent and @ inc-indent-abs .synb .mets (get-indent << stream ) .mets (set-indent < stream << new-indent ) .mets (inc-indent < stream << indent-delta ) .mets (inc-indent-abs < stream << indent-delta ) .syne .desc These functions manipulate the indentation value of the stream. The indentation takes effect the next time a character is output following a newline character. The .code get-indent function retrieves the current indentation amount. The .code set-indent function sets .metn stream 's indentation to the value .meta new-indent and returns the previous value. Negative values are clamped to zero. The .code inc-indent function sets .metn stream 's indentation relative to the current printing column position, and returns the old value. The indentation is calculated by adding .meta indent-delta to the current column position. If a negative indentation results, it is clamped to zero. The .code inc-indent-abs function sets .metn stream 's indentation relative to the current indentation value. The indentation is calculated by adding .meta indent-delta to the current indentation amount. If a negative indentation results, it is clamped to zero. .coNP Function @ width-check .synb .mets (width-check < stream << alt-char ) .syne .desc The .code width-check function examines the state of the stream, taking into consideration the current printing column position, the indentation state, the indentation amount and an internal "force break" flag. It makes a decision either to introduce a line break by printing a newline character, or else to print the .meta alt-char character. If a decision is made not to emit a line break, but .meta alt-char is .codn nil , then the function has no effect at all. The return value is .code t if the function has issued a line break, otherwise .codn nil . .coNP Function @ force-break .synb .mets (force-break << stream ) .syne .desc If the .code force-break function is called on a stream, it sets an internal "force break" flag which affects the future behavior of .codn width-check . The .code width-check function examines this flag. If the flag is set, .code width-check clears it, and issues a line break without considering any other conditions. The .metn stream 's .code force-break flag is also cleared whenever a newline character is output. The .code force-break function returns .codn stream . Note: the .code force-break is involved in line breaking decisions. Whenever a list or list-like syntax is being printed, whenever an element of that syntax is broken into multiple lines, a break is forced after that element, in order to avoid output which resembles the following diagonally-creeping pattern: .verb (a b c (d e f g h i) j (k l m n) o) .brev but instead is rendered in a more horizontally compact pattern: .verb (a b c (d e f g h i) j (k l m n) o) .brev When the printer prints .code "(d e f g h i)" it uses the .code width-check function between the elements; that function issues the break between the .code f and .codn g . The printer monitors the return value of .codn width-check ; it knows that since one of the calls returned .codn t , the object had been broken into two or more lines. It then calls .code force-break after printing the last element .code i of that object. Then, due to the force flag, the outer recursion of the printer which is printing .code "(a b c ...)" will experience a break when it calls .code width-check before printing .codn j . Custom .code print methods defined on structure objects can take advantage of .code width-check and .code force-break in the same way so that user-defined output integrates with the formatting algorithm. .SS* Stream Output Limiting Streams have two properties which are used by the \*(TL object printer to optionally truncate the output generated by aggregate objects. A stream can specify a maximum length for aggregate objects via the .code set-max-length function. Using the .code set-max-depth function, the maximum depth can also be specified. This feature is useful when diagnostic output is being produced, and the objects involved are so large that the diagnostic output overwhelms the output device or the user, so as to become uninformative. Output limiting also prevents the printer's non-termination on infinite, lazy structures. It is recommended that functions which operate on streams passed in as parameters save and restore these parameters, if they need to manipulate them, for instance using .codn with-resources : .verb (defun output-function (arg stream) ;; temporarily impose maximum width and depth (with-resources ((ml (set-max-length stream 42) (set-max-length stream ml)) (mw (set-max-depth stream 12) (set-max-depth stream mw))) (prinl arg stream) ...)) .brev .coNP Function @ set-max-length .synb .mets (set-max-length < stream << value ) .syne .desc The .code set-max-length function establishes the maximum length for aggregate object printing. It affects the printing of lists, vectors, hash tables, strings as well as quasiliterals and quasiword list literals (QLLs). The default value is 0 and this value means that no limit is imposed. Otherwise, the value must be a positive integer. When the list, vector or hash-table object being printed has more elements than the maximum length, then elements are printed only up to the maximum count, and then the remaining elements are summarized by printing the .code ... (three dots) character sequence as if it were an additional element. This sequence is an invalid token; it cannot be read as input. When a character string is printed, and the maximum length parameter is nonzero, a maximum character count is determined as follows. Firstly, if the maximum length value is less than 3, it is taken to be 3. Then it is multiplied by 8. Thus, a maximum length of 10 allows 80 characters, whereas a maximum length of 1 allows 24 characters. If a string which exceeds the maximum number of characters is being printed with read-print consistency, as by the .code print function, then only a prefix of the string is printed, limited to the maximum number of characters. Then, the literal syntax is closed using the character sequence .code \e...\(dq (backslash, dot, dot, dot, double quote) whose leading invalid escape sequence .code \e. (backslash, dot) ensures that the truncated object is not readable. If a string which exceeds the maximum number of characters is being printed without read-print consistency, as by the .code pprint function, then only a prefix of the string is printed, limited to the maximum number of characters. Then the character sequence .code ... is emitted. Quasiliterals are treated using a combination of behaviors. Elements of a quasiliteral are literal sequence of text, and embedded variables and expressions. The maximum length specifies both the maximum number of elements in the quasiliteral, and the maximum number of characters in any element which is a sequence of text. When either limit is exceeded, the quasiliteral is immediately terminated with the sequence .code \e...` (escaped dot, dot, dot, backtick). The maximum character limit is applied to the units of text cumulatively, rather than individually. As in the case of string literals, the limit is determined by multiplying the length by 8, and clamping at a minimum value of 24. When a QLL is printed, the space-separated elements of the literal are individually subject to the maximum character limit as if they were independent quasiliterals. Furthermore, the sequence of these elements is subject to the maximum length. If there are more elements in the QLL, then the sequence .code \e...` (escaped dot, dot, dot, backtick) is emitted and thus the QLL ends. The .code set-max-length function returns the previous value. .coNP Function @ set-max-depth .synb .mets (set-max-depth < stream << value ) .syne .desc The .code set-max-length function establishes the maximum depth for the printing of nested objects. It affects the printing of lists, vectors, hash tables and structures. The default value is 0 and this value means that no limit is imposed. Otherwise, the value must be a positive integer. The depth of an object not enclosed in any object is zero. The depth of the element of an aggregate is one greater than the depth of the aggregate itself. For instance, given the list .code "(1 (2 3))" the list itself has depth 0, the atom .code 1 has depth 1, as does the sublist .codn "(2 3)" , and the .code 2 and .code 3 atoms have depth 2. When an object is printed whose depth exceeds the maximum depth, then three dot character sequence .code ... is printed instead of that object. This notation is an invalid token; it cannot be read as input. Additionally, when a vector, list, hash table or structure is printed which itself doesn't exceed the maximum depth, but whose elements do exceed, then that object is summarized, respectively, as .codn "(...)" , .codn "#(...)" , .code "H#(...)" and .codn "S#(...)" , rather than repeating the .code ... sequence for each of its elements. The .code set-max-depth function returns the previous value. .SS* Coprocesses .coNP Functions @, open-command @ open-process and @ open-subprocess .synb .mets (open-command < system-command <> [ mode-string ]) .mets (open-process < program < mode-string <> [ argument-list ]) .mets (open-subprocess < program < mode-string .mets \ \ >> [ argument-list <> [ function ]]) .syne .desc These functions spawn external programs which execute concurrently with the \*(TX program. They all return a unidirectional stream for communicating with these programs: either an output stream, or an input stream, depending on the contents of .metn mode-string . In .codn open-command , the .meta mode-string argument is optional, defaulting to the value .str r if it is missing. See the .code open-file function for a discussion of modes. The .code open-command function is implemented using POSIX .codn popen . Those elements of .meta mode-string which are applicable to .code popen are passed to it, and hence their semantics follows from their processing in that function. The .code open-command function accepts, via the .meta system-command string parameter, a system command, which is in a system-dependent syntax. On a POSIX system, this would be in the POSIX Shell Command Language. The .code open-process function specifies a program to invoke via the .meta command argument. This is subject to the operating system's search strategy. On POSIX systems, if it is an absolute or relative path, it is treated as such, but if it is a simple base name, then it is subject to searching via the components of the PATH environment variable. If .code open-process is not able to find .metn program , or is otherwise unable to execute the program, the child process will exit, using the value of the C variable .code errno as its exit status. This value can be retrieved via .codn close-stream . The .meta argument-list argument is a list of strings which specifies additional optional arguments to be passed to the program. The .meta program argument becomes the first argument, and .meta argument-list becomes the second and subsequent arguments. If .meta argument-list is omitted, it defaults to empty. If a coprocess is open for writing .mono .meti >> ( mode-string .onom is specified as .strn w ), then writing on the returned stream feeds input to that program's standard input file descriptor. Indicating the end of input is performed by closing the stream. If a coprocess is open for reading .mono .meti >> ( mode-string .onom is specified as .strn r ), then the program's output can be gathered by reading from the returned stream. When the program finishes output, it will close the stream, which can be detected as normal end of data. The standard input and error file descriptors of an input coprocess are obtained from the streams stored in the .code *stdin* and .code *stderr* special variables, respectively. Similarly, the standard output and error file descriptors of an output coprocess are obtained from the .code *stdout* and .code *stderr* special variables. These variables must contain streams on which the .code fileno function is meaningful, otherwise the operation will fail. What this functionality means is that rebinding the special variables for standard streams has the effect of redirection. For example, the following two expressions achieve the same effect of creating a stream which reads the output of the .code cat program, which reads and produces the contents of the file .codn text-file . .verb ;; redirect input by rebinding *stdin* (let ((*stdin* (open-file "text-file"))) (open-command "cat")) ;; redirect input using POSIX shell redirection syntax (open-command "cat < text-file") .brev The following is erroneous: .verb ;; (let ((*stdin* (make-string-input-stream "abc"))) (open-command "cat")) .brev A string input or output stream doesn't have an operating system file descriptor; it cannot be passed to a coprocess. The streams .codn *stdin* , .code *stdout* and .code *stderr* are not synchronized with their underlying file descriptors prior to the execution of a coprocess. It is up to the program to ensure that previous output to .code *stdout* or .code *stderr* is flushed, so that the output of the coprocess isn't reordered with regard to output produced by the program. Similarly, input buffered in .code *stdin* is not available to the coprocess, even though it has not yet been read by the program. The program is responsible for preventing this situation also. If a coprocess terminates abnormally or unsuccessfully, an exception is raised. The .meta mode-string argument of .code open-process supports a special .meta redirection syntax. This syntax specifies I/O redirections which are done in the context of the child process, before the specified program is executed. Instances of the syntax are considered options; if .meta mode-string specifies a mode such as .code r that mode must precede the redirections. Redirections may be mixed with other options. Up to four redirections may be specified using one of two forms: a short form or the long form. If more than four redirections are specified, the .meta mode-string is considered ill-formed. The short form of the syntax consists of three characters: the prefix character .codn > , a single decimal digit indicating the file descriptor to be redirected, and then a third character which is either another digit, or else one of the two characters .code n or .codn x . If the third character is a digit, it indicates the target file descriptor of the redirection. For instance .code >21 indicates that file descriptor 2 is to be redirected to 1 (so that material written to standard error goes to the same destination as that written to standard output). If the third character is .codn n , it means that the file descriptor will be redirected to the file .codn /dev/null . For instance, .code >2n indicates that descriptor 2 (standard error) will be redirected to the null device. If the third character is .codn x , it indicates that the file descriptor shall be closed. For instance .code >0x means to close descriptor 0 (standard input). The long form of the syntax allows file descriptors that require more than one decimal digit. It consists of the same prefix character .code > which is immediately followed by an open parenthesis .codn ( . The parenthesis is immediately followed by one or more digits which give the to-be-redirected file descriptor. This is followed by one or more whitespace characters, and then either another multi-digit decimal file descriptor or one of the two letters .code n or .codn x . This second element must be immediately followed by the closing parenthesis .codn ) . Thus .code >21 and .code >2n may be written in the long form, respectively, as .code ">(2 1)" and .codn ">(2 n)" , while .code ">(32 47)" has no short form equivalent. Multiple redirections may be specified, in any mixture of the long and short form. For instance .code "r>21>0n>(27 31)" specifies a process pipe that is open for reading, capturing the output of the process. In that process, standard error is redirected to standard output, standard input is connected to the null device, and descriptor 27 is redirected to descriptor 31. The .meta mode-string argument of .code open-process also supports a special .mono .meti >> ? fdno .onom syntax. This syntax specifies an alternative file descriptor in the process to which the returned stream should be connected. By default, when the process is opened for writing, its standard output descriptor 1 is used, and when it is opened for reading, its standard input descriptor 0 is used. This option overrides the choice of descriptor. The .meta fdno portion of the syntax must be a sequence of decimal digits, immediately following the .code ? character. For example, the mode string .str ?2 specifies that the process is to be open for input, such that the input stream captures the standard error output of that process. In this situation, the standard output will not be captured; it remains unredirected. The .code open-subprocess function is a variant of .codn open-process . This function has all the same argument conventions and semantics as .codn open-process , adding the .meta function argument. If this argument isn't .codn nil , then it must specify a function which can be called with no arguments. This function is called in the child process after any redirections are established, just before the program specified by the .meta program argument is executed. Moreover, the .code open-subprocess function allows .meta program to be specified as .code nil in which case .meta function must be specified. When .meta function returns, the child process terminates as if by a call to .code exit* with an argument of zero. .coNP Functions @, map-command-lines @ map-command-str and @ map-command-buf .synb .mets (map-command-lines < cmd < lines <> [ mode-opts ]) .mets (map-command-str < cmd < str <> [ mode-opts ]) .mets (map-command-buf < cmd < buf >> [ pos >> [ bytes <> [ skip ]]]]]) .syne .desc The .codn map-command-lines , .code map-command-str and .code map-command-buf functions filter data through an external command. The .meta cmd parameter has the same meaning as the corresponding parameter in the .meta open-command function. The command open with the .str w mode, which is implied. The .meta mode-opts optional argument, if present, specifies extra mode options, which must be compatible with .codn w . The .meta lines argument in .code map-command-lines must be a sequence of strings. These strings are transmitted to the command as newline-terminated lines, as if by the .code put-lines function. Simultaneously, the output of the command is read and divided into lines as if by the .code get-lines function. The entire output of the command is read before the function terminates, and the list of lines is returned. Similarly, the .meta str argument in .code map-command-str is transmitted to the executing command as its complete input, as if by .codn put-string . Simultaneously, the output of the command is captured as a single string, as if using the .code get-string function. That string is returned. The .meta buf argument in .code map-command-buf must be a buffer. The bytes of the buffer are transmitted to the executing command, whose output bytes are gathered into a new buffer object which is returned. The optional .meta pos argument, which defaults to zero, specifies the starting position within .metn buf . Bytes from that position to the end of the buffer are transmitted to the command. The optional .meta bytes argument specifies a limit on the number of bytes of the command's output that should be accumulated into a buffer. The default is unlimited. The optional .meta skip argument, defaulting to zero, specifies how many initial bytes of the command's output must be discarded prior to reading the bytes that are to be accumulated. .coNP Functions @, map-process-lines @ map-process-str and @ map-process-buf .synb .mets (map-process-lines < program < args < lines <> [ mode-opts ]) .mets (map-process-str < program < args < str <> [ mode-opts ]) .mets (map-process-buf < program < args < buf .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ >> [ pos >> [ bytes <> [ skip ]]]]]) .syne .desc The .codn map-process-lines , .code map-process-str and .code map-process-buf are counterparts to .codn map-command-lines , .code map-command-str and .code map-command-buf which specify the external process differently. Instead of the .meta cmd parameter, these functions feature a pair of parameters .meta program and .meta args which have the same semantics as the .meta program and .meta argument-list parameters of .metn open-process . Thus the relationship between these groups of three functions is like that between .code open-command and .codn open-process . In all other regards, these functions are identical to their counterparts. .SS* I/O-Related Convenience Functions The functions in this group create a stream, perform an I/O operation on it, and ensure that it is closed, in one convenient operation. They operate on files or command streams. Several other functions in this category exist, which operate with buffers. They are documented in the Buffer Functions subsection under the FOREIGN FUNCTION INTERFACE section. Many of the functions described in this section take an optional .meta mode-opts argument. If this is specified, it must be a string which follows the .meta options portion of the .meta mode-string syntax described for the .code open-file function. This string must not specify the .code mode part. If specified, the .meta mode-opts must be compatible with the implied .metn mode . Functions that write a file have an implied mode of .strn "w" , those which append have an implied mode of .strn "a" , and those which read have an implied mode of .strn "r" . For instance, a .meta mode-opts value of .str "x" is useful with .code file-put-string but not .codn file-get-string , .coNP Functions @, file-get @ file-get-string and @ file-get-lines .synb .mets (file-get < name <> [ mode-opts ]) .mets (file-get-string < name <> [ mode-opts ]) .mets (file-get-lines < name <> [ mode-opts ]) .syne .desc The .code file-get function opens a text stream over the file indicated by the string argument .meta name for reading, reads the printed representation of a \*(TL object from it, and returns that object, ensuring that the stream is closed. The .code file-get-string is similar to .code file-get except that it reads the entire file as a text stream and returns its contents in a single character string. The .code file-get-lines function opens a text stream over the file indicated by .meta name and returns produces a lazy list of strings representing the lines of text of that file as if by a call to the .code get-lines function, and returns that list. The stream remains open until the list is consumed to the end, as indicated in the description of .codn get-lines . .coNP Functions @, file-put @ file-put-string and @ file-put-lines .synb .mets (file-put < name < obj <> [ mode-opts ]) .mets (file-put-string < name < string <> [ mode-opts ]) .mets (file-put-lines < name < list <> [ mode-opts ]) .syne .desc The .codn file-put , .code file-put-string and .code file-put-lines functions open a text stream over the file indicated by the string argument .metn name , write the argument object into the file in their specific manner, and then close the file. If the file doesn't exist, it is created. If it exists, it is truncated to zero length and overwritten. The .code file-put function writes a printed representation of .meta obj using the .code prinl function. The return value is that of .codn prinl . The .code file-put-string function writes .meta string to the stream using the .code put-string function. The return value is that of .codn put-string . The .code file-put-lines function writes .meta list to the stream using the .code put-lines function. The return value is that of .codn put-lines . .coNP Functions @, file-append @ file-append-string and @ file-append-lines .synb .mets (file-append < name < obj <> [ mode-opts ]) .mets (file-append-string < name < string <> [ mode-opts ]) .mets (file-append-lines < name < list <> [ mode-opts ]) .syne .desc The .codn file-append , .code file-append-string and .code file-append-lines functions open a text stream over the file indicated by the string argument .metn name , write the argument object into the stream in their specific manner, and then close the stream. These functions are close counterparts of, respectively, .codn file-get , .code file-get-string and .codn file-get-lines . These functions behave differently when the indicated file already exists. Rather than being truncated and overwritten, the file is extended by appending the new data to its end. .coNP Function @ file-get-objects .synb .mets (file-get-objects < name < [ mode-opts <> [ error-stream ]]) .syne .desc The .code file-get-objects function opens an input text stream over the file indicated by the .meta name argument, which is a string. All Lisp objects are read from the stream. Parse errors are reported to .meta error-stream which defaults to .code *stdnull* (error output is discarded). If there is a parse error, the function throws an exception, otherwise the list of parsed objects is returned. .coNP Functions @ file-put-objects and @ file-append-objects .synb .mets (file-put-objects < name < seq <> [ mode-opts ]) .syne .desc The functions .code file-put-objects and .code file-append-objects open a text stream over the file indicated by the string argument .metn name , and write each of the objects contained in sequence .meta seq into the stream as if using the .code prinl function on each individual element of .metn seq . The .code file-put-objects function opens the file using the .str w mode, which overwrites the file if it exists, whereas .code file-append-objects uses .strn a , which appends to the file. .coNP Functions @, command-get @ command-get-string and @ command-get-lines .synb .mets (command-get < cmd <> [ mode-opts ]) .mets (command-get-string < cmd <> [ mode-opts ]) .mets (command-get-lines < cmd <> [ mode-opts ]) .syne .desc The .code command-get function opens text stream over an input command pipe created for the command string .metn cmd , as if by the .code open-command function. It reads the printed representation of a \*(TL object from it, and returns that object, ensuring that the stream is closed. The .code command-get-string is similar to .code command-get except that it reads the entire file as a text stream and returns its contents in a single character string. The .code command-get-lines function opens a text stream over an input command pipe created for the command string .meta cmd and returns produces a lazy list of strings representing the lines of text of that file as if by a call to the .code get-lines function, and returns that list. The stream remains open until the list is consumed to the end, as indicated in the description of .codn get-lines . .coNP Functions @, command-put @ command-put-string and @ command-put-lines .synb .mets (command-put < cmd < obj <> [ mode-opts ]) .mets (command-put-string < cmd < string <> [ mode-opts ]) .mets (command-put-lines < cmd < list <> [ mode-opts ]) .syne .desc The .codn command-put , .code command-put-string and .code command-put-lines functions open an output text stream over an output command pipe created for the command specified in the string argument .metn cmd , as if by the .code open-command function. They write the argument object into the stream in their specific manner, and then close the stream. The .code command-put function writes a printed representation of .meta obj using the .code prinl function. The return value is that of .codn prinl . The .code command-put-string function writes .meta string to the stream using the .code put-string function. The return value is that of .codn put-string . The .code command-put-lines function writes .meta list to the stream using the .code put-lines function. The return value is that of .codn put-lines . .SS* Buffer streams A stream type exists which allows .code buf objects to be manipulated through the stream interface. A buffer stream is created using the .code make-buf-stream function, which can either attach the stream to an existing buffer, or create a new buffer that can later be retrieved from the stream using .codn get-buf-from-stream . Operations on the buffer stream treat the underlying buffer much like if it were a memory-based file. Unless the underlying buffer is a "borrowed buffer" referencing the storage belonging to another object (such as the buffer object produced by the .code buf-d FFI type's get semantics) the stream operations can change the buffer's size. Seeking beyond the end of the buffer and then writing one or more bytes extends the buffer's length, filling the newly allocated area with zero bytes. The .code truncate-stream function is supported also. Buffer streams also support the .code :byte-oriented property. Macros .code with-out-buf-stream and .code with-in-buf-stream are provided to simplify the steps involved in using buffer streams in some common scenarios. Note that in spite of the naming of these macros there is only one buffer stream type, which supports bidirectional I/O. .coNP Function @ make-buf-stream .synb .mets (make-buf-stream <> [ buf ]) .syne .desc The .code make-buf-stream function return a new buffer stream. If the .meta buf argument is supplied, it must be a .code buf object. The stream is then associated with this object. If the argument is omitted, a buffer of length zero is created and associated with the stream. .coNP Function @ get-buf-from-stream .synb .mets (get-buf-from-stream << buf-stream ) .syne .desc The .code get-buf-from-stream returns the buffer object associated with .meta buf-stream which must be a buffer stream. .coNP Macros @ with-out-buf-stream and @ with-in-buf-stream .synb .mets (with-out-buf-stream >> ( var <> [ buf-expr ]) .mets \ \ << body-form *) .mets (with-in-buf-stream >> ( var << buf-expr ) .mets \ \ << body-form *) .syne .desc The .code with-out-buf-stream and .code with-in-buf-stream macros both bind variable .meta var to an implicitly created buffer stream, and evaluate zero or more .metn body-form s in the environment where the variable is visible. The .meta buf-expr argument, which may be omitted in the use of the .code with-out-buf-stream macro, must be an expression which evaluates to a .code buf object. The .meta var argument must be a symbol suitable for naming a variable. The implicitly allocated buffer stream is connected to the buffer specified by .meta buf-expr or, when .meta buf-expr is omitted, to a newly allocated buffer. The code generated by the .code with-out-buf-stream macro, if it terminates normally, yields the buffer object as its result value. The .code with-in-buf-stream returns the value of the last .metn body-form , or else .code nil if no forms are specified. .TP* Examples: .verb (with-out-buf-stream (*stdout* (make-buf 24)) (put-string "Hello, world!")) -> #b'48656c6c6f2c2077 6f726c6421000000 0000000000000000' (with-out-buf-stream (*stdout*) (put-string "Hello, world!")) -> #b'48656c6c6f2c2077 6f726c6421' .brev .SS* Foreign Pointers .coNP The @ cptr type Objects of type .code cptr are Lisp values which contain a foreign pointer ("C pointer"). This data type is used by the .code dlopen function and is generally useful in conjunction with the Foreign Function Interface (FFI). An arbitrary pointer emanating from a foreign function can be captured as a .code cptr value, which can be passed back into foreign code. For this purpose, there exits also a matching FFI type called .codn cptr . The .code cptr type supports a symbolic type tag, which defaults to .codn nil . The type tag plays a role in FFI. The FFI .code cptr type supports a tag attribute. When a .code cptr object is converted to a foreign pointer under the control of the FFI type, and that FFI type has a tag other than .codn nil , the object's tag must exactly match that of the FFI type, or the conversion throws an error. In the reverse direction, when a foreign pointer is converted to a .code cptr object under control of the FFI .code cptr type, the object inherits the type tag from the FFI type. Although .code cptr objects are conceptually non-aggregate values, corresponding to pointers, they are de facto aggregates due to their implementation as references to heap objects. When a .code cptr object is passed to a foreign function by pointer, for instance using a parameter of type .codn "(ptr cptr)" , its internal pointer is potentially updated to the new value coming from the function. .coNP Function @ cptr-int .synb .mets (cptr-int < integer <> [ type-symbol ]) .syne .desc The .code cptr-int function converts .meta integer into a pointer in a system-specific way which is consistent with the system's addressing structure. Then it returns that pointer contained in a .code cptr object. The .meta integer argument must be an integer which is in range for a pointer value. Note: this range is wider than the .code fixnum range; a portion of the range of .code bignum integers can denote pointers. An extended range of values is accepted. The entire addressable space may be expressed by non-negative values. A range of negative values also expresses a portion of the address space, in accordance with the platform's concept of a signed integer. For instance, on a system with 32-bit addresses, the values 0 to 4294967295 express all of the addresses as a pure binary value. Furthermore, the values -2147483648 to -1 also express the upper part of this range, corresponding, respectively, to the addresses 2147483648 to 4294967295. On that platform, values of .meta integer outside of the range -2147483648 to 4294967295 are invalid. The .meta type-symbol argument should be a symbol. If omitted, it defaults to .codn nil . This symbol becomes the .code cptr object's type tag. .coNP Function @ cptr-obj .synb .mets (cptr-obj < object <> [ type-symbol ]) .syne .desc The .code cptr-obj function converts .meta object object directly to a .codn cptr . The .meta object argument may be of any type. The raw representation of .meta object is simply stored in a new instance of .code cptr and returned. The .meta type-symbol argument should be a symbol. If omitted, it defaults to .codn nil . This symbol becomes the .code cptr object's type tag. The lifetime of the returned .code cptr object is independent from that of .metn object . If the lifetime of .meta object reaches its end before that of the .codn cptr , the pointer stored inside the .code cptr becomes invalid. .coNP Function @ int-cptr .synb .mets (int-cptr << cptr ) .syne .desc The .code int-cptr function retrieves the pointer value of the .meta cptr object as an integer. If an integer .meta n is in a range convertible to .code cptr type, then the expression .mono .meti (int-cptr (cptr-int << n )) .onom reproduces .metn n . .coNP Function @ cptr-buf .synb .mets (cptr-buf < buf <> [ type-symbol ]) .syne .desc The .code cptr-buf function returns a .code cptr object which holds a pointer to a buffer object's storage area. The .meta buf argument must be of type .codn buf . The .meta type-symbol argument should be a symbol. If omitted, it defaults to .codn nil . This symbol becomes the .code cptr object's type tag. The lifetime of the returned .code cptr object is independent from that of .metn buf . If the lifetime of .meta buf reaches its end before that of the .codn cptr , the pointer stored inside the .code cptr becomes invalid. .coNP Function @ cptr-cast .synb .mets (cptr-cast < type-symbol << cptr ) .syne .desc The .code cptr-cast function produces a new .code cptr object which has the same pointer as .meta cptr but whose type is given by .metn type-symbol . Casting .meta cptr objects with .code cptr-cast circumvents the safety mechanism which .code cptr type tagging provides. .coNP Function @ copy-cptr .synb .mets (cptr-copy << cptr ) .syne .desc The .code copy-cptr function creates a new .code cptr object similar to .metn cptr , which has the same address and type symbol as .metn cptr . .coNP Function @ cptr-zap .synb .mets (cptr-zap << cptr ) .syne .desc The .code cptr-zap function changes the pointer value of the .meta cptr object to the null pointer. The .meta cptr argument must be of .code cptr type. The return value is .meta cptr itself. Note: it is recommended to use .code cptr-zap when the program has taken some action which invalidates the pointer value stored in a .code cptr object, where a risk exists that the value may be subsequently misused. .coNP Function @ cptr-free .synb .mets (cptr-free << cptr ) .syne .desc The .code cptr-free function passes the .meta cptr object's pointer to the C library .code free function. After this action, it behaves exactly like .codn cptr-zap . The .meta cptr argument must be of .code cptr type. The return value is .meta cptr itself. Note: this function is unsafe. If the pointer didn't originate from the .code malloc family of memory allocation functions, or has already been freed, or copies of the pointer exist which are still in use, the consequences are likely catastrophic. .coNP Function @ cptrp .synb .mets (cptrp << value ) .syne .desc The .code cptrp function tests whether .meta value is a .codn cptr . It returns .code t if this is the case, .code nil otherwise. .coNP Function @ cptr-type .synb .mets (cptr-type << cptr ) .syne .desc The .code cptr-type function retrieves the .meta cptr object's type tag. .coNP Function @ cptr-get .synb .mets (cptr-get < cptr <> [ type ]) .syne .desc The .code cptr-get function extracts a Lisp value by converting a C object at the memory location denoted by .metn cptr , according to the FFI type .metn type . The external representation at the specified memory location is is scanned according to the .meta type and converted to a Lisp value which is returned. If the .meta type argument is specified, it must be a FFI type object. If omitted, then the .code cptr object's type tag is interpreted as a FFI type symbol and resolved to a type; the resulting type, if one is found is substituted for .metn type . If the lookup fails an error exception is thrown. The .meta cptr object must be of type .code cptr and point to a memory area suitably aligned for, and large enough to hold a foreign representation of .metn type , at the byte offset indicated by the .meta offset argument. If .meta cptr is a null pointer, an exception is thrown. The .code cptr-get operation is similar to the "get semantics" performed by FFI in order to extract the return value of foreign function calls, and by the FFI callback mechanism to extract the arguments coming into a callback. The .meta type argument may not be a variable length type, such as an array of unspecified size. Note: the functions .code cptr-get and .code cptr-out are useful in simplifying the interaction with "semi-opaque" foreign objects: objects which serve as API handles that are treated as opaque pointers in API argument calls, but which expose some internal members that the application must access directly. The .code cptr objects pass through the foreign API without undergoing conversion, as usual. The application uses these two functions to perform conversion as necessary. Under this technique, the description of the foreign object need not be complete. Structure members which occur after the last member that the application is interested in need not be described in the FFI type. .coNP Function @ cptr-out .synb .mets (cptr-out < cptr < obj <> [ type ]) .syne .desc The .code cptr-out function converts a Lisp value into a C representation, which is stored at the memory location denoted by .metn cptr , according to the FFI type .metn type . The function's return value is .metn obj . If the .meta type argument is specified, it must be a FFI type object. If omitted, then the .code cptr object's type tag is interpreted as a FFI type symbol and resolved to a type; the resulting type, if one is found is substituted for .metn type . If the lookup fails an error exception is thrown. The .meta obj argument must be an object compatible with the conversions implied by .metn type . The .meta cptr object must be of type .code cptr and point to a memory area suitably aligned for, and large enough to hold a foreign representation of .metn type , at the byte offset indicated by the .meta offset argument. If .meta cptr is a null pointer, an exception is thrown. It is assumed that .meta obj is an object which was returned by an earlier call to .codn cptr-get , and that the .meta cptr and .meta type arguments are the same objects that were used in that call. The .code cptr-out function performs the "out semantics" encoding action, similar to the treatment applied to the arguments of a callback prior to returning to foreign code. .coNP Variable @ cptr-null .desc The .code cptr-null variable holds a null pointer as a .code cptr instance. Two .code cptr objects may be compared for equality using the .code equal function, which tests whether their pointers are equal. The .code cptr-null variable compares .code equal to values which have been subject to .code cptr-zap or .codn cptr-free . A null .code cptr may be produced by the expression .codn "(cptr-obj nil)" ; however, this creates a freshly allocated object on each evaluation. The expression .code "(cptr-int 0)" also produces a null pointer on all platforms where \*(TX is found. .coNP Function @ cptr-size-hint .synb .mets (cptr-size-hint < cptr << bytes ) .syne .desc The .code cptr-size-hint function indicates to the garbage collector that the given .meta cptr object is associated with .meta bytes of foreign memory that are otherwise invisible to the garbage collector. Note: this function should be used if the foreign memory is indirectly managed by the .meta cptr object in cooperation with the garbage collector. Specifically, .meta cptr should have a finalizer registered against it which will liberate the foreign memory. .SS* User-Defined Streams In \*(TL, stream objects aren't structure types, and therefore lie outside of the object-oriented programming system. However, \*(TL supports a delegation mechanism which allows a structure which provides certain methods to be used as a stream. The function .code make-struct-delegate-stream takes as an argument the instance of a structure, which is referred to as the .IR "stream interface object" . The function returns a stream object such that when stream operations are invoked on this stream, it delegates these operations to methods of the stream interface object. A structure type called .code stream-wrap is provided, whose instances can serve as stream interface objects. This structure has a slot called .meta stream which holds a stream, and it provides all of the methods required for the delegation mechanism used by .codn make-struct-delegate-stream . This .code stream-wrap operations simply invoke the ordinary stream operations on the .meta stream slot. The .code stream-wrap type can be used as a base class for a derived class which intercepts certain operations on a stream (by defining the corresponding methods) while allowing other operations to transparently pass to the stream (via the base methods inherited from .codn stream-wrap ). .coNP Function @ make-struct-delegate-stream .synb .mets (make-struct-delegate-stream << object ) .syne .desc The .code make-struct-delegate-stream function returns a stream whose operations depend on the .metn object , a stream interface object. The .meta object argument must be a structure which implements certain subsets of, or all of, the following methods: .codn put-string , .codn put-char , .codn put-byte , .codn get-line , .codn get-char , .codn get-byte , .codn unget-char , .codn unget-byte , .codn put-buf , .codn fill-buf , .codn close , .codn flush , .codn seek , .codn truncate , .codn get-prop , .codn set-prop , .codn get-error , .codn get-error-str , .code clear-error and .codn get-fd . Implementing .code get-prop is mandatory, and that method must support the .code :name property. Failure to implement some of the other methods will impair the use of certain stream operations on the object. .coNP Method @ put-string .synb .mets << stream .(put-string << str ) .syne .desc The .code put-string method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code put-string stream I/O function. .coNP Method @ put-char .synb .mets << stream .(put-char << chr ) .syne .desc The .code put-char method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code put-char stream I/O function. .coNP Method @ put-byte .synb .mets << stream .(put-byte << byte ) .syne .desc The .code put-byte method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code put-byte stream I/O function. .coNP Method @ get-line .synb .mets << stream .(get-line) .syne .desc The .code get-line method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code get-line stream I/O function. .coNP Method @ get-char .synb .mets << stream .(get-char) .syne .desc The .code get-char method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code get-char stream I/O function. .coNP Method @ get-byte .synb .mets << stream .(get-byte) .syne .desc The .code get-byte method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code get-byte stream I/O function. .coNP Method @ unget-char .synb .mets << stream .(unget-char << chr ) .syne .desc The .code unget-char method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code unget-char stream I/O function. .coNP Method @ unget-byte .synb .mets << stream .(unget-byte << byte ) .syne .desc The .code unget-byte method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code unget-byte stream I/O function. .coNP Method @ put-buf .synb .mets << stream .(put-buf < buf << pos ) .syne .desc The .code put-buf method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code put-buf stream I/O function. Note: there is a severe restriction on the use of the .meta buf argument. The buffer object denoted by the .meta buf argument may be specially allocated and have a lifetime which is scoped to the method invocation. The .code put-buf method shall not permit the .meta buf object to be used beyond the duration of the method invocation. .coNP Method @ fill-buf .synb .mets << stream .(fill-buf < buf << pos ) .syne .desc The .code fill-buf method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code fill-buf stream I/O function. Note: there is a severe restriction on the use of the .meta buf argument. The buffer object denoted by the .meta buf argument may be specially allocated and have a lifetime which is scoped to the method invocation. The .code fill-buf method shall not permit the .meta buf object to be used beyond the duration of the method invocation. .coNP Method @ close .synb .mets << stream .(close << throw-on-error-p ) .syne .desc The .code close method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code close-stream stream I/O function. With two exceptions, the value returned from .code close is retained by close-stream, such that repeated calls to .code close-stream then return that value without calling the .code close method. The exceptions are the values .code nil and .code : (the colon symbol). If either of these values is returned, and .code close-stream is invoked again on the same stream object, the .code close method will be called again. Furthermore, if the .code : symbol is returned by the .code close method, this indicates a successful close, and the .code close-stream function returns the .code t symbol rather than the .code : symbol. The rationale for this mechanism is that it supports reference-counted closing. A struct delegate stream may be written which is shared by several owners, which must each call .code close-stream before the underlying real stream is closed. .coNP Method @ flush .synb .mets << stream .(flush < offs << whence ) .syne .desc The .code flush method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code flush-stream stream I/O function. .coNP Method @ seek .synb .mets << stream .(seek < offs << whence ) .syne .desc The .code seek method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code seek-stream stream I/O function. .coNP Method @ truncate .synb .mets << stream .(truncate << len ) .syne .desc The .code truncate method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code truncate-stream stream I/O function. .coNP Method @ get-prop .synb .mets << stream .(get-prop << sym ) .syne .desc The .code get-prop method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code get-prop stream I/O function. .coNP Method @ set-prop .synb .mets << stream .(set-prop < sym << nval ) .syne .desc The .code set-prop method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code set-prop stream I/O function. .coNP Method @ get-error .synb .mets << stream .(get-error) .syne .desc The .code get-error method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code get-error stream I/O function. .coNP Method @ get-error-str .synb .mets << stream .(get-error-str) .syne .desc The .code get-error-str method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code get-error-str stream I/O function. .coNP Method @ clear-error .synb .mets << stream .(clear-error) .syne .desc The .code clear-error method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code clear-error stream I/O function. .coNP Method @ get-fd .synb .mets << stream .(get-fd) .syne .desc The .code get-fd method is implemented on a stream interface object. It should behave in a manner consistent with the description of the .code fileno stream I/O function. .coNP Structure @ stream-wrap .synb .mets (defstruct stream-wrap nil .mets \ \ stream .mets \ \ (:method put-string (me str) .mets \ \ \ \ (put-string str me.stream)) .mets \ \ (:method put-char (me chr) .mets \ \ \ \ (put-char chr me.stream)) .mets \ \ (:method put-byte (me byte) .mets \ \ \ \ (put-byte byte me.stream)) .mets \ \ (:method get-line (me) .mets \ \ \ \ (get-line me.stream)) .mets \ \ (:method get-char (me) .mets \ \ \ \ (get-char me.stream)) .mets \ \ (:method get-byte (me) .mets \ \ \ \ (get-byte me.stream)) .mets \ \ (:method unget-char (me chr) .mets \ \ \ \ (unget-char chr me.stream)) .mets \ \ (:method unget-byte (me byte) .mets \ \ \ \ (unget-byte byte me.stream)) .mets \ \ (:method put-buf (me buf pos) .mets \ \ \ \ (put-buf buf pos me.stream)) .mets \ \ (:method fill-buf (me buf pos) .mets \ \ \ \ (fill-buf buf pos me.stream)) .mets \ \ (:method close (me throw-on-error) .mets \ \ \ \ (close-stream me.stream throw-on-error)) .mets \ \ (:method flush (me) .mets \ \ \ \ (flush-stream me.stream)) .mets \ \ (:method seek (me offs whence) .mets \ \ \ \ (seek-stream me.stream offs whence)) .mets \ \ (:method truncate (me len) .mets \ \ \ \ (truncate-stream me.stream len)) .mets \ \ (:method get-prop (me sym) .mets \ \ \ \ (stream-get-prop me.stream sym)) .mets \ \ (:method set-prop (me sym nval) .mets \ \ \ \ (stream-set-prop me.stream sym nval)) .mets \ \ (:method get-error (me) .mets \ \ \ \ (get-error me.stream)) .mets \ \ (:method get-error-str (me) .mets \ \ \ \ (get-error-str me.stream)) .mets \ \ (:method clear-error (me) .mets \ \ \ \ (clear-error me.stream)) .mets \ \ (:method get-fd (me) .mets \ \ \ \ (fileno me.stream))) .syne .desc The .code stream-wrap class provides a trivial implementation of a stream interface. It has a single slot, .code stream which should be initialized with a stream object. Each methods of .metn stream-wrap , shown in its entirety in the above Syntax section, simply invoke the corresponding stream I/O library functions, passing the method arguments, and the value of the .code stream slot to that function, and consequently returning whatever that function returns. Note: the .code stream-wrap function is intended to useful as an inheritance base. A user-defined structure can inherit from .code stream-wrap and provide its own versions of some of the methods, thereby intercepting those operations to customize the behavior. For instance, a function equivalent to the .code record-adapter function could be implemented by constructing an object derived from .code stream-wrap which overrides the behavior of the .code get-line method, and then using the .code make-struct-delegate-stream to return a stream based on this object. .TP* Example: .verb ;;; Implementation of my-record-adapter, ;;; a function resembling ;;; the record-adapter implementation (defstruct rec-input stream-wrap regex include-match-p ;; get-line overridden to use regex-based ;; extraction using read-until-match (:method get-line (me) (read-until-match me.regex me.stream me.include-match-p)))) (defun my-record-adapter (regex stream include-match-p) (let ((recin (new rec-input stream stream regex regex include-match-p include-match-p))) (make-struct-delegate-stream recin))) .brev .SS* Symbols and Packages \*(TL has a package system inspired by the salient features of ANSI Common Lisp, but substantially simpler. Each symbol has a name, which is a string. A package is an object which serves as a container of symbols; the package associates the name strings with symbols. A symbol which exists inside a package is said to be interned in that package. A symbol can be interned in more than one package. A symbol may also have a home package. A symbol which has a home package is always interned in that package. A symbol which has a home package is called an .IR "interned symbol" . A symbol which is interned in one or more packages, but has no home package, is a .IR "quasi-interned symbol" . When a quasi-interned symbol is printed, if it is not interned in the package currently held in the .code *package* variable, it will appear in uninterned notation denoted by a .code #: prefix, even though it is interned in one or more packages. This is because in any situation when a symbol is printed with a package prefix, that prefix corresponds to the name of its home package. The reverse isn't true: when a symbol token is read bearing a package prefix, the token denotes any interned symbol in the indicated package, whether or not the package is the home package of that symbol. Packages are held in a global list which can be used to search for a package by name. The .code find-package function performs this lookup. A package may be deleted from the list with the .code delete-package function, but it continues to exist until the program loses the last reference to that package. When a package is deleted with .codn delete-package , its symbols are uninterned from all other packages. A symbol existing in one package can be brought into another package via the .code use-sym function, causing it to be interned in the target package. A symbol which thus exists inside a package which is not its home package is called a .IR "foreign symbol" , relative to that package. The contrasting term with .I "foreign symbol" is .IR "local symbol" , which refers to a symbol, relative to a package, which is interned in that package and that package is also its home. Every symbol interned in a package is either foreign or local. An existing symbol can also be brought into a package under a different name using the .code use-sym-as function, causing it to be interned under an alternative name. This has the effect of creating a local alias for a foreign symbol, and is intended as a renaming mechanism for resolving name clashes. If a foreign symbol is introduced into a package, and has the same name as an existing local symbol, the local symbol continues to exist, but is hidden: it is not accessible via a name lookup on that package. While hidden, a symbol loses its home package and is thus degraded to either quasi-interned or uninterned status, depending on whether that symbol is interned in other packages. When a foreign symbol is removed from a package via .codn unuse-sym , then if a hidden symbol exists in that package of the same name, that hidden symbol is reinterned in that package and reacquires that package as its home package, becoming an interned symbol again. Finally, packages have a .IR "fallback package list" : a list of associated packages, which may be empty. The fallback package list is manipulated with the functions .code package-fallback-list and .codn set-package-fallback-list , and with the .code :fallback clause of the .code defpackage macro. The fallback package list plays a role only in three situations: one in the \*(TL parser, one in the printer, and one in the interactive listener. Besides that, two library functions refer to it: .code intern-fb and .codn find-symbol-fb . The parser situation involving the fallback list occurs when the \*(TL parser resolves an unqualified symbol token: a symbol token not carrying a package prefix. Such a symbol name is resolved against the current package (the package currently stored in the .code *package* special variable). If a symbol matching the token is not found in the current package, then the packages in its fallback package list are searched for the symbol. The first matching symbol which is found in the fallback list is returned. If no matching symbol is found in the fallback list, then the token is interned as a new symbol is interned in the current package. The packages in the current package's fallback list may themselves have fallback lists. Those fallback lists are not involved; no such recursion takes place. The printer situation involving the fallback list is as follows. If a symbol is being printed in a machine-readable way (not "pretty"), has a home package and is not a keyword symbol, then a search takes place through the current package first and then its fallback list. If the symbol is found anywhere in that sequence of locations, and is not occluded by a same-named symbol occurring earlier in that sequence, then the symbol is printed without a package prefix. The listener situation involving the fallback list is a follows. When tab completion is used on a symbol without a package prefix, the listener searches for completions not only in the current package, but in the fallback list also. .TP* "Dialect Notes:" The \*(TL package system doesn't support the ANSI Common Lisp concept of package use, replacing that concept with fallback packages. Though the .code use-package and .code unuse-package functions exist and are similar to the ones in ANSI CL, they actually operate on individual foreign symbols, bringing them in or removing them, respectively. These functions effectively iterate over the local symbols of the used or unused package, and invoke .code use-sym or .codn unuse-sym , respectively. The \*(TL package system consequently doesn't support the concept of shadowing symbols, and conflicts do not exist. When a foreign symbol is introduced into a package which already has a symbol by that name, that symbol is silently removed from that package if it is itself foreign, or else hidden if it is local. The \*(TL package system also doesn't feature the concept of internal and external symbols. The rationale is that this distinction divides symbols into subsets in a redundant way. Packages are already subsets of symbols. A module can use two packages to simulate private symbols. An example of this is given in the Package Examples section below. The \*(TL fallback package list mechanism resembles ANSI CL package use, and satisfies similar use scenarios. However, this mechanism does not cause a symbol to be considered visible in a package. If a package .code foo contains no symbol .codn bar , but one of the packages in .codn foo 's fallback list does contain .codn bar , that symbol is nevertheless not considered visible in .codn foo . The syntax .code foo:bar will not resolve. The fallback mechanism only comes into play when a package is installed as the current package in the .code *package* variable. It then allows unqualified symbol references to refer across the fallback list. The \*(TL package system does not feature package nicknames, which have been found to be a source of clashes in large Common Lisp software collections, leading to the development of a feature called package local nicknames that is not part of ANSI CL, but supported by a number of implementations. In \*(TL, packages have only one name, accessible via .codn package-name . \*(TL packages are held in an association list called .codn *package-alist* , which is public, which associates string names with packages. The function .code find-package which is used by the parser when looking up the package prefix of a qualified symbol, only uses the names which appears as keys in this association list. Usually those names are the same as the names of the package objects. However, it's possible to manipulate this association list to create alias names for packages. Thus, it is possible for .code "(find-package \(dqfoo\(dq)" to return .code "#" if the name .str foo is associated, in .code *package-alist* with a package object named .strn bar . The \*(TL package system doesn't feature package local nicknames. There are three reasons for this. One is that it doesn't have global package nicknames. The second is that the mechanism would be cumbersome, and add delay to the resolution of qualified symbols, requiring nicknames in the .code *package* to be searched for a package name, in addition to the dynamic .codn *package-alist* . The third reason is that package local nicknames do not actually solve the problem of clashing symbols, when an application uses multiple packages that each define a symbol by the same name. Package nicknames only shorten the qualified names required to refer to the symbols, Instead, \*(TL allows a foreign symbol to be interned in a package under a name which is different from its .codn symbol-name . Thus, rather than creating aliases for package names, \*(TL packages can locally rename the actual clashing symbols, which can then be referenced by unqualified names. By manipulating .codn *package-alist* , a \*(TL source file can nevertheless achieve the creation of a de facto package nickname, which is local to a loaded file, by following the following example: .verb ;; make sure that when this file finishes loading, ;; or the loading is interrupted by an exception, ;; the "u" package alias is deleted from *package-alist* (push-after-load (set *package-alist* [remqual "u" *package-alist* car])) ;; push an alias named u for the usr package. (push (cons "u" (find-package "usr")) *package-alist*) ;; u: can now be used, until the end of this file (u:prinl (u:list 1 2 3)) .brev .NP* Package Examples The following example illustrates a simple scenario of a module whose identifies are in a package, and which also has private identifiers in a private package. .verb ;; Define three packages. (defpackage mod-priv (:fallback usr)) (defpackage mod) (defpackage client (:fallback mod usr) (:use-from mod-priv other-priv)) ;; Switch to mod-priv package (in-package mod-priv) (defun priv-fun (arg) (list arg)) ;; Another function with a name in the mod-priv package. (defun other-priv (arg) (cons arg arg)) ;; Define a function in mod; a public function. ;; Note that we don't have to change to the mod package, ;; to define functions with names in that package. ;; We rely on interning being allowed for the qualified ;; mod:public-fun syntax. (defun mod:public-fun (arg) (priv-fun arg)) ;; priv-fun here is mod-priv:priv-fun ;; Switch to client package (in-package client) (priv-fun) ;; ERROR: refers to client:priv-fun, not defined (mod:priv-fun) ;; ERROR: mod-priv:priv-fun not used in mod (mod-priv:priv-fun 3) ;; OK: direct reference via qualifier (public-fun 3) ;; OK: mod:public-fun symbol via fallback (other-priv 3) ;; OK: foreign symbol mod-priv:other-priv ;; present in client due to :use-from .brev The following example shows how to create a package called .code custom in which the .code + symbol from the .code usr package is replaced with a local symbol. A function is then defined using the local symbol, which allows strings to be catenated with .codn + : .verb (defpackage custom (:fallback usr) (:local + - * /)) (defmacro outside-macro (x) ^(+ ,x 42)) (in-package custom) (defun binary-+ (: (left 0) (right 0)) (if (and (numberp left) (numberp right)) (usr:+ left right) `@left@right`)) (defun + (. args) [reduce-left binary-+ args]) (+) -> 0 (+ 1) -> 1 (+ 1 "a") -> "1a" (+ 1 2) -> 3 (+ "a") -> "a" (+ "a" "b" "c") -> "abc" ;; macro expansions using usr:+ are not affected (outside-macro "a") -> ;; error: + invalid operands "a" 42 .brev .NP* Packages and the Extraction Language The \*(TX extraction language has a syntax in which certain Lisp symbolic expressions denoting directives .code "@(collect ...)" or .code "@(end)" behave as if they were the tokens of a phrase structure. As a matter of implementation, these are processed specially in the parser and lexical analyzer, and are not read in the same way as ordinary Lisp forms. On the other hand, some directives are not this way. For instance the .codn "@(bind ...)" , syntax is processed as a true Lisp expression, in which the .code bind token is subject to the usual rules for interning a symbol, sensitive to .code *package* in the usual way. The following notes describe the treatment of "special" directives that are involved in phrase structure syntax. It applies to all directives which head off a block that must be terminated by .codn "@(end)" , all "punctuation" directives like .code "@(and)" or .code "@(end)" and all subphrase indicators like .code "@(last)" or .codn "@(elif)" . Firstly, each such directive may have a package prefix on its main symbol, yet is still recognized as the same token. That is to say, .code "@(foo:collect)" is still treated by the tokenizer and parser as the .code "@(collect)" token, regardless of the package prefix, and regardless of whether .code foo:end is the same symbol as the .code usr:end symbol. However, this doesn't mean that any .code foo:collect is allowed to denote the .code collect directive. A qualified symbol such as .code foo:collect must correspond to (be the same object as) precisely one of two symbols: either the same-named symbol in the .code usr package, or else the same-named symbol in the .code keyword package. If this condition isn't satisfied, the situation is a syntax error. Note that this check uses the original .code usr and .code keyword packages, not the packages which are currently named .str "usr" or .str "keyword" in the current .codn *package-alist* . A check is also performed for an unqualified symbol. An unqualified symbol like .code collect must also resolve, in the context of the current value of the .code *package* variable, to the same named-symbol in either the original .code usr or .code keyword package. Thus if the current package isn't .codn usr , and .code "@(collect)" is being processed, the current package must be such that .code collect resolves to .codn usr:collect . either because that symbol is present in the current pack via import, or else visible via the fallback list. These rules are designed to approximate what the behavior would be if these directives were actually scanned as Lisp forms in the usual way and then recognized as phrase structure tokens according to the identity of their leading symbol. The additional restriction is added that that the directive symbol names are treated as reserved. If there exists a user-defined pattern function called .code mypackage:end it may not be invoked using the syntax .codn "@(mypackage:end)" , which is erroneous; though it is invocable indirectly via the .code "@(call)" directive. .NP* Package Library Conventions Various functions in the package and symbol area of the library have a .meta package parameter. When the argument is optional, it defaults to the current value of the .code *package* special variable. If specified, the argument may be a character string, which is taken as the name of a package. It may also be a symbol, in which case the symbol's name, which is a character string, is used. Thus the objects .codn :sys , .codn usr:sys , .code abc:sys and .str sys all refer to the same package, the system package which is named .strn sys . A .code package parameter may also simply be a package object. Some functions, like .code use-package and .code unuse-package functions accept a list of packages as their first argument. This may be a list of objects which follow the above conventions: strings, symbols or package objects. Also, instead of a list, an atom may be passed: a string, symbol or package object. It is treated as a singleton list consisting of that object. .coNP Variables @, user-package @ keyword-package and @ system-package .desc These variables hold predefined packages. The .code user-package contains all of the public symbols in the \*(TL library. The .code keyword-package holds keyword symbols, which are printed with a leading colon. The .code system-package is for internal symbols, helping the implementation avoid name clashes with user code in some situations. These variables shouldn't be modified. If they are modified, the consequences are unspecified. The names of these packages, respectively, are .strn usr , .strn sys , and .strn keyword . .coNP Special Variable @ *package* .desc This variable holds the current package. The global value of this variable is initialized to a package called .strn pub . The .code pub package has the .code usr package in its fallback list; thus when .code pub is current, all of the .code usr symbols, comprising the content of the \*(TL library, are visible. All forms read and evaluated from the \*(TX command line, in the interactive listener, from files via .code load or .code compile-file or from the \*(TX pattern language are processed in this default .code pub package, unless arrangement are made to change to a different package. The current package is used as the default package for interning symbol tokens which do not carry the colon-delimited package prefix. The current package also affects printing. When a symbol is printed whose home package matches the current package, it is printed without a package prefix. (Keyword symbols are always printed with the colon prefix, even if the keyword package is current.) .coNP Function @ make-sym .synb .mets (make-sym << name ) .syne .desc The .code make-sym function creates and returns a new symbol object. The argument .metn name , which must be a string, specifies the name of the symbol. The symbol does not belong to any package (it is said to be "uninterned"). Note: an uninterned symbol can be interned into a package with the .code rehome-sym function. Also see the .code intern function. .coNP Function @ gensym .synb .mets (gensym <> [ prefix ]) .syne .desc The .code gensym function is similar to .codn make-sym . It creates and returns a new symbol object. If the .meta prefix argument is omitted, it defaults to .strn g . The difference between .code gensym and .code make-sym is that .code gensym creates the symbol's name by combining the .meta prefix with a numeric suffix. The suffix is obtained by incrementing the .code *gensym-counter* and taking the new value. The name string then calculated from the prefix and the counter value as if by evaluating a form similar to .codn "(fmt \(dq~a~,04d\(dq prefix counter)" . From this it can be inferred that .meta prefix can be an object of any kind. Note: the generated symbol's name, though varying thanks to the incrementing counter, is not the basis of its uniqueness. The basis of the symbol's uniqueness is that it is a freshly created object, distinct from any other object. The related function .code make-sym still returns unique symbols even if repeatedly called with the same string argument. .coNP Special Variable @ *gensym-counter* .desc This variable is initialized to 0. Each time the .code gensym function is called, it is incremented. The incremented value forms the basis of the numeric suffix which .code gensym uses to form the name of the new symbol. .coNP Function @ make-package .synb .mets (make-package < name <> [ weak ]) .syne .desc The .code make-package function creates and returns a package named .metn name , where .meta name is a string. It is an error if a package by that name exists already. Note: ordinary creation of packages for everyday program modularization should be performed with the .code defpackage macro rather than by direct use of .codn make-package . If the .meta weak parameter is given an argument which is a Boolean true, then the resulting package holds symbols weakly, from a garbage collection point of view. If the only reference to a symbol is that which occurs inside the weak package, then that symbol may be removed from the package and reclaimed by the garbage collector. Note: weak packages address the following problem. The application creates a package for the purpose of reading Lisp data. Symbols occurring in that data therefore are interned into the package. Subsequently, the application retains references to some of the symbols, discarding the others. If the package isn't weak, then because the application is retaining some of the symbols, and those symbols hold a reference to the package, and the package holds a reference to all symbols that were interned in it, all of the symbols are retained. If a weak package is used, then the discarded symbols are eligible for garbage collection. .coNP Function @ delete-package .synb .mets (delete-package << package ) .syne .desc The .code delete-package breaks the association between a package and its name. After .codn delete-package , the .meta package object continues to exist, but cannot be found using .codn find-package . Furthermore, .code delete-package iterates over all remaining packages. For each remaining package .metn p , it performs the semantic action of the .mono .meti (unuse-package < package << p ) .onom expression. That is to say, all of the remaining packages are scrubbed of any foreign symbols which are the local symbols of the deleted .metn package . .coNP Function @ merge-delete-package .synb .mets (merge-delete-package dst-package <> [ src-package ]) .syne .desc The .code merge-delete-package iterates over all of the local symbols of .meta src-package and rehomes each symbol into .metn dst-package . Then, it deletes .metn src-package . Note: the local symbols are identified as if using .codn package-local-symbols , rehoming is performed as if using .codn rehome-sym , and deleting .meta src-package is performed as if using .codn delete-package . .coNP Function @ packagep .synb .mets (packagep << obj ) .syne .desc The .code packagep function returns .code t if .meta obj is a package, otherwise it returns .codn nil . .coNP Function @ find-package .synb .mets (find-package << name ) .syne .desc The argument .meta name should be a string. If a package called .meta name exists, then it is returned. Otherwise .code nil is returned. .coNP Special Variable @ *package-alist* .desc The .code *package-alist* variable contains the master association list which contains an entry about each existing package. Each element of the list is a cons cell whose .code car field is the name of a package and whose .code cdr is a package object. Note: the \*(TL application can overwrite or rebind this variable to manipulate the active package list. This is useful for .IR sandboxing : safely evaluating code that is obtained as an input from an untrusted source, or calculated from such an input. The contents of .code *package-alist* have security implications because textual source code can refer to any symbol in any package by invoking a package prefix. For instance, even if the .code open function's name is not available in the current package (established by the .code *package* variable) that symbol can easily be obtained using the syntax .codn usr:open . However, the entire .code usr package itself can be removed from .codn *package-alist* . In that situation, the syntax .code usr:open is no longer valid. At the same time, selected symbols from the original .code usr can be nevertheless made available via some intermediate package, which is present in .code *package-alist* and which contains a subset of the .code usr symbols that has been curated for safety. That curated package may even be called .codn usr , so that if for instance .code cons is present in that package, it may be referred to as .code usr:cons in the usual way. .coNP Function @ package-alist .synb .mets (package-alist) .syne .desc The .code package-alist function retrieves the value of .codn *package-alist* . Note: this function is obsolescent. There is no reason to use it in new code instead of just accessing .code *package-alist* directly. .coNP Function @ package-name .synb .mets (package-name << package ) .syne .desc The .code package-name function retrieves the name of a package. .coNP Function @ package-symbols .synb .mets (package-symbols <> [ package ]) .syne .desc The .code package-symbols function returns a list of all the symbols which are interned in .metn package . .coNP Functions @ package-local-symbols and @ package-foreign-symbols .synb .mets (package-local-symbols <> [ package ]) .mets (package-foreign-symbols <> [ package ]) .syne .desc The .code package-local-symbols function returns a list of all the symbols which are interned in .metn package , and whose home package is that package. The .code package-foreign-symbols function returns a list of all the symbols which are interned in .metn package , which do not have that package as their home package, or do not have a home package at all. The union of the local and foreign symbols contains exactly the same elements as the list returned by .codn package-symbols : the symbols interned in a package are partitioned into local and foreign. .coNP Functions @ package-fallback-list and @ set-package-fallback-list .synb .mets (package-fallback-list << package ) .mets (set-package-fallback-list < package << package-list ) .syne .desc The .code package-fallback-list returns the current .I "fallback package list" associated with .metn package . The .code set-package-fallback-list replaces the fallback package list of .meta package with .metn package-list . The .meta package-list argument must be a list which is a mixture of symbols, strings or package objects. Strings are taken to be package names, which must resolve to existing packages. Symbols are reduced to strings via .codn symbol-name . .coNP Functions @ intern and @ intern-fb .synb .mets (intern < name <> [ package ]) .mets (intern-fb < name <> [ package ]) .syne .desc The argument .meta name must be a string. The optional argument .meta package must be a package. If .meta package is not supplied, then the value taken is that of .codn *package* . The .code intern function searches .meta package for a symbol called .metn name . If that symbol is found, it is returned. If that symbol is not found, then a new symbol called .meta name is created and inserted into .metn package , and that symbol is returned. In this case, the package becomes the symbol's home package. The .code intern-fb function is very similar to .code intern except that if the symbol is not found in .meta package then the packages listed in the fallback list of .meta package are searched, in order. Only these packages themselves are searched, not their own fallback lists. If a symbol called .meta name is found, the search terminates and that symbol is returned. Only if nothing is found in the fallback list will .code intern-fb create a new symbol and insert it into .metn package , exactly like .codn intern . .coNP Function @ unintern .synb .mets (unintern < symbol <> [ package ]) .syne .desc The .code unintern function removes .meta symbol from .metn package . The .code nil symbol may not be removed from the .code usr package; an error exception is thrown in this case. If .code symbol isn't .codn nil , then .meta package is searched to determine whether it contains .meta symbol as an interned symbol (either local or foreign), or a hidden symbol. If .meta symbol is a hidden symbol, then it is removed from the hidden symbol store. Thereafter, even if a same-named foreign symbol is removed from the package via .code unuse-sym or .codn unuse-package , those operations will no longer restore the hidden symbol to interned status. In this case, .meta unintern returns the hidden symbol that was removed from the hidden store. If .meta symbol is a foreign symbol, then it is removed from the package. If the package has a hidden symbol of the same name, that hidden symbol is reinterned in the package, and the package once again becomes its home package. In this case, .meta symbol is returned. If .meta symbol is a local symbol, the symbol is removed from the package. In this case also, .meta symbol is returned. If .meta symbol is not found in the package as either an interned or hidden symbol, then the function has no effect and returns .codn nil . .coNP Functions @ find-symbol and @ find-symbol-fb .synb .mets (find-symbol < name >> [ package <> [ notfound-val ]]) .mets (find-symbol-fb < name >> [ package <> [ notfound-val ]]) .syne .desc The .code find-symbol and .code find-symbol-fb functions search .meta package for a symbol called .metn name . That argument must be a character string. If the .meta package argument is omitted, the parameter defaults to the current value of .codn *package* . If the symbol is found in .meta package then it is returned. If the symbol is not found in .metn package , then the function .code find-symbol-fb also searches the packages listed in the fallback list of .meta package are searched, in order. Only these packages themselves are searched, not their own fallback lists. If a symbol called .meta name is found, the search terminates and that symbol is returned. The function .code find-symbol only searches .metn package , ignoring its fallback list. If a symbol called .meta name isn't found, then these functions return .meta notfound-val is returned, which defaults to .codn nil . Note: an ambiguous situation exists when .meta notfound-val is a symbol, such as its default value .codn nil , because if that symbol is successfully found, it is indistinguishable from .metn notfound-val . .coNP Function @ rehome-sym .synb .mets (rehome-sym < symbol <> [ package ]) .syne .desc The arguments .meta symbol and .meta package must be a symbol and package object, respectively, and .meta symbol must not be the symbol .codn nil . The .code rehome-sym function moves .meta symbol into .metn package . If .meta symbol is already interned in a package, it is first removed from that package. If a symbol of the same name exists in .metn package , that symbol is first removed from .metn package . Also, if a symbol of the same name exists in the hidden symbol store of .metn package , that hidden symbol is removed. Then .code symbol is interned into .metn package , and .meta package becomes its home package, making it a local symbol of .metn package . Note: if .code symbol is currently the hidden symbol of some package, it is not removed from the hidden symbol store of that package. This is a degenerate case. The implication is that if that hidden symbol is ever restored in that package, it will once again have that package as its home package, and consequently it will turn into a foreign symbol of .metn package . .coNP Function @ symbolp .synb .mets (symbolp << obj ) .syne .desc The .code symbolp function returns .code t if .meta obj is a symbol, otherwise it returns .codn nil . .coNP Function @ symbol-name .synb .mets (symbol-name << symbol ) .syne .desc The .code symbol-name function returns the name of .metn symbol . .coNP Function @ symbol-package .synb .mets (symbol-package << symbol ) .syne .desc The .code symbol-package function returns the home package of .metn symbol . If .meta symbol has no home package, it returns .codn nil . .coNP Function @ keywordp .synb .mets (keywordp << obj ) .syne .desc The .code keywordp function returns .code t if .meta obj is a keyword symbol, otherwise it returns .codn nil . .coNP Function @ bindable .synb .mets (bindable << obj ) .syne .desc The .code bindable function returns .code t if .meta obj is a bindable symbol, otherwise it returns .codn nil . All symbols are bindable, except for keyword symbols, and the special symbols .code t and .codn nil . .coNP Functions @ use-sym and @ use-sym-as .synb .mets (use-sym < symbol <> [ package ]) .mets (use-sym-as < symbol < name <> [ package ]) .syne .desc The .code use-sym function brings an existing .meta symbol into .metn package . The .code use-sym-as is similar, but allows an alternative .meta name to be specified. The .meta symbol will be interned under that name, rather than under its symbol name. In all cases, both function return .codn symbol . The following equivalence holds: .verb (use-sym s p) <--> (use-sym-as s (symbol-name s) p) .brev Thus, in the following descriptions, when the remarks are interpreted as applying to .codn use-sym , the .meta name argument is understood as referring to the .code symbol-name of the .meta symbol argument. If .meta package is the home package of .metn symbol , then the function has no effect. Otherwise .meta symbol is interned in .meta package under .metn name . If a symbol is already interned in .meta package under .metn name , then that symbol is is replaced. If that replaced symbol is a local symbol of .metn package , meaning that .meta package is its home package, then that replaced symbol turns into a hidden symbol associated with the package. It is placed into a special hidden symbol store associated with .meta package and is stripped of its home package, becoming quasi-interned or uninterned. Note: .code use-sym and .code use-sym-as are the basis for the .code defpackage clauses .code :use-syms and .codn :use-syms-as . Note: if .code use-sym-as is used to introduce a foreign symbol into a package under a different name, that symbol cannot be removed with .codn unintern . It can only be removed using .codn unuse-sym . .coNP Function @ unuse-sym .synb .mets (unuse-sym < symbol <> [ package ]) .syne .desc The .code unuse-sym function removes .meta symbol from .metn package . If .meta symbol is not interned in .metn package , the function does nothing and returns .codn nil . If .meta symbol is a local symbol of .metn package , an error is thrown: a package cannot "unuse" its own symbol. Removing a symbol from its own home package requires the .code unintern function. Otherwise .meta symbol is a foreign symbol interned in .meta package and is removed. If the package has a hidden symbol of the same name as .metn symbol , that symbol is reinterned into .meta package as a local symbol. In this case, that previously hidden symbol is returned. If the package has no hidden symbol matching the removed .metn symbol , then .meta symbol itself is returned. There are close similarities between the function .code unintern and .codn unuse-sym , but the two are significantly different. Firstly, .code unuse-sym cannot be used to remove a symbol from its home package. As noted above, this requires .codn unintern . Secondly, .code unuse-sym can be used to undo the effect of .code use-sym-as whereby a foreign symbol is introduced into a package under a different name. If .meta symbol is not found under its name, .code unuse-sym will search the package for that symbol to discover whether it is present under a different name, and proceed with the removal using that name. The .code unintern function performs no such secondary check; if .meta symbol is not found in the package under its own name, the operation fails, and so .code unintern cannot be used for undoing the effect of .codn use-sym-as . .coNP Functions @ use-package and @ unuse-package .synb .mets (use-package < package-list <> [ package ]) .mets (unuse-package < package-list <> [ package ]) .syne .desc The .meta use-package and .meta unuse-package are convenience functions which perform a mass import of symbols from one package to another, or a mass removal, respectively. The .code use-package function iterates over all of the local symbols of the packages in .metn package-list . For each symbol .metn s , it performs the semantic action implied by the .mono .meti (use-sym < s << package ) .onom expression. Similarly .code unuse-package iterates .meta package-list in the same way, performing, effectively, the semantic action of the .mono .meti (unuse-sym < s << package ) .onom expression. The .meta package-list argument must be a list which is a mixture of symbols, strings or package objects. Strings are taken to be package names, which must resolve to existing packages. Symbols are reduced to strings via .codn symbol-name . .coNP Macro @ defpackage .synb .mets (defpackage < name << clause *) .syne .desc The .code defpackage macro provides a convenient means to create a package and establish its properties in a single construct. It is intended for the ordinary situations in which packages support the organization of programs into modules. The .code name argument, giving the package name, may be a symbol or a character string. If it is a symbol, then the symbol's name is taken to be name for the package. If a package called .code name already exists, then .code defpackage selects that package for further operations. Otherwise, a new, empty package is created. In either case, this package is referred to as the .I "present package" in the following descriptions. The .code name may be optionally followed by one or more clauses, which are processed in the order that they appear. Each clause is a compound form headed by a keyword. The supported clauses are as follows: .RS .meIP (:fallback << package-name *) The .code :fallback clause specifies the packages to comprise the fallback list of the present package. If this clause is omitted, or if it is present with no .meta package-name arguments, then the present package has an empty fallback list. Each .meta package-name may be a string or symbol naming an existing package. It is permitted for the present package itself to appear in its own fallback list. This is useful for creating a package with a nonempty fallback list which doesn't actually provide access to any other package. .meIP (:use << package-name *) The .code :use clause specifies packages whose local symbols are to be interned into the present package as foreign symbols. Each .meta package-name may be a string or symbol naming an existing package. The list of package names is processed as if by a call to .codn use-package . .meIP (:use-syms << symbol *) The .code :use-syms clause specifies individual symbols to be brought into the present package, as if by the .code use-sym function. The arguments are symbols. .meIP (:use-syms-as >> { symbol << name }*) The .code :use-syms-as clause specifies individual symbols to be brought into the present package, as if by the .code use-sym-as function. The arguments constitute a property list consisting of interleaved symbols and names. Each .meta symbol argument is a symbol, and each .meta name is either a symbol or a string. If it is a symbol, then its name is retrieved via .code symbol-name and used in its place. .meIP (:use-from < package-name << symbol-name *) The .code :use-from clause specifies the names of local symbols in a package denoted by .meta package-name to be used in the present package. All arguments of .code :use-from are either strings or symbols which are reduced to strings by mapping to their names. Each .meta symbol-name is interned in the package identified by .metn package-name , which may have the effect of creating that symbol. This symbol is expected to be a local symbol of that package. If that is so, the symbol is brought into the present package via .codn use-sym . Otherwise if the symbol is foreign to package identified by .metn package-name , then an error exception is thrown. .meIP (:local << symbol-name *) The .code :local clause specifies the names of symbols to be interned in the new package as local symbols. Each .meta symbol-name argument must be either a character string or a symbol. If it is a symbol, its name is taken, thereby reducing the argument to a character string. The arguments are processed in the order in which they appear. Each name is first interned in the newly created package using the .code intern function. Then, if the resulting symbol is foreign to the package, it is removed with .code unuse-sym and the name is interned again. .RE .coNP Macro @ in-package .synb .mets (in-package << name ) .syne .desc The .code in-package macro causes the .code *package* special variable to take on the package denoted by .metn name . The macro checks, at expansion time, that .meta name is either a string or symbol. An error is thrown if this isn't the case. The .meta name argument expression isn't evaluated, and so must not be quoted. The code generated by the macro performs a search for the package. If the package is not found at the time when the macro's expansion is evaluated, an error is thrown. .SS* Pseudorandom Numbers .coNP Special Variable @ *random-state* .desc The .code *random-state* variable holds an object which encapsulates the state of a pseudorandom number generator. This variable is the default argument value for the .code random-fixnum and .codn "random functions" , for the convenience of writing programs which are not concerned about the management of random state. On the other hand, programs can create and manage random states, making it possible to obtain repeatable sequences of pseudorandom numbers which do not interfere with each other. For instance objects or modules in a program can have their own independent streams of random numbers which are repeatable, independently of other modules making calls to the random number functions. When \*(TX starts up, the .code *random-state* variable is initialized with a newly created random state object, which is produced as if by the call .codn "(make-random-state 42)" . .coNP Special Variable @ *random-warmup* .desc The .code *random-warmup* special variable specifies the value which is used by .code make-random-state in place of a missing .meta warmup-period argument. To "warm up" a pseudorandom number generator (PRNG) means to obtain some values from it which are discarded, prior to use. The number of values discarded is the .IR "warm-up period" . The WELL512a PRNG used in \*(TX produces 32-bit values, natively. Thus each warm-up iteration retrieves and discards a 32-bit value. The PRNG has a state space consisting of a vector of sixteen 32-bit words, making the state space 4096 bits wide. Warm up is required because PRNG-s, in particular PRNG-s with large state spaces and long periods, produce fairly predictable sequences of values in the beginning, before transitioning into chaotic behavior. This problem is worse for low complexity seeds, such as small integer values. The sequences are predictable in two ways. Firstly, some initial values extracted from the PRNG may exhibit patterns ("problem 1"). Secondly, the initial values from sequences produced from similar seeds (for instance consecutive integers) may be similar or identical ("problem 2"). .TP* Notes: The default value of .code *random-warmup* is only 8. This is insufficient to ensure good initial PRNG behavior for seeds even as large as 64 bits or more. That is to say, even if as many as eight bytes' worth of true random bits are used as the seed, the PRNG will exhibit predictable behaviors, and a poor distribution of values. Applications which critically depend on good PRNG behavior should choose large warm-up periods into the hundreds or thousands of iterations. If a small warm-up period is used, it is recommended to use larger seeds which initialize more of the 4096-bit state space. \*(TX's PRNG implementation addresses "problem 1" by padding the unseeded portions of the state space with random values (from a static table that doesn't change). For instance, if the integer 1 is used to seed the space, then one 32-bit word of the space is set to the value 1. The remaining 15 are populated from the random table. This helps to ensure that a good PRNG sequence is obtained immediately. However, it doesn't address "problem 2": that similar seed values generate similar sequences, when the warm-up period is small. For instance, if 65536 different random state objects are created, from each of the 16-bit seeds in the range [0, 65536), and then a random 16-bit value is extracted from each state, only 1024 unique values result. .coNP Function @ make-random-state .synb .mets (make-random-state >> [ seed <> [ warmup-period ]) .syne .desc The .code make-random-state function creates and returns a new random state, an object of the same kind as what is stored in the .code *random-state* variable. The seed, if specified, must be an integer value, a buffer, an existing random state object, or else a vector returned from a call to the function .codn random-state-get-vec . Note that the sign of the seed is ignored, so that negative seed values are equivalent to their additive inverses. If seed is not specified, then .code make-random-state produces a seed based on some information in the process environment, such as current time of day. It is not guaranteed that two calls to .code make-random-state that are separated by less than some minimum increment of real time produce different seeds. The minimum time increment depends on the platform. On a platform with a millisecond-resolution real-time clock, the minimum time increment is a millisecond. Calls to .code make-random-state less than a millisecond apart may predictably produce the same seed. If an integer or buffer seed is specified, then the integer value is mapped to a pseudorandom sequence, in a platform-independent way. If an existing random state is specified as a seed, then it is duplicated. The returned random state object is a distinct object which is in the same state as the input object. It will produce the same remaining pseudorandom number sequence, as will the input object. If a vector is specified as a seed, then a random state is constructed which duplicates the random state object which was captured in that vector representation by the .code random-state-get-vec function. The .meta warm-up-period argument specifies the number of values which are immediately obtained and discarded from the newly-seeded generator before it is returned. This procedure is referred to as PRNG .IR warm-up . Warm-up is not performed if .meta seed is a vector or random state object. In this situation, if the .meta warm-up-period is present, it may still be required to be an integer, even though it is ignored. If warm-up is performed, but the .meta warm-up-period argument is missing, then the value of the .code *random-warmup* special variable is used. Note: this variable has a default value which may be too small for some applications of pseudorandom numbers; see the Notes under .codn *random-warmup* . .coNP Function @ random-state-p .synb .mets (random-state-p << obj ) .syne .desc The .code random-state-p function returns .code t if .meta obj is a random state, otherwise it returns .codn nil . .coNP Function @ random-state-get-vec .synb .mets (random-state-get-vec <> [ random-state ]) .syne .desc The .code random-state-get-vec function converts a random state into a vector of integer values. If the .meta random-state argument, which must be a random state object, is omitted, then the value of the .code *random-state* is used. .coNP Functions @, random-fixnum @ random and @ rand .synb .mets (random-fixnum <> [ random-state ]) .mets (random < random-state << modulus ) .mets (rand < modulus <> [ random-state ]) .syne .desc All three functions produce pseudorandom numbers, which are positive integers. The numbers are obtained from a WELL512a PRNG, whose state is stored in the random state object. The .code random-fixnum function produces a random fixnum integer: a reduced range integer which fits into a value that does not have to be heap-allocated. The .code random and .code rand functions produce a value in the range [0, .metn modulus ). They differ only in the order of arguments. In the .code rand function, the random state object is the second argument and is optional. If it is omitted, the global .code *random-state* object is used. The .meta modulus argument must be a positive integer. If .meta modulus is 1, then the function returns zero without altering the state of the pseudorandom number generator. .coNP Functions @ random-float and @ random-float-incl .synb .mets (random-float <> [ random-state ]) .mets (random-float-incl <> [ random-state ])l .syne .desc The .code random-float function produces a pseudorandom floating-point value in the range [0.0, 1.0). The .code random-float-incl produces a pseudorandom floating-point value in the range [0.0, 1.0], thus differing from .code random-float by including the 1.0 limit value. The numbers are obtained from a WELL512a PRNG, whose state is stored in the random state object given by the argument to the optional .meta random-state parameter, which defaults to the value of .codn *random-state* . Because the floating-point type does not provide a representation of every real value in the range 0.0 to 1.0, it is not possible to impose the requirement that every value shall occur with equal likelihood. Rather, these functions are intended to produce an a uniform distribution of values according to the following pragmatic requirements. A subset .I S of the real values in the specified range, [0.0, 1.0) or [0.0, 1.0] is identified whose elements are representable in the floating-point type and which are uniformly spaced along the interval. Then, a random element is chosen from .I S and returned, such that every element is equally likely to be selected. Note that these requirements do not correspond to the more mathematically ideal concept of uniformly choosing actual real numbers in the [0, 1] interval of the real number line, and then finding the closest floating-point representation. Such a requirement would mean that the boundary values 0.0 and 1.0 appear in the output half as frequently as all the interior values, because each of these two floating-point values is a representations of a range of numbers, half of which lies outside of the [0, 1] interval. .coNP Function @ random-buf .synb .mets (random-buf < size <> [ random-state ]) .syne .desc The .code random-buf function creates a .code buf object of the specified .meta size fills it with pseudorandom bytes, and returns it. The bytes are obtained from the random state object given by the optional .meta random-state parameter, which defaults to the value of .codn *random-state* . See the section .B Buffers for a description of .code buf objects. .coNP Function @ random-sample .synb .mets (random-sample < size < seq <> [ random-state ]) .syne .desc The .code random-sample function returns a vector of .meta size randomly selected elements from the sequence .metn seq , using reservoir sampling. If the number of elements in .meta seq is equal to or smaller than .metn size , then the function returns a vector of all the elements of .meta seq in their original order. In other cases, the selected elements are not required to appear in their original order. No element of sequence .meta seq is selected more than once; duplicate values can appear in the output only if .meta seq itself contains duplicates. .SS* Time .coNP Functions @, time @ time-usec and @ time-nsec .synb .mets (time) .mets (time-usec) .mets (time-nsec) .syne .desc The .code time function returns the number of seconds that have elapsed since midnight, January 1, 1970, in the UTC timezone: a point in time called .IR "the epoch" . The .code time-usec function returns a cons cell whose .code car field holds the seconds measured in the same way, and whose .code cdr field extends the precision by giving number of microseconds as an integer value between 0 and 999999. The .code time-nsec function is similar to .code time-usec except that the returned cons cell's .code cdr field gives a number of nanoseconds as an integer value between 0 and 999999999. Note: on hosts where obtaining nanosecond precision is not available, the .code time-nsec function obtains a microseconds value instead, and multiplies it by 1000. .coNP Functions @ time-string-local and @ time-string-utc .synb .mets (time-string-local < time << format ) .mets (time-string-utc < time << format ) .syne .desc These functions take the numeric time returned by the .code time function, and convert it to a textual representation in a flexible way, according to the contents of the .meta format string. The .code time-string-local function converts the time to the local timezone of the host system. The .code time-string-utc function produces time in UTC. The .meta format argument is a string, and follows exactly the same conventions as the format string of the C library function .codn strftime . The .meta time argument is an integer representing seconds obtained from the time function or from the .code car field of the cons returned by the .code time-usec function. .coNP Functions @ time-str-local and @ time-str-utc .synb .mets (time-str-local < format <> [ time ]) .mets (time-str-utc < format <> [ time ]) .syne .desc The functions .code time-str-local and .code time-str-utc are equivalent, respectively, to .code time-string-local and .code time-string-utc with the arguments reversed. Thus the following equivalences hold: .verb (time-str-local F T) <--> (time-string-local T F) (time-str-utc F T) <--> (time-string-utc T F) .brev Additionally, if no argument is supplied to the .code time parameter, its value is obtained by invoking the .code time function. .coNP Functions @ time-fields-local and @ time-fields-utc .synb .mets (time-fields-local <> [ time ]) .mets (time-fields-utc <> [ time ]) .syne .desc These functions take numeric time in the format returned by the .code time function and convert it to a list of seven fields. The .code time-fields-local function converts the time to the local timezone of the host system, whereas the .code time-fields-utc function produces time in UTC. The fields returned as a list consist of six integers, and a Boolean value. The six integers represent the year, month, day, hour, minute and second. The Boolean value indicates whether daylight savings time is in effect (always .code nil in the case of .codn time-fields-utc ). The .meta time parameter is an integer representing seconds obtained from the .code time function. If the argument is absent, the value is obtained by calling .codn time . .coNP Structure @ time .synb .mets (defstruct time nil .mets \ \ year month day hour min sec .mets \ \ wday yday .mets \ \ dst gmtoff zone) .syne .desc The .code time structure represents a time broken down into individual fields. The structure almost directly corresponds to the .code "struct tm" type in the ISO C language. There are differences. Whereas the .code "struct tm" member .code tm_year represents a year since 1900, the .code year slot of the .code time structure represents the absolute year, not relative to 1900. Furthermore, the .code month slot represents a one-based numeric month, such that 1 represents January, whereas the C member .code tm_mon uses a zero-based month. The .code dst slot is a \*(TL Boolean value. The slots .codn hour , .codn min , .codn sec , .code wday and .code yday correspond directly to .codn tm_hour , .codn tm_min , .codn tm_sec , .code tm_wday and .codn tm_yday . The slot .code gmtoff represents the number of seconds east of UTC, and .code zone holds a string giving the abbreviated time zone name. On platforms where the C type .code "struct tm" has fields corresponding to these slots, values for these slots are calculated and stored into them by the .code time-struct-local and .code time-struct-utc functions, and also the related .code time-local and .code time-utc methods. On platforms where the corresponding fields are not present in the C language .codn "struct tm" , these slots are unaffected by those functions, retaining the default initial value .code nil or a previously stored value, if applicable. Lastly, the values of .code gmtoff and .code zone are not ignored by functions which accept a .code time structure as a source of input values. .coNP Functions @ time-struct-local and @ time-struct-utc .synb .mets (time-struct-local <> [ time ]) .mets (time-struct-utc <> [ time ]) .syne .desc These functions take numeric time in the format returned by the .code time function and convert it to an instance of the .code time structure. The .code time-struct-local function converts the time to the local timezone of the host system, whereas .code time-struct-utc function produces time in UTC. The .meta time parameter is an integer representing seconds obtained from the .code time function. If the argument is absent, the value is obtained by calling .codn time . .coNP Functions @, time-parse @ time-parse-local and @ time-parse-utc .synb .mets (time-parse < format << string ) .mets (time-parse-local < format << string ) .mets (time-parse-utc < format << string ) .syne .desc The .code time-parse function scans a time description in .meta string according to the specification given in the .meta format string. If the scan is successful, a structure of type .code time is returned, otherwise .codn nil . The .meta format argument follows the same conventions as the POSIX C library function .codn strptime . Prior to obtaining the time from .meta format and .meta string the returned structure is created and initialized with a time which represents time 0 ("the epoch") if interpreted in the UTC timezone as by the .meta time-utc method. The .code time-parse-local and .code time-parse-utc functions return an integer time value: the same value that would be returned by the .code time-local and .code time-utc methods, respectively, when applied to the structure object returned by .codn time-parse . Thus, these equivalences hold: .verb (time-parse-local f s) <--> (time-parse f s).(time-local) (time-parse-utc f s) <--> (time-parse f s).(time-utc) .brev Note: the availability of these three functions depends on the availability of .codn strptime . Note: on some platforms, like the GNU C Library, the .code strptime function supports the parsing of numeric and symbolic time zones. The .code gmtoff slot of the structure ends up being set accordingly. The .code time-local and .code time-utc functions take the .code gmtoff field into account, adjusting the returned time accordingly. If these are specified. .coNP Methods @ time-local and @ time-utc .synb .mets << time-struct .(time-local) .mets << time-struct .(time-utc) .syne .desc The .code time structure has two methods called .code time-local and .codn time-utc . The .code time-local function considers the slots of the .code time structure instance .meta time-struct to be local time, and returns its integer representation as the number of seconds since the epoch. The .code time-utc function is similar, except it considers the slots of .meta time-struct to be in the UTC time zone. Note: these functions work by converting the slots into arguments to which .code make-time or .code make-time-utc is applied. Note: if the .code gmtoff slot is not .codn nil , its value is subtracted from the returned result. .coNP Method @ time-string .synb .mets << time-struct .(time-string << format ) .syne .desc The .code time structure has a method called .codn time-string . This method accepts a .meta format string argument, which it uses to convert the fields to a character string representation which is returned. The .meta format argument is a string, and follows exactly the same conventions as the format string of the C library function .codn strftime . .coNP Method @ time-parse .synb .mets << time-struct .(time-parse < format << string ) .syne .desc The .code time-parse method scans a time description in .meta string according to the specification given in the .meta format string. If the scan is successful, the structure is updated with the parsed information, and the remaining unmatched portion of .meta string is returned. If all of .meta string is matched, then an empty string is returned. Slots of .meta time-struct which are originally .code nil are replaced with zero, even if these zero values are not actually parsed from .metn string . If the scan is unsuccessful, then .code nil is returned and the structure is not altered. The .meta format argument follows the same conventions as the POSIX C library function .codn strptime . Note: the .code time-parse method may be unavailable if the host system does not provide the .code strptime function. In this case, the .code time-parse static slot of the .code time struct is .codn nil . .coNP Functions @ make-time and @ make-time-utc .synb .mets (make-time < year < month < day .mets \ \ \ \ \ \ \ \ \ \ < hour < minute < second << dst-advice ) .mets (make-time-utc < year < month < day .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ < hour < minute < second << dst-advice ) .syne .desc The .code make-time function returns a time value, similar to the one returned by the .code time function. The .code time value is constructed not from the system clock, but from a date and time specified as arguments. The .meta year argument is a calendar year, like 2014. The .meta month argument ranges from 1 to 12. The .meta hour argument is a 24-hour time, ranging from 0 to 23. These arguments represent a local time, in the current time zone. The .meta dst-advice argument specifies whether the time is expressed in daylight savings time (DST). It takes on three possible values: .codn nil , the keyword .codn :auto , or else the symbol .codn t . Any other value has the same interpretation as .codn t . If .meta dst-advice is .codn t , then the time is assumed to be expressed in DST. If the argument is .codn nil , then the time is assumed not to be in DST. If .meta dst-advice is .codn :auto , then the function tries to determine whether DST is in effect in the current time zone for the specified date and time. The .code make-time-utc function is similar to .codn make-time , except that it treats the time as UTC rather than in the local time zone. The .meta dst-advice argument is supported by .code make-time-utc for function call compatibility with .codn make-time . It may or may not have any effect on the output (since the UTC zone by definition doesn't have daylight savings time). .SS* Data Integrity .coNP Function @ crc32-stream .synb .mets (crc32-stream < stream >> [ nbytes <> [ crc-prev ]]) .syne .desc The .code crc32-stream calculates the CRC-32 sum over the bytes read from .metn stream , starting at the stream's current position. If the .meta nbytes argument is specified, it should be a nonnegative integer. It gives the number of bytes which should be read and included in the sum. If the argument is omitted, then bytes are read until the end of the stream. The optional .meta crc-prev argument defaults to zero. It is fully documented under the .code crc32 function. The .code crc32-stream functions returns the calculated CRC-32 as a nonnegative integer. .coNP Function @ crc32 .synb .mets (crc32 < obj <> [ crc-prev ]) .syne .desc The .code crc32 function calculates the CRC-32 sum over .metn obj , which may be a character string or a buffer. If .meta obj is a buffer, then the sum is calculated over all of the bytes contained in that buffer, according to its current length. If .meta obj is a character string, then the sum is calculated over the bytes which constitute its UTF-8 representation. The optional .meta crc-prev argument defaults to zero. If specified, it should be a nonnegative integer in the 32-bit range. This argument is useful when a single CRC-32 must be calculated in multiple operations over several objects. The first call should specify a value of zero, or omit the argument. To continue the checksum, each subsequent call to the function should pass as the .meta crc-prev argument the CRC-32 obtained from the previous call. The .code crc32 function returns the calculated CRC-32 as a nonnegative integer. The parameters of the algorithm are as follows. The polynomial is .codn #x04C11DB7 ; the input and result are reflected; the initial value is .codn #xFFFFFFFF ; and the final value is bitwise .IR xor -ed with .codn #xFFFFFFFF . .TP* Examples: .mono ;; Single operation (crc32 "ABCD") --> 3675725989 ;; In two steps, demonstrating crc-prev argument: (crc32 "CD" (crc32 "AB")) -> 3675725989 .onom .coNP Functions @, sha1-stream @ sha256-stream and @ md5-stream .synb .mets (sha1-stream < stream >> [ nbytes <> [ buf ]]) .mets (sha256-stream < stream >> [ nbytes <> [ buf ]]) .mets (md5-stream < stream >> [ nbytes <> [ buf ]]) .syne .desc The .code sha1-stream and .code sha256-stream functions calculate, respectively, the NIST SHA-1 and SHA-256 digests over the bytes read from .metn stream , starting at the stream's current position. The .code md5-stream function calculates the MD5 digest, using the RSA Data Security, Inc. MD5 Message-Digest Algorithm. If the .meta nbytes argument is specified, it should be a nonnegative integer. It gives the number of bytes which should be read and included in the digest. If the argument is omitted, then bytes are read until the end of the stream. If the .meta buf argument is omitted, the digest value is returned as a new, buffer object. This buffer is 32 bytes long in the case of SHA-256, holding a 256-bit digest, and 16 bytes long in the case of MD5, holding a 128-bit digest. If the .meta buf argument is specified, it must be a buffer that is at least 16 bytes long in the case of MD5, and at least 32 bytes long in the case of SHA-256. The hash is placed into that buffer, which is then returned. .coNP Functions @, sha1 @ sha256 and @ md5 .synb .mets (sha1 < obj <> [ buf ]) .mets (sha256 < obj <> [ buf ]) .mets (md5 < obj <> [ buf ]) .syne .desc The .code sha1 and .code sha256 function calculate, respectively, the NIST SHA-1 and SHA-256 digests over .metn obj , which may be a character string or a buffer. Similarly, the .code md5 functions calculates the MD5 digest over .metn obj , using the RSA Data Security, Inc. MD5 Message-Digest Algorithm. If .meta obj is a buffer, then the digest is calculated over all of the bytes contained in that buffer, according to its current length. If .meta obj is a character string, then the digest is calculated over the bytes which constitute its UTF-8 representation. If the .meta buf argument is omitted, the digest value is returned as a new, buffer object. This buffer is 32 bytes long in the case of SHA-256, holding a 256-bit digest, and 16 bytes long in the case of MD5, holding a 128-bit digest. If the .meta buf argument is specified, it must be a buffer that is at least 16 bytes long in the case of MD5, and at least 32 bytes long in the case of SHA-256. The hash is placed into that buffer, which is then returned. .coNP Functions @, sha1-begin @ sha1-hash and @ sha1-end .synb .mets (sha1-begin) .mets (sha1-hash < ctx << obj ) .mets (sha1-end < ctx <> [ buf ]) .syne .desc The three functions .codn sha1-begin , .code sha1-hash and .code sha1-end implement a stateful computation of SHA-1 digest which allows multiple input sources to contribute to the result. Furthermore, the context object may be serially reused for calculating multiple digests. The .code sha1-begin function, which takes no arguments, returns a new SHA-1 digest-producing context object. The .code sha1-hash updates the state of the SHA-1 digest object .meta ctx by including .meta obj into the digest calculation. The .meta obj argument may be: a character or character string, whose UTF-8 representation is digested; a buffer object, whose contents are digested; or an integer, representing a byte value in the range 0 to 255 included in the digest. The .code sha1-hash function may be called multiple times to include any mixture of strings and buffers into the digest calculation. The .code sha1-end function finalizes the digest calculation and returns the digest in a buffer. If the .meta buf argument is omitted, then a new 20-byte buffer is created for this purpose. Otherwise, .meta buf must specify a .code buf object that is at least 20 bytes long. The digest is stored into this buffer and that the buffer is returned. The .code sha1-end function additionally resets the .meta ctx object into the initial state of a newly created context object, so that it may be used for another digest session. .coNP Functions @, sha256-begin @ sha256-hash and @ sha256-end .synb .mets (sha256-begin) .mets (sha256-hash < ctx << obj ) .mets (sha256-end < ctx <> [ buf ]) .syne .desc The three functions .codn sha256-begin , .code sha256-hash and .code sha256-end implement a stateful computation of SHA-256 digest which allows multiple input sources to contribute to the result. Furthermore, the context object may be serially reused for calculating multiple digests. The .code sha256-begin function, which takes no arguments, returns a new SHA-256 digest-producing context object. The .code sha256-hash updates the state of the SHA-256 digest object .meta ctx by including .meta obj into the digest calculation. The .meta obj argument may be: a character or character string, whose UTF-8 representation is digested; a buffer object, whose contents are digested; or an integer, representing a byte value in the range 0 to 255 included in the digest. The .code sha256-hash function may be called multiple times to include any mixture of strings and buffers into the digest calculation. The .code sha256-end function finalizes the digest calculation and returns the digest in a buffer. If the .meta buf argument is omitted, then a new 32-byte buffer is created for this purpose. Otherwise, .meta buf must specify a .code buf object that is at least 32 bytes long. The digest is stored into this buffer and that the buffer is returned. The .code sha256-end function additionally resets the .meta ctx object into the initial state of a newly created context object, so that it may be used for another digest session. .coNP Functions @, md5-begin @ md5-hash and @ md5-end .synb .mets (md5-begin) .mets (md5-hash < ctx << obj ) .mets (md5-end < ctx <> [ buf ]) .syne .desc The three functions .codn md5-begin , .code md5-hash and .code md5-end implement a stateful computation of MD5 digest which allows multiple input sources to contribute to the result. Furthermore, the context object may be serially reused for calculating multiple digests. The .code md5-begin function, which takes no arguments, returns a new MD5 digest-producing context object. The .code md5-hash updates the state of the MD5 digest object .meta ctx by including .meta obj into the digest calculation. The .meta obj argument may be: a character or character string, whose UTF-8 representation is digested; a buffer object, whose contents are digested; or an integer, representing a byte value in the range 0 to 255 included in the digest. The .code md5-hash function may be called multiple times to include any mixture of strings and buffers into the digest calculation. The .code md5-end function finalizes the digest calculation and returns the digest in a buffer. If the .meta buf argument is omitted, then a new 16-byte buffer is created for this purpose. Otherwise, .meta buf must specify a .code buf object that is at least 16 bytes long. The digest is stored into this buffer and that the buffer is returned. The .code md5-end function additionally resets the .meta ctx object into the initial state of a newly created context object, so that it may be used for another digest session. .SS* The Awk Utility The \*(TL library provides a macro called .code awk which is inspired by the Unix utility Awk. The macro implements a processing paradigm similar to that of the utility: it scans one or more input streams, which are divided into records and fields, under the control of user-settable regular-expression-based delimiters. The records and fields are matched against a sequence of programmer-defined conditions (called "patterns" in the original Awk), which have associated actions. Like in Awk, the default action is to print the current record. Unlike Awk, the .code awk macro is a robust, self-contained language feature which can be used anywhere where a \*(TL expression is called for, cleanly nests with itself and can produce a return value when done. By contrast, a function in the Awk language, or an action body, cannot instantiate a local Awk processing machine. The .code awk macro implements some of the most important Awk conventions and semantics, in Lisp syntax, while eschewing others. It does not implement the Awk convention that variables become defined upon first mention; variables must be defined to be used. It doesn't implement Awk's weak type system. A character string which looks like a number isn't a number, and an empty string or undefined variable doesn't serve as zero in arithmetic expressions enclosed in the macro. All expression evaluation within .code awk is the usual \*(TL evaluation. The .code awk macro also does not provide a library of functions corresponding to those in the Awk library, nor does it provide counterparts various global variables in Awk such as the .code ENVIRON and .code PROCINFO arrays, or .code RSTART and .codn RLENGTH . Such features of Awk are extraneous to its central paradigm. .coNP Macro @ awk .synb .mets (awk >> {( condition << action *)}*) .syne .desc The .code awk macro processes one or more input sources, which may be streams or files. Each input source is scanned into records, and each record is broken into fields. For each record, the sequence of condition-action clauses (except for certain special clauses) is processed. Every .meta condition is evaluated, and if it yields true, the corresponding .metn action s are evaluated. The .meta condition and .meta action forms are understood to be in a scope in which certain local identifiers exist in the variable namespace as well as in the function namespace. These are called .I "awk functions" and .IR "awk macros" . If .meta condition is one of the following keyword symbols, then it is a special clause, with special semantics: .codn :name , .codn :let , .codn :inputs , .codn :output , .codn :begin , .codn :set , .codn :end , .codn :begin-file , .code :set-file and .codn :end-file . These clause types are explained below. In such a clause, the .meta action expressions are not necessarily forms to be evaluated; the treatment of these expressions depends on the clause. Otherwise, if .meta condition is not one of the above keyword symbols, the clause is an ordinary condition-action clause, and .meta condition is a \*(TL expression, evaluated to determine a Boolean value which controls whether the .meta action forms are evaluated. In every ordinary condition-action clause which contains no .meta action forms, the .code awk macro substitutes the single action equivalent to the form .codn "(prn)" : a call to the local .code awk function .codn prn . The behavior of this macro, when called with no arguments, as above, is to print the current record (contents of the variable .codn rec ) followed by the output record terminator from the variable .codn ors . While the processing loop in .code awk scans an input source, it also binds the special variable .code *stdin* to the open stream associated with that source. This binding is in effect across all ordinary clauses, as well as across the special clauses .code :begin-file and .codn :end-file . The following is a description of the special clauses: .RS .meIP (:name << obj ) The .code :name clause establishes the name of the implicit block contained within the expansion of the .code awk macro to be the object .metn obj , usually a symbol. Forms enclosed in the macro can use .code return-from to abandon the .code awk form, specifying the same object as the argument. If the .code :name form is omitted, the implicit block is named .codn awk . It is an error for two or more .code :name forms to appear. Note: in \*(TX 255 and older, the .code :name clause must have an argument which is a symbol. The symbol .code nil is not permitted. .meIP (:let >> { sym | >> ( sym << init-form )}*) Regardless of what order they appear in relation to other clauses in the same .code awk macro, .code :let clauses are evaluated first before the macro takes any other action. The argument forms of this clause are variables or variable-init forms. They are treated the same way as analogous forms in the .code let* special form. Note that these are not enclosed in an extra list as they are in the that form. The bindings established by the .code :let clause have a scope which extends over all the other clauses in the .code awk macro. If multiple .code :let clauses are present, they are effectively consolidated into a single clause, in the order they appear. Note that the lexical variables, functions and macros established by the .code awk macro (called, respectively, .IR "awk macros" , .I "awk functions" and .IR "awk variables" ) are in an inner scope relative to .code :let bindings. For instance if .code :let creates a binding for a variable called .codn fs , that variable will be visible only to subsequent forms appearing in the same .code :let clause or later .code :let clauses, and also visible in .code :inputs and .code :output clauses. In .codn :begin , .codn :set , .codn :end , and ordinary clauses, it will be shadowed by the .code awk variable .codn fs , which holds the field-separator regular expression or string. .meIP (:fun >> {( name < param-list << function-body-form *)}*) The .code :fun clause introduces named functions which are visible inside the .code awk form, as if bound by a .code labels operator. Variables defined by .code :let are visible to these named functions. The reverse is not true: the functions are not visible to the .metn init-form s of the .code :let clause. This is regardless of the order of appearance of the .code :let and .code :fun clauses in the .code awk macro. Furthermore, functions defined by .code :fun may refer to awk macros, functions and variables. .meIP (:inputs << source-form *) The .code :inputs clause is evaluated by the .code awk macro after processing the .code :let clauses. Each .meta source-form is evaluated and the values of these forms are gathered into a list. This list then comprises the list of input sources for the .code awk processing task. Each input source must be one of three kinds of objects. It may be a stream object, which must be capable of character input. It may be a list of strings, which .code awk will convert to an input stream as if by the .code make-strlist-input-stream function. Or else it must be a character string, which denotes a filesystem pathname which .code awk will open for reading. If the .code :inputs clause is omitted, then a defaulting behavior occurs for obtaining the list of input sources. If the special variable .code *args* isn't the empty list, then .code *args* is taken as the input sources. Otherwise, the .code *stdin* stream is taken as the one and only input source. If the .code awk macro uses .code *args* via the above defaulting behavior, it copies .code *args* and sets that variable to .codn nil . This is done in order that if .code awk is used from the \*(TX command line, for example using the .code -e command-line option, after .code awk terminates, \*(TX will not try to open the next argument as a script file or treat it as an option. Note: programs which want .code awk not to modify .code *args* can explicitly specify .code *args* as the argument to the .code :inputs keyword, rather than allow .code *args* to be used through the defaulting behavior. Only the defaulting behavior consumes the arguments by overwriting .code *args* with .codn nil . It is an error to specify more than one .code :inputs clause. .meIP (:output << output-form ) The .code :output clause is processed just after the .code :inputs clause. It must have exactly one argument, which is an expression that evaluates to a string, or else to an output stream. If it evaluates to a string, then that string is used as the name of a file to open for writing, and the resulting stream is taken in place of that string. The .code :output clause, if present, has the effect of creating a local binding for the .code *stdout* special variable. This new value of .code *stdout* is visible to all forms within the macro. If a .code :let clause is present, it establishes bindings in a scope which is nested within the scope established by .codn :output . Therefore, .metn init-form s in the .code :let may refer to the new value of .code *stdout* established by .codn :output . Furthermore, .code :let can rebind .codn *stdout* , causing the definition provided by .code :output to be shadowed. In the case when the .code :output argument is a string such that a new stream is opened on the file, the .code awk macro will close that stream when it finishes executing. Moreover, that stream is treated uniformly as a member of the set of streams that are implicitly managed by the redirection macros in the same .code awk macro invocation. In brief, the implication is that if .code :output creates a stream for the file pathname .str "out.txt" and somewhere in the same .code awk macro, there is a redirection of the form, or equivalent to .mono (-> "out.txt") .onom then this redirection shall refer to the same stream that was established by .codn :output . Note also that in this example situation, the expression .mono (-> "out.txt" :close) .onom has the effect of closing the .code :output stream. .meIP (:begin << form *) All .code :begin clauses are processed in the order in which they appear, before input processing begins. Each .code form is evaluated. These forms have in their scope the local .code awk variables and macros. .meIP (:set >> { place << new-value }*) The .code :set clause provides a shorthand which allows the frequently occurring pattern .code "(:begin (set ...))" to be condensed to .codn "(:set ...)" . .meIP (:end << form *) All .code :end clauses are processed, in the order in which they appear, when the input processing loop terminates. This termination occurs when all records from all input sources are either processed or skipped, or else by an explicit termination such as a dynamic nonlocal transfer, such as .codn return-from , or the throwing of an exception. Upon termination, the end clauses are processed in the order they appear. Each .code form is evaluated, left to right. In the normal termination case, the value of the last .meta form of the last end clause appears as the return value of the .code awk macro. Note that only termination of the .code awk macro initiated from condition-action clauses, .code :begin-file clauses, or .code :end-file clauses triggers .code :end clause processing. If termination of the .code awk macro is initiated from within a .codn :let , .codn :inputs , .code :output or .code :begin clause, then end clauses are not processed. If an .code :end clause performs a nonlocal transfer, the remaining .code :end forms in that clause and .code :end clauses which follow are not evaluated. .meIP (:begin-file << form *) All .code :begin-file clauses are processed in the order in which they appear, before .code awk switches to each new input. If both .code :begin and .code :begin-file forms are specified, then before the first input is processed, .code :begin clauses are processed first, then the .code :begin-file clauses. .meIP (:set-file >> { place << new-value }*) The .code :set-file clause is a shorthand which translates .code "(:set-file ...)" to .codn "(:begin-file (set ...))" . .meIP (:end-file << form *) All .code :end-file clauses are processed after the processing of an input source finishes. If both .code :end and .code :end-file forms are specified, then before after the last input is processed, .code :end-file clauses are processed first, then the .code :end clauses. The .code :end-file clauses are processed unconditionally, no matter how the processing of an input source terminates, whether terminated naturally by running out of records, prematurely by invocation of the .code next-file macro, or via a dynamic nonlocal control transfer such as a block return or exception throw. If a .code :begin-file clause performs a nonlocal transfer, .code :end-file processing is not triggered, because the processing of the input source is deemed not to have taken place. .meIP (:fields >> { sym | >> ( sym <> [ fun ]) | -}*) The .code :fields clause may be specified in order to give symbolic names to fields, and optionally specify conversions for them. Every argument must be one of three expressions. It may be a bindable symbol other than .code - (minus). It may be a list whose first element is a symbol other than .code - optionally followed the name of a function. Or else it may be the .code - symbol, which has a special meaning. Symbols other than .code - may not be repeated, and the .code :fields clause may appear at most once in a given instance of the .code awk macro. Each argument is understood to correspond to a field expression for a successive field, starting with the leftmost .meta sym corresponding with the first field, .codn "[f 0]" . Each .meta sym other than .code - becomes the name of a symbol macro which denotes its corresponding field expression, expanded over the scope of the .code awk macro. The .code - symbol is a placeholder which doesn't bind a symbol macro to the corresponding field. Additionally, every two-element entry which associates the field symbol .meta sym with a function name .meta fun specifies a field conversion. After each record is read and divided into fields, those fields for which .meta fun is specified are updated by passing their value to this function and replacing them by the returned value. The .meta fun symbol may also be one of the short-hand symbols available in the .code fconv macro, such as .codn i , .code x and others. If at least one such conversion is specified in a .code :fields clause, then the value of .code rec is updated from the converted fields in the usual manner, as if the fields had been assigned. Furthermore, it is ensured that every field for which a .code :fields clause specifies a conversion exists. Fields with an empty string value are automatically added so that a field exists for the rightmost conversion, and the value of .code nf is updated to include these fields. .meIP >> ( condition << action *) Clauses which do not have one of the specially recognized keywords in the first position are ordinary condition-action clauses. After processing the .code :begin clauses, .code awk enters a loop in which it extracts successive records from the input sources according to the .code rs (record separator) variable. Each record is divided into fields according to the .code fs (field separator) variable, and various .code awk variables are updated. Then, the condition-action clauses are processed, in the order in which they appear. Each .meta condition is evaluated. If the resulting value is a regular expression or a function, then this regular expression or function is invoked on the value stored in the record variable .codn rec , and the result is taken to be the truth value of .metn condition . Otherwise, if the resulting value of .meta condition is other than a function or regular expression, it is taken directly to be the truth value. If the condition is true, then its associated .meta action forms are evaluated. These forms have access to the truth value via the .code res variable, which is freshly bound for each execution of the .meta action forms of that specific clause. For each input record, all condition-action clauses are processed in the order they appear, regardless of which of them have a true action, except in cases when some .meta action invokes the .code next or .code next-file macro, or abandons the execution of the .code awk macro entirely via a non-local exit. When an input source runs out of records, .code awk switches to the next input source. When there are no more input sources, the macro terminates. .RE .coNP Variables @ rec and @ orec .desc The .code awk variable .code rec holds the current record. It is automatically updated prior to the processing of the condition-action clauses. Prior to the extraction of the first record, its value is .codn nil . It is possible to assign to .codn rec . The value assigned to .code rec must be a character string. Immediately upon the assignment, the character string is delimited into fields according to the field separator .code awk variable .codn fs , and these fields are assigned to the field list .codn f . At the same time, the .code nf variable is updated to reflect the new number of fields. Likewise, modification of these variables causes .code rec to be reconstructed by a catenation of the textual representation of the fields in .code f separated by copies of the output field separator .codn ofs . The .code orec variable ("original record") also holds the current record. It is automatically updated prior to the processing of the condition-action clauses at the same time as .code rec with the same contents. Like .codn rec , it is initially .code nil before the first record is read. The .code orec variable is unaffected by modification of the variables .codn rec , .code f and .codn nf . It may be assigned. Doing so has no effect on any other variable. .coNP Variable @ f .desc The .code awk variable .code f holds the list of fields. Prior to the first record being read, its value is .codn nil . Whenever a new record is read, it is divided into fields according to the field separator variable .codn fs , and these fields are stored in .code f as a list of character strings. If the variable .code f is assigned, the new value must be a sequence. The variable .code nf is automatically updated to reflect the length of this sequence. Furthermore, the .code rec variable is updated by catenating a string representation of the elements of this sequence, separated by the contents of the .code ofs (output field separator) .code awk variable. Note that assigning to a DWIM bracket form which indexes .codn f , such as for instance .code "[f 0]" constitutes an implicit modification of .codn f , and triggers the recalculation of .codn rec . Modifications of the .code f list which do not involve an implicit or explicit assignment to the variable .code f itself do not have this recalculating effect. Unlike in Awk, assigning to the nonexistent field .mono .meti [f << m ] .onom where .meta m >= .code nf is erroneous. .coNP Variable @ nf .desc The .code awk variable .code nf holds the current number of fields in the sequence .codn f . Prior to the first record being read, it is initially zero. If .code nf is assigned, then .code f is modified to reflect the new number of fields. Fields are deleted from .code f if the new value of .code nf is smaller. If the new value of .code nf is larger, then fields are added. The added fields are empty strings, which means that .code f must be a sequence of a type capable of holding elements which are strings. If .code nf is assigned, then .code rec is also recalculated, in the same way as described in the documentation for the .code f variable. .coNP Variable @ nr .desc The .code awk variable .code nr holds the current absolute record number. Record numbers start at 1. Absolute means that this value does not reset to 1 when .code awk switches to a new input source; it keeps incrementing for each record. See the .code fnr variable. Prior to the first record being read, the value of .code nr is zero. .coNP Variable @ fnr .desc The .code awk variable .code fnr holds the current record number within the file. The first record is 1. Prior to the first record being read from the first input source, the value of .code fnr is zero. Thereafter, it resets to 1 for the first record of each input source and increments for the remaining records of the same input source. .coNP Variable @ arg .desc The .code awk variable .code arg is an integer which indicates what input source is being processed. Prior to input processing, it holds the value zero. When the first record is extracted from the first input source, it is set to 1. Thereafter, it is incremented whenever .code awk switches to a new input source. .coNP Variable @ fname .desc The .code awk variable .code fname provides access to a character string which, if the current input is a file stream, is the name of the underlying file. Assigning to this variable changes its value, but has no effect on the input stream. Whenever a new input source is used by .codn awk , this variable is set from the file name on which it is opening a stream. When using an existing stream rather than opening a file, .code awk sets this variable from the .code :name property of the stream. Note that the redirection macros .code <- and .code (set f [(opip a b c ...) f]) .brev .TP* Example: .verb ;; convert all fields from string to floating-point (ff (mapcar flo-str)) .brev .coNP Macro @ mf .synb .mets (mf < opip-arg *) .syne .desc The .code awk macro .code mf (map fields) provides a shorthand for mapping each field individually trough a pipeline of chained functions expressed using .code opip argument syntax. The following equivalence holds, except that .code f refers to the .code awk variable even if the .code mf invocation occurs in code which establishes a binding which shadows .codn f . .verb (mf a b c ...) <--> (set f (mapcar (opip a b c ...) f)) .brev .TP* Example: .verb ;; convert all fields from string to floating-point (mf flo-str) .brev .coNP Macro @ fconv .synb .mets (fconv >> { clause | : | - }*) .syne .desc The .code awk macro .code fconv provides a succinct way to request conversions of the textual fields. Conversions are expressed by clauses which correspond with fields. Each .meta clause is an expression which must evaluate to a function. The clause is evaluated in the same manner as the argument a .code dwim operator, using Lisp-1-style name lookup. Thus, functions may be specified simply by using their name as a .metn clause . Furthermore, several local functions exist in the scope of each .metn clause , providing a shorthand notation. These are described below. Conversion proceeds by applying the function produced by a clause to the field to which that clause corresponds, positionally. The return value of the function applied to the field replaces the field. When a clause is specified as the symbol .code - (minus) it has a special meaning: this minus clause occupies a field position and corresponds to a field, but performs no conversion on its field. The .code : (colon) keyword symbol isn't a clause and does not correspond to a field position. Rather, it acts as a separator among clauses. It need not appear at all. If it appears, it may appear at most twice. Thus, the clauses may be separated into up to three sequences. If the colon does not appear, then all the clauses are .IR "prefix clauses" . Prefix clauses line up with fields from left to right. If there are fewer fields than prefix clauses, the values of the excess clauses are evaluated, but ignored. Vice versa, if there are fewer prefix clauses than fields, then the excess fields are not subject to conversions. If the colon appears once, then the clauses before the colon, if any, are prefix clauses, as described in the previous paragraph. Clauses after the colon, if any, are .IR "interior clauses" . Interior clauses apply to any fields which are left unconverted by the prefix clauses. All interior clauses are evaluated. If there are fewer fields than interior clauses, then the values of the excess interior clauses are ignored. If there are more fields than clauses, then the clause values are cycled: reused from the beginning against the excess fields, enough times to convert all the fields. If the colon appears twice, then the clauses before the first colon, if any, are prefix clauses, the clauses between the two colons are interior clauses, and those after the second colon are .IR "suffix clauses" . The presence of suffix clauses change the behavior relative to the one-colon case as follows. After the conversions are performed according to the prefix clauses, the remaining fields are counted. If there are are only as many fields as there are suffix clauses, or fewer, then the interior clauses are evaluated, but ignored. The remaining fields are processed against the suffix clauses. If after processing the prefix clauses there are more fields remaining than suffix clauses, then a number of rightmost fields equal to the number of suffix clauses is reserved for those clauses. The interior fields are applied only to the unreserved middle fields which precede these reserved rightmost fields, using the same repeating behavior as in the one-colon case. Finally, the previously reserved rightmost fields are processed using the suffix clauses. The following special convenience functions are in scope of the clauses, effectively providing a shorthand for commonly-needed conversions: .RS .coIP i Provides conversion to integer. It is identical to the .code toint function, with the default radix. .coIP o Converts a string value holding an octal representation to the integer which it denotes. It is equivalent to .code toint with a .meta radix argument of 8. .coIP x Converts a string value holding a hexadecimal representation to the integer which it denotes. It is equivalent to .code toint is equivalent to with a .meta radix argument of 16. .coIP b Converts a string value holding a binary (base two) representation to the integer which it denotes. It is equivalent to .code toint with a .meta radix argument of 2. .coIP c Converts a string value holding a C-language-style representation to the integer which it denotes, meaning that the .code 0x prefix denotes a hexadecimal value, a leading zero octal, otherwise decimal. These prefixes follow the .code + or .code - sign, if present. The .code c function is equivalent to .code toint invoked with a .meta radix argument of .codn #\ec . .coIP r Converts a string holding a floating-point representation to the floating-point value which it denotes. It is equivalent to .codn tofloat . .ccIP @, iz @, oz @, xz @, bz @ cz and @ rz Conversion similar to .codn i , .codn o , .codn x , .codn b , .code c and .codn r , but equivalent to using the functions .code tointz and .codn tofloatz . Thus fields which are non-numeric strings or the object .code nil get converted to 0, or 0.0 in the case of .codn rz . .coIP - Performs no conversion: the corresponding field is taken as-is. .RE .IP Because .code fconv macro destructively operates on the elements of the field list .codn f , it has the same effect as an assignment to the fields: the value of .code rec is updated. The return value of .code fconv is .codn f . Note: because .code f is .code nil when no fields have been extracted, a .code fconv expression can be used as the condition in an .code awk clause which triggers the action if one or more fields have been extracted, and performs conversions on them. Note: although .code fconv is intended for converting textual fields, and the semantic descriptions below consequently make references to string inputs, the behavior of .code fconv with respect to non-string fields can be inferred. For instance if a field actually holds the floating-point value 3.14, and the .code i conversion is applied to it, it will produce 3, because it works by means of the .code toint function. Note: a somewhat less flexible mechanism for converting fields, related to .codn fconv , is present in the .code :fields clause of the .code awk macro, which can specify names for the positional fields, along with conversion functions. The .code :fields clause has different syntax, and doesn't support the .code : (colon) separator, instead assuming a fixed number of fields enumerated from the left. .TP* Examples: .verb ;; convert up to first three fields to integer: (awk ((fconv i i i))) ;; convert all fields to floating-point (awk ((fconv : r :))) ;; convert first and second fields to integer ;; from hexadecimal; ;; convert last field to integer from octal; ;; process pairs of fields in between ;; these by leaving the first element of ;; each pair unconverted and converting second ;; to floating-point; (awk ((fconv x x : - r : o))) ;; convert all fields, except the first, ;; from integer, turning empty strings ;; and non-integer junk as zero; ;; leave first field unconverted: (awk ((fconv - : iz))) .brev .coNP Macros @, -> @, ->> @, <- @ !> and @ < path << form *) .mets (->> < path << form *) .mets (<- < path << form *) .mets (!> < command << form *) .mets ( , .code ->> and .code !> evaluate each .meta form in a dynamic environment in which the .code *stdout* variable is bound to a file output stream, for the first two functions, or output command pipe in the case of the last one. Similarly, when at least .meta form argument is present, the remaining functions .code <- and .code macro indicates that the file named .meta path is to be opened for writing and overwritten, or created if it doesn't exist. The .code ->> macro indicates that the file named by .meta path is to be opened in append mode, created if necessary. The .code <- macro indicates that the file given by .meta path is to be opened for reading. The .code !> macro indicates that .meta command is to be opened as an output command pipe. The .code [f 2] 5)))) ;; print strictly original lines from orec (awk ((and [f 2] (fconv - - iz) (> [f 2] 5)) (prn orec))) .brev .IP 2. Print every tenth line: .verb (awk ((zerop (mod nr 10)))) .brev .IP 3. Print any line with a substring matching a regex: .verb (awk (#/(G|D)(2\ed[\ew]*)/)) .brev Note the subtle flaw here: the .code [\ew]* portion of the regular expression contributes nothing to what lines are matched. The following example has a similar flaw. .IP 4. Print any line with a substring beginning with a .code G or .code D followed by a sequence of digits and characters: .verb (awk (#/(G|D)([\ed\ew]*)/)) .brev .IP 5. Print lines where the second field matches a regex, while the fourth one doesn't: .verb (awk (:let (r #/xyz/)) ((and [f 3] [r [f 1]] (not [r [f 3]])))) .brev .IP 6. Print lines containing a backslash in the second field: .verb (awk ((find #\e\e [f 1]))) .brev .IP 7. Print lines containing a backslash using a regex constructed from a string. Note that backslash escapes are interpreted twice: once in the string literal, and once in the parsing of the regex, requiring four backslashes to encode one: .verb (awk (:let (r (regex-compile "\e\e\e\e"))) ((and [f 1] [r [f 1]]))) .brev .IP 8. Print penultimate and ultimate field in each record, separating then by a colon: .verb ;; original: {OFS=":";print $(NF-1), $NF} ;; (awk (t (set ofs ":") (prn [f -2] [f -1]))) .brev .IP Note that the above behaves more correctly than the original Awk example because in the when there is only one field, .code $(NF-1) reduces to .code $0 which refers to the entire record, not to the field. This sort of bug is why the \*(TL .code awk does not imitate the design decision to make the record the first numbered field. .IP 9. Output the line number and number of fields separated by colon, by producing a single string first: .verb (awk (t (prn `@nr:@nf`))) .brev .IP 10. Print lines longer than 72 characters: .verb (awk ((> (len rec) 72))) .brev .IP 11. Print first two fields in reverse order, separated by .codn ofs : .verb (awk (t (prn [f 1] [f 0]))) .brev .IP 12. Same as 11, but with field separation consisting of a comma, or spaces and tabs, or both in sequence: .verb (awk (:set fs #/,[ \et]*|[ \et]+/) (t (prn [f 1] [f 0]))) .brev .IP 13. Add the values in the first column, then print sum and average: .verb ;; original: ;; {s += $1} ;; END {print "sum is ", s, " average is", s/NR} ;; (awk (:let (s 0) (n 0)) ([f 0] (fconv r) (inc s [f 0]) (inc n)) (:end (prn `sum is @s average is @(/ s n)`))) .brev Note that the original is not robust against blank lines in the input. Blank lines are treated as if they had a first column field of zero, and are counted toward the denominator in the calculation of the average. .IP 14. Print fields in reverse order, one per line: .verb (awk (t (tprint (reverse f)))) .brev .IP 15. Print all lines between occurrences of .code start and .codn stop : .verb (awk ((rng #/start/ #/stop/))) .brev .IP 16. Print lines whose first field is different from the corresponding field in the previous line: .verb (awk (:let prev) ((nequal [f 0] prev) (prn) (set prev [f 0]))) .brev .IP 17. Simulate the .code echo utility: .verb (awk (:begin (prn `@{*args* " "}`))) .brev Note: if this is evaluated in the command line, for instance with the .code -e option, an explicit exit is required to prevent the arguments from being processed by \*(TX after .code awk completes: .verb (awk (:begin (prn `@{*args* " "}`) (exit 0))) .brev .IP 18. Print the components of the .code PATH environment variable, one per line: .verb ;; Process variable as if it were a file: (awk (:inputs (make-string-input-stream (getenv "PATH"))) (:set fs ":") (t (tprint f))) ;; Just get, split and print; awk macro is irrelevant (awk (:begin (tprint (split-str (getenv "PATH") ":")))) .brev .IP 19. Given a file called .code input which contains page headers of the format .str "Page #" and a \*(TL file called .code prog.tl which contains: .verb (awk (:let (n (toint n))) (#/Page/ (set [f 1] (pinc n))) (t)) .brev the command line: .verb txr -Dn=5 prog.tl input .brev prints the file, filling in page numbers starting at 5. .RE .SS* Environment Variables and Command Line Note that environment variable names, their values, and command-line arguments are all regarded as being externally encoded in UTF-8. \*(TX performs the encoding and decoding automatically. .coNP Special Variables @, *args-full* @ *args-eff* and @ *args* .desc The .code *args-full* variable holds the original, complete list of arguments passed from the operating system, including the program executable name. During command-line-option processing, \*(TX may transform the argument list. The hash-bang mechanism, and the .code --args and .code --eargs options can inject new command-line arguments, as can code which is executed during argument processing via the .code -e options and others. The .code *args-eff* variable holds the list of .IR "effective arguments" , which is the argument list after these transformations are applied. This variable is established and set to the same value as .code *args-full* prior to command-line processing, but is not updated with its final value until after command-line processing. The .code *args* variable holds a list of strings representing the remaining arguments which follow any options processed by the \*(TX executable, and the script name. This list is a suffix of .codn *args-eff* . Thus, the arguments before .code *args* can be calculated using the expression .codn "(ldiff *args-eff* *args*)" . The .code *args* variable is available to \*(TL expressions invoked from the command line via the .codn -p , .code -e and other such options. During these evaluations, .code *args* holds all the remaining options, after the invoking option and its argument expression. In other words, code executed from the command line has access to the remaining arguments which follow it. Furthermore, this code may modify the value of .codn *args* . Such a modification is visible to the option processing code. That is to say code executed from the command line can rewrite the remaining list of arguments, and that list takes effect. .coNP Function @ env .synb .mets (env) .syne .desc The .code env function retrieves the list of environment variables. Each variable is represented by a single entry in the list: a string which contains an .code = (equal) character somewhere, separating the variable name from its value. Multiple calls to .code env may return the same list, or lists which share structure. If a list returned by .code env is modified, the behavior is unspecified. See also: the .code env-hash function. .coNP Function @ env-hash .synb .mets (env-hash) .syne .desc The .code env-hash function returns an .code :equal-based hash whose keys and values are strings. The hash table is populated with the environment variables, represented as key-value character string pairs. The .code env-hash function allocates the hash table when it is first invoked; thereafter, it returns the same hash table. The hash table is updated by the functions .codn setenv , .code unsetenv and .codn getenv . Note: calls to the underlying C library functions .code setenv and .codn getenv , and other direct manipulations of the environment, will not update the hash table. .coNP Functions @, getenv @ setenv and @ unsetenv .synb .mets (getenv << name ) .mets (setenv < name < value <> [ overwrite-p ]) .mets (unsetenv << name ) .syne .desc These functions provide access to, as well as manipulation of, environment variables. Of these three, .code setenv and .code unsetenv might not be available on some platforms, or .code unsetenv might be be present in a simulated form which sets the variable .meta name to the empty string rather than deleting it. The .code getenv function searches the environment for the environment variable whose name is .metn name . If the variable is found, its value is returned. Otherwise .code nil is returned. The .code setenv function creates or modifies the environment variable indicated by .metn name . The .meta value string argument specifies the new value for the variable. If .meta value is .codn nil , then .code setenv behaves like .codn unsetenv , except that it observes the .meta overwrite-p argument. That is to say, the meaning of a null .meta value is that the variable is to be removed. If the .meta overwrite-p argument is specified, and is true, then the variable is overwritten if it already exists. If the argument is false, then the variable is not modified if it already exists. If the argument is not specified, it defaults to the value .codn t , effectively giving rise to a two-argument form of .code setenv which creates or overwrites environment variables. A variable removal is deemed to be an overwrite. Thus if both .meta value and .meta overwrite-p are .codn nil , then .code setenv does nothing. The .code setenv function unconditionally returns .meta value regardless of whether or not it overwrites or removes an existing variable. The .code unsetenv function removes the environment variable specified by .metn name , if it exists. On some platforms, it instead sets the environment variable to the empty string. Note: supporting removal semantics in .code setenv allows for the following simple save/modify/restore pattern: .verb (let* ((old-val (getenv "SOME-VAR"))) (unwind-protect (progn (setenv "SOME-VAR" new-val) ...) (setenv "SOME-VAR" old-val))) .brev This works in the case when .code SOME-VAR exists, as well as in the case that it doesn't exist. In both cases, its previous value or, respectively, non-existence, is restored by the .code unwind-protect cleanup form. These functions interact with the list returned by the .code env function and with the hash table returned by the .code env-hash function as follows. A previously returned list returned by .code env is not modified. The .code setenv and .code unsetenv functions may cause a subsequent call to .code env to return a different list. The .code getenv function has no effect on the list. The hash table previously returned by .code env-hash is modified by .code setenv in the manner consistent with its semantics. A new entry is created in the table, if required, and an existing entry is overwritten only if the .code overwrite-p flag is specified. Likewise, if .code setenv is invoked in a way that causes the environment variable to be deleted, it is removed from the hash also. The .code unsetenv function causes the variable to be removed from the hash table also. The .code getenv function accesses the underlying environment and updates the hash table with the name-value pair which is retrieved. .coNP Function @ replace-env .synb .mets (replace-env << env-list ) .syne .desc The .code replace-env function replaces the environment with the environment variables specified in .metn env-list . The argument is a list of character strings, in the same format as the list returned by the .code env function: each element of the list describes an environment variable as a single character string in which the name is separated by the value by the .code = character. As a special concession, if this character is missing, the .code replace-env function treats that entry as being a name with an empty value. The .code replace-env first empties the existing environment, rendering it devoid of environment variables. Then it installs the entries specified in .metn env-list . The return value is .metn env-list . Note: .code replace-env may be used to specify an exact environment to child programs executed by functions like .codn open-process , .code sh or .codn run . Note: the previous environment may be saved by calling .code env and retaining the returned list. Then after modifying the environment, the original environment can be restored by passing that retained list to .codn replace-env . .coNP Special Variable @ *child-env* .desc The .code *child-env* variable specifies the list of command-line variables established for programs executed via the functions .codn exec , .codn run , .codn sh , .code open-command and .codn open-process . The initial top-level value of this variable is the symbol .code t which indicates that .code *child-env* is to be ignored, such that the executed program inherits the current set of environment variables. If .code *child-env* has any other value, it must be a possibly empty list of environment variables, in the same format as what is returned by .code env function and accepted by .codn replace-env . That value completely specifies the environment that executed programs shall receive. .TP* Example: .verb (let ((*child-env* '("a=b"))) ;; /usr/bin/env sees only "a" environment variable (get-lines (open-process "/usr/bin/env" "r"))) -> ("a=b") .brev .SS* Command-Line-Option Processing \*(TL provides a support for recognizing, extracting and validating the POSIX-style options from a list of command-line arguments. The supported options can be defined as a list of option descriptor objects each of which is constructed by a call to the .code opt function. Each option can have a long name, a short name, a type, and a description. The .code getopts function takes a list of option descriptors, and a list of arguments, producing a parse, or else throwing an exception of type .code opt-error if an error is detected. The returned object, an instance of struct type .codn opts , can then be queried for specific option values, or for the remaining non-option arguments. The .code opthelp function takes a list of option descriptors and an output stream, and generates help text on that stream. A program supporting a .code --help option can use this to generate that portion of its help text which describes the available options. Also provided are functions .code opthelp-conventions and .codn opthelp-types , which have the same interface as .code opthelp and print additional information. These may be used together with .code opthelp to provide more detailed help under a single .code --help option, or under separate options like .codn --extra-help . The .code define-option-struct macro provides a more streamlined, declarative mechanism built on the same facility. The options are declared in a more condensed way, and using symbols instead of strings. Furthermore, the parsed option values become slot values of an object, named by the same symbols. .NP* Command-Line-Option Conventions A command-line option can have a short or long name. A short name is always one character long, and treated specially in the command-line syntax. Long options have names two or more characters long. An option can have both a long and short name. Options may not begin with the .code - (ASCII dash) character. A long option may not contain the .code = character. Short options are invoked by specifying an argument with a single leading .code - followed by the option character. Multiple short options which take no argument can be "clumped": combined into a single argument consisting of a single .code - followed by multiple short option characters. An option can take an argument, in which case the argument is required. An option which takes no argument is Boolean, and a Boolean option never takes an argument: "takes no argument" and "Boolean" effectively mean the same thing. Long options are invoked as an argument which begins with a .code -- (double dash) immediately followed by the name. When a long option takes an argument, it is mandatory. It must be specified in the same argument, separated from the name by the .code = character. If that is omitted, then the next command-line argument is taken as the argument. That argument is removed, and not recognized as an option, even if it looks like one. A Boolean long option can be explicitly specified as false using the .code --no- prefix rather than the .code -- prefix. Short options may be invoked using long name syntax; if .code a is a short option, then it may be referenced on the command line as .code --a and treated as a long option in all other ways, including the use of .code --no- to explicitly specify false for a Boolean option. If a short option takes an argument, it may not clump with other short option. The following command-line argument is taken as the options argument. That argument is removed and is not recognized as an option even if it looks like one. If the command-line argument .code -- occurs in the command line where an option would otherwise be recognized, it signifies the end of the options. The subsequent arguments are the non-option arguments, even if they resemble options. .NP* Command-Line Processing Examples The following example illustrates a complete \*(TL program which parses command-line options: .verb (defvarl options (list (opt "v" "verbose" :dec "Verbosity level. Higher values produce more chatter.") (opt nil "help" :bool "List this help text.") (opt "x" nil :hex "The X factor: a number with a mysterious\e \e interpretation, affecting the program\e \e behavior in strange ways.") (opt "z" nil) ;; undocumented option (opt nil "cee" :cint "C style integer.") (opt "g" "gravity" :float "Gravitational constant. This gives\e \e the gravitational field\e \e strength at the Earth's surface.") (opt "l" "lit" :str "A character string given in TXR Lisp notation.") (opt "c" nil 'upcase-str "Custom treatment: ARG is converted to uppercase.") (opt "b" "bool" :bool "A flag you can flip true."))) (defvarl prog-name *load-path*) (let ((o (getopts options *args*))) (when [o "help"] (put-line "Usage:\en") (put-line ` @{prog-name} [options] arg*`) (opthelp options) (exit 0)) (put-line `args after opts are: @{o.out-args ", "}`)) .brev The next example is equivalent to the previous, but using the .code define-option-struct macro: .verb (define-option-struct prog-opts nil (v verbose :dec "Verbosity level. Higher values produce more chatter.") (nil help :bool "List this help text.") (nil extra-help :bool "List help text with more detailed information.") (x nil :hex "The X factor: a number with a mysterious\e \e interpretation, affecting the program\e \e behavior in strange ways.") ;; undocumented Boolean: (z nil) (nil cee :cint "C style integer.") (g gravity :float "Gravitational constant. This gives\e \e the gravitational field\e \e strength at the Earth's surface.") (l lit :str "A character string given in TXR Lisp notation.") (c nil upcase-str "Custom treatment: ARG is converted to uppercase.") (b bool :bool "A flag you can flip true.")) (defvarl prog-name *load-path*) (let ((o (new prog-opts))) o.(getopts *args*) (when (or o.help o.extra-help) (put-line "Usage:\en") (put-line ` @{prog-name} [options] arg*`) o.(opthelp) (when o.extra-help o.(opthelp-types) o.(opthelp-conventions)) (exit -1)) (put-line `args after opts are: @{o.out-args ", "}`)) .brev .coNP Structure @ opt-desc .synb .mets (defstruct opt-desc .mets \ \ short long helptext type .mets \ \ ... < unspecified << slots ) .syne .desc The .code opt-desc structure describes a single command-line option. The .code short and .code long slots are either .code nil or else hold strings. The .code short slot gives the option's short name: a one-character-long string which may not be the ASCII dash character .codn - . The .code long slot gives the option's long name: a string two or more characters long which doesn't begin with a dash. An option must have at least one of these names. The .code helptext slot provides a descriptive string. This string may be long. The .code opthelp function displays this text, formatting into multiple lines as necessary. If .code helptext is .codn nil , the option is considered undocumented. The .code type slot may be a symbol naming a global function which takes one argument, or it may be such a function object. Otherwise it must be one of the following keyword symbols: .RS .coIP :bool This indicates that the type of the option is Boolean. Such an option doesn't take any argument. Its value is .code t or .codn nil . .coIP :dec This indicates that the option requires an argument, which is a decimal integer with an optional positive or negative sign. This argument is converted to an integer object. .coIP :hex This type indicates that the option requires an argument consisting of a hexadecimal integer with an optional positive or negative sign. This is converted to an integer object. .coIP :oct This type indicates that the option requires an argument consisting of a octal integer with an optional positive or negative sign. This is converted to an integer object. .coIP :cint This type indicates that the option requires an integer argument whose format conforms to one of three C language conventions in most respects, other than that this integer may have an arbitrary range. All forms may carry an optional positive or negative leading sign at the very beginning. The first convention consists of decimal digits, which must not have a superfluous leading zero. The second convention consists of octal digits which are introduced by an extra leading zero. The third convention consists of hexadecimal digits introduced by the .code 0x prefix. .coIP :float This type indicates a decimal floating-point argument, which is converted to a floating-point number. Its basic form is: an optional leading plus or minus sign, followed by a sequence of one or more digits which may contain a single decimal point anywhere, including the very beginning of the sequence or at the end, optionally followed by the letter .code e or .code E followed by a decimal integer which may have a leading positive or negative sign, and include leading zeros. .coIP :text This type indicates a simple textual argument. The argument is taken as verbatim UTF-8 text, converted to a string without interpreting the characters in any special way. .coIP :str This type indicates that the argument consists of the interior notation of a TXR Lisp character string. It is processed by adding a double quote at the beginning or end, and parsed as a string literal. This parsing must successfully yield a string object, otherwise the argument is ill-formed. .meIP (list << type ) If the type is specified as a compound form headed by the .code list symbol, it indicates that the command-line option's argument is a list of elements. The argument appears on the command line as a single string contained within one argument. It may contain commas, and is split into pieces using the comma character as a separator. The pieces are then individually treated as of type .meta type and converted accordingly. The option's argument is then a list object whose elements are the converted pieces. For instance .code "(list :dec)" will convert a list of comma-separated decimal integer tokens into a list of integer objects. The .meta type argument must be a basic type other than .codn :bool . The .code list option type does not nest. .meIP (cumul << type ) If the type is specified as a compound form headed by the .code cumul symbol, it indicates that if the option is specified multiple times, the values coming from the multiple occurrences are accumulated into a list. The .meta type argument must be a .code list type or a basic type other than .codn :bool , for example .code "(cumul (list :dec))" and .codn "(cumul :str)" . This type specifier does not nest: combinations such as .code "(cumul (cumul ...))" and .code "(list (cumul ...))" are invalid. The option values are accumulated in reverse order, so that the rightmost repetition becomes the first item in the list. For instance, if the .code -x option has type .codn "(cumul :dec)" , and the arguments presented for parsing are .codn "(\(dq-x\(dq \(dq1\(dq \(dq-x\(dq \(dq2\(dq)" , then the option's value will be .codn "(2 1)" . If a .codn list -typed option is cumulative, then the option value will be a list of lists. Each repetition of the option produces a list, and the lists are accumulated. .RE .IP If .code type is a function, then the option requires an argument. The argument string is passed to the function, and the value is whatever the function returns. The .code opt-desc structure may have additional slots which are not specified. The .code opt convenience function is provided for constructing .code opt-desc objects. .coNP Function @ opt .synb .mets (opt < short < long >> [ type <> [ helptext ]]) .syne .desc The .code opt function provides a slightly condensed syntax for constructing an object of type .codn opt-desc . The required arguments .meta short and .meta long are strings, corresponding to .code opt-desc slots of the same name. The optional parameter .meta type corresponds to the same-named slot and defaults to .codn :bool . The optional parameter .meta helptext corresponds to the same-named slot and defaults to .code nil (no help text provided for the option). The .code opt function follows this equivalence: .verb (opt a b c d) <--> (new opt-desc short a long b type c helptext d) .brev .coNP Structure @ opts .synb .mets (defstruct opts nil .mets \ \ in-args out-args .mets \ \ ... < unspecified << slots ) .syne .desc The .code opts structure represents a parsed command line, containing decoded information obtained from the options and an indication of where the non-option arguments start. The .code opts structure supports direct indexing for option retrieval. That is the only documented interface for accessing the parsed options; the implementation of the information set describing the parsed options is unspecified. The .code in-args slot holds the original argument list. The .code out-args slot holds the tail of the argument list consisting of the non-option arguments. The mechanism by means of which .code out-args is calculated, and by means of which the information about the options is populated, is unspecified. The only interface to that mechanism is the .code getopts function. The .code opts object supports indexing, including indexed assignment. If .code o is an instance of .code opts returned by .codn getopts , then the expression .code "[o \(dqv\(dq]" tests whether the option .str v is available in .codn o ; that is, whether it has been specified in the command line. If so, then its associated value is returned, otherwise .code nil is returned. This .code nil is ambiguous: for a Boolean option it indicates that either the option was not specified, or that it was explicitly specified as false. For a Boolean option that was specified (positively), the value .code t is returned. The expression .code "[o \(dqv\(dq dfl]" yields the value of option .str v if that option has been specified. If the option hasn't been specified, then the expression yields the value .codn dfl . Assigning to .code "[o \(dqv\(dq]" is possible. This replaces the value associated with option .strn v . The assignment is erroneous if no such option was parsed from the command line, even if it is a valid option. If an option is defined with both a long form and a short form, and either form of that option occurs in the command line being processed, then the option appears under both names in the index. For instance if option .str --verbose has the short form .strn -v , and either option occurs, then both the keys .str "v" and .str "verbose" will exist in the .code opts structure returned by .codn getopts . Note that this behavior is different from that of the structure produced .code define-option-struct macro. Under that approach, if an option is defined with a long and short name, the structure will have only a single slot for that option, named after the long name. .coNP Function @ getopts .synb .mets (getopts < option-desc-list << arg-list ) .syne .desc The .code getopts function takes a list of .code opt-desc structures and a list of strings .meta arg-list representing command-line arguments. The .meta arg-list is parsed. If the parse is unsuccessful, an exception of type .code opt-error is thrown, derived from .codn error . If there are problems in .code option-desc-list itself, then an exception of type .code error is thrown. If the parse is successful, .code getopts returns an instance of the .code opts structure describing the parsed options and listing the non-option arguments. .coNP Functions @, opthelp @ opthelp-types and @ opthelp-conventions .synb .mets (opthelp < opt-desc-list <> [ stream ]) .mets (opthelp-types < opt-desc-list <> [ stream ]) .mets (opthelp-conventions < opt-desc-list <> [ stream ]) .syne .desc The .code opthelp function processes the list of .code opt-desc structures .meta opt-desc-list and compiles a customized body of help describing all of the options which have help text. These are presented in alphabetical order. Options which do not have help text, if any, are simply listed together under a heading which indicates their undocumented status. The text is formatted to fit within 79 columns, and begins and ends with a blank line. Its format consists of headings which begin in the first column, and paragraphs and tables which feature a two space left margin. A blank line follows each section heading. The heading begins with a capital letter. Its remaining words are uncapitalized, and it ends with a colon. The text is sent to .metn stream , if specified. This argument defaults to .codn *stdout* . If there are problems in .code option-desc-list itself, then an exception of type .code error is thrown. The .code opthelp-types supplementary help function processes the .metn opt-desc-list , considering only those options which are documented. If any of them have typed arguments, then a legend is printed explaining the types. The legend includes only information about those option argument types which appear in .metn opt-desc-list . The .code opthelp-conventions supplementary help function processes .metn opt-desc-list , considering only those options which are documented. It prints a guide to the use of options, which includes information only about the kinds of options actually present in .metn opt-desc-list . .coNP Macro @ define-option-struct .synb .mets (define-option-struct < name < super << opt-specifier *) .syne .desc The .code define-option-struct macro defines a struct type, instances of which provide command-line option parsing. The .meta name and .meta super parameters are subject to the same requirements and have the same semantics as the same-named parameters of .codn defstruct . The .meta opt-specifier arguments are lists of between two and four elements: .mono .meti >> ( short-symbol < long-symbol >> [ type <> [ help-text ]]). .onom The .meta short-symbol and .meta long-symbol must be symbols suitable for use as slot names. One of them may be specified as .code nil indicating that the option has no long form, or no short form. If a .meta opt-specifier specifies both a .meta short-symbol and a .meta long-symbol then only a slot named by .meta long-symbol shall exist in the structure. The struct type defined by .code define-option-struct has four methods: .codn getopts , .codn opthelp , .code opthelp-types and .codn opthelp-conventions . It also has two slots: .code in-args and .codn out-args , which function in a manner identical to their same-named counterparts in the .code opts class. The .code getopts method takes a single argument: the argument list to be processed. When the argument list is successfully processed. The .codn opthelp , .code opthelp-types and .code opthelp-conventions methods take an optional stream argument. Note: to encode the option names .str "t" or .strn "nil" , or option names which clash with the slot names .code in-args and .code out-args or the method names such as .code getopts or .codn opthelp , symbols with these names from a package other than .code usr must be used. .SS* System Programming .coNP Accessor @ errno .synb .mets (errno <> [ new-errno ]) .mets (set (errno) << new-value ) .syne .desc The .code errno function retrieves the current value of the C library error variable .codn errno . If the argument .meta new-errno is present and is not .codn nil , then it specifies a value which is stored into .codn errno . The value returned is the prior value. The place form of .code errno does not take an argument. .coNP Function @ strerror .synb .mets (strerror << errno-value ) .syne .desc The .code strerror returns a character string which provides the host platform's description of the integer .meta errno-value obtained from the .code errno function. If the host platform fails to provide a description, the function returns .codn nil . .coNP Function @ exit .synb .mets (exit <> [ status ]) .syne .desc The .code exit function terminates the entire process (running \*(TX image), specifying the termination status to the operating system. Values of the optional .meta status parameter may be .codn nil , .codn t , or an integer value. The value .code nil indicates an unsuccessful termination status, whereas .code t indicates a successful termination status. An absence of the .meta status argument also specifies a successful termination status. If .meta status is an integer value, it specifies a successful termination if it is .codn 0 , otherwise the interpretation of the value is platform-specific. .coNP Variables @, e2big @, eacces @, eaddrinuse @, eaddrnotavail @, eafnosupport @, eagain @, ealready @, ebadf @, ebadmsg @, ebusy @, ecanceled @, echild @, econnaborted @, econnrefused @, econnreset @, edeadlk @, edestaddrreq @, edom @, edquot @, eexist @, efault @, efbig @, ehostunreach @, eidrm @, eilseq @, einprogress @, eintr @, einval @, eio @, eisconn @, eisdir @, eloop @, emfile @, emlink @, emsgsize @, emultihop @, enametoolong @, enetdown @, enetreset @, enetunreach @, enfile @, enobufs @, enodata @, enodev @, enoent @, enoexec @, enolck @, enolink @, enomem @, enomsg @, enoprotoopt @, enospc @, enosr @, enostr @, enosys @, enotconn @, enotdir @, enotempty @, enotrecoverable @, enotsock @, enotsup @, enotty @, enxio @, eopnotsupp @, eoverflow @, eownerdead @, eperm @, epipe @, eproto @, eprotonosupport @, eprototype @, erange @, erofs @, espipe @, esrch @, estale @, etime @, etimedout @, etxtbsy @ ewouldblock and @ exdev .desc These variables correspond to the POSIX .cod2 \(dq errno constants\(dq, namely .codn E2BIG , .codn EACCES , .code EADDRINUSE and so forth. Variables corresponding to all of the .code "" constants from the Issue 6 2004 edition of POSIX are included. The variables .code eownerdead and .code enotrecoverable from Issue 7 2018 are subject to the availability of the corresponding constants in the host platform. .coNP Function @ abort .synb .mets (abort) .syne .desc The .code abort function terminates the entire process (running \*(TX image), specifying an abnormal termination status to the process. Note: .code abort calls the C library function .code abort which works by raising the .code SIG_ABRT signal, known in \*(TX as the .code sig-abrt variable. Abnormal termination of the process is this signal's default action. .coNP Functions @ at-exit-call and @ at-exit-do-not-call .synb .mets (at-exit-call << function ) .mets (at-exit-do-not-call << function ) .syne .desc The .code at-exit-call function registers .meta function to be called when the process terminates normally. Multiple functions can be registered, and the same function can be registered more than once. The registered functions are called in reverse order of their registrations. The .code at-exit-do-not-call function removes all previous .code at-exit-call registrations of .metn function . The .code at-exit-call function returns .metn function . The .code at-exit-do-not-call function returns .code t if it removed anything, .code nil if no registrations of .meta function were found. .coNP Function @ usleep .synb .mets (usleep << usec ) .syne .desc The .code usleep function suspends the execution of the program for at least .meta usec microseconds. The return value is .code t if the sleep was successfully executed. A .code nil value indicates premature wakeup or complete failure. Note: the actual sleep resolution is not guaranteed, and depends on granularity of the system timer. Actual sleep times may be rounded up to the nearest 10 millisecond multiple on a system where timed suspensions are triggered by a 100 Hz tick. .coNP Functions @ mkdir and @ ensure-dir .synb .mets (mkdir < path <> [ mode ]) .mets (ensure-dir < path <> [ mode ]) .syne .desc .code mkdir tries to create the directory named .meta path using the POSIX .code mkdir function. An exception of type .code file-error is thrown if the function fails. Returns .code t on success. The .meta mode argument specifies the request numeric permissions for the newly created directory. If omitted, the requested permissions are .code #o777 (511): readable and writable to everyone. The requested permissions are subject to the system .codn umask . The function .code ensure-dir also creates a directory named .metn path . Unlike .codn mkdir , it also attempt to create all the necessary parent directories, and does not throw an error if .meta path refers to an existing object, if that object is a directory or a symbolic link to a directory. Rather, in that case it returns .code nil instead of .codn t . .coNP Function @ chdir .synb .mets (chdir >> { path | < stream | << fd }) .syne .desc .code chdir changes the current working directory to the object specified by the argument, and returns .codn t , or else throws an exception of type .codn file-error . If the argument is a string, it is interpreted as a .metn path , in which case the POSIX .code chdir function is used. If the argument is a .meta stream then an integer file descriptor is retrieved from that stream using the .code fileno function. That descriptor can be specified directly as a .meta fd argument. In the case of these these two argument types, the .code fchdir function is used. .coNP Function @ pwd .synb .mets (pwd) .syne .desc The .code pwd function retrieves the current working directory. If the underlying .code getcwd C library function fails with an .code errno other than .codn ERANGE , an exception will be thrown. .coNP Function @ rmdir .synb .mets (rmdir << path ) .syne .desc The .code rmdir function removes the directory named by .codn path . If successful, it returns .codn t , otherwise it throws an exception of type .codn file-error . Note: .code rmdir calls the same-named POSIX function, which requires .code path to be the name of an empty directory. .coNP Function @ remove-path .synb .mets (remove-path < path <> [ throw-on-error-p ]) .syne .desc The .code remove-path function tries to remove the filesystem object named by .metn path , which may be a file, directory or something else. If successful, it returns .codn t . The optional Boolean parameter .metn throw-on-error-p , which defaults to .codn nil . A failure to remove the object results in an exception of type .code file-error being thrown, unless the failure reason is that the object indicated by .meta path doesn't exist. In this non-existence case, the behavior is controlled by the .meta throw-on-error argument. If that argument is true, the exception is thrown. Otherwise, the function returns normally, producing the value .code nil to indicate that it didn't perform a removal. .coNP Function @ rename-path .synb .mets (rename-path < from-path << to-path ) .syne .desc The .code rename-path function tries to rename filesystem path .metn from-path , which may refer to a file, directory or something else, to the path .metn to-path . If successful, it returns .codn t . A failure results in an exception of type .codn file-error . .coNP Functions @ sh and @ run .synb .mets (sh << system-command ) .mets (run < program <> [ argument-list ]) .syne .desc The .code sh function executes .meta system-command using the system command interpreter. The run function spawns a .metn program , searching for it using the system PATH. Using either method, the executed process receives environment variables from the parent. \*(TX blocks until the process finishes executing. If the program terminates normally, then its integer exit status is returned. The value zero indicates successful termination. The return value .code nil indicates an abnormal termination, or the inability to run the process at all. In the case of the .code run function, if the child process is created successfully but the program cannot be executed, then the exit status will be an .code errno value from the failed .code exec attempt. The standard input, output and error file descriptors of an executed command are obtained from the streams stored in the .codn *stdin* , .code *stdout* and .code *stderr* special variables, respectively. For a detailed description of the behavior and restrictions, see the .code open-command function, whose description of this mechanism applies to the .code run and .code sh function also. Note: as of \*(TX 120, the .code sh function is implemented using .code run and not by means of the .code system C library function, as previously. The .code run function is used to invoke the system interpreter by name. On Unix-like systems, the string .code /bin/sh is assumed to denote the system interpreter, which is expected to support a pair of arguments .mono .meti -c < command .onom to specify the command to be executed. On MS Windows, the interpreter is assumed to be the relative pathname .code cmd.exe and expected to support .mono .meti /C < command .onom as a way of specifying a command to execute. .coNP Functions @, sh-esc @, sh-esc-all @ sh-esc-dq and @ sh-esc-sq .synb .mets (sh-esc << str ) .mets (sh-esc-all << str ) .mets (sh-esc-dq << str ) .mets (sh-esc-sq << str ) .syne .desc The functions .codn sh-esc , .codn sh-esc-all , .code sh-esc-dq and .code sh-esc-sq transform the argument string .code str for safe insertion into commands. These functions are intended for use on POSIX systems, where the command interpreter used by the functions .code sh and .code open-command and related functions is the POSIX Shell Command Language. The .code sh-esc function adds quoting and escaping into its argument in such a way that the resulting string may be inserted as an argument into a command. The .code sh-esc-all function performs a stricter escaping and quoting, such that the transformed string may be inserted into any syntactic context where a textual operand is required for any reason, such as the .meta pattern in the .mono .meti <2> ${ var % pattern } .onom construct. The .code sh-esc-dq function escapes its argument for insertion into a double-quoted field in a shell command line. It does not add the double quotes themselves. The .code sh-esc-dq function escapes its argument for insertion into a single-quoted field in a shell command line. It does not add the single quotes themselves. The precise set of characters which, according to the .code sh-esc function, require escaping or quoting, is the following: .verb | & ; < > ( ) $ ` \e " ' tab newline space * ? [ # ~ .brev If none of these characters occur in .metn str , then .code sh-esc returns .metn str . The .code sh-esc-all function considers all the above characters, and also these: .verb = % .brev The .code sh-esc-dq function escapes the following characters by preceding them with the \e (backslash) character: .verb $ ` \e " .brev The .code sh-esc-sq function replaces every occurrence of the .code ' character (single quote, apostrophe) with the sequence .code '\e'' (single quote, backslash, single quotes, single quote). This sequence has the effect of terminating the enclosing single-quoted field, then producing a single quote via a backslash escape, and then opening a single-quoted field. .SS* Unix Filesystem Manipulation .coNP Structure @ stat .synb .mets (defstruct stat nil .mets \ \ dev ino mod nlink uid gid .mets \ \ rdev size blksize blocks .mets \ \ atime atime-nsec mtime mtime-nsec .mets \ \ ctime ctime-nsec path) .syne .desc The .code stat structure defines the type of object which is returned by the .code stat and .code lstat functions. Except for .codn path , .codn atime-nsec , .code ctime-nsec and .codn mtime-nsec , the slots are the direct counterparts of the members of POSIX C structure .codn "struct stat" . For instance the slot .code dev corresponds to .codn st_dev . The .code path slot is set by the functions .code stat and .codn lstat . Its value is .code nil when the path is not available. The .codn atime-nsec , .code ctime-nsec and .code mtime-nsec fields give the fractional parts of .codn atime , .code ctime and .codn mtime , respectively. They are derived from the newer style information in which the POSIX function provides the timestamps in .code "struct timespec" format. If that is not available from the platform, these fields take on values of zero. .coNP Functions @, stat @ lstat and @ fstat .synb .mets (stat >> { path | < stream | << fd } <> [ struct ]) .mets (lstat << path ) .mets (fstat >> { path | < stream | << fd } <> [ struct ]) .syne .desc The .code stat function retrieves information about a filesystem object whose pathname is given by the string argument .metn path , or else about a system object associated with the open stream .metn stream , or one associated with the integer file descriptor .metn fd . If a .meta stream is specified, that stream must be of a kind from which the .code fileno function can retrieve a file descriptor, otherwise an exception of type .code file-error is thrown. If the object is not found or cannot be accessed, an exception is thrown. Otherwise, if the .meta struct argument is missing, information is retrieved and returned, in the form of a new structure of type .codn stat . If the .meta struct argument is present, it must be either: an instance of the .code struct structure type, or of a type derived from that type by inheritance, or else structure type which has all the same slots as the .code struct type. The retrieved information is stored into .meta struct and that object is returned rather than a new object. If .meta path refers to a symbolic link, the .code stat function retrieves information about the target of the link, if it exists, or else throws an exception of type .codn file-error . The .code lstat function behaves the same as .code stat on objects which are not symbolic links. For a symbolic link, it retrieves information about the link itself, rather than its target. The .code path slot of the returned structure holds a copy of their .meta path argument value. When information is retrieved using a .meta stream or .meta fd argument, this slot is .codn nil . The .code fstat function is an alias for .codn stat . Note: until \*(TX 231, .code stat and .code fstat were distinct functions: .code stat accepted only .meta path arguments, whereas .code fstat function accepted only .meta stream or .meta fd arguments. .coNP Variables @, s-ifmt @, s-iflnk @, s-ifreg @, s-ifblk ..., @ s-ixoth .desc The following variables exist, having integer values. These are bitmasks which can be applied against the value given by the .code mode slot of the .code stat structure returned by the function .codn stat : .codn s-ifmt , .codn s-ifsock , .codn s-iflnk , .codn s-ifreg , .codn s-ifblk , .codn s-ifdir , .codn s-ifchr , .codn s-ififo , .codn s-isuid , .codn s-isgid , .codn s-isvtx , .codn s-irwxu , .codn s-irusr , .codn s-iwusr , .codn s-ixusr , .codn s-irwxg , .codn s-irgrp , .codn s-iwgrp , .codn s-ixgrp , .codn s-irwxo , .codn s-iroth , .code s-iwoth and .codn s-ixoth . These variables correspond to the C language constants from POSIX: .codn S_IFMT , .codn S_IFLNK , .code S_IFREG and so forth. The .code logtest function can be used to test these against values of mode. For example .code "(logtest mode s-irgrp)" tests for the group read permission. .coNP Function @ umask .synb .mets (umask <> [ mask ]) .syne .desc The .code umask function provides access to the Unix C library function of the same name, which controls which permissions are denied when files are newly created. If .code umask is called with no argument, it returns the current value of the mask. If the .meta mask argument is present, it must be an integer specifying the new mask to be installed. The previous mask is returned. If .meta mask is absent, then .code umask returns the previous mask. Note: the value of the .meta mask argument may be calculated as a bitwise or of the following constants: .codn s-irwxu , .codn s-irusr , .codn s-iwusr , .codn s-ixusr , .codn s-irwxg , .codn s-irgrp , .codn s-iwgrp , .codn s-ixgrp , .codn s-irwxo , .codn s-iroth , .code s-iwoth and .codn s-ixoth , which correspond to the POSIX C constants .codn S_IRWXU , .codn S_IRUSR , .codn S_IWUSR , .codn S_IXUSR , .codn S_IRWXG , .codn S_IRGRP , .codn S_IWGRP , .codn S_IXGRP , .codn S_IRWXO , .codn S_IROTH , .code S_IWOTH and .codn S_IXOTH . Implementation note: since the .code umask C library function provides no way to retrieve the current mask without overwriting with a new one, the \*(TX .code umask function, when given no argument, simulates the pure retrieval of the mask by calling the C function with an argument of .code #o777 to temporarily install the maximally safe mask. The value returned is then reinstated as the mask by another call to .codn umask , and that value is also returned. .coNP Functions @, makedev @ minor and @ major .synb .mets (makedev < minor << major ) .mets (minor << dev ) .mets (major << dev ) .syne .desc The parameters .metn minor , .meta major and .meta dev are all integers. The .code makedev function constructs a combined device number from a minor and major pair (by calling the Unix .code makedev function). This device number is suitable as an argument to the .code mknod function (see below). Device numbers also appear as values of the .code dev slot of the .code stat structure. The .code minor and .code major functions extract the minor and major device number from a combined device number. .coNP Function @ chmod .synb .mets (chmod < target << mode ) .syne .desc The .code chmod function changes the permissions of the filesystem object specified by .metn target . It is implemented in terms of the POSIX functions .code chmod and .codn fchmod . If .meta mode is a character string representing a symbolic mode, then the function also makes use of .code stat or .code fstat and .codn umask . The permissions are specified by .metn mode , which must be an integer or a string. An integer .meta mode is a bitwise combination of permission mode bits. The value is passed directly to the POSIX .code chmod or .code fchmod function. Note: to construct a mode value, applications may use .code logior to combine the values of the variables like .code s-irusr or .code s-ixoth or take advantage of the well-known numeric structure of POSIX permissions to express them octal in octal notation. For instance the mode .code #o750 denotes that the owner has read, write and execute permissions, the group owner has read and execute, others have no permission. This value may also be calculated using .codn "(logior s-irwxu s-irgrp s-ixgrp)" . If the argument to .meta mode is a string, it is interpreted according to the symbolic syntax of the POSIX .code chmod utility. For instance, a .meta mode value of .str a+w,-s means to give all users (owner, group and others) write permission, and remove the setuid and setgid bits. The full syntax and semantics of symbolic .meta mode strings is given in the POSIX standard IEEE 1003.1. The function throws a .code file-error exception if an error occurs, otherwise it returns .codn t . The .meta target argument may be a character string, in which case it specifies a pathname in the filesystem. In this case, the POSIX function .code chmod is invoked. The .meta target argument may also be an integer file descriptor, or a stream. In these two cases, the POSIX .code fchmod function is invoked. For a stream .metn target , the integer file descriptor is retrieved from the stream using .code fileno function. .TP* Example: .verb ;; Set permissions of foo.txt to "rw-r--r--" ;; (owner can read and write; group owner ;; and other users can only read). ;; numerically: (chmod "foo.txt" #o644) ;; symbolically: (chmod "foo.txt" (logior s-irusr s-iwusr s-irgrp s-iroth)) .brev Implementation note: The implementation of the symbolic .meta mode processing is based on the descriptions given in IEEE 1003.1-2018, Issue 7 and also on the .code chmod program from from GNU Coreutils 8.28: and experiments with its behavior, and its documentation. .coNP Functions @ chown and @ lchown .synb .mets (chown < target < id << gid ) .mets (lchown < target < id << gid ) .syne .desc The .code chown and .code lchown functions change the user and group ownership of the filesystem object specified by .metn target . They implemented in terms of the POSIX functions .codn chown , .code fchown and .codn lchown . The ownership attributes are specified by .meta uid and .metn gid , both integer arguments. The existing ownership attributes may be obtained using the .code stat function. These functions throw a .code file-error exception if an error occurs, otherwise they returns .codn t . The .meta target argument may be a character string, in which case it specifies a pathname in the filesystem. In this case, the same-named POSIX function .code chown is invoked by .codn chown , whereas .code lchown likewise invokes its respective same-named POSIX counterpart. The difference is that if .meta target is a pathname denoting a symbolic link, then .code lchown operates on the symbolic link, whereas .code chown dereferences the symbolic link. The .meta target argument may also be an integer file descriptor, or a stream. In these two cases, the POSIX .code fchown function is invoked by either function. For a stream .metn target , the integer file descriptor is retrieved from the stream using .code fileno function. Note: in most POSIX systems, unprivileged processes may not change the user ownership denoted by .metn uid . They may change the group ownership indicated in .metn gid , if that value corresponds to the effective group ID of the calling process or one of its ancillary group IDs. To avoid trying to change the user ownership (and therefore failing), the caller should specify a .meta uid value which matches the object's existing owner. .coNP Functions @ utimes and @ lutimes .synb .mets (utimes < target < atime-s < atime-ns < mtime-s << mtime-ns ) .mets (lutimes < target < atime-s < atime-ns < mtime-s << mtime-ns ) .syne .desc The functions .code utimes and .code lutimes change the access and modification timestamps of a file indicated by the .meta target argument. The difference between the two functions is that if .meta target is the pathname of a symbolic link, then .code lutimes operates on the symbolic link itself, whereas .code utimes resolves the symbolic link. Note: the full, complete functionality of these functions requires the platform to provide the POSIX functions .code futimens and .code utimensat functions. If these functions are not available, then other functions are relied on, with some reductions in functionality, that are documented below. The .meta target argument specifies the file to operate on. It may be an integer file descriptor, an open stream, or a character string representing a pathname. The .meta atime-s and .meta mtime-s parameters specify the whole seconds part of the new access and modification times, expressed as seconds since the epoch. The .meta atime-ns and .meta mtime-ns parameters specify the fractional part of the access and modification times, expressed in nanoseconds. If an integer argument is given to these parameters, it must lie in the range 0 to 999999999, or else the symbols .code nil or .code t may be passed as arguments. If the symbol .code nil is passed as the nanoseconds part of the access or modification time, then the access or modification time, respectively, shall not be modified by the operation. The corresponding seconds argument is ignored. If the symbol .code t is passed as the nanoseconds part of the access or modification time, then the access or modification time, respectively, shall be obtained from the current system time. The corresponding seconds argument is ignored. If the .code utimensat and .code futimens functions are not available from the host system, then the above .code nil and .code t convention in the nanoseconds arguments is not supported; the function will fail by throwing an exception if an attempt is made to pass these arguments. If the .code utimensat and .code futimens functions are not available from the host system, then operating on a symbolic link with .code lutimes is only possible if the system provides the .code lutimes C library function, otherwise the operation fails by throwing an exception (if given a path argument for .metn target , even if that path isn't a symbolic link). If the implementation falls back on the .codn utimes , .codn futimes , and .code lutimes functions, then the nanoseconds arguments are truncated to microsecond precision. If the implementation falls back on .codn utime , then the nanoseconds arguments are ignored; the times are effectively truncated to whole seconds. .coNP Function @ mknod .synb .mets (mknod < path < mode <> [ dev ]) .syne .desc The .code mknod function tries to create an entry in the filesystem: a file, FIFO, or a device special file, under the name .metn path . If it is successful, it returns .codn t , otherwise it throws an exception of type .codn file-error . The .meta mode argument is a bitwise or combination of the requested permissions, and the type of object to create: one of the constants .codn s-ifreg , .codn s-ififo , .codn s-ifchr , .code s-ifblk or .codn s-ifsock . The permissions are subject to the system .codn umask . If a block or character special device .cod2 ( s-ifchr or .codn s-ifblk ) is being created, then the .meta dev argument specifies the major and minor numbers of the device. A suitable value can be constructed from a major and minor pair using the .code makedev function. .TP* Example: .verb ;; make a character device (8, 3) called /dev/foo ;; requesting rwx------ permissions (mknod "dev/foo" (logior #o700 s-ifchr) (makedev 8 3)) .brev .coNP Function @ mkfifo .synb .mets (mkfifo < path << mode ) .syne .desc The .code mkfifo function creates a POSIX FIFO object. If it is successful, it returns .codn t , otherwise it throws an exception of type .codn file-error . The .meta mode argument is a bitwise or combination of the requested permissions, and is subject to the system .codn umask . Note: the .code mknod function can also create FIFOs, specified via the bitwise combination of the .code s-ififo type and the permission mode bits. .coNP Functions @, symlink @ link and @ rlink .synb .mets (symlink < target << path ) .mets (link < target << path ) .mets (rlink < target << path ) .syne .desc The .code symlink function creates a symbolic link called .meta path whose contents are the absolute or relative path .metn target . .meta target does not actually have to exist. The .code link function creates a hard link. The object at .meta target is installed into the filesystem at .meta path also. The .code rlink function is like .code link except that if .meta target is a symbolic link, it is resolved, and the link is made to the resulting object. On Linux, and some other platforms .code link will create a hard link to the symbolic link. The behavior is not specified by POSIX. If these functions succeed, they return .codn t . Otherwise they throw an exception of type .codn file-error . .coNP Function @ readlink .synb .mets (readlink << path ) .syne .desc If .meta path names a filesystem object which is a symbolic link, the .code readlink function reads the contents of that symbolic link and returns it as a string. Otherwise, it fails by throwing an exception of type .codn file-error . .coNP Function @ realpath .synb .mets (realpath << path ) .syne .desc The .code realpath function provides access to the same-named POSIX function. It processes the input string .meta path by expanding all symbolic links, removes all superfluous .str ".." and .str "." path components, and extra component-separating slash characters, to produce a canonical absolute pathname. If the underlying POSIX function indicates failure, then .code nil is returned. In that situation the .code errno value is available using the .code errno function. .SS* Unix Filesystem Complex Operations Functions in this category are complex functionality implemented using a combination of multiple calls into the host system's POSIX API. .coNP Functions @ copy-file and @ copy-files .synb .mets (copy-file < from-path < to-path >> [ perms-p <> [ times-p ]]) .mets (copy-file < from-list < to-dir >> [ perms-p <> [ times-p ]]) .syne .desc The .code copy-file function creates a replica of the file .code from-path at the destination path .metn to-path . Both paths are opened using .code open-file in binary mode, as if using .mono .meti (open-file < from-path \(dqb\(dq) .onom and .mono .meti (open-file < to-path \(dqwb\(dq) .onom respectively. Then bytes are read from one stream and written to the other, in blocks which whose size is a power of two at least as large as 16834. If the optional Boolean parameter .meta perms-p is specified, and is true, then the permissions of .meta from-path are propagated to .metn to-path . If the optional Boolean parameter .meta times-p is specified, and is true, then the access and modification timestamps of .meta from-path are propagated to .metn to-path . The .code copy-file function returns .code nil if it is successful, and throws an exception derived from .code file-error on failure. The .code copy-files function copies multiple files, whose pathnames are given by the list argument .meta from-list into the target directory whose path is given by .metn to-dir . The target directory must exist. For each source path in .metn from-list , the .code copy-files function forms a target path by combining the base name of the source path with .metn target-dir . (See the .code base-name and .code path-cat functions). Then, the source path is copied to the resulting target path, as if by the .code copy-file function. The .code copy-files function returns .code nil if it is successful, and throws an exception derived from .code file-error on failure. Additionally, .code copy-files provides an internal catch for the .code retry and .code skip restart exceptions. If the caller, using a handler frame established by .codn handle , catches an error emanating from the .code copy-files function, it can retry the failed operation by throwing the .code retry exception, or continue copying with the next file by throwing the .code skip exception. .TP* Example: .verb ;; Copy all "/mnt/cdrom/*.jpg" files into "images" directory, ;; preserving their time stamps, ;; continuing the operation in the face of ;; file-error exceptions. (handle (copy-files (glob "/mnt/cdrom/*.jpg") "images" nil t) (file-error (throw 'skip))) .brev .coNP Function @ cat-files .synb .mets (cat-files < to-path << from-path *) .syne .desc The .code cat-files function catenates the contents of zero or more files into one file. The destination path is specified by .metn to-path . Regardless of whether there are any .meta from-path arguments, the file named by .meta to-path is created, if necessary or else truncated to zero length. Then, the files named by each .meta from-path are traversed in left-to-right order; the contents of each file is appended to the destination file. .coNP Function @ copy-path-rec .synb .mets (copy-path-rec < from-path < to-path << option *) .syne .desc The .code copy-path-rec function replicates a file system object identified by the pathname .metn from-path , creating a similar object named .metn to-path . If .code from-path is a directory, it is recursively traversed and its structure and content is replicated under .codn to-path . The .meta option arguments are keywords, which may be the following: .RS .IP :perms Propagate the permissions of all objects under .meta from-path onto their .meta to-path counterparts. In the absence of this option, the copied objects receive the same permissions as a newly created files. On POSIX systems this means: readable and writable to the owner, group and others, by default, subject to the .code umask that is in effect. .IP :times Propagate the modification and access time stamps of all objects under .meta from-path onto their .meta to-path counterparts. .IP :symlinks Copy symbolic links literally rather than dereferencing them. Symbolic links are not altered in any way; their exact content is preserved. Thus, relative symlinks which point outside of the .meta from-path tree may turn into dangling symlinks in the .meta to-path tree. .IP :owner Propagate the ownership of all objects under .meta from-path to their .meta to-path counterparts. Ownership refers to the owner user ID and group ID. Without this option, the ownership of the copied objects is derived from the effective user ID and group ID of the calling process. Note that it is assumed that the host system may requires superuser privileges to set both ownerships IDs of an object, and to set them to an arbitrary value. An unprivileged process may not change the user ID of a file, and may only change the group ID of a file which they own, to one of the groups of which that process is a member, either via the effective GID, or the ancillary list. The .code copy-path-rec function tests whether the application is running under superuser privileges; if not, then it only honors the .code :owner option for those objects under .meta from-path which are owned by the caller, and owned by a group to which the caller belongs. Other objects are copied as if the .code :owner option were not in effect, avoiding an attempt to set their ownership that is likely to fail. .IP :all The .code :all keyword is a shorthand representing all of the options being applied: permissions, times, symlinks and ownership are replicated. .RE .IP The .code copy-path-rec function creates all necessary pathname components required for .meta to-path to come into existence, as if by using the .code ensure-dir function. Whenever an object under .meta from-path has a counterpart in .meta to-path which already exists, the situation is handled as follows: .RS .IP 1. If a directory object is copied to an existing directory object, then that existing directory object is accepted as the copy, and the operation continues recursively within that directory. If any options are specified, then the requested attributes are propagated to that existing directory. .IP 2. If a non-directory object is copied to a directory object, the situation throws an exception: the .code copy-path-rec function refuses to delete an entire directory or subdirectory in order to make way for a file, symbolic link, special device or any other kind of non-directory object. .IP 3. If any object is copied to an existing non-directory object, that target object is removed first, then the copy operation proceeds. .RE .IP Copying of files takes place similarly as what is described for the .code copy-file function. Special objects such as FIFOs, character devices, block devices and sockets are copied by creating a new, similar objects at the destination path. In the case of devices, the major and minor numbers of the copy are derived from the original, so that the copy refers to the same device. However, the copy of a socket or a FIFO is effectively a new, different endpoint because these objects are identified by their pathname. Processes using the copy of a socket or a FIFO will not connect to processes which are working with the original. The .code copy-path-rec function returns .code nil if it is successful. It throws an exception derived from .code file-error when encountering failures. Additionally .code copy-path-rec provides an internal catch for the .code retry and .code skip restart exceptions. If the caller, using a handler frame established by .codn handle , catches an error emanating from the .code copy-files function, it can retry the failed operation by throwing the .code retry exception, or continue copying with the next object by throwing the .code skip exception. .coNP Function @ remove-path-rec .synb .mets (remove-path-rec << path ) .syne .desc The .code remove-path-rec function attempts to remove the filesystem object named by .metn path . If .meta path refers to a directory, that directory is recursively traversed to remove all of its contents, and is then removed. The .code remove-path-rec function returns .code nil if it is successful. It throws an exception derived from .code file-error when encountering failures. Additionally .code remove-path-rec provides an internal catch for the .code retry and .code skip restart exceptions. If the caller, using a handler frame established by .codn handle , catches an error emanating from the .code copy-files function, it can retry the failed operation by throwing the .code retry exception, or continue removing other objects by throwing the .code skip exception. Skipping a failed remove operation may cause subsequent operations to fail. Notably, the failure to remove an item inside a directory means that removal of that directory itself will fail, and ultimately, .meta path will still exist when .code remove-path-rec completes and returns. .coNP Functions @ chmod-rec and @ chown-rec .synb .mets (chmod-rec < path << mode ) .mets (chown-rec < path < uid << gid ) .syne .desc The .code chmod-rec and .code chown-rec functions are recursive counterparts of .code chmod and .codn lchown . The filesystem object given by .meta path is recursively traversed, and each of its constituent objects is subject to a permission change in the case of .codn chown-rec , or an ownership change in the case of .codn chown-rec . The .code chmod-rec function alters the permission of each object that is not a symbolic link using the .code chmod function, and .meta mode is interpreted accordingly: it may be an integer or string. Each object which is a symbolic link is ignored. The .code chown-rec function alters the permission of each object encountered, including symbolic links, using the .code lchown function. These functions establish restart catches, similarly to .code remove-path-rec and .codn copy-path-rec , allowing the caller to retry individual failed operations or skip the objects on which operations have failed. .coNP Function @ touch .synb .mets (touch < path <> [ ref-path ]) .syne .desc The .code touch function updates the modification timestamp of the filesystem object named by .metn path . If the object doesn't exist, it is created as a regular file. If .meta ref-path is specified, then the modification timestamp of the object denoted by .meta path is updated to be equivalent to the modification timestamp of the object denoted by .metn ref-path . Otherwise .meta ref-path being absent, the modification timestamp of .meta path is set to the current time. If .meta path is a symbolic link, it is dereferenced; .code touch operates on the target of the link. .coNP Function @ mkdtemp .synb .mets (mkdtemp << prefix ) .syne .desc The .code mkdtemp function combines the .metn prefix , which is a string, with a generated suffix to create a unique directory name. The directory is created, and the name is returned. If the .code prefix argument ends in with a sequence of one or more .code X characters, the behavior is unspecified. Note: this function is implemented using the same-named POSIX function. Whereas the POSIX function requires the template to end in a sequence of at least six .code X characters, which are replaced by the generated suffix, the \*(TL function handles this detail internally, requiring only the prefix part without those characters. .coNP Function @ mkstemp .synb .mets (mkstemp < prefix <> [ suffix ]) .syne .desc The .code mkstemp function create a unique file name by adding a generated infix between the .meta prefix and .meta suffix strings. The file is created, and a stream open in .str w+b mode for the file is returned. If either the .meta prefix or .meta suffix contain .code X characters, the behavior is unspecified. If .meta suffix is omitted, it defaults to the empty string. The name of the file is available by interrogating the returned stream's .code :name property using the function .codn stream-get-prop . Notes: this function is implemented using the POSIX function .code mkstemp or, if available, using the .code mkstemps function which is not standardized, but appears in the GNU C Library and some other systems. If .code mkstemps is unavailable, then the suffix functionality is not available: the .meta suffix argument must either be omitted, or must be an empty string. Whereas the C library functions require the template to contain a sequence at least six .code X characters, which are replaced by the generated portion, the \*(TL function handles this detail internally, requiring no such characters in any of its inputs. .SS* Unix Filesystem Object Existence, Type and Access Tests Functions in this category perform various tests on the attributes of filesystem objects. The functions all have a .meta path parameter, which accepts three types of arguments. If a character string is specified, it denotes a filesystem path to be probed for properties such as ownership and permissions. The object is probed using the .code stat function except in the case of .code path-symlink-p which uses .codn lstat . If instead a stream is specified as .metn path , then the associated filesystem descriptor is probed for these properties. If an integer value is specified, it is treated as a POSIX open file descriptor that is to be probed. Otherwise, a .code stat structure, for example one returned by the .code stat or .code lstat function may be specified, in which case no system object is probed. The properties to be tested are those given in the .code stat object. Note: in a situation when it is necessary to use any of these functions to probe the properties of a symbolic link itself (other than the function .code path-symlink-p which does so implicitly) it is necessary to first invoke .code lstat on the symlink's path, and then pass the resulting .code stat structure to that function instead of the path. Some of the accessibility tests (functions which determine whether the calling process has certain access rights) may not be perfectly accurate, since they are based strictly on portable information available via .codn stat , together with the basic, portable POSIX APIs for inquiring about security credentials, such as .codn getuid . They ignoring any special permissions which may exist such as operating system and file system specific extended attributes (for example, file immutability connected to a "secure level" and such) and special process capabilities not reflected in the basic credentials. With the exception of two functions, the accessibility tests use the real credentials of the caller, rather than the effective credentials. Thus, in a setuid process, where the real and effective privileges are different, the access tests inquire about whether the real user has the given access, not the effective user. In this aspect, the functions are similar to the POSIX .code access function which also uses real credentials. The functions .code path-private-to-me-p and .code path-strictly-private-to-me-p use effective credentials, because they answer a different question: can the given filesystem object be trusted? The trust has to be determined from the point of view of the effective user, because security-sensitive actions are being performed in their context; and the effective user does not trust the real user. .coNP Function @ path-exists-p .synb .mets (path-exists-p << path ) .syne .desc The .code path-exists-p function returns .code t if .meta path is a string which resolves to a filesystem object. Otherwise it returns .codn nil . If the .meta path names a dangling symbolic link, it is considered nonexistent. If .meta path is an object returned by .code stat or .codn lstat , .code path-exists-p unconditionally returns .codn t . .coNP Functions @, path-file-p @, path-dir-p @, path-symlink-p @, path-blkdev-p @, path-chrdev-p @ path-sock-p and @ path-pipe-p .synb .mets (path-file-p << path ) .mets (path-dir-p << path ) .mets (path-symlink-p << path ) .mets (path-blkdev-p << path ) .mets (path-chrdev-p << path ) .mets (path-sock-p << path ) .mets (path-pipe-p << path ) .syne .desc .code path-file-p tests whether .meta path exists and is a regular file. .code path-dir-p tests whether .meta path exists and is a directory. .code path-symlink-p tests whether .meta path exists and is a symbolic link. Similarly, .code path-blkdev-p tests for a block device, .code path-chrdev-p for a character device, .code path-sock-p for a socket and .code path-pipe-p for a named pipe. .coNP Function @ path-dir-empty .synb .mets (path-dir-empty << path ) .syne .desc The .code path-dir-empty function returns .code t if .meta path is an empty directory. Implementation note: this function performs a test similar to .codn path-dir-p ; then, if it is confirmed that .meta path is a directory, a directory stream is opened and entries are read. If an entry is seen which has a name other than .str . or .str .. then it is concluded that the directory is not empty and .code nil is returned. If no such entry is seen, then the directory is deemed empty and .code t is returned. .coNP Functions @, path-setgid-p @ path-setuid-p and @ path-sticky-p .synb .mets (path-setgid-p << path ) .mets (path-setuid-p << path ) .mets (path-sticky-p << path ) .syne .desc .code path-setgid-p tests whether .meta path exists and has the set-group-ID permission set. .code path-setuid-p tests whether .meta path exists and has the set-user-ID permission set. .code path-sticky-p tests whether .meta path exists and has the "sticky" permission bit set. .coNP Functions @ path-mine-p and @ path-my-group-p .synb .mets (path-mine-p << path ) .mets (path-my-group-p << path ) .syne .desc .code path-mine-p tests whether .meta path exists, and is effectively owned by the calling process; that is, it has a user ID equal to the real user ID of the process. .code path-my-group-p tests whether .meta path exists, and is effectively owned by a group to which the calling process belongs. This means that the group owner is either the same as the real group ID of the calling process, or else is among the supplementary group IDs of the calling process. .coNP Function @ path-readable-to-me-p .synb .mets (path-readable-to-me-p << path ) .syne .desc .code path-readable-to-me-p tests whether the calling process can read the object named by .metn path . If necessary, this test examines the real user ID of the calling process, the real group ID, and the list of supplementary groups. .coNP Function @ path-writable-to-me-p .synb .mets (path-writable-to-me-p << path ) .syne .desc .code path-writable-to-me-p tests whether the calling process can write the object named by .metn path . If necessary, this test examines the real user ID of the calling process, the real group ID, and the list of supplementary groups. .coNP Function @ path-read-writable-to-me-p .synb .mets (path-read-writable-to-me-p << path ) .syne .desc .code path-readable-to-me-p tests whether the calling process can both read and write the object named by .metn path . If necessary, this test examines the real user ID of the calling process, the real group ID, and the list of supplementary groups. .coNP Function @ path-executable-to-me-p .synb .mets (path-executable-to-me-p << path ) .syne .desc .code path-executable-to-me-p tests whether the calling process can execute the object named by .metn path , or perform a search (name lookup, not implying sequential readability) on it, if it is a directory. If necessary, this test examines the real user ID of the calling process, the real group ID, and the list of supplementary groups. .coNP Functions @ path-private-to-me-p and @ path-strictly-private-to-me-p .synb .mets (path-private-to-me-p << path ) .mets (path-strictly-private-to-me-p << path ) .syne .desc The .code path-private-to-me-p and .code path-strictly-private-to-me-p functions report whether the calling process can rely on the object indicated by .code path to be, respectively, private or strictly private to the security context implied by its effective user ID. "Private" means that beside the effective user ID of the calling process and the superuser, no other user ID has write access to the object, and thus its contents may be trusted to be be free from tampering by any other user. "Strictly private" means that not only is the object private, as above, but users other than the effective user ID of the calling process and superuser also not not have read access. The rules which the function applies are as follows: A file to be examined is initially assumed to be strictly private. If the file is not owned by the effective user ID of the caller, or else by the superuser, then it is not private. If the file grants write permission to "others", then it is not private. If the file grants read permission to "others", then it is not strictly private. If the file grants write permission to the group owner, then it is not private if the group contains names other than that of the file owner or the superuser. If the file grants read permission to the group owner, then it is not strictly private if the group contains names other than that of the file owner or the superuser. Note that this interpretation of "private" and "strictly private" is vulnerable to the following time-of-check to time-of-use race condition with regard to the group check. At the time of the check, the group might be empty or contain only the caller as a member. But by the time the file is subsequently accessed, the group might have been innocently extended by the system administrator to include additional users, who can maliciously modify the file. Another issue is that if any components of .meta path can be subverted by another user, test may not be trusted. It becomes vulnerable to a time-of-check to time-of-use race condition. The function .code path-components-safe function is provided to perform a security check on an entire path. .coNP Function @ path-components-safe .synb .mets (path-components-safe << path ) .syne .desc On Unix platforms, the .code path-components-safe performs a security check on an entire relative or absolute .metn path , returning .code t if the entire path is examined without encountering an error, and the check passes, otherwise .codn nil . On native Microsoft Windows, the function unconditionally returns true. An exception may be thrown if an an inaccessible or nonexistent path component is encountered, too many symbolic links have to be resolved or there is some other problem preventing the traversal of .metn path . The objective of this function is to determine that every portion of .code path is writable only to the effective user: that if the path is used for filesystem access, its meaning cannot be altered by an adversarial user who is able to control a symbolic link or a directory component. The function expands symbolic links on its own, one level at a time, and walks the components coming from a link target. Note: directories which are owned by root, and have the sticky bit, as is the usual configuration of .code tmp are considered safe, even though multiple users have write permissions. .coNP Functions @ path-newer and @ path-older .synb .mets (path-newer < left-path << right-path ) .mets (path-older < left-path << right-path ) .syne .desc The .code path-newer function compares two paths or stat results by modification time. It returns .code t if .meta left-path exists, and either .meta right-path does not exist, or has a modification time stamp in the past relative to .metn left-path . The .code path-older function is equivalent to .code path-newer with the arguments reversed. Note: .code path-newer takes advantage of subsecond timestamp resolution information, if available. The implementation is based on using the .code mtime-nsec field of the .code stat structure, if it isn't .codn nil . .coNP Function @ path-same-object .synb .mets (path-same-object < left-path << right-path ) .syne .desc The .code path-same-object function returns .code t if .meta left-path and .meta right-path resolve to the same filesystem object: the same inode number on the same device. .coNP Function @ path-search .synb .mets (path-search < name <> [ search-path ]) .syne .desc The .code path-search function searches for the existence of a filesystem object named by .meta name in the directories specified .metn search-path . If .meta name is the empty string or one of the two strings .str . (dot) or .str .. (dotdot), then .code nil is returned. If .meta name contains any path separator characters (any of the set of characters found in the .code path-sep-chars string) then the function returns .meta name without performing any search. In all these trivial cases, the .meta search-path argument is ignored. The .meta search-path argument, if present, may be a string or a list of strings. If omitted, then it takes on the value of the .code PATH environment variable if that variable exists, or else takes on the value .code nil indicating an empty search path. If .meta search-path is a string, it is converted to a list of directories by splitting on the separator character, which may be .code : (colon) or .code ; (semicolon) depending on the system. Then, for each directory in the list, .code path-search affixes the .meta name to that component, as if using the .code path-cat function, and tests whether the resulting path refers to an existing filesystem object. If so, then the search terminates and that resulting path is returned. If the entire list is traversed without finding a filesystem object, then .code nil is returned. If any error whatsoever occurs while determining whether the resulting path exists, the situation is treated as nonexistence, and the search continues. Note: subtle discrepancies may exist between .code path-search and the host platform's mechanisms for searching for an executable program. For instance, since .code path-search is interested in existence only, it may return a path which exists, but is not executable. Whereas a path searching implementation which tests for executability will in that case continue searching, and not return that path. .SS* Unix Credentials .coNP Functions @, getuid @, geteuid @ getgid and @ getegid .synb .mets (getuid) .mets (geteuid) .mets (getgid) .mets (getegid) .syne .desc These functions directly correspond to the POSIX C library functions of the same name. They retrieve the real user ID, effective user ID, real group ID and effective group ID, respectively, of the calling process. .coNP Functions @, setuid @, seteuid @ setgid and @ setegid .synb .mets (setuid << uid ) .mets (seteuid << uid ) .mets (setgid << gid ) .mets (setegid << gid ) .syne .desc These functions directly correspond to the POSIX C library functions of the same name. They set the real user ID, effective user ID, real group ID and effective group ID, respectively, of the calling process. On success, they return .codn t . On failure, they throw an exception of type .codn system-error . .coNP Function @ getgroups .synb .mets (getgroups) .syne .desc The .code getgroups function retrieves the list of supplementary group IDs of the calling process by calling the same-named POSIX C library function. Whether or not the effective group ID retrieved by .code getegid is included in this list is system-dependent. Programs should not depend on its presence or absence. .coNP Function @ setgroups .synb .mets (setgroups << gid-list ) .syne .desc The .code setgroups function corresponds to a C library function found in some Unix operating systems, complementary to the .code getgroups function. The argument to .meta gid-list must be a list of numeric group IDs. If the function is successful, this list is installed as the list of supplementary group IDs of the calling process, and the value .code t is returned. On failure, it throws an exception of type .codn system-error . .coNP Functions @ getresuid and @ getresgid .synb .mets (getresuid) .mets (getresgid) .syne .desc These functions directly correspond to the POSIX C library functions of the same names available in some Unix operating systems. Each function retrieves a three element list of numeric IDs. The .code getresuid function retrieves the real, effective and saved user ID of the calling process. The .code getresgid function retrieves the real, effective and saved group ID of the calling process. .coNP Functions @ setresuid and @ setresgid .synb .mets (setresuid < real-uid < effective-uid << saved-uid ) .mets (setresgid < real-gid < effective-gid << saved-gid ) .syne .desc These functions directly correspond to the POSIX C library functions of the same names available in some Unix operating systems. They change the real, effective and saved user ID or group ID, respectively, of the calling process. A value of -1 for any of the IDs specifies that the ID is not to be changed. Only privileged processes may arbitrarily change IDs to different values. Unprivileged processes are restricted in the following way: each of the new IDs that is replaced must have a new value which is equal to one of the existing three IDs. .SS* Unix Password Database .coNP Structure @ passwd .synb .mets (defstruct passwd nil .mets \ \ name passwd uid gid .mets \ \ gecos dir shell) .syne .desc The .code passwd structure corresponds to the C type .codn "struct passwd" . Objects of this struct are produced by the password database query functions .codn getpwent , .codn getpwuid , and .codn getpwnam . .coNP Functions @, getpwent @ setpwent and @ endpwent .synb .mets (getpwent) .mets (setpwent) .mets (endpwent) .syne .desc The first time .code getpwent function is called, it returns the first password database entry. On subsequent calls it returns successive entries. Entries are returned as instances of the .code passwd structure. If the function cannot retrieve an entry for any reason, it returns .codn nil . The .code setpwent function rewinds the database scan. The .code endpwent function releases the resources associated with the scan. .coNP Function @ getpwuid .synb .mets (getpwuid << uid ) .syne .desc The .code getpwuid searches the password database for an entry whose user ID field is equal to the numeric .metn uid . If the search is successful, then a .code passwd structure representing the database entry is returned. If the search fails, .code nil is returned. .coNP Function @ getpwnam .synb .mets (getpwnam << name ) .syne .desc The .code getpwnam searches the password database for an entry whose user name is equal to .metn name . If the search is successful, then a .code passwd structure representing the database entry is returned. If the search fails, .code nil is returned. .SS* Unix Group Database .coNP Structure @ group .synb .mets (defstruct group nil .mets \ \ name passwd gid mem) .syne .desc The .code group structure corresponds to the C type .codn "struct group" . Objects of this struct are produced by the password database query functions .codn getgrent , .codn getgrgid , and .codn getgrnam . .coNP Functions @, getgrent @ setgrent and @ endgrent .synb .mets (getgrent) .mets (setgrent) .mets (endgrent) .syne .desc The first time .code getgrent function is called, it returns the first group database entry. On subsequent calls it returns successive entries. Entries are returned as instances of the .code passwd structure. If the function cannot retrieve an entry for any reason, it returns .codn nil . The .code setgrent function rewinds the database scan. The .code endgrent function releases the resources associated with the scan. .coNP Function @ getgrgid .synb .mets (getgrgid << gid ) .syne .desc The .code getgrgid searches the group database for an entry whose group ID field is equal to the numeric .metn gid . If the search is successful, then a .code group structure representing the database entry is returned. If the search fails, .code nil is returned. .coNP Function @ getgrnam .synb .mets (getgrnam << name ) .syne .desc The .code getgrnam searches the group database for an entry whose group name is equal to .metn name . If the search is successful, then a .code group structure representing the database entry is returned. If the search fails, .code nil is returned. .SS* Unix Password Hashing .coNP Function @ crypt .synb .mets (crypt < key << salt ) .syne .desc The .code crypt function is a wrapper for the Unix C library function of the same name. It calculates a hash over the .meta key and .meta salt arguments, which are strings. The hash is returned as a string. The .meta key and .meta salt arguments are converted into UTF-8 prior to being passed into the underlying platform function. The hash value is assumed to be UTF-8 and converted to Unicode characters, though it is not expected to contain anything but 7 bit ASCII characters. Note: if C library function .code crypt uses a static buffer for its return value. If that function is used, the Lisp string returned by the \*(TL function carries its own copy of that buffer. Where available, the .code crypt_r function is used which avoids static storage. Implementations of the C function vary in their error reporting. Some implementations return a null pointer for invalid salts, whereas others return valid "error token" strings which vary between implementations. To work consistently across numerous implementations, the \*(TL .code crypt function throws an .code error exception if the C library function returns either a null pointer, or a valid pointer to a string that is less than 13 characters long, regardless of its content. .SS* Unix Signal Handling On platforms where certain advanced features of POSIX signal handling are available at the C API level, \*(TX exposes signal-handling functionality. A \*(TX program can install a \*(TL function (such as an anonymous .codn lambda , or the function object associated with a named function) as the handler for a signal. When that signal is delivered, \*(TX will intercept it with its own safe, internal handler, mark the signal as deferred (in a \*(TX sense) and then dispatch the registered function at a convenient time. Handlers currently are not permitted to interrupt the execution of most \*(TX internal code. Immediate, asynchronous execution of handlers is currently enabled only while \*(TX is blocked on I/O operations or sleeping. Additionally, the .code sig-check function can be used to dispatch and clear deferred signals. These handlers are then safely called if they were subroutines of .codn sig-check , and not asynchronous interrupts. .coNP Variables @, sig-hup @, sig-int @, sig-quit @, sig-ill @, sig-trap @, sig-abrt @, sig-bus @, sig-fpe @, sig-kill @, sig-usr1 @, sig-segv @, sig-usr2 @, sig-pipe @, sig-alrm @, sig-term @, sig-chld @, sig-cont @, sig-stop @, sig-tstp @, sig-ttin @, sig-ttou @, sig-urg @, sig-xcpu @, sig-xfsz @, sig-vtalrm @, sig-prof @, sig-poll @, sig-sys @, sig-winch @, sig-iot @, sig-stkflt @, sig-io @ sig-lost and @ sig-pwr .desc These variables correspond to the C signal constants .codn SIGHUP , .code SIGINT and so forth. The variables .codn sig-winch , .codn sig-iot , .codn sig-stkflt , .codn sig-io , .code sig-lost and .code sig-pwr may not be available since a system may lack the corresponding signal constants. See notes for the function .codn log-authpriv . The highest signal number is 31. .coNP Functions @ set-sig-handler and @ get-sig-handler .synb .mets (set-sig-handler < signal-number << handling-spec ) .mets (get-sig-handler << signal-number ) .syne .desc The .code set-sig-handler function is used to specify the handling for a signal, such as the installation of a handler function. It updates the signal handling for a signal whose number is .meta signal-number (usually one of the constants like .codn sig-hup , .code sig-int and so forth), and returns the previous value. The .code get-sig-handler function returns the current value. The .meta signal-number must be an integer the range 1 to 31. Initially, all 31 signal handling specifications are set to the value .codn t . The .meta handling-spec parameter may be a function. If a function is specified, then the signal is enabled and connected to that function until another call to .code set-sig-handler changes the handling for that signal. If .meta handling-spec is the symbol .codn nil , then the function previously associated with the signal, if any, is removed, and the signal is disabled. For a signal to be disabled means that the signal is set to the .code SIG_IGN disposition (refer to the C API). If .meta handling-spec is the symbol .codn t , then the function previously associated with the signal, if any, is removed, and the signal is set to its default disposition. This means that it is set to .code SIG_DFL (refer to the C API). Some signals terminate the process if they are generated while the handling is configured to the default disposition. Note that the certain signals like .code sig-quit and .code sig-kill cannot be ignored or handled. Please observe the signal documentation in the IEEE POSIX standard, and your platform. A signal handling function must take two arguments. It is of the form: .mono .mets (lambda >> ( signal << async-p ) ...) .onom The .meta signal argument is an integer indicating the signal number for which the handler is being invoked. The .meta asyncp-p argument is a Boolean value. If it is .codn t , it indicates that the handler is being invoked asynchronously\(emdirectly in a signal handling context. If it is .codn nil , then it is a deferred call. Handlers may do more things in a deferred call, such as terminate by throwing exceptions, and perform I/O. The return value of a handler is normally ignored. However if it invoked asynchronously (the .meta async-p argument is true), then if the handler returns a .cod2 non- nil value, it is understood that the handler requesting that it be deferred. This means that the signal will be marked as deferred, and the handler will be called again at some later time in a deferred context, whereby .meta async-p is .codn nil . This is not guaranteed, however; it's possible that another signal will arrive before that happens, possibly resulting in another async call, so the handler must be prepared to deal with an async call at any time. If a handler is invoked synchronously, then its return value is ignored. In the current implementation, signals do not queue. If a signal is delivered to the process again, while it is marked as deferred, it simply stays deferred; there is no counter associated with a signal, only a Boolean flag. .coNP Function @ sig-check .synb .mets (sig-check) .syne .desc The .code sig-check function tests whether any signals are deferred, and for each deferred signal in turn, it executes the corresponding handler. For a signal to be deferred means that the signal was caught by an internal handler in \*(TX and the event was recorded by a flag. If a handler function is removed while a signal is deferred, the deferred flag is cleared for that signal. Calls to the .code sig-check function may be inserted into CPU-intensive code that has no opportunity to be interrupted by signals, because it doesn't invoke any I/O functions. .coNP Function @ raise .synb .mets (raise << signal ) .syne .desc The .code raise function sends .meta signal to the process. It is a wrapper for the C function of the same name. The return value is .code t if the function succeeds, otherwise .codn nil . .coNP Function @ kill .synb .mets (kill < process-id <> [ signal ]) .syne .desc The .code kill function is used for sending a signal to a process group or process. It is a wrapper for the POSIX .code kill function. If the .meta signal argument is omitted, it defaults to the same value as .codn sig-term . The return value is .code t if the function succeeds, otherwise .codn nil . .coNP Function @ strsignal .synb .mets (strsignal << signal ) .syne .desc The .code strsignal function returns a character string describing the specified signal number. It is based on the same-named POSIX C library function. .SS* Unix Processes .coNP Functions @ fork and @ wait .synb .mets (fork) .mets (wait >> [ pid <> [ flags ]]) .syne .desc The .code fork and .code wait functions are interfaces to the Unix functions .code fork and .codn waitpid . The .code fork function creates a child process which is a replica of the parent. Both processes return from the function. In the child process, the return value is zero. In the parent, it is an integer representing the process ID of the child. If the function fails to create a child, it returns .code nil rather than an integer. In this case, the .code errno function can be used to inquire about the cause. The .code wait function, if successful, returns a cons cell consisting of a pair of integers. The .code car of the cons is the process ID of the process or group which was successfully waited on, and the .code cdr is the status. If .code wait fails, it returns .codn nil . The .code errno function can be used to inquire about the cause. The .meta process-id argument, if not supplied, defaults to -1, which means that .code wait waits for any process, rather than a specific process. Certain other values have special meaning, as documented in the POSIX standard for the .code waitpid function. The .meta flags argument defaults to zero. If it is specified as nonzero, it should be a bitwise combination (via the .code logior function) of the variables .codn w-nohang , .code w-untraced and .codn w-continued . If .code w-nohang is used, then .code wait returns a cons cell whose .code car specifies a process ID value of zero in the situation that at least one of the processes designated by .code process-id exist and are children of the calling process, but have not changed state. In this case, the status value in the .code cdr is unspecified. Status values may be inspected with the functions .codn w-ifexited , .codn w-exitstatus , .codn w-ifsignaled , .codn w-termsig , .codn w-coredump , .codn w-ifstopped , .code w-stopsig and .codn w-ifcontinued . .coNP Functions @, w-ifexited @, w-exitstatus @, w-ifsignaled @, w-termsig @, w-coredump @ w-ifstopped and @ w-stopsig .synb .mets (w-ifexited << status ) .mets (w-exitstatus << status ) .mets (w-ifsignaled << status ) .mets (w-termsig << status ) .mets (w-coredump << status ) .mets (w-ifstopped << status ) .mets (w-stopsig << status ) .mets (w-ifcontinued << status ) .syne .desc These functions analyze process exit values produced by the .code wait function. They are closely based on the POSIX macros .codn WIFEXITED , .codn WEXITSTATUS , and so on. The .meta status value is either an integer, or a cons cell. In this case, the cons cell is expected to have an integer in its .code cdr which is used as the status. The .codn w-ifexited , .codn w-ifsignaled , .codn w-coredump , .code w-ifstopped and .code w-ifcontinued functions have Lisp Boolean return semantics, unlike their C language counterparts: they return .code t or .codn nil , rather than zero or nonzero. The others return integer values. .coNP Function @ exec .synb .mets (exec < file <> [ args ]) .syne .desc The exec function replaces the process image with the executable specified by string argument .metn file . The executable is found by searching the system path. The .meta file argument becomes the first argument of the executable, argument zero. If .meta args is specified, it is a list of strings. These are passed as the additional arguments of the executable. If .code exec fails, an exception of type .code file-error is thrown. .coNP Function @ exit* .synb .mets (exit* << status ) .syne .desc The .code exit* function terminates the entire process (running \*(TX image), specifying the termination status to the operating system. The .meta status argument is treated exactly like that of the .code exit function. Unlike that function, this one exits the process immediately, cleaning up only low-level operating system resources such as closing file descriptors and releasing memory mappings, without performing userspace cleanup. .code exit* is implemented using a call to the POSIX function .codn _exit . .coNP Functions @ getpid and @ getppid .synb .mets (getpid) .mets (getppid) .syne .desc These functions retrieve the current process ID and the parent process ID respectively. They are wrappers for the POSIX functions .code getpid and .codn getppid . .coNP Function @ daemon .synb .mets (daemon < nochdir << noclose ) .syne .desc This is a wrapper for the function .code daemon which originated in BSD Unix. It returns .code t if successful, .code nil otherwise, and the .code errno variable is set in that case. Unlike in the underlying same-named platform function, the .meta nochdir and .meta noclose arguments are Boolean, rather than integer values. .SS* Unix File Descriptors .coNP Function @ open-fileno .synb .mets (open-fileno < file-descriptor >> [ mode-string <> [ pid ]]) .syne .desc The .code open-fileno function creates and returns a \*(TX stream over a file descriptor. The .meta file-descriptor argument must be an integer denoting a valid file descriptor. For a description of .metn mode-string , see the .code open-file function. If the .meta pid argument is present, it must be a positive integer corresponding to a process ID. The .code open-fileno function will associate the process ID with the returned stream. When the stream is closed with .codn close-stream , special handling takes place, as documented for that function. .coNP Function @ fileno .synb .mets (fileno << stream ) .syne .desc The .code fileno function returns the underlying file descriptor of .metn stream , if it has one. Otherwise, it returns .codn nil . This is equivalent to querying the stream using .code stream-get-prop for the .code :fd property. .coNP Function @ dupfd .synb .mets (dupfd < old-fileno <> [ new-fileno ]) .syne .desc The .code dupfd function provides an interface to the POSIX functions .code dup or .codn dup2 , when called with one or two arguments, respectively. .coNP Function @ pipe .synb .mets (pipe) .syne .desc The .code pipe function, if successful, returns a pair of integer file descriptors as a cons-cell pair. The descriptor in the .code car field of the pair is the read end of the pipe. The .code cdr holds the write end. If the function fails, it throws an exception of type .codn file-error . .coNP Function @ close .synb .mets (close < fileno <> [ throw-on-error-p ]) .syne .desc The .code close function passes the integer descriptor .meta fileno to the POSIX .code close function. If the operation is successful, then .code t is returned. Otherwise an exception of type .code file-error is thrown, unless the .meta throw-on-error-p argument is present, with a true value. In that case, .code close indicates failure by returning .codn nil . .coNP Function @ poll .synb .mets (poll < poll-list <> [ timeout ]) .syne .desc The .code poll function suspends execution while monitoring one or more file descriptors for specified events. It is a wrapper for the same-named POSIX function. The .meta poll-list argument is a sequence of .code cons pairs. The .code car of each pair is either an integer file descriptor, or else a stream object which has a file descriptor (the .code fileno function can be applied to that stream to retrieve a descriptor). The .code cdr of each pair is an integer bit mask specifying the events, whose occurrence the file descriptor is to be monitored for. The variables .codn poll-in , .codn poll-out , .code poll-err and several others are available which hold bitmask values corresponding to the constants .codn POLLIN , .codn POLLOUT , .code POLLERR used with the C language .code poll function. The .meta timeout argument, if absent, defaults to the value -1, which specifies an indefinite wait. A nonnegative value specifies a wait with a timeout, measured in milliseconds. The function returns a list of pairs representing the descriptors or streams which were successfully polled. If the function times out, it returns an empty list. If an error occurs, an exception is thrown. The returned list is similar in structure to the input list. However, it holds only entries which polled positive. The .code cdr of every pair now holds a bitmask of the events which were to have occurred. .coNP Function @ isatty .synb .mets (isatty << stream ) .mets (isatty << fileno ) .syne .desc The .code isatty function provides access to the underlying POSIX function of the same name. If the argument is a .meta stream object which has a .code :fd property, then the file descriptor number is retrieved. The behavior is then as if that descriptor number were passed as the .meta fileno argument. If the argument is not a .metn stream , it must be a .metn fileno : an integer in the representation range of the C type .codn int . The POSIX .code isatty is invoked on this integer. If it that returns 1, then .code t is returned, otherwise .codn nil . .SS* Unix File Control .coNP Variables @, o-accmode @, o-rdonly @, o-wronly @, o-rdwr @, o-creat @, o-noctty @, o-trunc @, o-append @, o-nonblock @, o-sync @, o-async @, o-directory @, o-nofollow @, o-cloexec @, o-direct @ o-noatime and @ o-path .desc These variables correspond to the POSIX file mode constants .codn O_ACCMODE , .codn O_RDONLY , .codn O_WRONLY , .codn O_RDWR , .codn O_CREAT , .codn O_NOCTTY , and so forth. The availability of the variables .codn o-async , .codn o-directory , .codn o-nofollow , .codn o-cloexec , .codn o-direct , .code o-noatime and .code o-path depends on the host platform. Some of these flags may be set or cleared on an existing file descriptor using the .code f-setfl command of the .code fcntl function, in accordance with POSIX and the host platform documentation. .coNP Variables @, seek-set @ seek-cur and @ seek-end .desc These variables correspond to the ISO C constants .codn SEEK_SET , .code SEEK_CUR and .codn SEEK_END . These values, usually associated with the ISO C .code fseek function, are also used in the .code fcntl file locking interface as values of the .code whence member of the .code flock structure. .coNP Variables @, f-dupfd @, f-dupfd-cloexec @, f-getfd @, f-setfd @, f-getfl @, f-setfl @, f-getlk @ f-setlk and @ f-setlkw .desc These variables correspond to the POSIX .code fcntl command constants .codn F_DUPFD , .codn F_GETFD , .codn F_SETFD , and so forth. Availability of the .code f-dupfd-cloexec depends on the host platform. .coNP Variable @ fd-cloexec .desc The .code fd-cloexec variable corresponds to the POSIX .code FD_CLOEXEC constant. It denotes the flag which may be set by the .code fd-setfd command of the .code fcntl function. .coNP Variables @, f-rdlck @ f-wrlck and @ f-unlck .desc These variables correspond to the POSIX lock type constants .codn F_RDLCK , .code F_WRLCK and .codn F_UNLCK . They specify the possible values of the .code type field of the .code flock structure. .coNP Structure @ flock .synb .mets (defstruct flock nil .mets \ \ type whence .mets \ \ start len .mets \ \ pid) .syne .desc The .code flock structure corresponds to the POSIX structure of the same name. An instance of this structure must be specified as the third argument of the .code fcntl function when the .meta command argument is one of the values .codn f-getlk , .code f-setlk or .codn f-setlkw . All slots must be initialized with appropriate values before calling .code fcntl with the exception that the .code f-getlk command does not access the existing value of the .code pid slot. .coNP Function @ fcntl .synb .mets (fcntl < fileno < command <> [ arg ]) .syne .desc The .code fcntl function corresponds to the same-named POSIX function. The .meta fileno and .meta command arguments must be integers. The \*(TL .code fileno restricts the .meta command argument to the supported values for which symbolic variable names are provided. Other integer .meta command values are rejected by returning -1 and setting the .code errno variable to .codn EINVAL . Whether the third argument is required, and what type it must be, depends on the .meta command value. Commands not requiring the third argument ignore it if it is passed. .code fcntl commands for which POSIX requires an argument of type .code long require .meta arg to be an integer. The file locking commands .codn f-getlk , .code f-setlk and .code f-setlkw require .meta arg to be a .code flock structure. The .code fcntl function doesn't throw an error if the underlying POSIX function indicates failure; the underlying function's return value is converted to a Lisp integer and returned. .SS* Unix Itimers Itimers ("interval timers") can be used in combination with signal handling to execute asynchronous actions. Itimers deliver delayed, one-time signals, and also periodically recurring signals. For more information, consult the POSIX specification. .coNP Variables @, itimer-real @ itimer-virtual and @ itimer-prof .desc These variables correspond to the POSIX constants .codn ITIMER_REAL , .code ITIMER_VIRTUAL and .codn ITIMER_PROF . Their values are suitable as the .meta timer argument of the .code getitimer and .code setitimer functions. .coNP Functions @ getitimer and @ setitimer .synb .mets (getitimer << timer ) .mets (setitimer < timer < interval << value ) .syne .desc The .code getitimer function returns the current value of the specified timer, which must be .codn itimer-real , .code itimer-virtual or .codn itimer-prof . The current value consists of a list of two integer values, which represents microseconds. The first value is the timer interval, and the second value is the timer's current value. Like .codn getitimer , the .code setitimer function also retrieves the specified timer. In addition, it stores a new value in the timer, which is given by the two arguments, expressed in microseconds. .SS* Unix Syslog On platforms where a Unix-like syslog API is available, \*(TX exports this interface. \*(TX programs can configure logging via the .code openlog function, control the logging mask via .code setlogmask and generate logs via .codn syslog , or using special syslog streams. .coNP Variables @, log-pid @, log-cons @, log-ndelay @, log-odelay @ log-nowait and @ log-perror .desc These variables take on the values of the corresponding C preprocessor constants from the .code header: .codn LOG_PID , .codn LOG_CONS , etc. These integer values represent logging options used in the .meta options argument to the .code openlog function. Note: .code LOG_PERROR is not in POSIX, and so .code log-perror might not be available. See notes about .code LOG_AUTHPRIV in the documentation for .codn log-authpriv . .coNP Special Variables @, log-user @, log-daemon @ log-auth and @ log-authpriv .desc These variables take on the values of the corresponding C preprocessor constants from the .code header: .codn LOG_USER , .codn LOG_DAEMON , .code LOG_AUTH and .codn LOG_AUTHPRIV . These are the integer facility codes specified in the .code openlog function. Note: .code LOG_AUTHPRIV is not in POSIX, and so .code log-authpriv might not be available. For portability use code like .code "(or (symbol-value 'log-authpriv) 0)" to evaluate to 0 if .code log-authpriv doesn't exist, or else check for its existence using .codn "(boundp 'log-authpriv)" . .coNP Variables @, log-emerg @, log-alert @, log-crit @, log-err @, log-warning @, log-notice @ log-info and @ log-debug .desc These variables take on the values of the corresponding C preprocessor constants from the .code header: .codn LOG_EMERG , .codn LOG_ALERT , etc. These are the integer priority codes specified in the .code syslog function. .coNP Special Variable @ *stdlog* .desc The .code *stdlog* variable holds a special kind of stream: a syslog stream. Each newline-terminated line of text sent to this stream becomes a log message. The stream internally maintains a priority value that is applied when it generates messages. By default, this value is that of .codn log-info . The stream holds the priority as the value of the .code :prio stream property, which may be changed with the .code stream-set-prop function. The latest priority value which has been configured on the stream is used at the time the newline character is processed and the log message is generated, not necessarily the value which was in effect at the time the accumulation of a line began to take place. Messages sent to .code *stdlog* are delimited by newline characters. That is to say, each line of text written to the stream is a new log. .coNP Function @ openlog .synb .mets (openlog < id-string >> [ options <> [ facility ]]) .syne .desc The .code openlog function is a wrapper for the .code openlog C function, and the arguments have the same semantics. It is not necessary to use .code openlog in order to call the .code syslog function or to write data to .codn *stdlog* . The call is necessary in order to override the default identifying string, to set options, such as having the PID (process ID) recorded in log messages, and to specify the facility. The .meta id-string argument is mandatory. The .meta options argument is a bitwise mask (see the logior function) of option values such as .code log-pid and .codn log-cons . If it is missing, then a value of 0 is used, specifying the absence of any options. The .meta facility argument is one of the values .codn log-user , .code log-daemon or .codn log-auth . If it is missing, then .code log-user is assumed. .coNP Function @ closelog .synb .mets (closelog) .syne .desc The .code closelog function is a wrapper for the C function .codn closelog . .coNP Function @ setlogmask .synb .mets (setlogmask << bitmask-integer ) .syne .desc The .code setlogmask function interfaces to the corresponding C function, and has the same argument and return value semantics. The .meta bitmask-integer argument is a mask of priority values to enable. The return value is the prior value. Note that if the argument is zero, then the function doesn't set the mask to zero; it only returns the current value of the mask. Note that the priority values like .code log-emerg and .code log-debug are integer enumerations, not bitmasks. These values cannot be combined directly to create a bitmask. Rather, the .code mask function should be used on these values. .TP* Example: .verb ;; Enable LOG_EMERG and LOG_ALERT messages, ;; suppressing all others (setlogmask (mask log-emerg log-alert)) .brev .coNP Function @ syslog .synb .mets (syslog < priority < format << format-arg *) .syne .desc The .code syslog function is the interface to the .code syslog C function. The .code printf formatting capabilities of the function are not used; the .meta format argument follows the conventions of the \*(TL .code format function instead. Note in particular that the .code %m convention for interpolating the value of strerror(errno) which is available in some versions of the .code syslog C function is currently not supported. Note that syslog messages are not newline-terminated. .SS* Unix Path Globbing On platforms where the POSIX .code glob function is available \*(TX provides this functionality in the form of a like-named function, and some numeric constants. \*(TX also provides access the .code fnmatch function, where available. .coNP Variables @, glob-err @, glob-mark @, glob-nosort @, glob-nocheck @, glob-noescape @, glob-period @, glob-altdirfunc @, glob-brace @, glob-nomagic @, glob-tilde @ glob-tilde-check and @ glob-onlydir .desc These variables take on the values of the corresponding C preprocessor constants from the .code header: .codn GLOB_ERR , .codn GLOB_MARK , .codn GLOB_NOSORT , etc. These values are passed as an argument to the optional .meta flags argument of the .code glob function. They are bitmasks and so multiple values can be combined using the .code logior function. Note that the .codn glob-period , .codn glob-altdirfunc , .codn glob-brace , .codn glob-nomagic , .codn glob-tilde , .code glob-tilde-check and .code glob-onlydir variables may not be available. They are extensions in the GNU C library implementation of .codn glob . The standard .code GLOB_APPEND flag is not represented as a \*(TX variable. The .code glob function uses it internally when calling the C library function multiple times, due to having been given multiple patterns. .coNP Variable @ glob-xnobrace .desc This value holds an integer bitmask value that may be given as an argument to the optional .meta flags parameter of the .code glob* function. It may be used alone, combine with the other .code glob mask values using the .code logior function. If used with .codn glob* , it disables brace expansion, which is enabled in .code glob* by default. If used with the .code glob function, it has no effect. This value is a \*(TL extension; it does not appear in the API of the .code glob C function. .coNP Functions @ glob and @ glob* .synb .mets (glob >> { pattern | << patterns } >> [ flags <> [ errfun ]]) .mets (glob* >> { pattern | << patterns } >> [ flags <> [ errfun ]]) .syne .desc The .code glob function is a interface to the Unix function of the same name. The first argument must either be a single .metn pattern , which is a string, or else sequence of strings specifying multiple .metn patterns , which are strings. Each string is a glob pattern: a pattern which matches zero or more pathnames, similar to a regular expression. The function tries to expand the patterns and return a list of strings representing the matching pathnames in the file system. If there are no matches, then an empty list is returned. The .code glob* function is a \*(TL extension built on .codn glob . The .code glob* functions supports a .code ** ("double star") pattern which matches zero or more path components. The double star match is described in detail below. The .code glob* function also supports brace expansion, independently of whether or not .code glob supports brace expansion. Brace expansion is enabled by default in .code glob* and can be disabled using the .code glob-xnobrace flag. Brace expansion is described in detail below. Lastly, the .code glob* function performs a path-aware sort of the emerging path names that is not influenced by locale, whereas the sort performed by .code glob is influenced by locale, defaulting to a lexicographic sort in the .str C locale. The optional .meta flags argument defaults to zero. If given, it may be a bitwise combination of the values of the variables .codn glob-err , .codn glob-mark , .code glob-nosort and others. The .code glob-append If the .meta errfun argument is specified, it gives a callback function which is invoked when .code glob encounters errors accessing paths. The function takes two arguments: the pathname and the .code errno value which occurred for that pathname. The function's return value is Boolean. If the function returns true, then .code glob will terminate. The .meta errfun may terminate the traversal by a nonlocal exit, such as by throwing an exception or performing a block return. The .meta errfun may not reenter the .code glob function. This situation is detected and diagnosed by an exception. The .meta errfun may not capture a continuation across the error boundary. That is to say, code invoked from the error may not capture a continuation up to a prompt which surrounds the .code glob call. Such an attempt is detected and diagnosed by an exception. If a sequence of .meta patterns is specified instead of a single pattern, .code glob makes multiple calls to the underlying C library function. The second and subsequent calls specify the .code GLOB_APPEND flag to add the matches to the result. The following equivalence applies: .verb (glob (list p0 p1 ...) f e) <--> (append (glob p0 f e) (glob p1 f e) ...) .brev Details of the semantics of the .code glob function, and the meaning of all the .meta flags arguments are given in the documentation for the C function. The .code glob* function supports brace expansion, which is enabled by default, and can be disabled with .codn glob-xnobrace . On some platforms, such as the GNU C Library, the .code glob function also supports brace expansion. If available, then the .code glob-brace variable has a nonzero value and must be included in the .meta flags argument. These two brace expansion features are independent; the \*(TL .code glob* function does not rely on .code glob for brace expansion, even if it is available. The brace expansion supported by .code glob* is a string generation mechanism driven by a syntax which specifies comma-separated elements enclosed in braces. When a single brace expansion appears in a pattern, that pattern turns into a list of patterns. There are as many elements in the list as there are elements between the braces. Each element replaces the braces with a different element from between the braces. For instance, .str x{a,b}y denotes the list of strings .codn "(\(dqxay\(dq \(dqxby\(dq)" . The there are two elements in the list because the braces contain two elements. The first string replaces .str {a,b} with .str a and the second replaces it with .strn b . When multiple braces occur in a pattern, then all combinations (Cartesian product) of the braces is produced. Braces may also nest. When the element of a brace itself uses braces, then that element is subject to brace expansion. The elements which emerge then become items of the enclosing brace, as if they were comma-separated elements. For instance .str x{a,{b,c}y}z is equivalent to .str x{a,by,cy}z which then expands to the three strings .strn xaz , .str xbyz and .strn xcyz . Braces may be escaped by a backslash to disable their special meaning. Likewise, the commas may be escaped by a backslash to preserve their special meaning. Brace expansion preserves these backslashes; they appear in the resulting patterns, and must be recognized and removed by subsequent processing. When the .meta pattern arguments of .code glob* use brace expansion, those arguments produce multiple patterns. The order of these patterns is preserved: the patterns are matched in that order. For each pattern, the matching path names are sorted, unless the .code glob-nosort flag is in effect. The .code ** ("double star") operator recognized by .code glob* matches zero or more path components. It may be used more than once. It cannot be combined with other characters or globbing operators. It is valid for .str ** to be an entire pattern. This expands the relative path names of all files, directories and other objects in the current directory and its children. Otherwise the double star may appear at the start of a pattern if it is followed by a slash; at the end of a pattern if it is preceded by a slash, or in the middle of a pattern if it is surrounded by two slashes. The double star is not recognized in a bracket-enclosed character class. Thus, the following examples all contain one double star: .verb ** foo/** **/bar here/**/there .brev These do not contain a double star; the two asterisks in these patterns will be passed to the underlying .code glob function without being processed as a double star by .codn glob* , with unspecified consequences: .verb foo** **bar here**/there etc/**conf foo[/**/]bar .brev Each double star matches a maximum of ten path components, and all of the double stars in a single pattern together do not match more than 48 components. Using more than three double stars in a pattern is not recommended for performance reasons. If the double star is followed by a slash, it matches only directories. The .code glob* function sorts paths in such a way that the slash character is ranked lower than all other characters. Thus the path .str test/ sorts before .str test-data/ even though in ASCII and Unicode, the .code - character has a lower code than the .code / character. .TP* Examples: .verb ;; find all jpg and gif paths under the current directory, ;; (up to ten levels deep). (glob* "**/*.{jpg,gif}") ;; find all "2023" directories under the current directory, ;; which have .jpg or .gif files under them, listing those ;; .jpg and .gif paths: (glob* "**/2023/**/*.{jpg,gif}") ;; find all "2023" directories under the current directory. (glob* "**/2023/**/") .brev .coNP Variables @, fnm-pathname @, fnm-noescape @, fnm-period @, fnm-leading-dir @ fnm-casefold and @ fnm-extmatch .desc These variables take on the values of the corresponding C preprocessor constants from the .code header: .codn FNM_PATHNAME , .codn FNM_NOESCAPE , .codn FNM_PERIOD , etc. These values are bit masks which may be combined with the .code logior function to form the optional third .meta flags argument of the .code fnmatch function. Note that the .codn fnm-leading-dir , .code fnm-casefold and .code fnm-extmatch functions may not be available. They are GNU extensions, found in the GNU C library. .coNP Function @ fnmatch .synb .mets (fnmatch < pattern < string <> [ flags ]]) .syne .desc The .code fnmatch function, if available, provides access to the like-named POSIX C library function. The .meta pattern argument specifies a POSIX-shell-style filename-pattern-matching expression. Its exact features and dialect are controlled by .metn flags . If .meta string matches .meta pattern then .code t is returned. If there is no match, then .code nil is returned. If the C function indicates that an error has occurred, an exception is thrown. .SS* Unix Filesystem Traversal On platforms where the POSIX .code nftw function is available \*(TX provides this functionality in the form of the analogous Lisp function .codn ftw , accompanied by some numeric constants. Likewise, on platforms where the POSIX functions .code opendir and .code readdir are available, \*(TX provides the functionality in the form of same-named Lisp functions, a structure type named .code dirent and some accompanying numeric constants. .coNP Variables @, ftw-phys @, ftw-mount @, ftw-chdir @ ftw-depth and @ ftw-actionretval .desc These variables hold numeric values that may be combined into a single bitmask bitmask value using the .code logior function. This value is suitable as the .meta flags argument of the .code ftw function. These variables correspond to the C constants .codn FTW_PHYS , .codn FTW_MOUNT , etc. Note that .code ftw-actionretval is a GNU extension that is not available on all platforms. If the platform's .code nftw function doesn't have this feature, then this variable is not defined. .coNP Variables @, ftw-f @, ftw-d @, ftw-dnr @, ftw-ns @, ftw-sl @ ftw-dp and @ ftw-sln .desc These variables provide symbolic names for the integer values that are passed as the .code type argument of the callback function called by .codn ftw . This argument classifies the kind of file system node visited, or error condition encountered. These variables correspond to the C constants .codn FTW_F , .codn FTW_D , etc. Not all of them are present. If the underlying platform doesn't have a given constant, then the corresponding variable doesn't exist in \*(TX. .coNP Variables @, ftw-continue @, ftw-stop @ ftw-skip-subtree and @ ftw-skip-siblings .desc These variables are defined if the variable .code ftw-actionretval is defined. If the value of .code ftw-actionretval is included in the .meta flags argument of .codn ftw , then the callback function can use the values of these variables as return codes. Ordinarily, the callback returns zero to continue the search and nonzero to stop. These variables correspond to the C constants .codn FTW_CONTINUE , .codn FTW_STOP , etc. .coNP Function @ ftw .synb .mets (ftw < path-or-list < callbackfun >> [ flags <> [ nopenfd ]]) .mets >> [ callbackfun < path < type < stat-struct < level << base ] .syne .desc The .code ftw function provides access to the .code nftw POSIX C library function. Note that the .meta flags and .meta nopenfd arguments are reversed with respect to the C language interface. They are both optional; .meta flags defaults to the value of .code ftw-phys and .meta nopenfd defaults to 20. If an argument is given to .metn flags , then the presence of the .code ftw-phys is no longer implied; the flag must be explicitly included in the argument in order to be present. Compatibility Note: the .meta flags parameter defaults to an argument value of zero in \*(TX versions 283 or lower. The .meta path-or-list argument may be a string specifying the top-level pathname that .code ftw shall visit. Or else, .meta path-or-list may be a list. If it is a list, then .code ftw recursively invokes itself over each of the elements, taking that element as the .meta path-or-name argument of the recursive call, passing down all other argument values as-is. The traversal stops when any recursive invocation of .code ftw returns a value other than .code t or .codn nil , and that value is returned. If .code t or .code nil is returned, the traversal continues with the application of .code ftw to the next list element, if any. If the list is completely traversed, and some recursive invocations of .code ftw return .codn t , then the return value is .codn t . If all recursive invocations return .code nil then .code nil is returned. If the list is empty, .code t is returned. The .code ftw function walks the filesystem, as directed by the .meta path-or-list argument and .meta flags bitmask arguments. For each visited entry, it calls the supplied .meta callbackfun function, which receives five arguments. If this function returns normally, it must return either .codn nil , .codn t , or an integer value in the range of the C type .codn int . The .code ftw function can continue the traversal by returning any non-integer value, or the integer value zero. If .code ftw-actionretval is included in the .meta flags bitmask, then the only integer code which continues the traversal without any special semantics is .code ftw-continue and only .code ftw-stop stops the traversal. (Non-integer return values behave like .codn ftw-continue ). The .meta path argument of .meta callbackfun gives the path of the visited filesystem object. The .meta type argument is an integer code which indicates the kind of object that is visited, or an error situation in visiting that filesystem entry. See the documentation for .code ftw-f and .code ftw-d for possible values. The .meta stat-struct argument provides information about the filesystem object as a .code stat structure, the same kind of object as what is returned by the .code stat function. The .meta level argument is an integer value representing the directory level depth. This value is obtained from the C structure .code FTW in the .code nftw C API. The .meta base argument indicates the length of the directory part of the .code path argument. Characters in excess of this length are thus the base name of the visited object, and the expression .mono .meti >> [ path << base ..:] .onom calculates the base name. The .code ftw function returns either .code t upon successful completion, or an integer value returned by .metn callbackfun , as described below. On failure it throws an exception derived from .codn file-error , whose specific type is based on analyzing the POSIX .code errno value. The .meta callbackfun may return a value of any type. If it returns a value that is not of integer type, then zero is returned to the .code nftw function and traversal continues. Similarly, traversal continues if the function returns an integer zero. If .meta callbackfun returns an integer value, that value must be in the range of the C type .codn int . That .code int value is returned to .codn nftw . If the value is not zero, and is not -1, then .code nftw will terminate, and return that value, which .code ftw then returns. If the value is -1, then .code nftw is deemed to have failed, and .code ftw will thrown an exception of type .codn file-error , whose specific type is based on analyzing the POSIX .code errno value. If the value is zero, then the traversal continues. The .meta callbackfun may also terminate the traversal by a nonlocal exit, such as by throwing an exception or performing a block return. The .meta callbackfun may not reenter the .code ftw function. This situation is detected and diagnosed by an exception. The .meta callbackfun may not capture a continuation across the callback boundary. That is to say, code invoked from the callback may not capture a continuation up to a prompt which surrounds the .code ftw call. Such an attempt is detected and diagnosed by an exception. .coNP Structure @ dirent .synb .mets (defstruct dirent nil .mets \ \ name ino type) .syne .desc Objects of the .code dirent structure type are returned by the .code readdir function. The .code name slot is a character string giving the name of the directory entry. If the .code opendir function's .meta prefix-p argument is specified as true, then .code readdir operations produce .code dirent structures whose .code name slot is a path formed by combining the directory path with the directory entry name. The .code ino slot is an integer giving the inode number of the object named by the directory entry. The .code type slot indicates the type of the object, which is an integer code. Support for this member is platform-dependent. If the directory traversal doesn't provide the information, then this slot takes on the .code nil value. In this situation, the .code dirstat function may be used to backfill the missing information. .coNP Variables @, dt-blk @, dt-chr @, dt-dir @, dt-fifo @, dt-lnk @, dt-reg @ dt-sock and @ dt-unknown .desc These variables give the possible type code values exhibited by the .code type slot of the .code dirent structure. If the underlying host platform does not feature a .code d_type field in the .code dirent C structure, then almost all these variables are defined anyway using the values that they have on GNU/Linux. These definitions are useful in conjunction with the .code dirstat function below. If the host platform does does not feature a .code d_type field in the .code dirent structure, then the variable .code dt-unknown is not defined. Note: the application can take advantage of this this to detect the situation, in order to conditionally define code in such a way that some run-time checking is avoided. .coNP Function @ opendir .synb .mets (opendir < dir-path <> [ prefix-p ]) .syne .desc The .code opendir function initiates a traversal of the directory object named by the string argument .metn dir-path , which must be the name of a directory. If .code opendir is not able to open the directory traversal, it throws an exception of type .codn system-error . Otherwise an object of type .code dir is returned, which is a directory traversal handle suitable as an argument for the .code readdir function. If the .meta prefix-p argument is specified and has a true value, then it indicates that the subsequent .code readdir operations should produce the value of the .code name slot of the .code dirent structure by combining .meta dir-path with the directory entry name using the .code path-cat function. .coNP Function @ readdir .synb .mets (readdir < dir-handle <> [ dirent-struct ]) .syne .desc The .code readdir function returns the next available directory entry from the directory traversal controlled by .metn dir-handle , which must be a .code dir object returned by .codn opendir . If no more directory entries remain, then .code readdir returns .codn nil . In this situation, the .meta dir-handle is also closed, as if by a call to .codn closedir . Otherwise, the next available directory entry is returned as a structure object of type .codn dirent . The .code readdir function internally skips and does not report the .str . (dot) and .str .. (dotdot) directory entries. If the .meta dirent-struct argument is specified, then it must be a .code dirent structure, or one which has all of the required slots. In this case, .code readdir stores values in that structure and returns it. If .meta dirent-struct is absent, then .code readdir allocates a fresh .code dirent structure. .coNP Function @ closedir .synb .mets (opendir << dir-handle ) .syne .desc The .code closedir function terminates the directory traversal managed by .metn dir-handle , releasing its resources. If this has already been done before, .code closedir returns .codn nil , otherwise it returns .codn t . Further .code readdir calls on the same .meta dir-handle return .codn nil . Note: the .code readdir function implicitly closes .meta dir-handle when the handle indicates that no more directory entries remain to be traversed. .coNP Function @ dirstat .synb .mets (dirstat < dirent-struct >> [ dir-path <> [ struct ]]) .syne .desc The .code dirstat function invokes .code lstat on the object represented by the .code dirent structure .metn dirent-struct , sets the .code type slot of the .meta dirent-struct accordingly, and then returns the value that .code lstat returned. If the .meta struct argument is specified, it is passed to .codn lstat . The .meta dir-path parameter must be specified, if the .code name slot of .meta dirent-struct is a simple directory entry name, rather than the full path to the object. In that case, the slot's value gives the effective path. If the .code name slot is already a path (due to, for instance, a true value of .meta prefix-p having been passed to .codn opendir ) then .meta dir-path must not be specified. If .meta dir-path is specified, then its value is combined with the .meta name slot of .meta dirent-struct using .code path-cat to form the effective path. The .code lstat function is invoked on the effective path, and if it succeeds, then type information is obtained from the resulting structure to set the value of the .code type slot of .metn dirent-struct . The same structure that was returned by .code lstat is then returned. .SS* Unix Sockets On platforms where the underlying system interface is available, \*(TX provides a sockets library for communicating over Internet networks, or over Unix sockets. Stream as well as datagram sockets are supported. The classic Version 4 of the Internet protocol is supported, as well as IP Version 6. Sockets are mapped to \*(TX streams. The .code open-socket function creates a socket of a specified type, in a particular address family. This socket is actually a stream (always, even if it isn't used for data transfer, but only as a passive contact point). The functions .codn sock-connect , .codn sock-bind , .codn sock-listen , .code sock-accept and .code sock-shutdown are used for enacting socket communication scenarios. Stream sockets use ordinary streams, reusing the same underlying framework that is used for file I/O and process types. Datagram socket streams are implemented using special datagram socket streams. Datagram socket streams eliminate the need for operations analogous to the .code sendto and .code recvfrom socket API functions, even in server programs which handle multiple clients. An overview of datagrams is treated in the following section, Datagram Socket Streams. The .code getaddrinfo function is provided for resolving host names and services to IPv4 and IPv6 addresses. Several structure types are provided for representing socket addresses, and options for .codn getaddrinfo . Various numeric constants are also provided: .codn af-unix , .codn af-inet , .codn af-inet6 , .codn sock-stream , .code sock-dgram and others. .NP* Datagram Socket Streams Datagram socket streams are a new paradigm unique to \*(TX which attempts to unify the programming model of stream and datagram sockets. A datagram socket stream is created by the .code open-socket function, when the .code sock-dgram socket type is specified. Another way in which a datagram socket is created is when .code sock-accept is invoked on a datagram socket, and returns a new socket. I/O is performed on datagram sockets using the regular I/O functions. None of the functions take or return peer addresses. There are no I/O functions which are analogous to the C library .code recvfrom and .code sendto functions which are usually used for datagram programming. Datagram I/O assumes that the datagram datagram socket is connected to a specific remote peer, and that peer is implicitly used for all I/O. Datagram streams solve the message framing problem by considering a single datagram to be an entire stream. On input, a datagram stream holds an entire datagram in a buffer. The stream ends (experiences the EOF condition) after the last byte of this buffer is removed by an input operation. Another datagram will be received and buffered if the EOF condition is first explicitly cleared with the .code clear-error function, and then another input operation is attempted. On output, a datagram stream gathers data into an ever-growing output buffer which isn't subject to any automatic flushing. An explicit .code flush-stream operation sends the buffer contents to the connected peer as a new datagram, and empties the buffer. Subsequent output operations prepare data for a new datagram. The .code close-stream function implicitly flushes the stream in the same way, and thus also potentially generates a datagram. A client-style datagram stream can be explicitly connected to a peer with the .code sock-connect function. This is equivalent to connecting a datagram socket using the C library .code connect function. Writes on the stream will be transmitted using the C library function .codn send . A client-style datagram stream can also be "soft-connected" to a peer using the .code sock-set-peer function. Writes on the stream will transmit data using the C library function .code sendto to the peer address. A datagram server program which needs to communicate using multiple peers is implemented by means of the .code sock-accept function which, unlike the C library .code accept function, works with datagram sockets as well as stream sockets. The server creates a datagram socket, and uses .code sock-bind to bind it to a local address. Optionally, it may also call .code sock-listen which is a no-op on datagram sockets. Supporting this function on datagram sockets allows program code to be more easily changed between datagram and stream operation. The server then uses .code sock-accept to accept new clients. Note that this is not possible with the C library function .codn accept , which only works with stream sockets. The .code sock-accept function receives a datagram from a client, and creates a new datagram socket stream which is connected to that client, and whose input buffer contains the received datagram. Input operations on this stream consume the datagram. Note that clearing the EOF condition and trying to receive another datagram is not recommended on datagram streams returned by .codn sock-accept , since they share the same underlying operating system socket, which is not actually connected to a specific peer. The receive operation could receive a datagram from any peer, without any indication which peer that is. Datagram servers should issue a new .code sock-accept call for each client datagram, treating it as a new stream. Datagram sockets ignore almost all aspects of the .meta mode-string passed in .code open-socket and .codn sock-accept . The only attribute not ignored is the buffer size specified with a decimal digit character; however, it cannot be the only item in the mode string. The string must be syntactically valid, as described under the .code open-file function. The buffer size attribute controls the size used by the datagram socket for receiving a datagram: the capture size. A datagram socket has obtains a default capture size if one isn't specified by the .metn mode-string . The default capture size is 65536 bytes for a datagram socket created by .codn open-socket . If a size is not passed to .code sock-accept via its .meta mode-string argument when it is invoked on a datagram socket, that socket's size is used as the capture size of the newly created datagram socket which is returned. .coNP Structure @ sockaddr .synb .mets (defstruct sockaddr nil .mets \ canonname .mets \ (:static family nil)) .syne .desc The .code sockaddr structure represents the abstract base class for socket addresses, from which several other types are derived: .codn sockaddr-in , .code sockaddr-in6 and .codn sockaddr-un . It has a single static slot named .code family and a single instance slot .codn canonname , both initialized to .codn nil . Note: the .code canonname slot is optionally set by the .code getaddrinfo function on address structures that it returns, if requested via the .code ai-canonname flag. The slot only provides information to the application, playing no semantic role in addressing. .coNP Structure @ sockaddr-in .synb .mets (defstruct sockaddr-in sockaddr .mets \ (addr 0) (port 0) (prefix 32) .mets \ (:static family af-inet)) .syne .desc The .code sockaddr-in address represents a socket address used in the context of networking over IP Version 4. It may be used with sockets in the .code af-inet address family. The .code addr slot holds an integer denoting an abstract IPv4 address. For instance the hexadecimal integer literal constant .code #x7F000001 or its decimal equivalent .code 2130706433 represents the loopback address, whose familiar "dot notation" is .codn 127.0.0.1 . Conversion of the abstract IP address to four bytes in network order, as required, is handled internally. The .code port slot holds the TCP or UDP port number, whose value ranges from 0 to 65535. Zero isn't a valid port; the value is used for requesting an ephemeral port number in active connections. Zero also appears in situations when the port number isn't required: for instance, when the .code getaddrinfo function is used with the aim of looking up the address of a host, without caring about the port number. The .code prefix field is set by the function .codn inaddr-str , when it recognizes and parses a prefix field in the textual representation. The .code family static slot holds the value .codn af-inet . .coNP Structure @ sockaddr-in6 .synb .mets (defstruct sockaddr-in6 sockaddr .mets \ (addr 0) (port 0) (flow-info 0) (scope-id 0) .mets \ (prefix 128) .mets \ (:static family af-inet6)) .syne .desc The .code sockaddr-in6 address represents a socket address used in the context of networking over IP Version 6. It may be used with sockets in the .code af-inet6 address family. The .code addr slot holds an integer denoting an abstract IPv6 address. IPv6 addresses are pure binary integers up to 128 bits wide. The .code port slot holds the TCP or UDP port number, whose value ranges from 0 to 65535. In IPv6, the port number functions similarly to IPv6; see .codn sockaddr-in . The .code flow-info and .code scope-id are special IPv6 parameters corresponding to the .code sin6_flowinfo and .code sin6_scope_id slots of the .code sockaddr_in6 C language structure. Their meaning and use are beyond the scope of this document. The .code prefix field is set by the function .codn in6addr-str , when it recognizes and parses a prefix field in the textual representation. The .code family static slot holds the value .codn af-inet6 . .coNP Structure @ sockaddr-un .synb .mets (defstruct sockaddr-un sockaddr .mets \ path .mets \ (:static family af-unix)) .syne .desc The .code sockaddr-un address represents a socket address used for interprocess communication within a single operating system node, using the "Unix domain" sockets of the .code af-unix address family. This structure has only one slot, .code path which holds the rendezvous name for connecting pairs of socket endpoints. This name appears in the filesystem. When the .code sockaddr-un structure is converted to the C structure .codn "struct sockaddr_un" , the .code path slot undergoes conversion to UTF-8. The resulting bytes are stored in the .code sun_path member of the C structure. If the resulting UTF-8 byte string is larger than the .code sun_path array, it is silently truncated. Note: Linux systems have support for "abstract" names which do not appear in the filesystem. These abstract names are distinguished by starting with a null byte. For more information, consult Linux documentation. This convention is supported in the .code path slot of the .code sockaddr-un structure. If .code path contains occurrences of the pseudo-null character U+DC00, these translate to null bytes in the .code sun_path member of the corresponding C structure .codn "struct sockaddr_un" . For example, the path .str "\exDC00;foo" is valid and represents an abstract address consisting of the three bytes .str "foo" followed by null padding bytes. The .code family static slot holds the value .codn af-unix . .coNP Structure @ addrinfo .synb .mets (defstruct addrinfo nil .mets \ \ (flags 0) (family 0) (socktype 0)) .syne .desc The .code addrinfo structure is used in conjunction with the .code getaddrinfo function. If that function's .meta hints argument is specified, it is of this type. The purpose of the argument is to narrow down or possibly alter the selection of addresses which are returned. The .code flags slot holds a bitwise or combination (see the .code logior function) of .code getaddrinfo flags: values given by the variables .codn ai-passive , .codn ai-numerichost , .codn ai-v4mapped , .codn ai-canonname , .codn ai-all , .code ai-addrconfig and .codn ai-numericserv . These correspond to the C constants .codn AI_PASSIVE , .code AI_NUMERICHOST and so forth. If .code ai-canonname is specified, then every returned address structure will have its .code canonname member set to a string value rather than .codn nil . This string is a copy of the canonical name reported by the underlying C library function, which that function places only into the first returned address structure. The .code family slot holds an address family, which may be the value of .codn af-unspec , .codn af-unix , .code af-inet or .codn af-inet6 . The .code socktype slot holds, a socket type. Socket types are given by the variables .code sock-dgram and .codn sock-stream . .coNP Function @ getaddrinfo .synb .mets (getaddrinfo >> [ node >> [ service <> [ hints ]]]) .syne .desc The .code getaddrinfo returns a list of socket addresses based on search criteria expressed in its arguments. That is to say, the returned list, unless empty, contains objects of type .code sockaddr-in and .codn sockaddr-in6 . The function is implemented directly in terms of the like-named C library function. All parameters are optional. Omitting any argument causes a null pointer to be passed for the corresponding parameter of the C library function. The .meta node and .meta service parameters may be character strings which specify a host name, and service. The contents of these strings may be symbolic, like .str www.example.com and .str ssh or numeric, like .str 10.13.1.5 and .strn 80 . If an argument is given for the .code hints parameter, it must be of type .codn addrinfo . The .meta node and .meta service parameters may also be given integer arguments. An integer argument value in either of these parameters is converted to a null pointer when calling the C .code getaddrinfo function. The integer values are then simply installed into every returned address as the IP address or port number, respectively. However, if both arguments are numeric, then no addresses are returned, since the C library function is then called with a null node and service. .coNP Variables @, af-unix @ af-inet and @ af-inet6 .desc These variables hold integers which give the values of address families. They correspond to the C constants .codn AF_UNIX , .code AF_INET and .codn AF_INET6 . Address family values are used in the .meta hints argument of the .code getaddrinfo function, and in the .code open-socket function. Note that unlike the C language socket addressing structures, the \*(TX socket addresses do not contain an address family slot. That is because they indicate their family via their type. That is to say, an object of type .code sockaddr-in is an address which is implicitly associated with the .code af-inet family via its type. .coNP Variables @ sock-stream and @ sock-dgram .desc These variables hold integers which give the values of address families. They correspond to the C constants .code SOCK_STREAM and .codn SOCK_DGRAM . .coNP Variables @, ai-passive @, ai-numerichost @, ai-v4mapped @, ai-all @ ai-addrconfig and @ ai-numericserv .desc These variables hold integers which are bitmasks that combine together via bitwise or, to express the .code flags slot of the .code addrinfo structure. They correspond to the C constants .codn AI_PASSIVE , .codn AI_NUMERICHOST , .code AI_V4MAPPED and so forth. They influence the behavior of the .code getaddrinfo function. .coNP Variables @, inaddr-any @, inaddr-loopback @ in6addr-any and @ in6addr-loopback .desc These integer-valued variables provide constants for commonly used IPv4 and IPv6 address values. The value of .code inaddr-any and .code in6addr-any is zero. This address is used in binding a passive socket to all of the external interfaces of a host, so that it can accept connections or datagrams from all attached networks. The .code inaddr-loopback variable is IPv4 loopback address, the same integer as the hexadecimal constant .code #x7F000001. The .code in6addr-loopback is the IPv6 loopback address. Its value is 1. .TP* Example: .verb ;; Construct an IPv6 socket address suitable for binding ;; a socket to the loopback network, port 1234: (new sockaddr-in6 addr in6addr-loopback port 1234) ;; Mistake: IPv4 address used with IPv6 sockaddr. (new sockaddr-in6 addr inaddr-loopback) .brev .coNP Function @ open-socket .synb .mets (open-socket < family < type <> [ mode-string ]) .syne .desc The .code open-socket function creates a socket, which is a kind of stream. The .meta family parameter specifies the address family of the socket. One of the values .codn af-unix , .code af-inet or .code af-inet6 should be used to create a Unix domain, Internet IPv4 or Internet IPv6 socket, respectively. The .meta type parameter specifies the socket type, either .code sock-stream (stream socket) or .code sock-dgram (datagram socket). The .meta mode-string specifies several properties of the stream; for a description of .meta mode-string parameters, refer to the .code open-file function. Note that the defaulting behavior for an omitted .meta mode-string argument is different under .code open-socket from other functions. Because sockets are almost always used for bidirectional data flow, the default mode string is .str r+b rather than the usual .strn r . The rationale for including the .str b flag in the default mode string is that network protocols are usually defined in a way that is independent of machine and operating system, down to the byte level, even when they are textual. It doesn't make sense for the same \*(TX program to see a network stream differently based on what platform it is running on. Line-ending conversion has to do with how a platform locally stores text files, whereas network streams are almost always external formats. Like other stream types, stream sockets are buffered and marked as non-real-time streams. Specifying the .str i mode in .meta mode-string marks a socket as a real-time stream, and, if it is opened for writing or reading and writing, changes it to use line buffering. .coNP Function @ open-socket-pair .synb .mets (open-socket-pair < family < type <> [ mode-string ]) .syne .desc The .code open-socket-pair function provides an interface to the functionality of the .code socketpair C library function. If successful, it creates and returns a list of two stream objects, which are sockets that are connected together. Note: the Internet address families .code af-inet and .code af-inet6 are not supported. The .code mode-string is applied to each stream. For a description, see .code open-socket and .codn open-file . .coNP Functions @ sock-family and @ sock-type .synb .mets (sock-family << socket ) .mets (sock-type << socket ) .syne .desc These functions retrieve the integer values representing the address family and type of a socket. The argument to the .meta socket parameter must be a socket stream or a file or process stream. For a file stream, both functions return .codn nil . An exception of type .code type-error is thrown for other stream types. .coNP Accessor @ sock-peer .synb .mets (sock-peer << socket ) .mets (set (sock-peer << socket ) << address ) .syne .desc The .code sock-peer function retrieves the peer address has most recently been assigned to .metn socket . Sockets which are not connected initially have a peer address value of .codn nil . A socket which is connected to a remote peer receives that peer's address as its .codn sock-peer . If a socket is connected to a remote peer via a successful use of the .code sock-connect function, then its .code sock-peer address is set to match that of the peer. Sockets returned by the .code sock-accept function are connected, and have the remote endpoint address as their .code sock-peer address. Assigning an address to a .code sock-peer form is equivalent to using .code sock-set-peer to set the address. Implementation note: the .code sock-peer function does not use the .code getpeername C library function; the association between a stream and .code sockaddr struct is maintained by \*(TX. .coNP Function @ sock-set-peer .synb .mets (sock-set-peer < socket << address ) .syne .desc The .code sock-set-peer function stores .meta address into .meta socket as that socket's peer. Subsequently, the .code sock-peer function will retrieve that address. If .meta address is not an appropriate address object in the address family of .metn socket , the behavior is unspecified. .coNP Function @ sock-connect .synb .mets (sock-connect < socket < address <> [ timeout-usec ]) .syne .desc The .code sock-connect function connects a socket stream to a peer address. The .meta address argument must be a .code sockaddr object of type matching the address family of the socket. If the operation fails, an exception of type .code socket-error is thrown. Otherwise, the function returns .metn socket . If the .meta timeout-usec argument is specified, it must be a fixnum integer. It denotes a connection timeout period in microseconds. If the connection doesn't succeed within the specified timeout, an exception of type .code timeout-error is thrown. .coNP Function @ sock-bind .synb .mets (sock-bind < socket << address ) .syne .desc The .code sock-bind function binds a socket stream to a local address after enabling the socket stream's .code so-reuseaddr option. The .meta address argument must be a .code sockaddr object of type matching the address family of the socket. If the operation fails, an exception of type .code socket-error is thrown. Otherwise, the function returns .codn t . .coNP Function @ sock-listen .synb .mets (sock-listen < socket <> [ backlog ]) .syne .desc The .code sock-listen function prepares .meta socket for listening for connections. The .meta backlog parameter, if specified, requires an integer argument. The default value is 16. .coNP Function @ sock-accept .synb .mets (sock-accept < socket >> [ mode-string <> [ timeout-usec ]]) .syne .desc The .code sock-accept function waits for a client connection on .metn socket , which must have been prepared for listening for connections using .code sock-bind and .codn sock-listen . If the operation fails, an exception of type .code socket-error is thrown. Otherwise, the function returns a new socket which is connected to the remote peer. The peer's address may be retrieved from this socket using .codn sock-peer . The .code mode-string parameter is applied to the new socket just like the similar argument in .codn open-socket . It defaults to .strn r+b . If the .meta timeout-usec argument is specified, it must be a fixnum integer. It denotes a timeout period in microseconds. If no peer connects for the specified timeout, .code sock-accept throws an exception of type .codn timeout-error . .coNP Variables @, shut-rd @ shut-wr and @ shut-rdwr .desc The values of these variables are useful as the second argument to the .code sock-shutdown function. .coNP Function @ sock-shutdown .synb .mets (sock-shutdown < sock <> [ direction ]) .syne .desc The .code sock-shutdown function indicates that no further communication is to take place on .meta socket in the specified direction(s). If the operation fails, an exception of type .code socket-error is thrown. Otherwise, the function returns .codn t . The .code direction parameter is one of the values given by the variables .codn shut-rd , .code shut-wr or .codn shut-rdwr . These values shut down communication in the read direction, write direction, or both directions, respectively. If the argument is omitted, .code sock-shutdown defaults to closing the write direction. Notes: shutting down is most commonly requested in the write direction, to perform a "half close". The communicating process thereby indicates that it has written all the data which it intends to write. When the shutdown action is processed on the remote end, that end is unblocked from waiting on any further data, and effectively experiences an "end of stream" condition on its own socket or socket-like endpoint, while continuing to be able to transmit data. Shutting down in the reading direction is potentially abrupt. If it is executed before an "end of stream" indication is received from a peer, it results in an abortive close. .coNP Functions @ sock-recv-timeout and @ sock-send-timeout .synb .mets (sock-recv-timeout < sock << usec ) .mets (sock-send-timeout < sock << usec ) .syne .desc The .code sock-recv-timeout and .code sock-send-timeout functions configure, respectively, receive and send timeouts on socket .metn sock . The .meta usec parameter specifies the value, in microseconds. It must be a .code fixnum integer. When a receive timeout is configured on a socket, then an exception of type .code timeout-error is thrown when an input operation waits for at least .code usec microseconds without receiving input. Similarly, when a send timeout is configured, then an exception of type .code timeout-error is thrown when an output operation waits for at least .code usec microseconds for the availability of buffer space in the socket. .coNP Variables @, sol-socket @, ipproto-ip @, ipproto-ipv6 @ ipproto-tcp and @ ipproto-udp .desc These variables represent the protocol levels of socket options and are suitable for use as the .meta level argument of the .code sock-opt and .code sock-set-opt functions. The variables correspond to the POSIX C constants .codn SOL_SOCKET , .codn IPPROTO_IP , .codn IPPROTO_IPV6 , .code IPPROTO_TCP and .codn IPPROTO_UDP . .coNP Variables @, so-acceptconn @, so-broadcast @, so-debug @, so-dontroute @, so-error @, so-keepalive @, so-linger @, so-oobinline @, so-rcvbuf @, so-rcvlowat @, so-rcvtimeo @, so-reuseaddr @, so-sndbuf @, so-sndlowat @ so-sndtimeo and @ so-type .desc These variables represent socket options at the .code sol-socket protocol level and are suitable for use as the .meta option argument of the .code sock-opt and .code sock-set-opt functions. The variables correspond to the POSIX C constants .codn SO_ACCEPTCONN , .codn SO_BROADCAST , .codn SO_DEBUG , etc. Note that the .code sock-recv-timeout and .code sock-send-timeout are a more convenient interface for setting the value of the .code so-rcvtimeo and .code so-sndtimeo socket options. .coNP Variables @, ipv6-join-group @, ipv6-leave-group @, ipv6-multicast-hops @, ipv6-multicast-if @, ipv6-multicast-loop @ ipv6-unicast-hops and @ ipv6-v6only .desc These variables represent socket options at the .code ipproto-ipv6 protocol level and are suitable for use as the .meta option argument of the .code sock-opt and .code sock-set-opt functions. The variables correspond to the POSIX C constants .codn IPV6_JOIN_GROUP , .codn IPV6_LEAVE_GROUP , .codn IPV6_MULTICAST_HOPS , etc. .coNP Variable @ tcp-nodelay .desc This variable represents a socket option at the .code ipproto-tcp protocol level and is suitable for use as the .meta option argument of the .code sock-opt and .code sock-set-opt functions. The variable corresponds to the POSIX C constant .codn TCP_NODELAY . .coNP Accessor @ sock-opt .synb .mets (sock-opt < socket < level < option <> [ ffi-type ]) .mets (set (sock-opt < socket < level < option <> [ ffi-type ]) << value ) .syne .desc The .code sock-opt function retrieves the value of the specified socket option, at the specified protocol level, associated with .codn socket , which must be a socket stream. The .code level argument should be one of the protocol levels .codn sol-socket , .codn ipproto-ip , .codn ipproto-ipv6 , .code ipproto-tcp and .codn ipproto-udp . The .code option argument should be one of the socket options .codn so-acceptconn , .codn so-broadcast , .codn so-debug , \&..., .codn ipv6-join-group , \&..., .code ipv6-v6only and .codn tcp-nodelay . The .meta ffi-type argument, which must be a compiled FFI type, specifies the type of the socket option's value. The type is most commonly .code int or .codn uint , but it can be any other fixed-size type, including .codn struct s. (Variable-size types, such as C .code char arrays, are unsupported.) The .meta ffi-type argument defaults to .codn "(ffi int)" . Assigning a value to a .code sock-opt place is equivalent to calling .code sock-set-opt with that value. Note: the .code sock-opt and .code sock-set-opt functions call the POSIX C .code getsockopt and .code setsockopt functions, respectively. Consult the POSIX specification for more information about these functions and in particular the various socket options (and the types they require). .coNP Function @ sock-set-opt .synb .mets (sock-set-opt < socket < level < option < value <> [ ffi-type ]) .syne .desc The .code sock-set-opt function sets the value of the specified socket option, at the specified protocol level, associated with .codn socket , which must be a socket stream. See the documentation of the .code sock-opt function for a description of the .metn level , .meta option and .meta ffi-type arguments. Like the .code sock-opt function, .codn sock-set-opt 's .meta ffi-type argument defaults to .codn "(ffi int)" . .coNP Functions @ str-inaddr and @ str-in6addr .synb .mets (str-inaddr address <> [ port ]) .mets (str-in6addr address <> [ port ]) .syne .desc The .code str-inaddr and .code str-in6addr functions convert an IPv4 and IPv6 address, respectively, to textual notation which is returned as a character string. The conversion is done in conformance with RFC 5952, section 4. IPv6 addresses representing IPv6-mapped IPv4 addresses are printed in the hybrid notation exemplified by .codn ::ffff:192.168.1.1 . The .meta address parameter must be a nonnegative integer in the appropriate range for the address type. If the .meta port number argument is supplied, it is included in the returned character string, according to the requirements in section 6 of RFC 5952 pertaining to IPv6 addresses (including IPv6-mapped IPv6 addresses) and section 3.2.3 of RFC 3986 for IPv4 addresses. In brief, IPv6 addresses with ports are expressed as .code [address]:port and IPv6 addresses follow the traditional .code address:port pattern. .coNP Functions @ str-inaddr-net and @ str-in6addr-net .synb .mets (str-inaddr-net < address <> [ width ]) .mets (str-in6addr-net < address <> [ width ]) .syne .desc The functions .code str-inaddr-net and .code str-in6addr-net convert, respectively, IPv4 and IPv6 network prefix addresses to the "slash notation". For IPv6 addresses, the requirements of section 2.3 of RFC 4291 are implemented. For IPv4, section 3.1 of RFC 4632 is followed. The condensed portion of the IP address is always determined by measuring the contiguous extent of all-zero bits in the least significant position of the address. For instance an IPv4 address which has at least 24 zero bits in the least significant position, so that the only nonzero bits are in the highest octet, is always condensed to a single decimal number: the value of the first octet. If the .meta width parameter is specified, then its value is incorporated into the returned textual notation as the width. No check is made whether this width large enough to span all of the nonzero bits in the address. If .meta width is omitted, then it is calculated as the number of bits in the address, excluding the contiguous all-zero bits in the least significant position: how many times the address can be shifted to the right before a 1 appears in the least significant bit. .coNP Functions @ inaddr-str and @ in6addr-str .synb .mets (inaddr-str << string ) .mets (in6addr-str << string ) .syne .desc The .code inaddr-str and .code in6addr-str functions recover an IPv4 or IPv6 address from a textual representation. If the parse is successful, the address is returned as, respectively, a .code sockaddr-in or .code sockaddr-in6 structure. If .meta string is a malformed address, due to any issue such as invalid syntax or a numeric value being out of range, an exception is thrown. The .code inaddr-str function recognizes the dot notation consisting of four decimal numbers separated by period characters. The numbers must be in the range 0 to 255. Note: superfluous leading zeros are permitted, though this is a nonstandard extension; not all implementations of this notations support this. A prefix may be specified in the notation as a slash followed by a decimal number, in the range 0 to 32. In this case, the integer value of the prefix appears as the .code prefix member of the returned .code sockaddr-in structure. Furthermore, the address is masked, so that any bits not included in the prefix are zero. For instance, the address .str 255.255.255.255/1 is equivalent to .strn 128.0.0.0 , except that the .code prefix if the returned structure is 1 rather than 32. When a prefix is not specified, the .code prefix member of the structure retains its default value of 32. When the prefix is specified, the address part need not contain all four octets; it may contain between one and four octets. Thus, .str 192.168/16 is a valid address, equivalent to .strn 192.168.0.0/16 . A port number may be specified in the notation as a colon, followed by a decimal number in the range 0 to 65535. The integer value of this port number appears as the .code port member of the returned structure. An example of this notation is .strn 127.0.0.1:23 . A prefix and port number may both be specified; in this case the prefix must appear first, followed by the port number. For example, .strn "127/8:23" . The .code in6addr-str function recognizes the IPv6 notation consisting of 16-bit hexadecimal pieces separated by colons. If the operation is successful, it returns a .code sockaddr-in6 structure. Each piece must be a value in the range 0 to FFFF. The hexadecimal digits may be any mixture of uppercase and lowercase. Leading zeros are permitted. Up to eight such pieces must be specified. If fewer pieces are specified, then the token .code :: (double colon) must appear in the address exactly once. That token denotes the condensation of a sufficient number of zero-valued pieces to make eight pieces. The token must be in one of three positions: it may be the leftmost element of the address, immediately followed by a hexadecimal piece; it may be the rightmost element of the address, preceded by a hexadecimal piece; or else, it may be in the middle of the address, flanked on both sides by hexadecimal pieces. The .code in6addr-str also recognizes the special notation for IPv6-mapped IPv4 addresses. This notation consists of the address string .str ::FFFF which may appear in any uppercase/lowercase mixture, possibly with leading zeros, followed by an IPv4 address given in the four-octet dot notation. For example, .strn ::FFFF:127.0.0.1 . A prefix may be specified using a slash, followed by a decimal number in the range 0 to 128. The handling of the prefix is similar to that of .code inaddr-str except that pieces of the address may not be omitted. Condensing the pieces of the IPv6 address is always done by means of the .code :: token, whether or not a prefix is present. Furthermore, the octets specified in the IPv6-mapped IPv4 notation must all be present, regardless of the prefix. A port number may be specified in the notation as follows: the entire address, including any slash-separated prefix, must appear surrounded in square brackets. The closing square bracket must be followed by a colon and one or more digits denoting a decimal number in the range 0 to 65535. For instance .strn "[1:2:3::4/64]:1234". .coNP Function @ sockaddr-str .synb .mets (sockaddr-str << string ) .syne .desc The function .code sockaddr-str analyzes the .meta string argument to determine whether it represents a valid IPv4, IPv6 or Unix domain address. If so, it constructs an object, representing that address, of type .codn sockaddr-in , .code sockaddr-in or .codn sockaddr-un . The slash prefix notation, and port numbers are handled, and represented in the returned structures accordingly. The .code sockaddr-str function works by applying simple tests to the input, and then invoking the functions .code inaddr-str or .codn in6addr-str , or constructing a .code sockaddr-un structure whose .code path slot is .metn string . The precise procedure followed is: .RS .IP 1. If .meta string starts with .str [ then it is handled via .codn in6addr-str . .IP 2. If .meta string starts with .str / then it is assumed to be the path of a Unix socket. A .code sockaddr-un structure is constructed whose .code path slot is .metn string . This is the only case in which a structure of type .code sockaddr-un is returned. .IP 3. If .meta string contains .str "::" as a substring, it is handled via .codn in6addr-str . .IP 4. If .meta string contains .str . then it is handled via .codn inaddr-str . .IP 5. If the above tests fail, .meta string is passed to .codn in6addr-str , and if that call returns normally, .code sockaddr-str returns that value. .IP 6. Otherwise, .code string is passed to .codn inaddr-str . .RE .coNP Method @ str-addr .synb .mets << sockaddr .(str-addr) .syne .desc A method named .code str-addr is defined for the struct types .codn sockaddr-in , .code sockaddr-in6 and .codn sockaddr-un . It returns a text representation of the address as a string. If the .code port slot of .code sockaddr-in or .code sockaddr-in6 is a nonzero integer, then it is incorporated into the text representation. Likewise if the .code prefix slot has a non-default value specifying fewer bits than the width of the address, the prefix notation is produced. The intent is that the representations produced are suitable as input to the .code sockaddr-str function which will reproducing an address object of the same type, featuring the same .codn addr , .code port and .codn prefix . In the case of .codn sockaddr-un , the .code sockaddr-str function will reproduce the same address only if the .code path slot is a string starting with .strn / . .SS* Unix Terminal Control \*(TX provides access to the terminal control "termios" interfaces defined by POSIX, and some of the extensions to it in Linux. By using termios, programs can control serial devices, consoles and virtual terminals. Terminal control in POSIX revolves around a C language structure called .codn "struct termios" . This is mirrored in a \*(TL structure also called .codn termios . Like-named \*(TL functions are provided which correspond to the C functions .codn tcgetattr , .codn tcsetattr , .codn tcsendbreak , .codn tcdrain , .code tcflush and .codn tcflow . These have somewhat different argument conventions. The TTY device is specified last, so that it can conveniently default to the .code *stdin* stream. A TTY device can be specified as either a stream object or a numeric file descriptor. The functions .codn cfgetispeed , .codn cfgetospeed , .code cfsetispeed and .code cfsetospeed are not provided, because they are unnecessary. Device speed (informally, "baud rate") is specified directly as a integer value in the .code termios structure. The \*(TL termios functions automatically convert between integer values and the speed constants (like .codn B38400 ) used by the C API. All of the various termios-related constants are provided, including some nonstandard ones. They appear in lowercase. For instance .code IGNBRK and .code PARENB are simply known as the predefined \*(TL variables .code ignbrk and .codn parenb . .coNP Structure @ termios .synb .mets (defstruct termios nil .mets \ \ iflag oflag cflag lflag .mets \ \ cc ispeed ospeed) .syne .desc The .code termios structure represents the kernel level terminal device configuration. It holds hardware related setting such as serial line speed, parity and handshaking. It also holds software settings like translations, and settings affecting input behaviors. The structure closely corresponds to the C language .code termios structure which exists in the POSIX API. The .codn iflag , .codn oflag , .code cflag and .code lflag slots correspond to the .codn c_iflag , .codn c_oflag , .code c_cflag and .code c_lflag members of the C structure. They hold integer values representing bitfields. The .code cc slot corresponds to the .code c_cc member of the C structure. Whereas the C structure's .code c_cc member is an array of the C type .codn cc_t , the .code cc slot is a vector of integers, whose values must have the same range as the .code cc_t type. .coNP Variables @, ignbrk @, brkint @, ignpar @, parmrk @, inpck @, istrip @, inlcr @, igncr @, icrnl @, iuclc @, ixon @, ixany @, ixoff @ imaxbel and @ iutf8 .desc These variables specify bitmask values for the .code iflag slot of the .code termios structure. They correspond to the C language preprocessor symbols .codn IGNBRK , .codn BRKINT , .code IGNPAR and so forth. The .code imaxbel and .code iutf8 variables are specific to Linux and may not be present. Portable code should test for their presence with .codn boundp . The .code iuclc variable is a legacy feature not found on all systems. Note: the .code termios methods .code set-iflags and .code clear-iflags provide a convenient means for setting and clearing combinations of these flags. .coNP Variables @, opost @, olcuc @, onlcr @, ocrnl @, onocr @, onlret @, ofill @, ofdel @, vtdly @, vt0 @, vt1 @, nldly @, nl0 @, nl1 @, crdly @, cr0 @, cr1 @, cr2 @, cr3 @, tabdly @, tab0 @, tab1 @, tab2 @, tab3 @, bsdly @, bs0 @, bs1 @, ffdly @ ff0 and @ ff1 .desc These variables specify bitmask values for the .code oflag slot of the .code termios structure. They correspond to the C language preprocessor symbols .codn OPOST , .codn OLCUC , .code ONLCR and so forth. The variable .code ofdel is Linux-specific. Portable programs should test for its presence using .codn boundp . The .code olcuc variable is a legacy feature not found on all systems. Likewise, whether the following groups of symbols are present is platform-specific: .codn nldly , .code nl0 and .codn nl1 ; .codn crdly , .codn cr0 , .codn cr1 , .code cr2 and .codn cr3 ; .codn tabdly , .codn tab0 , .codn tab1 , .code tab2 and .codn tab3 ; .codn bsdly , .code bs0 and .codn bs1 ; and .codn ffdly , .code ff0 and .codn ff1 . Note: the .code termios methods .code set-oflags and .code clear-oflags provide a convenient means for setting and clearing combinations of these flags. .coNP Variables @, csize @, cs5 @, cs6 @, cs7 @, cs8 @, cstopb @, cread @, parenb @, parodd @, hupcl @, clocal @, cbaud @, cbaudex @ cmspar and @ crtscts .desc These variables specify bitmask values for the .code cflag slot of the .code termios structure. They correspond to the C language preprocessor symbols .codn CSIZE , .codn CS5 , .code CS6 and so forth. The following are present on Linux, and may not be available on other platforms. Portable code should test for them using .codn boundp : .codn cbaud , .codn cbaudex , .code cmspar and .codn crtscts . Note: the .code termios methods .code set-cflags and .code clear-cflags provide a convenient means for setting and clearing combinations of these flags. .coNP Variables @, isig @, icanon @, echo @, echoe @, echok @, echonl @, noflsh @, tostop @, iexten @, xcase @, echoctl @, echoprt @, echoke @, flusho @ pendin and @ extproc .desc These variables specify bitmask values for the .code lflag slot of the .code termios structure. They correspond to the C language preprocessor symbols .codn ISIG , .codn ICANON , .code ECHO and so forth. The following are present on Linux, and may not be available on other platforms. Portable code should test for them using .codn boundp : .codn iexten , .codn xcase , .codn echoctl , .codn echoprt , .codn echoke , .codn flusho , .code pendin and .codn extproc . Note: the .code termios methods .code set-lflags and .code clear-lflags provide a convenient means for setting and clearing combinations of these flags. .coNP Variables @, vintr @, vquit @, verase @, vkill @, veof @, vtime @, vmin @, vswtc @, vstart @, vstop @, vsusp @, veol @, vreprint @, vdiscard @, vwerase @ vlnext and @ veol2 .desc These variables specify integer offsets into the vector stored in the .code cc slot of the .code termios structure. They correspond to the C language preprocessor symbols .codn VINTR , .codn VQUIT , .code VERASE and so forth. The following are present on Linux, and may not be available on other platforms. Portable code should test for them using .codn boundp : .codn vswtc , .codn vreprint , .codn vdiscard , .code vlnext and .codn veol2 . .coNP Variables @, tcooff @, tcoon @ tcioff and @ tcion .desc These variables hold integer values suitable as the .meta action argument of the .code tcflow function. They correspond to the C language preprocessor symbols .codn TCOOFF , .codn TCOON , .code TCIOFF and .codn TCION . .coNP Variables @, tciflush @ tcoflush and @ tcioflush .desc These variables hold integer values suitable as the .meta queue argument of the .code tcflush function. They correspond to the C language preprocessor symbols .codn TCIFLUSH , .code TCOFLUSH and .codn TCIOFLUSH . .coNP Variables @, tcsanow @ tcsadrain and @ tcsaflush .desc These variables hold integer values suitable as the .meta actions argument of the .code tcsetattr function. They correspond to the C language preprocessor symbols .codn TCSANOW , .code TCSADRAIN and .codn TCSAFLUSH . .coNP Functions @ tcgetattr and @ tcsetattr .synb .mets (tcgetattr <> [ device ]) .mets (tcsetattr < termios >> [ actions <> [ device ]]) .syne .desc The .code tcgetattr and .code tcsetattr functions, respectively, retrieve and install the configuration of the terminal driver associated with the specified device. These functions are wrappers for the like-named POSIX C library functions, but with different argument conventions, and operating using a \*(TL structure. The .code tcgetattr function, if successful, returns a new instance of the .code termios structure. The .code tcsetattr function requires an instance of a .code termios structure as an argument to its .meta termios parameter. A program may alter the settings of a terminal device by retrieving them using .codn tcgetattr , manipulating the structure returned by this function, and then using .code tcsetattr to install the modified structure into the device. The .meta actions argument of .code tcsetattr may be given as the value of one of the variables .codn tcsanow , .code tcsadrain or .codn tcsaflush . If it is omitted, the default is .codn tcsadrain . If an argument is given for .meta device it must be either a stream, or an integer file descriptor. In either case, it is expected to be associated with a terminal (TTY) device. If the argument is omitted, it defaults to the stream currently stored in the .code *stdin* stream special variable, expected to be associated with a terminal device. .TP* Notes: The C .code termios structure usually does not have members for representing the input and output speed. \*(TL does not use such members, in any case, even if they are present. The speeds are encoded in the .code cc_iflag and .code cc_lflag bitmasks. When retrieving the settings, the .code tcgetattr function uses the POSIX functions .code cfgetispeed and .code cfgetospeed to retrieve the speed values from the C structure. These values are installed as the .code ispeed and .code ospeed slots of the Lisp structure. A reverse conversion takes place when setting are installed using .codn tcsetattr : the speed values are taken from the slots, and installed into the C structure using .code cfsetispeed and .code cfsetospeed before the structure is passed to the C .code tcsetattr function. On Linux, TTY devices do not have a separate input and output speed. The C .code termios structure stores only one speed which is taken as both the input and output speed, with a special exception. The input speed may be programmed as zero. In that case, it is independently represented. speed may be programmed as zero. .coNP Function @ tcsendbreak .synb .mets (tcsendbreak >> [ duration <> [ device ]]) .syne .desc The .code tcsendbreak function generates a break signal on serial devices. The .meta duration parameter specifies the length of the break signal in milliseconds. If the argument is omitted, the value 500 is used. The .meta device parameter is exactly the same as that of the .code tcsetattr function. .coNP Function @ tcdrain .synb .mets (tcdrain <> [ device ]) .syne .desc The .code tcdrain function waits until all queued output on a terminal device has been transmitted. It is a direct wrapper for the like-named POSIX C function. The .meta device parameter is exactly the same as that of the .code tcsetattr function. .coNP Function @ tcflush .synb .mets (tcflush < queue <> [ device ]) .syne .desc The .code tcflush function discards either untransmitted output data, or received and yet unread input data, depending on the value of the .meta queue argument. It is a direct wrapper for the like-named POSIX C function. The .meta queue argument should be the value of one of the variables .codn tciflush , .code tcoflush and .codn tcioflush , which specify the flushing of input data, output data or both. The .meta device parameter is exactly the same as that of the .code tcsetattr function. .coNP Function @ tcflow .synb .mets (tcflow < action <> [ device ]) .syne .desc The .code tcflow function provides bidirectional flow control on the specified terminal device. It is a direct wrapper for the like-named POSIX C function. The .meta action argument should be the value of one of the variables .codn tcooff , .codn tcoon , .code tcioff and .codn tcion . The .meta device parameter is exactly the same as that of the .code tcsetattr function. .coNP Methods @, set-iflags @, set-oflags @, set-cflags @, set-lflags @, clear-iflags @, clear-oflags @ clear-cflags and @ clear-lflags .synb .mets << termios .(set-iflags << flags *) .mets << termios .(set-oflags << flags *) .mets << termios .(set-cflags << flags *) .mets << termios .(set-lflags << flags *) .mets << termios .(clear-iflags << flags *) .mets << termios .(clear-oflags << flags *) .mets << termios .(clear-cflags << flags *) .mets << termios .(clear-lflags << flags *) .syne .desc These methods of the .code termios structure set or clear multiple flags of the four bitmask flag fields. The .meta flags arguments specify zero or more integer values. These values are combined together bitwise, as if by the .code logior function to form a single effective mask. If there are no .meta flags arguments, then the effective mask is zero. The .code set-iflags method sets, in the .code iflag slot of the .meta termios structure, all of the bits which are set in the effective mask. That is to say, the effective mask is combined with the value in .code iflag by a .code logior operation, and the result is stored back into .codn iflag . Similarly, the .codn set-oflags , .code set-cflags and .code set-lflags methods operate on the .codn oflag , .code cflag and .code lflag slots of the structure. The .code clear-iflags method clears, in the .code iflag slot of the .meta termios structure, all of the bits which are set in the effective mask. That is to say, the effective mask is bitwise inverted as if by the .code lognot function, and then combined with the existing value of the .code iflag slot using .codn logand . The resulting value is stored back into the .code iflag slot. Similarly, the .codn clear-oflags , .code clear-cflags and .code clear-lflags methods operate on the .codn oflag , .code cflag and .code lflag slots of the structure. Note: the methods .codn go-raw , .code go-cbreak and .code go-canon are provided for changing the settings to raw, "cbreak" and canonical mode. These methods should be preferred to directly manipulating the flag and .code cc slots. .TP* Example In this example, .code tio is assumed to be a variable holding an instance of a .code termios struct: .verb ;; clear the ignbrk, brkint, and various other flags: tio.(clear-iflags ignbrk brkint parmrk istrip inlcr igncr icrnl ixon) ;; set the csize and parenb flags: tio.(set-cflags csize parenb) .brev .coNP Methods @ go-raw and @ go-cbreak .synb .mets << termios .(go-raw) .mets << termios .(go-cbreak) .syne .desc The .code go-raw and .code go-cbreak methods of the .code termios structure manipulate the flag slots, as well as certain elements of the .code cc slot, in order to prepare the terminal settings for, respectively, "raw" and "cbreak" mode, described below. Note that manipulating the .code termios structure doesn't actually put these settings into effect in the terminal device; the settings represented by the structure must be installed into the device using .codn tcsetattr . There is no way to reverse the effect of these methods. To precisely restore the previous terminal settings, the program should retain a copy of the original .code termios structure. "Raw" mode refers to a configuration of the terminal device driver in which input and output is passed transparently and without accumulation, conversion or interpretation. Input isn't buffered into lines; as soon as a single byte is received, it is available to the program. No special input characters such as commands for generating an interrupt or process suspend request are processed by the terminal driver; all characters are treated as input data. Input isn't echoed; the only output which takes place is that generated by program output requests to the device. "Cbreak" mode is named after a concept and function in the "curses" terminal control library. It refers to a configuration of the terminal device driver which is less transparent than "raw" mode. Input isn't buffered into lines, and line editing commands are ordinary input characters, allowing character-at-a-time input. However, most input translations are preserved, except that the conversion of CR characters to NL is disabled. The signal-generating characters are processed in this mode. This latter feature of the configuration is the likely inspiration for the word "cbreak". Unless otherwise configured, the interrupt character corresponds to the .key Ctrl-C key, and "break" is another term for an interactive interruption. .coNP Methods @ string-encode and @ string-decode .synb .mets << termios .(string-encode) .mets << termios .(string-decode << string ) .syne .desc The .code string-encode method converts the terminal state stored in a .code termios structure into a textual format, returning that representation as a character string. The .code string-decode method parses the character representation produced by .code string-encode and populates the .meta termios structure with the settings are encoded in that string. If a string is passed to .code string-decode which wasn't produced by .codn string-encode , the behavior is unspecified. An exception may or may not be thrown, and the contents of .meta termios may or may not be affected. Note: the textual representation produced by .code string-encode is intended to be identical to that produced by the .code -g option of the GNU Coreutils version of the .code stty utility, on the same platform. That is to say, the output of .code "stty -g" may be used as input into .codn string-decode , and the output of .code string-encode may be used as an argument to .codn stty . .SS* Unix System Identification .coNP Structure @ utsname .synb .mets (defstruct utsname nil .mets \ \ sysname nodename release .mets \ \ version machine domainname) .syne .desc The .code utsname structure corresponds to the POSIX structure of the same name. An instance of this structure is returned by the .code uname function. .coNP Function @ uname .synb .mets (uname) .syne .desc The .code uname function corresponds to the POSIX function of the same name. It returns an instance of the .code utsname structure. Each slot of the returned structure is initialized with a character string that identifies the corresponding attribute of the host system. The host system might not support the reporting of the NIS domain name. In this case, the .code domainname slot of the returned .code utsname structure will have the value .codn nil . .SS* Unix Resource Limits .coNP Structure @ rlim .synb .mets (defstruct rlim nil .mets \ \ cur max) .syne .desc The .code rlim structure is required by the functions .code getrlimit and .codn setrlimit . It is analogous to the C structure by the same name described in POSIX. .coNP Variables @, rlim-saved-max @ rlim-saved-cur and @ rlim-infinity .desc These variables correspond to the POSIX constants .codn RLIM_SAVED_MAX , .code RLIM_SAVED_CUR and .codn RLIM_INFINITY . They have the same values, and are suitable as slot values of the .code rlim structure. Variables @, rlimit-core @, rlimit-cpu @, rlimit-data @, rlimit-fsize @, rlimit-nofile @ rlimit-stack and @ rlimit-as .desc These variables correspond to the POSIX constants .codn RLIMIT_CORE , .codn RLIMIT_CPU , .code RLIMIT_DATA and so forth. .coNP Functions @ getrlimit and @ setrlimit .synb .mets (getrlimit < resource <> [ rlim ]) .mets (setrlimit < resource << rlim ) .syne .desc The .code getrlimit function retrieves information about the limits imposed for a particular parameter indicated by the .meta resource integer. The .code setrlimit function changes the limit information for a resource parameter. The .meta resource parameter is the value of one of the variables .codn rlimit-core , .codn rlimit-cpu , .code rlimit-data and so forth. The .meta rlim argument is a structure of type .codn rlim . If this argument is given to the .code getrlimit function, then it fills in that structure with the retrieved parameters. Otherwise it allocates a new structure and fills that one. In either situation, the filled structure is returned, if the underlying call to the host operating system is successful. In the case of .codn setrlimit , the .code rlim object must have non-negative integer values which are in the range of the platform's .code rlim_t type. If the underlying system call fails, then these functions throw an exception. In the successful case, the .code getrlimit function returns the .code rlim structure, and .code setrlimit returns .codn t . Further information about resource limits is available in the POSIX standard and platform documentation. .SS* Unix Memory Mapping The \*(TL interface to the POSIX .code mmap family of functions is based around the .code carray type. The .code mmap function returns a special variant of a .code carray object which keeps track of the memory mapping. When such an object becomes unreachable and is reclaimed by garbage collection, the mapping is automatically unmapped. In addition to .codn mmap , the functions .codn munmap , .codn mprotect , .code madvise and .code msync are provided, all taking a .code carray as their leftmost argument. The \*(TL functions do not strictly follow the argument conventions of the same-named, corresponding POSIX functions. Adjustments which are likely to be defaulted are moved to the right. For instance, the .code msync operation is often applied to the entire memory mapping. Therefore, the first argument is the .code carray object which keeps track of the mapping. The second argument specifies the flags to be applied, which constitute the last argument of the underlying POSIX function. The remaining two arguments are the size and offset. If these are omitted, then .code msync applies to the entire region, whose address and size are known to the .code carray object. Cautionary note: misuse of .code mmap and related functions can easily cause the \*(TX image to receive a fatal signal due to a bad memory access. Care must be taken to prevent such a situation, or else to catch such signals and recover. .coNP Function @ mmap .synb .mets (mmap < ffi-type < length < prot < flags .mets \ \ \ \ \ >> [ source >> [ offset <> [ addr ]]]) .syne .desc The .code mmap function provides access to the same-named POSIX platform function for creating memory mappings. The POSIX function can be used for creating virtual memory views of files and special devices. Views can be read-only, and they can be mutable. They can be in such a way that changes appear only in the mapping itself, or in such a way that the changes are actually propagated to the mapped object itself. Mappings can be shared among processes, providing a shared memory mechanism: for instance, if .code fork is called, any .code map-shared mappings created by the parent are shared with the child: the child process does not get a copy of a shared mapping, but a reference to it. The function can also be used simply for allocating memory: on some platforms, the POSIX .code mmap function is used as the basis for the .code malloc function. It behaves as a pure allocator when asked to create a mapping which is private, and anonymous (not backed by an object). The \*(TL .code mmap function is integrated with the .code carray type and the FFI type system. A mapping returned by .code mmap is represented by a .code carray object. The required .meta ffi-type argument specifies the element type of the array; it must be a compiled FFI type. Note: this may be produced by the .code ffi macro. For instance, the type .code int may be specified using the expression .codn "(ffi int)" . The type must be a complete type suitable as the element type of an array; a type with a zero size such as .code void is invalid. The .meta length argument specifies the requested length of the mapping in bytes. Note that .code mmap allocates or configures virtual memory pages, not bytes. Internally to the system, the .meta length argument is converted to a number of pages. If it specifies a fractional number of pages, it is rounded up. For instance, if the page size is 4096 bytes, and .meta length is specified as 5000, it will be internally rounded up to 8192. The returned \*(TL .code carray object is oblivious to this padding: it works with the given 5000-byte size. Note: the .code page-size variable holds the system's page size. However, by the use of .code mmap extensions, it is possible for individual mappings to have their own page size. Mixed page size virtual memory systems exist. The .code mmap function determines the number of elements in the array by dividing the .meta length by the size of .metn type , using a division that truncates toward zero. The returned .code carray shall have that many elements. If the division is inexact, it means that some bytes from the underlying memory mapping are unused, even if .code length is a multiple of the page size. The required .meta prot argument must be some bitwise combination of the portable values .codn prot-read , .code prot-write and .codn prot-exec . Additional system-specific .code prot- values may be available also for specifying additional properties. If .meta prot is specified as zero, then the mapping, if successfully created, may be inaccessible: .code prot-read must be present to ensure read access, and .code prot-write to ensure write access. The .meta flags argument is a bitwise combination of values given by various .code map- variables. At the very least, it must contain exactly one of .code map-shared or .codn map-private , to request a shared or private mapping, respectively. If a mapping is requested which is neither shared nor private, the underlying POSIX function will likely fail. If a .meta source is specified, indicating a filesystem object to be mapped, the .code map-anon flag must be omitted. Vice versa, if .meta source is not specified, this means that the mapping will be anonymous. In this situation, the .code map-anon flag must be present. The .meta source argument may be an integer file descriptor. If so, this value will be passed to the underlying POSIX function directly. The .meta source argument may be a stream object, in which case the .code fileno function will be applied to it, which must retrieve an integer file descriptor which will be passed to the POSIX function. The .meta source argument may be a filename. The specified file is opened as if via .codn open-file , with a .meta mode-string which is .str "r+" if the .meta prot argument includes the .code prot-write flag, otherwise .strn "r" . The integer file descriptor from this open stream is used in the underlying .code mmap call. The file is immediately closed when .code mmap returns. In all cases, the integer file descriptor passed to the POSIX function must be a value suitable for conversion to the .code int type. Note: in the context of .codn mmap , "anonymous" means "not associated with a filesystem object referenced by a descriptor". It does not mean "without a name", but refers to a pure memory allocation from virtual memory. Memory maps do not have a name, whether anonymous or not. Moreover, the filesystem object associated with a memory map itself does not necessarily have a name. An open file that has been deleted from the directory structure is anonymous, yet a memory mapping can be created using its descriptor, and that mapping is not "anonymous". The .meta offset argument is used with a non-anonymous mapping. It specifies that the mapping doesn't begin at the start of the file or file-like object, but rather at the specified offset. The offset may not be an arbitrary integer; it must be a multiple of the page size. Unless certain nonportable .meta flags are used to specify an alternative page size, the value of the .code page-size variable may be relied upon to indicate the page size. If an .meta offset is specified for an anonymous mapping, with a nonzero value, the underlying POSIX function may indicate failure. If the .meta length and .meta offset values cause one or more pages to be mapped which are beyond the end of the file, then accessing those pages may produce a signal which is fatal if not handled. The .meta addr argument is used for specifying the address in conjunction with the .code map-fixed flag. Possibly, certain nonportable values in the .meta flags argument may similarly require .metn addr . If no bit is present in .meta flags which requires .metn addr , then .meta addr should either not be specified, or specified as zero. A nonzero value of .meta addr must be a multiple of the page size. The .code mmap function returns a .code carray object if successful. Upon failure, an exception derived from .code error is thrown. Note: when a .code carray object returned by .code mmap is identified by the garbage collector as unreachable, and reclaimed, the memory mapping is unmapped. The .code munmap function can be invoked on the .code carray to release the mapping before the object becomes garbage. The .code carray-free function cannot be used on a mapped .codn carray . .coNP Function @ munmap .synb .mets (munmap << carray ) .syne .desc The .code munmap function releases the memory mapping tracked by .metn carray , which must be an object previously returned by .codn mmap . An exception is thrown if the object is any other kind of .codn carray . Note: the memory mapping is released by means of the same-named POSIX function. No provision is made for selectively unmapping the pages of a mapping; the entire mapping associated with a .meta carray is removed. When the memory mapping is released, .code munmap returns .codn t . Thereafter, the .metn carray 's contents may no longer be accessed, subject to .code error exceptions being thrown. If .code munmap is called again on a .code carray on which it had previously been successfully called, the additional calls return .codn nil . .coNP Functions @, mprotect @ madvise and @ msync .synb .mets (mprotect < carray < prot >> [ offset <> [ size ]]) .mets (madvise < carray < advice >> [ offset <> [ size ]]) .mets (msync < carray < flags >> [ offset <> [ size ]]) .syne .desc The functions .codn mprotect , .code madvise and .code msync perform various operations and adjustments on a memory mapping, using the same-named, corresponding POSIX functions. All functions follow the same argument conventions with regard to the .meta carray argument and the optional .meta offset and .meta size arguments. The respective second arguments .metn prot , .meta advice and .meta flags are all integers. Of these, .meta prot and .meta flags are bitmapped flags, whereas .meta advice specifies an enumerated command. The .meta prot argument is a bitwise combination of .code prot- values such as .codn prot-read , .code prot-write and .codn prot-exec . The .code mprotect function adjusts the protection bits of the mapping accordingly. The .meta advice argument of .code madvise should specify one of the following portable values, or else some system-specific nonportable .code madv- value: .codn madv-normal , .codn madv-random , .codn madv-sequential , .code madv-willneed or .codn madv-dontneed . The .meta flags argument of .code msync should specify exactly one of the values .code ms-async and .codn ms-sync . Additional .code ms- values such as .code ms-invalidate may be combined in. If .meta offset and .meta size are omitted, they default to zero and the size of the entire mapping, respectively, so the operation applies to the entire mapping. If only .meta size is specified, it must not exceed the mapping size, or an error exception is thrown. The .meta offset argument defaults to zero. If only .meta offset is specified, it must not exceed the length of the mapping, or else an error exception is thrown. The size is calculated as the difference between the offset and the length. It may be zero. If both .meta offset and .meta size are specified, they must not specify a region any portion of which lies outside the mapping. If .meta size is zero, .meta offset may be equal to the length of the mapping. The .meta offset must be a multiple of the page size, or else the operation will fail, since these functions work with virtual memory pages, and not individual bytes. The .meta length is adjusted by the system to a multiple of the applicable page size, as noted in the description of .codn mmap . When any of these three functions succeeds, it returns .codn t . Otherwise, it throws an exception. .coNP Variables @, map-shared @, map-private @ map-anon and @ map-fixed .desc The integer values of these variables are bitmasks, intended to be combined with .code logior to prepare a value for the .meta flags argument of .codn mmap . Additional nonportable, system-dependent .code map- variables may be available. Their names are derived by taking the .codn MAP_ -prefixed symbol from the platform header file, converting it to lowercase and replacing underscores by hyphen characters. Any such variable which exists, but has a value of zero, is present only for compatibility with other systems. For instance .code map-huge-shift may be present in non-Linux ports of \*(TX, but with a zero value; it has a nonzero value on Linux systems to which it is specific. Applications critically relying on certain flags should test the corresponding variables for nonzero to make sure they are actually available. .coNP Variables @, prot-none @, prot-read @ prot-write and @ prot-exec .desc The integer values of these variable are bitmasks, intended to be combined with .code logior to prepare a value for the .meta prot argument of .code mmap and .codn mprotect . Additional nonportable, system-dependent .code prot- variables may be available. Their names are derived by taking the .codn PROT_ -prefixed symbol from the platform header file, converting it to lowercase and replacing underscores by hyphen characters. Any such variable which exists, but has a value of zero, is present only for compatibility with other systems. .coNP Variables @, madv-normal @, madv-random @, madv-sequential @ madv-willneed and @ madv-dontneed .desc The integer values of these variable are bitmasks, intended to be combined with .code logior to prepare a value for the .meta advice argument of the .code madvise function. Additional nonportable, system-dependent .code madv- variables may be available. Their names are derived by taking the .codn MADV_ -prefixed symbol from the platform header file, converting it to lower case and replacing underscores by hyphen characters. Any such variable which exists, but has a value of zero, is present only for compatibility with another system. .coNP Variables @, ms-async @ ms-sync and @ ms-invalidate .desc The integer values of these variable are bitmasks, intended to be combined with .code logior to prepare a value for the .meta advice argument of the .code msync function. As described under .codn msync , exactly one of .code ms-async and .code ms-sync should be present; .code ms-invalidate is optional. .SS* Web Programming Support .coNP Functions @ url-encode and @ url-decode .synb .mets (url-encode < string <> [ space-plus-p ]) .mets (url-decode < string <> [ space-plus-p ]) .syne .desc These functions convert character strings to and from a form which is suitable for embedding into the request portions of URL syntax. Encoding a string for URL use means identifying in it certain characters that might have a special meaning in the URL syntax and representing it using "percent encoding": the percent character, followed by the ASCII value of the character. Spaces and control characters are also encoded, as are all byte values greater than or equal to 127 (7F hex). The printable ASCII characters which are percent-encoded consist of this set: .verb :/?#[]@!$&'()*+,;=% .brev More generally, strings can consists of Unicode characters, but the URL encoding consists only of printable ASCII characters. Unicode characters in the original string are encoded by expanding into UTF-8, and applying percent-encoding the UTF-8 bytes, which are all in the range .codn \ex80-\exFF . Decoding is the reverse process: reconstituting the UTF-8 byte sequence specified by the URL-encoding, and then decoding the UTF-8 sequence into the string of Unicode characters. There is an additional complication: whether or not to encode spaces as plus, and to decode plus characters to spaces. In encoding, if spaces are not encoded to the plus character, then they are encoded as .codn %20 , since spaces are reserved characters that must be encoded. In decoding, if plus characters are not decoded to spaces, then they are left alone: they become plus characters in the decoded string. The .code url-encode function performs the encoding process. If the .code space-plus-p argument is omitted or specified as .codn nil , then spaces are encoded as .codn %20 . If the argument is a value other than .codn nil , then spaces are encoded as the character .code + (plus). The .code url-decode function performs the decoding process. If the .code space-plus-p argument is omitted or specified as .codn nil , then .code + (plus) characters in the encoded data are retained as .code + characters in the decoded strings. Otherwise, plus characters are converted to spaces. .coNP Functions @, html-encode @ html-encode* and @ html-decode .synb .mets (html-encode << text-string ) .mets (html-encode* << text-string ) .mets (html-decode << html-string ) .syne .desc The .code html-encode and .code html-decode functions convert between an HTML and raw representation of text. The .code html-encode function returns a string which is based on the content of .metn text-string , but in which all characters which have special meaning in HTML have been replaced by HTML codes for representing those characters literally. The returned string is the HTML-encoded verbatim representation of .metn text-string . The .code html-decode function converts .metn html-string , which may contain HTML character encodings, into a string which contains the actual characters represented by those encodings. The function composition .code "(html-decode (html-encode text))" returns a string which is equal to .codn text . The reverse composition .code "(html-encode (html-decode html))" does not necessarily return a string equal to .codn html . For instance if html is the string .strn "

Hello, world!

" , then .code html-decode produces .strn "

Hello, world!

" . From this, .code html-encode produces .strn "<p>Hello, world!</p>" . The .code html-encode* function is similar to .code html-encode except that it does not encode the single and double quote characters (ASCII 39 and 34, respectively). Text prepared by this function may not be suitable for insertion into a HTML template, depending on the context of its insertion. It is suitable as text placed between tags but not necessarily as tag attribute material. .coNP Functions @, base64-encode @ base64-decode and @ base64-decode-buf .synb .mets (base64-encode >> [ string | << buf ] <> [ column-width ]) .mets (base64-decode < string) .mets (base64-decode-buf < string) .syne .desc The .code base64-encode function converts the UTF-8 representation of .metn string , or the contents of .metn buf , to Base64 and returns that representation as a string. The Base64 encoding is described in RFC 4648, section 5. The second argument must either be a character string, or a buffer object. The .code base64-decode functions performs the opposite conversion; it extracts the bytes encoded in a Base64 string, and decodes them as UTF-8 to return a character string. The .code base64-decode-buf extracts the bytes encoded in a Base64 string, and returns a new buffer object containing these bytes. The Base64 encoding divides the UTF-8 representation of .meta string or the bytes contained in .meta buf into groups of six bits, each representing the values 0 to 63. Each value is then mapped to the characters .code A to .codn Z , .code a to .codn z , the digits .code 0 to .code 9 and the characters .code + and .codn / . One or two consecutive occurrences of the character .code = are added as padding so that the number of non-whitespace characters is divisible by four. These characters map to the code 0, but are understood not to contribute to the length of the encoded message. The .code base64-encode function enforces this convention, but .code base64-decode doesn't require these padding characters. Base64-encoding an empty string or zero-length buffer results in an empty string. If the .meta column-width argument is passed to .codn base64-encode , then the Base64 encoded string, unless empty, contains newline characters, which divide it into lines which are .meta column-width long, except possibly for the last line. .coNP Functions @ base64-stream-enc and @ base64-stream-dec .synb .mets (base64-stream-enc < out < in >> [ nbytes <> [ column-width ]]) .mets (base64-stream-dec < out << in ) .syne .desc The .code base64-stream-enc and .code base64-stream-dec perform, respectively, bulk Base64 encoding and decoding between streams. This format is described in RFC 4648, section 5. The .meta in and .meta out arguments must be stream objects. The .meta out stream must support output. In the decode operation, it must support byte output. The .meta in stream must support input. In the encode operation it must support byte input. The .code base64-stream-enc function reads a sequence of bytes from the .meta in stream and writes characters to the .meta out stream comprising the Base64 encoding of that sequence. If the .meta nbytes argument is specified, it must be a nonnegative integer. At most .meta nbytes bytes will be read from the .meta in stream. If .meta nbytes is omitted, then the operation will read from the .meta in stream without limit, until that stream indicates that no more bytes are available. The optional .meta column-with argument influences the formatting of Base64 output, in the same manner as documented for the .code base64-encode function. The .code base64-stream-dec function reads the characters of a Base64 encoding from the .meta in stream and writes the corresponding byte sequence to the .meta out stream. It keeps reading and decoding until it encounters the end of the stream, or a character not used in Base64: a character that is not whitespace according to .codn chr-isspace , isn't any of the Base64 coding characters (not an alphanumeric character, and not one of the characters .codn + , .code / or .codn = . If the function stops due to a non-Base64 character, that character is pushed back into the .meta in stream. The .code base64-stream-enc function returns the number of bytes encoded; the .code base64-stream-dec function returns the number of bytes decoded. .coNP Functions @, base64url-encode @ base64url-decode and @ base64url-decode-buf .synb .mets (base64url-encode >> [ string | << buf ] <> [ column-width ]) .mets (base64url-decode < string) .mets (base64url-decode-buf < string) .syne .desc The .codn base64url-encode , .code base64url-decode and .code base64url-decode-buf functions conform, in nearly every respect, to the descriptions of, respectively, .codn base64-encode , .code base64-decode and .codn base64-decode-buf . The difference is that these functions use the encoding described in section 6 of RFC 4648, rather than section 5. This means that, in the encoding alphabet, instead of the symbols .code + (plus) and .code / (slash) the symbols .code - (minus) and .code _ (underline) are used. .coNP Functions @ base64url-stream-enc and @ base64url-stream-dec .synb .mets (base64url-stream-enc < out < in >> [ nbytes <> [ column-width ]]) .mets (base64url-stream-dec < out << in ) .syne .desc The .code base64url-stream-enc and .code base64url-stream-dec functions conform, in nearly every respect, to the descriptions of, respectively, .code base64-stream-enc and .codn base64-stream-dec . The difference is that these functions use the encoding described in section 6 of RFC 4648, rather than section 5. This means that, in the encoding alphabet, instead of the symbols .code + (plus) and .code / (slash) the symbols .code - (minus) and .code _ (underline) are used. .SS* Filter Module The filter module provides a trie (pronounced "try") data structure, which is suitable for representing dictionaries for efficient filtering. Dictionaries are unordered collections of keys, which are strings, which have associated values, which are also strings. A trie can be used to filter text, such that keys appearing in the text are replaced by the corresponding values. A trie supports this filtering operation by providing an efficient prefix-based lookup method which only looks at each input character once, and which does not require knowledge of the length of the key in advance. .coNP Function @ make-trie .synb .mets (make-trie) .syne .desc The .code make-trie function creates an empty trie. There is no special data type for a trie; a trie is some existing type such as a hash table. .coNP Function @ trie-add .synb .mets (trie-add < trie < key << value ) .syne .desc The .code trie-add function adds the string .meta key to the trie, associating it with .metn value . If .meta key already exists in .metn trie , then the value is updated with .metn value . The .meta trie must not have been compressed with .metn trie-compress . A trie can contain keys which are prefixes of other keys. For instance it can contain .str dog and .strn dogma . When a trie is used for matching and substitution, the longest match is used. If the input presents the text .strn doggy , then the match is .strn dog . If the input is .strn dogmatic , then .str dogma matches. .coNP Function @ trie-compress .synb .mets (trie-compress << trie ) .syne .desc The .code trie-compress function changes the representation of .meta trie to a representation which occupies less space and supports faster lookups. The new representation is returned. The compressed representation of a trie does not support the .code trie-add function. The .code trie-compress function destructively manipulates .metn trie , and may return an object that is the same object as .codn trie , or it may return a different object, while at the same time still modifying the internals of .metn trie . Consequently, the program should not retain the input object .codn trie , but use the returned object in its place. .coNP Function @ trie-lookup-begin .synb .mets (trie-lookup-begin << trie ) .syne .desc The .code trie-lookup-begin function returns a context object for performing an open-coded lookup traversal of a trie. The .meta tri argument is expected to be a trie that was created by the .code make-trie function. .coNP Function @ trie-lookup-feed-char .synb .mets (trie-lookup-feed-char < trie-context << char ) .syne .desc The .code trie-lookup-feed-char function performs a one character step in a trie lookup. The .meta trie-context argument must be a trie context returned by .metn trie-lookup-begin , or by some previous call to .codn trie-lookup-feed-char . The .meta char argument is the next character to match. If the lookup is successful (the match through the trie can continue with the given character) then a new trie context object is returned. The old trie context remains valid. If the lookup is unsuccessful, .code nil is returned. Note: determining whether a given string is stored in a trie can be performed looking up every character of the string successively with .codn trie-lookup-feed-char , using the newly returned context for each successive operation. If every character is found, it means that either that exact string is found in the trie, or a prefix. The ambiguity can be resolved by testing whether the trie has a value at the last node using .codn trie-value-at . For instance, if .str catalog is inserted into an empty trie with value .strn foo , then .str cat will look up successfully, being a prefix of .strn catalog ; however, the value at .str cat is .codn nil , indicating that .str cat is only a prefix of one or more entries in the trie. .coNP Function @ trie-value-at .synb .mets (trie-value-at << trie-context ) .syne .desc The .code trie-value-at function returns the value stored at the node in in the trie given by .metn trie-context . Nodes which have not been given a value hold the value .codn nil . .coNP Function @ filter-string-tree .synb .mets (filter-string-tree < filter << obj ) .syne .desc The .code filter-string-tree function returns a tree structure similar to .meta obj in which all of the string atoms have been filtered through .metn filter . The .meta obj argument is a string-tree structure: either the symbol .codn nil , denoting an empty structure; a string; or a list of tree structures. If .meta obj is .codn nil , then .code filter-string-tree returns .codn nil . The .meta filter argument is a filter: it is either a trie, a function, or nil. If .meta filter is .codn nil , then .code filter-string-trie just returns .metn obj . If .meta filter is a function, it must be a function that can be called with one argument. The strings of the string tree are filtered by passing each one into the function and substituting the return value into the corresponding place in the returned structure. Otherwise if .meta filter is a trie, then this trie is used for filtering, the string elements similarly to a function. For each string, a new string is returned in which occurrences of the keys in the trie are replaced by the values in the trie. .coNP Function @ filter-equal .synb .mets (filter-equal < filter-1 < filter-2 < obj-1 << obj-2 ) .syne .desc The .code filter-equal function tests whether two string trees are equal under the given filters. The precise semantics can be given by this expression: .mono .mets (equal (filter-string-tree < filter-1 << obj-1 ) .mets \ \ \ \ \ \ (filter-string-tree < filter-2 << obj-2 )) .onom The string tree .meta obj-1 is filtered through .metn filter-1 , as if by the .code filter-string-tree function, and similarly, .meta obj-2 is filtered through .metn filter-2 . The resulting structures are compared using .codn equal , and the result of that is returned. .coNP Function @ regex-from-trie .synb .mets (regex-from-trie << trie ) .syne .desc The .code regex-from-trie function returns a representation of .meta trie as regular-expression abstract syntax, suitable for processing by the .code regex-compile function. The values stored in the trie nodes are not represented in the regular expression. The .meta trie may be one that has been compressed via .codn trie-compress ; in fact, a compressed .meta trie results in more compact syntax. Note: this function is useful for creating a compact, prefix-compressed regular expression which matches a list of strings. .coNP Special Variable @ *filters* .desc The .code *filters* special variable holds a hash table which associates symbols with filters. This hash table defines the named filters used in the \*(TX pattern language. The names are the hash-table keys, and filter objects are the values. Filter objects are one of three representations. The value .code nil represents a null filter, which performs no filtering, passing the input string through. A filter object may be a raw or compressed trie. It may also be a Lisp function, which must be callable with one argument of string type, and must return a string. The application may define new filters by associating symbolic keys in .code *filters* with values which conform to the above representation of filters. The behavior is unspecified if any of the predefined filters are removed or redefined, and are subsequently used, or if the .code *filters* variable is replaced or rebound with a hash-table value which omits those keys, or associates them with different values. Note that functions .codn html-encode , .code html-encode* and .code html-decode use, respectively, the HTML-related .codn :tohtml , .code :tohtml* and .codn :fromhtml . .SS* Access To TXR Pattern Language From Lisp It is useful to be able to invoke the abilities of the \*(TX pattern Language from \*(TL. An interface for doing this provided in the form of the .code match-fun function, which is used for invoking a \*(TX pattern function. The .code match-fun function has a cumbersome interface which requires the \*(TL program to explicitly deal with the variable bindings emerging from the pattern match in the form of an association list. To make it the interface easier to use, \*(TX provides the macros .codn txr-if , .code txr-when and .codn txr-case . .coNP Function @ match-fun .synb .mets (match-fun < name < args >> [ input <> [ files ]]) .syne .desc The .code match-fun function invokes a \*(TX pattern function whose name is given by .metn name , which must be a symbol. The .meta args argument is a list of expressions. The expressions may be symbols which will be interpreted as pattern variables, and may be bound or unbound. If they are not symbols, then they are treated as expressions (of the pattern language, not \*(TL) and evaluated accordingly. The optional .meta input argument is an object of one of several types. It may be a stream, character string or list of strings. If it is a string, then it is converted to a list containing that string. A list of strings represents zero or more lines of text to be processed. If the .meta input argument is omitted, then it defaults to .codn nil , interpreted as an empty list of lines. The .meta files argument is a list of filename specifications, which follow the same conventions as files given on the \*(TX command line. If the pattern function uses the .code @(next) directive, it can process these additional files. If this argument is omitted, it defaults to .codn nil . The .code match-fun function's return value falls into three cases. If there is a match failure, it returns .codn nil . Otherwise it returns a cons cell. The .code car field of the cons cell holds the list of captured bindings. The .code cdr of the cons cell is one of two values. If the entire input was processed, the cdr field holds the symbol .codn t . Otherwise it holds another cons cell whose .code car is the remainder of the list of lines which were not matched, and whose .code cdr is the line number. .TP* Example: .verb @(define foo (x y)) @x:@y @line @(end) @(do (format t "~s\en" (match-fun 'foo '(a b) '("alpha:beta" "gamma" "omega") nil))) Output: (((a . "alpha") (b . "beta")) ("omega") . 3) .brev In the above example, the pattern function .code foo is called with arguments .codn "(a b)" . These are unbound variables, so they correspond to parameters .code x and .code y of the function. If .code x and .code y get bound, those values propagate to .code a and .codn b . The data being matched consists of the lines .strn alpha:beta , .str gamma and .strn omega . Inside .codn foo , .code x and .code y bind to .str alpha and .strn beta , and then the line variable binds to .strn gamma . The input stream is left with .strn omega . Hence, the return value consists of the bindings of .code x and .code y transferred to .code a and .codn b , and the second cons cell which gives information about the rest of the stream: it is the part starting at .strn omega , which is line 3. Note that the binding for the .code line variable does not propagate out of the pattern function .codn foo ; it is local inside it. .coNP Function @ match-fboundp .synb .mets (match-fboundp << symbol ) .syne .desc The .code match-fboundp function returns .code t or .code nil if, respectively, .meta symbol is the name of an existing pattern function. .coNP Macro @ txr-if .synb .mets (txr-if < name <> ( argument *) < input .mets \ \ \ \ \ \ \ < then-expr <> [ else-expr ]) .syne .desc The .code txr-if macro invokes the \*(TX pattern-matching function .meta name on some input given by the .meta input parameter, whose semantics are the same as the .meta input argument of the .code match-fun function. If .meta name succeeds, then .meta then-expr is evaluated, and if it fails, .meta else-expr is evaluated instead. In the successful case, .meta then-expr is evaluated in a scope in which the bindings emerging from the .meta name function are turned into \*(TL variables. The result of .code txr-if is that of .metn then-expr . In the failed case, .meta else-expr is evaluated in a scope which does not have any new bindings. The result of .code txr-if is that of .metn else-expr . If .meta else-expr is missing, the result is .codn nil . The .meta argument forms supply arguments to the pattern function .metn name . There must be as many of these arguments as the function has parameters. Any argument which is a symbol is treated, for the purposes of calling the pattern function, as an unbound pattern variable. The function may or may not produce a binding for that variable. Also, every argument which is a symbol also denotes a local variable that is established around .meta then-expr if the function succeeds. For any such pattern variable for which the function produces a binding, the corresponding local variable will be initialized with the value of that pattern variable. For any such pattern variable which is left unbound by the function, the corresponding local variable will be set to .codn nil . Any .meta argument can be a form other than a symbol. In this situation, the argument is evaluated, and will be passed to the pattern function as the value of the binding for the corresponding argument. .TP* Example: .verb @(define date (year month day)) @{year /\ed\ed\ed\ed/}-@{month /\ed\ed/}-@{day /\ed\ed/} @(end) @(do (each ((date '("09-10-20" "2009-10-20" "July-15-2014" "foo"))) (txr-if date (y m d) date (put-line `match: year @y, month @m, day @d`) (put-line `no match for @date`)))) Output: no match for 09-10-20 match: year 2009, month 10, day 20 no match for July-15-2014 no match for foo .brev .coNP Macro @ txr-when .synb .mets (txr-when < name <> ( argument *) < input << form *) .syne .desc The .code txr-when macro is based on .codn txr-if . It is equivalent to .mono .meti \ \ (txr-if < name <> ( argument *) < input (progn << form *)) .onom If the pattern function .meta name produces a match, then each .meta form is evaluated in the scope of the variables established by the .meta argument expressions. The result of the .code txr-when form is that of the last .metn form . If the pattern function fails then the forms are not evaluated, and the result value is .codn nil . .coNP Macro @ txr-case .synb .mets (txr-case < input-form .mets \ \ >> {( name <> ( argument *) << form *)}* .mets \ \ >> [( t << form *)]) .syne .desc The .code txr-case macro evaluates .meta input-form and then uses the value as an input to zero or more test clauses. Each test clause invokes the pattern function named by that clause's .meta name argument. If the function succeeds, then each .meta form is evaluated, and the value of the last .meta form is taken to be the result value of .codn txr-case , which terminates. If there are no forms, then .code txr-case terminates with a .code nil result. The forms are evaluated in an environment in which variables are bound based on the .meta argument forms, with values depending on the result of the invocation of the .meta name pattern function, in the same manner as documented in detail for the .code txr-if macro. If the function fails, then the forms are not evaluated, and control passes to the next clause. A clause which begins with the symbol .code t executes unconditionally and causes .code txr-case to terminate. If it has no forms, then .code txr-case yields .codn nil , otherwise the forms are evaluated in order and the value of the last one specifies the result of .codn txr-case . The value of the input .meta input-form is expected to be one of the same kinds of objects as given by the requirements for the .meta input argument of the .code match-fun functions. If .meta input-form evaluates to a stream object according to the .code streamp function, then the stream is converted to a lazy list of lines, as if by invoking the .code get-lines function on that stream; that list then serves as input to the clauses. .coNP Function @ txr-parse .synb .mets (txr-parse >> [ source >> [ error-stream .mets \ \ \ \ \ \ \ \ \ \ \ >> [ error-retval <> [ name ]]]]) .syne .desc The .code txr-parse function converts textual \*(TX query syntax into a Lisp data structure representation. The .meta source argument may be either a character string, or a stream. If it is omitted, then .code *stdin* is used as the stream. The .meta source must provide the text representation of one complete \*(TX query. The optional .meta error-stream argument can be used to specify a stream to which diagnostics of parse errors are sent. If absent, the diagnostics are suppressed. The optional .meta name argument can be used to specify the file name which is used for reporting errors. If this argument is missing, the name is taken from the name property of the .meta source argument if it is a stream, or else the word .code string is used as the name if .meta source is a string. If there are no parse errors, the function returns the parsed data structure. If there are parse errors, and the .meta error-retval parameter is present, its value is returned. If the .meta error-retval parameter is not present, then an exception of type .code syntax-error is thrown. .SS* Debugging Functions .coNP Functions @ source-loc and @ source-loc-str .synb .mets (source-loc << form ) .mets (source-loc-str < form <> [ alternative ]) .syne .desc These functions map an expression in a \*(TX program to the file name and line number of the source code where that form came from. The .code source-loc function returns the raw information as a cons cell whose .cod3 car / cdr consist of the line number, and file name. The .code source-loc-str function formats the information as a string. Forms which were parsed from a file have source location info tracking to their origin in that file. Forms which are the result of macro-expansion are traced to the form whose evaluation produced them. That is to say, they inherit that form's source location info. More precisely, when a form is produced by macro-expansion, it usually consists of material which was passed to the macro as arguments, plus some original material allocated by the macro, and possibly literal structure material which is part of the macro code. After the expansion is produced, any of its constituent material which already has source location info keeps that info. Those nodes which are newly allocated by the macro-expansion process inherit their source location info from the form which yields the expansion. If .meta form is not a piece of the program source code that was constructed by the \*(TX parser or by a macro, and thus it was neither attributed with source location info, nor has it inherited such info, then .code source-loc returns .codn nil . In the same situation, and if its .meta alternative argument is missing, the .code source-loc-str returns a string whose text conveys that the source location is not available. If the .meta alternative argument is present, it is returned. .coNP Functions @ rlcp and @ rlcp-tree .synb .mets (rlcp < dest-form << source-form ) .mets (rlcp < dest-tree << source-form ) .syne .desc The .code rlcp function copies the source code location info ("rl" means "read location") from the .meta source-form object to the .meta dest-form object. These objects are pieces of list-based syntax. If .meta dest-form already has source code location info, then no copying takes place. The .code rlcp-tree function copies the source code location info from .code rlcp into every cons cell in the .meta dest-tree tree structure which doesn't already have location info. It may be regarded as a recursive application of .code rlcp via .cod3 car / cdr recursion on the tree structure. However, the traversal performed by .code rlcp-tree gracefully handles circular structures. Note: these functions are intended to be used in certain kinds of macros. If a macro transforms .meta source-form to .metn dest-form , this function can be used to propagate the source code location info also, so that when the \*(TL evaluator encounters errors in transformed code, it can give diagnostics which refer to the original untransformed source code. The macro expander already performs this transfer. If a macro call form has location info, the expander propagates that info to that form's expansion. In some situations, it is useful for a macro or other code transformer to perform this action explicitly. .coNP Special Variable @ *rec-source-loc* .desc The Boolean special variable .code *rec-source-loc* controls whether the .code read and .code iread functions record source location info. The variable is .code nil by default, so that these functions do not record source location info. If it is true, then these functions record source location info. Regardless of the value of this variable, source location info is recorded for Lisp forms which are read from files or streams under the .code load function or specified on the \*(TX command line. Source location info is also always recorded when reading the \*(TX pattern language syntax. Note: recording and propagating location info incurs a memory and performance penalty. The individual cons cells and certain other literal objects in the structure which emerges from the parser are associated with source location info via a global weak hash table. .coNP Function @ macro-ancestor .synb .mets (macro-ancestor << form ) .syne .desc The .code macro-ancestor function returns information about the macro-expansion ancestor of .metn form . The ancestor is the original form whose expansion produced .metn form . If .meta form is not the result of macro-expansion, or the ancestor information is unavailable, the function returns .codn nil . .SS* Profiling .coNP Operator @ prof .synb .mets (prof << form *) .syne .desc The .code prof operator evaluates the enclosed forms from left to right similarly to .codn progn , while determining the memory allocation requests and time consumed by the evaluation of the forms. If there are no forms, the prof operator measures the smallest measurable operation of evaluating nothing and producing .codn nil . If the evaluation terminates normally (not abruptly by a nonlocal control transfer), then .code prof yields a list consisting of: .mono .mets >> ( value < malloc-bytes < gc-bytes << milliseconds ) .onom where .meta value is the value returned by the rightmost .metn form , or .code nil if there are no forms, .meta malloc-bytes is the total number of bytes of all memory allocation requests (or at least those known to the \*(TX runtime, such as those of all internal objects), .meta gc-bytes is the total number of bytes drawn from the garbage-collected heaps, and .meta milliseconds is the total processor time consumed over the execution of those forms. Notes: The bytes allocated by the garbage collector from the C function .code malloc to create heap areas are not counted as .metn malloc-bytes . .meta malloc-bytes includes storage such as the space used for dynamic strings, vectors and bignums (in addition to their gc-heap-allocated nodes), and the various structures used by the .code cobj type objects such as streams and hashes. Objects in external libraries that use uninstrumented allocators are not counted: for instance the C .code "FILE *" streams. .coNP Macro @ pprof .synb .mets (pprof << form *) .syne .desc The .code pprof (pretty-printing .codn prof ) macro is similar to .codn progn . It evaluates .metn form s, and returns the rightmost one, or .code nil if there are no forms. Over the evaluation of .metn form s, it counts memory allocations, and measures CPU time. If .metn form s terminate normally, then just prior to returning, .code pprof prints these statistics in a concise report on the .codn *stdout* . The .code pprof macro relies on the .code prof operator. .SS* Garbage Collection .coNP Function @ sys:gc .synb .mets (sys:gc <> [ full ]) .syne .desc The .code gc function triggers garbage collection. Garbage collection means that unreachable objects are identified and reclaimed, so that their storage can be reused. The function returns .code nil if garbage collection is disabled (and consequently nothing is done), otherwise .codn t . The Boolean .meta full argument, defaulting to .codn nil , indicates whether a full garbage collection should be requested. Even if this argument is .codn nil , a full garbage collection may occur due to having been scheduled. .coNP Function @ sys:gc-set-delta .synb .mets (sys:gc-set-delta << bytes ) .syne .desc The .code gc-set-delta function sets the GC delta parameter. Note: This function may disappear in a future release of \*(TX or suffer a backward-incompatible change in its syntax or behavior. When the amount of new dynamic memory allocated since the last garbage collection equals or exceeds the GC delta, a garbage collection pass is triggered. From that point, a new delta begins to be accumulated. Dynamic memory is used for allocating heaps of small garbage-collected objects such as cons cells, as well as the satellite data attached to some objects: like the storage arrays of vectors, strings or bignum integers. Most garbage collector behaviors are based on counting objects in the heaps. Sometimes a program works with a small number of objects which are very large, frequently allocating new, large objects and turning old ones into garbage. For instance a single large integer could be many megabytes long. In such a situation, a small number of heap objects therefore control a large amount of memory. This requires garbage collection to be triggered much more often than when working with small objects, such as conses, to prevent runaway allocation of memory. It is for this reason that the garbage collector uses the GC delta. There is a default GC delta of 64 megabytes. This may be overridden in special builds of \*(TX for small systems. .coNP Function @ finalize .synb .mets (finalize < object < function <> [ reverse-order-p ]) .syne .desc The .code finalize function registers .meta function to be invoked in the situation when .meta object is identified by the garbage collector as unreachable. A function registered in this way is called a finalizer. If and when this situation occurs, the finalizer .meta function will be called with .meta object as its only argument. Multiple finalizer functions can be registered for the same object, up to an internal limit which is not required to be greater than 255. If the limit is exceeded, .code finalize throws an error exception. All registered finalizers are called when the object becomes unreachable. Finalizers registered against an object may also be invoked and removed using the .code call-finalizers function. If the .meta reverse-order-p argument isn't specified, or is .codn nil , then finalizer is registered at the end of the list. If .meta reverse-order-p is true, then the finalizer is registered at the front of the list. Finalizers which are activated in the same finalization processing phase are called in the order in which they appear in the registration list. After a finalization call takes place, its registration is removed. However, neither .meta object nor .meta function are reclaimed immediately; they are treated as if they were reachable objects until at least the next garbage collection pass. It is therefore safe for .meta function to store somewhere a persistent reference to .meta object or to itself, thereby reinstating these objects as reachable. A finalizer is itself permitted to call .code finalize to register the original .code object or any other object for finalization. Finalization processing can be understood as taking place in one or more rounds. At the start of each round, finalizers are identified that are to be called, arranged in order, and removed from the registration list. If this identification stage produces no finalizers, then finalization ends. Otherwise, those finalizers are processed, and then another round is initiated, to look for eligible finalizers that may have been registered during the previous round. Note: it is possible for the application to create an infinite finalization loop, if one or more objects have finalizers that register new finalizers, which register new finalizers and so on. Note: if a finalizer is invoked by the garbage collector rather than explicit finalization via .codn call-finalizers , and that finalizer calls .code finalize to make a registration, that registration will not be eligible for processing in the same phase, because the criteria for finalization is unreachability. .coNP Function @ call-finalizers .synb .mets (call-finalizers << object ) .syne .desc The .code call-finalizers function invokes and removes the finalizers, if any, registered against .metn object . If any finalizers are called, it returns .codn t , otherwise .codn nil . Finalization performed by .code call-finalizers works in the manner described under the specification of the .code finalize function. It is permissible for a finalizer function itself to call .codn call-finalizers . Such a call can happen in two possible contexts: finalization initiated by by garbage collection, or under the scope of a .code call-finalizers invocation from application code. Doing so is safe, since the finalization logic may be reentered recursively. When finalizers are being called during a round of processing, those finalizers have already been removed from the registration list, and will not be redundantly invoked by a recursive invocation of finalization. Under the scope of garbage-collection-driven reclamation, the order of finalizer calls may not be what the application logic expects. For instance even though a finalizer registered for some object .code A itself invokes .codn "(call-finalizers B)" , it may be the case during GC reclamation that both .code A and .code B are identified as unreachable objects at the same time, and some or all finalizers registered against .code B have already been called before the given .code A finalizer performs the explicit .code call-finalizers invocation against .codn B . Thus the call either has no effect at all, or only calls some remaining .code B finalizers that have not yet been processed, rather than all of them, as the application expects. The application must avoid creating a dependency on the order of finalization calls, to prevent the situation that the finalization actions are only correct under an explicit .code call-finalizers but incorrect under spontaneous reclamation driven by garbage collection. .SS* Stack-Overflow Protection \*(TX features a rudimentary mechanism for guarding against stack overflows, which cause the \*(TX process to crash. This capability is separate from and exists in addition to the possibility of catching a .code sig-segv (segmentation violation) signal upon stack overflow using .codn set-sig-handler . The stack-overflow guard mechanism is based on \*(TX, at certain key places in the execution, checking the current position of the stack relative to a predetermined limit. If the position exceeds the limit, then an exception of type .codn stack-overflow , derived from .codn error , is thrown. The stack-overflow guard mechanism is configured on startup. On platforms where it is possible to inquire the system's actual stack limit, and where the stack limit is at least 512 kilobytes, \*(TX sets the limit to within a certain percentage of the actual value. If it is not possible to determine the system's stack limit, or if the system indicates that the stack size is unlimited, then a default limit is imposed. If the system's limit is configured below a certain small value, then that small value is used as the stack limit. The .code get-stack-limit and .code set-stack-limit functions are provided to manipulate the stack limit. The mechanism cannot contain absolutely all sources of stack-overflow threat under all conditions. External functions are not protected, and not all internal functions are monitored. If \*(TX is close to the limit, but a function is called whose stack growth is not monitored, such as an external function or unmonitored internal function, it is possible that the stack may overflow anyway. .coNP Functions @ get-stack-limit and @ set-stack-limit .synb .mets (get-stack-limit) .mets (set-stack-limit << value ) .syne .desc The .code get-stack-limit returns the current value of the stack limit. If the guard mechanism is not enabled, it returns .codn nil , otherwise it returns a positive integer, which is measured in bytes. The .code set-stack-limit configures the stack limit according to .metn value , possibly enabling or disabling the guard mechanism, and returns the previous stack limit in exactly the same manner as .codn get-stack-limit . The .meta value must be a non-negative integer or else the symbol .codn nil . The values zero or .code nil disable the guard mechanism. Positive integer values set the limit. The value may be truncated to a multiple of some denomination or otherwise adjusted, so that a subsequent call to .code get-stack-limit need not retrieve that exact value. If .meta value is too close to the system's stack limit or beyond, the effectiveness of the stack-overflow detection mechanism is compromised. Likewise, if .meta value is too low, the operation of \*(TX shall become unreliable. Values smaller than 32767 bytes are strongly discouraged. .SS* Modularization .coNP Variable @ self-path .desc This variable holds the invocation pathname of a \*(TX program that was specified on the command line. The value of .code self-path when \*(TL expressions are being evaluated in command-line arguments is the string .strn cmdline-expr . The value of .code self-path when a \*(TX query is supplied on the command line via the .code -c command-line option is the string .strn cmdline . When a file is being compiled using the .code --compile option, the value of .code self-path is the source file path. When the interactive listener is entered, .code self-path is set to the value .strn listener , even if prior to that, a file was compiled or executed, for which .code self-path had been set to the name of that file. Note that for programs read from a file, .code self-path holds the resolved name, and not the invocation name. For instance if .code foo.tl is invoked using the name .codn foo , whereby \*(TX infers the suffix, then .code self-path holds the suffixed name. Note that the functions .codn load , .code compile-file and .code compile-update-file have no effect on the value of .code self-path. The variable is set strictly by command line processing. .coNP Variable @ stdlib .desc The .code stdlib variable expands to the directory where the \*(TX standard library is installed. It includes the trailing slash. Note: there is no need to use the value of this variable to load library modules. Library modules are keyed to specific symbols, and lazily loaded. When a \*(TL library function, macro or variable is referenced for the first time, the library module which defines it is loaded. This includes references which occur during the code expansion phase, at "macro time", so it works for macros. In the middle of processing a syntax tree, the expander may encounter a symbol that is registered for autoloading, and trigger the load. When the load completes, the symbol might now be defined as a macro, which the expander can immediately use to expand the given form that is being traversed. .coNP Function @ load .synb .mets (load < target << load-arg *) .syne .desc The .code load function causes a file containing \*(TL or \*(TX code to be read and processed. The .meta target argument is a string. The function can load \*(TL source files as well as compiled files. Firstly, the value in .meta target is converted to a .I "tentative pathname" as follows. If .meta target specifies a pure relative pathname, as defined by the .code pure-rel-path-p function, then a special behavior applies. If an existing load operation is in progress, then the special variable .code *load-path* has a binding. In this case, .code load will assume that the relative pathname is a reference relative to the directory portion of that pathname. If .code *load-path* has the value .codn nil , then a pure relative .meta target pathname is used as-is, and thus resolved relative to the current working directory. Once the tentative pathname is determined, .code load determines whether the name is suffixed. The name is suffixed if it ends in any of these four suffixes: .codn .tlo , .codn .tlo.gz , .codn .tl , .codn .txr , .code .txr-profile or .codn .txr_profile . Depending on whether the tentative pathname exists, and whether or not it is suffixed, .code load tries to make one or more attempts to open several variations of that name. These variations are called .IR "actual paths" . If any attempt fails due to an error other than non-existence, such as a permission error, then no further attempts are made; the error exception propagates to .codn load 's caller. Regardless of whether the tentative pathname is suffixed, .code load tries to open a file by that actual pathname first. If that attempt fails for a suffixed pathname, or fails due to a reason other than non-existence, no other names are tried. If an unsuffixed tentative pathname refers to a nonexistent file, .code .tlo is appended to the name, and an attempt is made to open a file with the resulting path. If that file is not found, then the suffixes .code .tlo.gz and .code .tl are similarly tried. If the above .I "initial attempts" to find the file fail, and the failure is due to the file not being found rather than some other problem such as a permission error, and .meta target isn't an absolute path according to .codn abs-path-p , then additional attempts are made by searching for the file in the list of directories given in the .code *load-search-dirs* variable. For each directory taken from this variable, the directory is combined with the relative .meta target as if using the .code path-cat function, and the resulting path is tried, with all the same suffix probing that is performed by the initial attempts. If any such a path is pure relative, it is interpreted relative to the current working directory, and not relative .codn *load-path* : only the initial attempts have that special behavior. An exception is thrown if a file is not found, or if any attempt to open a file results in an error other than non-existence. If an unsuffixed file is successfully opened, its contents are treated as interpreted Lisp. Files ending in .code .txr-profile or .code .txr_profile are also treated as interpreted Lisp. Files ending in .code .tlo are treated as compiled Lisp, and those ending in .code .txr are treated as the \*(TX Pattern Language. The .code .tlo.gz suffix denotes a file which is expected to be compressed in the .code gzip format, and to contain compiled Lisp. If the file is treated as \*(TL, then Lisp forms are read from it in succession. Each form is evaluated as if by the .code eval function, before the next form is read. If a syntax error is encountered, an exception of type .code eval-error is thrown. If a file is treated as a compiled \*(TL object file, then the compiled images of top-level forms are read from it, converted into compiled objects, and executed. If the file treated as \*(TX Pattern Language code, then its contents are parsed in their entirety. If the parse is successful, the query is executed. Previous \*(TX pattern variable and function bindings are in effect. If the query binds new variables and functions, these emerge from the .code load and take effect. If the parse is unsuccessful, an exception of type .code query-error is thrown. Parser error messages are directed to the .code *stderr* stream. Over the evaluation of either a \*(TL, compiled file, or \*(TX file, .code load establishes a new dynamic binding for several special variables: .RS .coIP *load-path* This variable is bound to the actual pathname being loaded. .coIP *load-args* The values of the .meta load-arg arguments which follow .meta target are combined into a list which is bound to .codn *load-args* . By this mechanism, .code load can pass arguments to the loaded file. .coIP *package* is given a new dynamic binding, whose value is the same as its existing binding. Thus if the processing of the loaded file has the effect of altering the value of .codn *package* , that effect will be undone when the binding is removed after the load completes. .RE .IP Over the evaluation of either a \*(TL, compiled file, or \*(TX file, .code load establishes a block named .codn load , which makes it possible for the loaded module to abort the loading using the .mono .meti (return-from load << expr ) .onom expression. In this situation, the value of .meta expr will appear as the return value of the .code load function. When a \*(TL file, or compiled file, is executed from the \*(TX command line in such a way that \*(TX will terminate when that file's last form has been evaluated, then if that file performs a .code return-from the .code load block, the value of .meta expr will turn into the termination status in exactly the same way as if that value were used as an argument to the .code exit function. However, if \*(TX has been instructed to enter into the Listener after executing the file, then the value of .meta expr is discarded. A block named .code load is also established by the .code @(load) directive in the pattern language. That directive provides no access to the returned value. The block is also visible to the file processed from the command line. When a such a file aborts the load via .codn return , the returned value is discarded. If the interactive option .code -i was specified, the interactive listener will be entered, otherwise the process will terminate successfully. When the .code load function terminates normally after processing a file, it returns .codn nil . If the file contains a \*(TX pattern query which is processed to completion, the matching success or failure of that query has no bearing on the return value of .codn load . Note that this behavior is different from the .code @(load) directive which itself fails if the loaded query fails, causing subsequent directives not to be processed. A \*(TX pattern language file loaded with the Lisp .code load function does not have the usual implicit access to the command-line arguments, unlike a top-level \*(TX query. If the directives in the file try to match input, they work against the .code *stdin* stream. The .code @(next) directive behaves as it does when no more arguments are available. If the source or compiled file begins with the characters .codn #! , usually indicating a hash-bang script, .code load reads reads the first line of the file and discards it. Processing of the file then begins with the first byte following that line. Two or more .code .tlo files produced by the same version of \*(TX may be catenated together (for instance, using the .code cat-files function) to produce a single .code .tlo file. Such a combined file can be loaded with the .code load function. The same is true of .code .tlo.gz files, because the .code gzip format supports catenation. Mixing is not possible: .code .tlo and .code .tlo.gz files cannot be catenated together. Note: this is a single .code load operation: all of the binding and unbinding of variables like .code *load-path* and .code *package* is performed once over the entire contents of the combined file, and any .code *load-hooks* are performed one time after the load operation. Therefore it is possible that the load-time behavior differs from that of loading the original files individually. The .code *load-path* is bound to the name of the combined file. .coNP Special Variable @ *load-path* .desc The .code *load-path* special variable has a top-level value which is .codn nil . When a file is being loaded, it is dynamically bound to the pathname of that file. This value is visible to the forms are evaluated in that file during the loading process. The .code *load-path* variable is bound when a file is loaded from the command line. If the .code -i command-line option is used to enter the interactive listener, and a file to be loaded is also specified, then the .code *load-path* variable remains bound to the name of that file inside the listener. The .code load function establishes a binding for .code *load-path* prior to processing and evaluating all the top-level forms in the target file. When the forms are evaluated, the binding is discarded and .code load returns. The .code compile-file function also establishes a binding for .codn *load-path* . The .code @(load) directive, also similarly establishes a binding around the parsing and processing of a loaded \*(TX source file. Also, during the processing of the profile file (see Interactive Profile File), the variable is bound to the name of that file. .coNP Special Variable @ *load-search-dirs* .desc The .code *load-search-dirs* variable holds a list of directories which are searched for a file to be loaded by the .code load function, the .code @(load) and .code @(include) directives, as well as by \*(TX's command line processing. Each of these situations first searches for a file in its characteristic way. If that fails due to the file not being found, and the name is a relative path, then the directories in .code *load-search-dirs* are probed, in order. The variable is initialized to a list which contains exactly one directory: a .code lib/ directory dynamically calculated relative to \*(TX the executable location. Then intent is that third-party library modules may be installed there, and easily found by .codn load . For more information, see the section Deployment Directory Structure. The .code *load-search-dirs* isn't influenced by any environment variables, which is deliberate. If a system has multiple installations of different versions of \*(TX in different locations, an environment variable intended for one installation could be mistakenly used by the others, resulting in chaos. .coNP Special Variable @ *load-hooks* .desc The .code *load-hooks* variable is at the centre of a mechanism which associates the deferred execution of actions, associated with a loaded module or program termination. The application may push values onto this list which are expected to be functions, or objects that may be called as functions. These objects must be capable of being called with no arguments. In the situations specified below, the list of functions is processed as follows. First .code *load-hooks* is examined, the list which it holds is remembered. Then the variable is reset to .codn nil , following which the remembered list is traversed in order. Each of the functions in the list is invoked, with no arguments. The .code *load-hooks* list is processed, as described above, whenever the .code load function terminates, whether normally or by throwing an exception. In this situation, the .code *load-hooks* variable which is accessed is that binding which was established by that invocation of .codn load . The execution of the functions from the .code *load-hooks* list takes place in the dynamic environment of the .codn load : all of the dynamic variable bindings established by that .code load are still visible, including that of .codn *load-hooks* . The .code *load-hooks* list is also processed after processing a \*(TX or \*(TL file that is specified on the command line. If the interactive listener is also being entered, this processing of .code *load-hooks* occurs prior to entering the listener. This situation occurs in the context of the top-level dynamic environment, and so the global value of .code *load-hooks* is referenced. Lastly, .code *load-hooks* is also processed if the \*(TX process terminates normally, regardless of its exit status. This processing executes in whatever dynamic environment is current at the point of exit, using its value of the .code *load-hooks* variable is used. It is unspecified whether, at exit time, the .code *load-hooks* functions are executed first, or whether the functions registered by .code at-exit-call are executed first. However, their executions do not interleave. Note that .code *load-hooks* is not processed after the listener reads the .code .txr-profile file. Hooks installed by the profile file will activate when the process exits. .coNP Function @ load-args-recurse .synb .mets (load-args-recurse << file-list ) .mets (load-args-recurse << file *) .syne .desc The .code load-args-recurse function loads multiple files, passing down the current .code *load-args* to each one. It may be invoked with a single argument which is a list of files, or else it may be given multiple arguments which are files. Each .meta file is passed to the .code load function, along with extra arguments coming from the current .code *load-args* value. Note: the purpose of .code load-args-recurse is to support a module organization of system whereby modules have local top-level files that respond to various actions specified via .codn *load-args* , actions such as compiling, loading or cleaning. The .code load-args-recurse function allows such modules to not only perform the actions requested in .code *load-args* locally, but also pass it down to submodules which then do the same. .coNP Function @ load-args-process .synb .mets (load-args-process << file-list ) .mets (load-args-process << file *) .syne .desc The .code load-args-process function performs one of several actions over the specified files, those actions being distinguished by the value in .codn *load-args* . In addition, some of the actions are also performed for the file indicated in the current value of .codn *load-path* . It may be invoked with a single argument which is a list of files, or else it may be given multiple arguments which are files. If there is exactly one argument in .codn *load-args* , the function responds to the following values of that argument: .RS .coIP :compile First, the current file in .code *load-path* is processed with .codn compile-update-file . Then each file in the argument list is also processed with .codn compile-update-file . Whenever that function returns .code nil for any file, that file is loaded with .codn load . No additional arguments are passed to this .code load invocation. .coIP :clean The current file in .code *load-path* as well as the files passed as arguments, are processed with .codn clean-file . .RE .IP Any other value of .code *load-args* causes the function to .code load the files passed in the argument, as if by .codn load-args-recurse . Note: The .code load-args-process function supports a protocol for organizing a program into library modules. .TP* Example: Suppose a module located in the .str path/to/application path consists of the files .strn command .strn data .str reports and .str main . Further, suppose that there are two submodules in the .str utils directory relative to this directory: .str database and .strn date . Then the application might have a file called .str "path/to/application/app.tl" with this content: .verb (compile-only (load-args-recurse "utils/database/db" "utils/date/date") (load-args-process "command" "data" "reports" "main")) .brev Furthermore, the .str database module similarly provides a .str "path/to/application/utils/database/db.tl" file with this content: .verb (compile-only (load-args-process "postgres" "mariadb" "sqlite")) .brev Lastly, the .str date module provides a file .str "path/to/application/utils/date/date.tl" with this content: .verb (compile-only (load-args-process "src/date.tl")) .brev Then, to load the application and the submodules, all that is needed is .codn "(load \(dqpath/to/application/app\(dq)" . Furthermore, the modules may be compiled using .codn "(load \(dqpath/to/application/app\(dq :compile)" . Now the .code *load-args* being passed is .code "(:compile)" which tells every .code load-args-process invocation to compile the file in which it occurs as well as its arguments. First, the .code app module's .code load-args-recurse call is executed, causing the .str database and .str date modules to compile. First, the .str database module's .str db.tl top file is compiled, if necessary, and then likewise the .strn postgres.tl , .str mariadb.tl and .str sqlite.tl files. Then the .str date module is similarly processed, due to its own invocation of .codn load-args-process . Finally the .code load-args-process call in the .str app module compiles .strn app.tl , .strn command.tl , .str data.tl .str reports.tl and .strn main.tl If the .code :clean keyword is passed via .code *load-args* instead of .codn :compile , then compiled files are recursively removed. The next time the application is loaded, source files will be loaded rather than compiled files. Note that the .code load-args-recurse and .code load-args-process forms are placed into a .code compile-only form so that the file compiler refrains from executing them. .coNP Macros @ push-after-load and @ pop-after-load .synb .mets (push-after-load << form *) .mets (pop-after-load) .syne .desc The .code push-after-load and .code pop-after-load macros work with the .code *load-hooks* list. The .code push-after-load macro's arguments are zero or more .metn form s. These forms are converted into the body of an anonymous function, which is pushed onto the .code *load-hooks* list. The return value is the new value of .codn *load-hooks* . The .code pop-after-macro removes the first item from .codn *load-hooks* . The return value is the new value of .codn *load-hooks* . The following equivalences hold: .verb (push-after-load ...) <--> (push (lambda () ...) *load-hooks*) (pop-after-load) <--> (set *load-hooks* (cdr *load-hooks*)) .brev .coNP Macro @ load-for .synb .mets (load-for >> {( kind < sym < target << load-arg* )}*) .syne .desc The .code load-for macro takes multiple arguments, each of which is a three-element clause. Each clause specifies that a given .meta target file is to be conditionally loaded based on whether a symbol .meta sym has a certain kind of binding. Each argument clause has the syntax .mono .meti >> ( kind < sym < target << load-arg *) .onom where .meta kind is one of the five symbols .codn var , .codn fun , .codn macro , .code struct or .codn pkg . The .meta sym element is a symbol suitable for use as a variable, function or structure name, and .meta target is an expression which is evaluated to produce a value that is suitable as an argument to the .code load function. First, all .code target expressions in all clauses are unconditionally evaluated in left-to-right order. Then the clauses are processed in that order. If the .meta kind symbol of a clause is .codn var , then .code load-for tests whether .meta sym has a binding in the variable namespace using the .code boundp function. If a binding does not exist, then the value of the .meta target expression is passed to the .code load function. Otherwise, .code load is not called. Similarly, if .meta kind is the symbol .codn fun , then .meta sym is instead tested using .codn fboundp , if .meta kind is .codn macro , then .meta sym is tested using .codn mboundp , if .meta kind is .codn struct , then .meta sym is tested using .codn find-struct-type , and if .meta kind is .codn pkg , then .meta sym is tested using .codn find-package . When .code load-for invokes the .code load function, it confirms whether loading file has had the expected effect of providing a definition of .meta sym of the right .metn kind . If this isn't the case, an error is thrown. The .code load function is invoked with any .meta load-arg arguments specified in the clause. The .meta load-arg expressions of all clauses are unconditionally evaluated in order before .code load-arg performs any other action. The .code load-for function returns the value returned by the rightmost .code load that was actually performed. If no loads are performed, it returns .codn nil . .coNP Variable @ txr-exe-path .desc This variable holds the absolute pathname of the executable file of the running \*(TX instance. .SS* Function Tracing .coNP Special Variable @ *trace-output* .desc The .code *trace-output* special variable holds a stream to which all trace output is sent. Trace output consists of diagnostics enabled by the .code trace macro. .coNP Macros @ trace and @ untrace .synb .mets (trace << function-name *) .mets (untrace << function-name *) .syne .desc The .code trace and .code untrace macros control function tracing. When .code trace is called with one or more arguments, it considers each argument to be the name of a global function. For each function, it turns on tracing, if it is not already turned on. If an argument denotes a nonexistent function, or is invalid function name syntax, .code trace terminates by throwing an exception, without processing the subsequent arguments, or undoing the effects already applied due to processing the previous arguments. When .code trace is called with no arguments, it lists the names of functions for which tracing is currently enabled. In other cases it returns .codn nil . When .code untrace is called with one or more arguments, it considers each argument to be the name of a global function. For each function, it turns off tracing, if tracing is enabled. When .code untrace is called with no arguments, it disables tracing for all functions. The .code untrace macro always returns .code nil and silently tolerates arguments which are not names of functions currently being traced. Tracing a function consists of printing a message prior to entry into the function indicating its name and arguments, and another message upon leaving the function indicating its return value, which is syntactically correlated with the entry message, using a combination of matching and indentation. These messages are posted to the .code *trace-output* stream. When traced functions call each other or recurse, these trace messages nest. The nesting is detected and translated into indentation levels. Tracing works by replacing a function definition with a trace hook function, and retaining the previous definition. The trace hook calls the previous definition and produces the diagnostics around it. When .code untrace is used to disable tracing, the previous definition is restored. Methods can be traced; their names are given using .mono .meti (meth < struct << slot ) .onom syntax: see the .code func-get-name function. Macros can be traced; their names are given using .mono .meti (macro << name ) .onom syntax. Note that .code trace will not show the destructured internal macro arguments, but only the two arguments passed to the expander function: the whole form, and the environment. The .code trace and .code untrace functions return .codn nil . .SS* Dynamic Library Access .coNP Function @ dlopen .synb .mets (dlopen >> [{ lib-name | nil} <> [ flags ]) .syne .desc The .code dlopen function provides access to the POSIX C library function of the same name. The argument to the optional .meta lib-name parameter may be a character string, or .codn nil . If it is .codn nil , then the POSIX function is called with a null pointer for its name argument, returning the handle for the main program, if possible. The .meta flags argument should be expressed as some bitwise combination of the values of the variables .codn rtld-lazy , .codn rtld-now , or other .code rtld- variables which give names to the .codn dlopen -related flags. If the .meta flags argument is omitted, the default value used is .codn rtld-lazy . If the function succeeds, it returns an object of type .code cptr which represents the open library handle ("dlhandle"). Otherwise it throws an exception, whose message incorporates, if possible, error text retrieved from the .code dlerror POSIX function. The .code cptr handle returned by .code dlopen will automatically be subject to .code dlclose when reclaimed by the garbage collector. .coNP Function @ dlclose .synb .mets (dlclose << dlhandle ) .syne .desc The .code dlclose closes the library indicated by .metn dlhandle , which must be a .code cptr object previously returned by .codn dlopen . The handle is closed by passing the stored pointer to the POSIX .code dlclose function. The internal pointer contained in the .code cptr object is then reset to null. It is permissible to invoke .code dlclose more than once on a .code cptr object which was created by .codn dlopen . The first invocation resets the .code cptr object's pointer to null; the subsequent invocations do nothing. The .code dlclose function returns .code t if the POSIX function reports a successful result (zero), otherwise it returns .codn nil . It also returns .code nil if invoked on a previously closed, and hence nulled-out .code cptr handle. .coNP Functions @ dlsym and @ dlvsym .synb .mets (dlsym < dlhandle << sym-name ) .mets (dlvsym < dlhandle < sym-name << ver-name ) .syne .desc The .code dlsym function provides access to the same-named POSIX function. The .code dlvsym function provides access to the same-named GNU C Library function, if available. The .meta dlhandle argument must be a .code cptr handle previously returned by .code dlopen and not subsequently closed by .code dlclose or altered in any way. The .meta sym-name and .meta ver-name arguments are character strings. If these functions succeed, they return a .code cptr value which holds the address of the symbol which was found in the library. If they fail, they return a .code cptr object containing a null pointer. .coNP Functions @ dlsym-checked and @ dlvsym-checked .synb .mets (dlsym-checked < dlhandle << sym-name ) .mets (dlvsym-checked < dlhandle < sym-name << ver-name ) .syne .desc The .code dlsym-checked and .code dlvsym-checked functions are alternatives to .code dlsym and .codn dlvsym , respectively. Instead of returning a null .code cptr on failure, these functions throw an exception. .coNP Variables @, rtld-lazy @, rtld-now @, rtld-global @, rtld-local @, rtld-nodelete @ rtld-noload and @ rtld-deepbind .desc These variables provide the same values as constants in the POSIX C library header .code "" named .codn RTLD_LAZY , .codn RTLD_NOW , .codn RTLD_LOCAL , etc. .SS* Data Interchange Support .coNP Macro @ json .synb .mets (json [quote | sys:qquote] << object ) .syne .desc The .code json macro exists in support of the JSON literal and quasiquote .mono .meti >> #J json-syntax .onom and .mono .meti >> #J^ json-syntax .onom notations, which use the macro as their target abstract syntax. The macro transforms itself by deleting the .code json symbol, producing either the .mono .meti (quote << object ) .onom quote syntax, or else the .mono .meti (sys:qquote << object ) .onom quasiquote syntax, depending on which quoting symbol is present. If the application produces and expands a .code json macro form which does not conform to this syntax, or does not specify one of the above two quoting symbols, the behavior is unspecified. .coNP Functions @ put-json and @ put-jsonl .synb .mets (put-json < obj >> [ stream <> [ flat-p ]]) .mets (put-jsonl < obj >> [ stream <> [ flat-p ]]) .syne .desc The .code put-json function converts .meta obj into JSON notation, and writes that notation into .meta stream as a sequence of characters. If .meta stream is an external stream such as a file stream, then the JSON is rendered by conversion of the characters into UTF-8, in the usual manner characteristic of those streams. The behavior is unspecified if .meta obj or any component of .meta obj is an object incompatible with the JSON representation conventions. An exception may be thrown. An object conforms to the JSON representation conventions if it is: .RS .IP 1. one of the symbols .codn nil , .code t or .codn null , which map to the JSON keywords .codn false , .code true and .codn null , respectively. .IP 2. a floating-point number or integer. .IP 3. a character string. .IP 4. a vector or list of JSON-conforming objects. .IP 5. a hash table whose keys and values are JSON-conforming objects. .RE .IP Note that unless the keys in a hash table are all strings, nonstandard JSON is produced, since RFC 8259 requires JSON object keys to be strings. A list of object is rendered in the same way as vector, in the JSON .code [] notation. When such JSON notation is parsed, a vector is produced. A structure object is rendered into JSON using the .code {} object notation. The keys of the objects are the names of the symbols of the object type's non-static slots, appearing as a string. The values are the values of the slots. They must be JSON-conforming objects. If the special variable .code *print-json-type* is true, the object includes a key named .str __type whose value is the structure type symbol, appearing as a string. When present, this key occurs first in the printed representation, before any other keys. Both the slot symbols and the type symbol may appear with a package qualifier, depending on the relationship of the symbols to the current package, according to similar rules as if the symbol were printed by the .code print function. When integer objects are output, they may not constitute valid JSON, since the JSON specification supports only IEEE 64 bit floating-point numbers. JSON numbers are read as floating-point. If the .code flat-p argument is present and has a true value, then the JSON is generated without any line breaks or indentation. Otherwise, the JSON output is subject to such formatting. The difference between .code put-json and .code put-jsonl is that the latter emits a newline character after the JSON output. When a string object is output as JSON string syntax, the following rules .RS .IP 1. The characters .code \e (backslash, reverse solidus) and .code \(dq (double quote) are preceded by a backslash escape. .IP 2. The characters U+0008 (BS), U+0009 (TAB), U+000A (LF), U+000C (FF) and U+000D (CR) are rendered as, respectively, .codn \eb , .codn \et , .codn \en , .code \ef and .codn \er . .IP 3. If the character sequence .code " occurs in a string, then in the JSON representation, the sequence is rendered as .codn -\eu002D> . Instances of .code - (hyphen) in other situations are not encoded. Rationale: safe embedding in HTML .code script tags. .IP 6. The code point U+DC00 (\*(TX's pseudo-null character) is translated into the .code "\eu0000" escape syntax. .IP 7. The code points U+DC01 through U+DCFF are send to the stream as-is. If the stream performs UTF-8 encoding, these characters turn into individual bytes in the range 0 to 255. .IP 8. Control characters in the U+0001 to U+001F other than the ones subject to rule 1 above are rendered as .code \eu escape sequences. Likewise, code points in the range U+007F to U+00BF, the range U+D800 to U+DBFF, U+DD00 to U+DFFF, and the code points U+FFFE and U+FFFF are also encoded as .code \eu escape sequences. .IP 9. A character outside of the BMP (Basic Multilingual Plane) in the range U+10000 to U+10FFFF is encoded using as a pair of consecutive .code \eu escape sequences, specifying the code points of a UTF-16 surrogate pair encoding that character. This representation is described in RFC 8259. .RE The .code put-json and .code put-jsonl functions return .codn t . Some of the JSON-related functions carry a .meta mode-opts optional parameter. These functions open a file as if using the .code open-file function, using a .meta mode-string appropriate to their direction of data transfer. If an argument is given to .metn mode-opts , it specifies the .meta options part to be added to the .metn mode-string . .coNP Function @ tojson .synb .mets (tojson < obj <> [ flat-p ]) .syne .desc The .code tojson function converts .meta obj into JSON notation, returned as a character string. The function can be understood as constructing a string output stream, calling the .code put-json function to write the object into that stream, and then retrieving and returning the constructed string. The .meta flat-p argument is passed to .codn put-json . .coNP Function @ get-json .synb .mets (get-json >> [ source .mets \ \ \ \ \ \ \ \ \ \ >> [ err-stream .mets \ \ \ \ \ \ \ \ \ \ \ >> [ err-retval >> [ name <> [ lineno ]]]]]) .syne .desc The .code get-json function closely resembles the .code read function, and follows the same argument and error reporting conventions. Rather than reading a Lisp object from the input source, it reads a JSON object, with support for \*(TX's JSON extensions. If an object is successfully read, its Lisp representation is returned. JSON numbers produce floating-point number objects. JSON strings produce string objects. The keywords .codn true , .code false and .code null map to the Lisp symbols .codn t , .codn nil , and .codn null , respectively. JSON objects map to hash tables, and JSON arrays to vectors. .coNP Function @ put-jsons .synb .mets (put-jsons < seq >> [ stream <> [ flat-p ]]) .syne .desc The .code put-jsons function writes multiple JSON representations into .metn stream . The objects are specified by the .meta seq argument, which must be an iterable object. The .code put-jsons function iterates over .meta seq and writes each element to the stream as if by using the .code put-jsonl function. Consequently, a newline character is written after each object. If the .meta stream argument is not specified, the parameter takes on the value of .metn *stdout* . The .meta flat-p argument has the same meaning as in .code put-json with regard to the individual elements. If it is specified and true, then exactly as many lines of text are written to .meta stream as there are elements in .metn seq . The .code put-jsons function returns .codn t . .coNP Function @ get-jsons .synb .mets (get-jsons <> [ source ]) .syne .desc The .meta get-jsons function reads zero or more JSON representations from .meta source until an end-of-stream or error condition is encountered. If .meta source is a character string, then the input takes place from a stream created from the character string using .codn make-string-byte-input-stream . Otherwise, if .meta source is specified, it must be an input stream supporting byte input; input takes place from that stream. If the .meta source argument is omitted, it defaults to .codn *stdin* . The objects are read as if by calls to .code get-json and accumulated into a list. If the end-of-stream condition is read, then the list of accumulated objects is returned. If an error occurs, then an exception is thrown and the list of accumulated objects is not available. If an end-of-stream condition occurs before any character is seen other than JSON whitespace, then the empty list .code nil is returned. .coNP Functions @ file-get-json and @ file-get-jsons .synb .mets (file-get-json < name <> [ mode-opts ]) .mets (file-get-jsons < name <> [ mode-opts ]) .syne .desc The .code file-get-json and .code file-get-jsons function open a text stream over the file indicated by the string argument .meta name for reading. The functions ensure that the stream is closed when they terminate. The .code file-get-json function invokes .code get-json to read a single JSON object, which is returned if that function returns normally. The .code file-get-jsons function invokes .code get-jsons to retrieve a list of JSON objects from the stream, which is returned if that function returns normally. .coNP Functions @ file-put-json and @ file-put-jsons .synb .mets (file-put-json < name < obj >> [ flat-p <> [ mode-opts ]]) .mets (file-put-jsons < name < seq >> [ flat-p <> [ mode-opts ]]) .syne .desc The .code file-put-json and .code file-put-jsons functions open a text stream over the file indicated by the string argument .metn name , using the function .code open-file with a .meta mode-string argument of .strn w , write the argument object into the stream in their specific manner, and then close the stream. The .code file-put-json function writes a JSON representation of .meta obj using the .code put-json function. The .meta flat-p argument is passed to that function, defaulting to .codn nil . The value returned is that of .codn put-json . The .code file-put-jsons function writes zero or more JSON representations of objects from .metn seq , which must be an iterable object, using the .code put-jsons function. The .meta flat-p argument is passed to that function, defaulting to .codn nil . The value returned is that of .codn put-jsons . .coNP Functions @ file-put-json and @ file-put-jsons .synb .mets (file-append-json < name < obj >> [ flat-p <> [ mode-opts ]]) .mets (file-append-jsons < name < seq >> [ flat-p <> [ mode-opts ]]) .syne .desc The .code file-append-json and .code file-append-jsons are identical in almost all requirements to the functions .code file-put-json and .codn file-put-jsons . The only difference is that when these functions open a text stream using .codn open-file , they specify a .meta mode-string argument of .str a rather than .strn w , in order to append data to the target file rather than overwrite it. .coNP Functions @ command-get-json and @ command-get-jsons .synb .mets (command-get-json < cmd <> [ mode-opts ]) .mets (command-get-jsons < cmd <> [ mode-opts ]) .syne .desc The .code command-get-json and .code command-get-jsons functions opens text stream over an input command pipe created for the command string .metn cmd , as if by the .code open-command function. They ensure that the stream is closed when they terminate. The .code command-get-json function calls .code get-json on the stream, and returns the value returned by that function. Similarly, .code command-get-jsons function calls .code get-jsons on the stream, and returns the value returned by that function. .coNP Functions @ command-put-json and @ command-put-jsons .synb .mets (command-put-json < cmd < obj >> [ flat-p <> [ mode-opts ]]) .mets (command-put-jsons < cmd < seq >> [ flat-p <> [ mode-opts ]]) .syne .desc The .code command-put-json and .code command-put-jsons functions open an output text stream over an output command pipe created for the command specified in the string argument .metn cmd , using the function .code open-command function, write the argument object into the stream, in their specific manner, and then close the stream. The .code command-put-json function writes a JSON representation of .meta obj using the .code put-json function. The .meta flat-p argument is passed to that function, defaulting to .codn nil . The value returned is that of .codn put-json . The .code command-put-jsons function writes zero or more JSON representations of objects from .metn seq , which must be an iterable object, using the .code put-jsons function. The .meta flat-p argument is passed to that function, defaulting to .codn nil . The value returned is that of .codn put-jsons . .coNP Special Variable @ *print-json-format* .desc The .code *print-json-format* variable controls the formatting style exhibited by .code put-json and related functions. The initial value of this variable is .codn nil . If the value is the keyword symbol .codn :standard , then a widely-used format is used, in which the opening and closing braces and brackets of vectors and dictionaries are printed on separate lines, as are the elements of those objects. If the variable has any other value, including the initial value .codn nil , then a default format is used in which braces, brackets and elements appear on the same line, subject to automatic breaking and indentation, similar to the way Lisp nested list structure is printed. .coNP Special Variable @ *print-json-type* .desc The .code *print-json-type* variable, whose initial value is .codn t , controls whether the .str __type field is included when a structure object is printed as JSON. .coNP Special Variable @ *read-bad-json* .desc This dynamic variable, initialized to a value of .codn nil , controls whether the parser is tolerant to certain non-conformances in the syntax of JSON data, which are ordinarily syntax errors. If the value of this variable is true, then the last element in a JSON array or the last element pair in a JSON object may be followed by spurious trailing comma, which is ignored. Note: in the future, the variable may be extended to enable other instances of tolerance in the area of JSON parsing. .TP* Example: .verb (get-json "{ 3:4, }") -> ;; syntax error (let ((*read-bad-json* t)) (get-json "{ 3:4, }")) --> #H(() (3.0 4.0)) .brev .coNP Special Variable @ *read-json-int* .desc This dynamic variable, initialized to a value of .codn nil , controls whether the parser reads some JSON numbers as integer objects. If the value of the variable is true, then whenever a JSON number is scanned which does not contain a .code . (decimal point) character or the letters .code e or .code E indicating an exponent field, it is converted to an integer object rather than a floating-point value. It is unspecified whether the number is converted to integer or floating-point if the exponent .code e or .code E is present, with a positive exponent value. If this variable is .codn nil , then JSON numbers are all converted to floating point. .SH* FOREIGN FUNCTION INTERFACE On platforms where it is supported, \*(TX provides a feature called the .IR "foreign function interface" , or FFI. This refers to the ability to interoperate with programming interfaces which are defined by the binary data type representations and calling conventions of the platform's principal C language compiler. \*(TX's FFI module provides a succinct Lisp-based type notation for expressing C data types, together with memory-management semantics pertinent to the transfer of data between software components. The notation is used to describe the arguments and return values of functions in external libraries, and of Lisp callback functions that can be called from those libraries. Driven by the compiled representation of the type notation, the FFI module performs transparent conversions between Lisp data types and C data types, and automatically manages memory around foreign calls and incoming callbacks, for many common interfacing conventions. The FFI module consists of a library of functions which provide all of its semantics. On top of these functions, the FFI module provides a number of macros which comprise an expressive, convenient language for defining foreign interfaces. The FFI module supports passing and returning both structures and arrays by value. Passing arrays by value isn't a feature of the C language syntax; from the C point of view, these by-value array objects in the \*(TX FFI type system are equivalent to C arrays encapsulated in .codn struct s. A .code carray type is provided for situations when foreign code generates arrays of undeclared, dynamic length, other than strings, and returns these arrays by the usual convention of pointer to the first element. The handling of .code carray requires more responsibility from the application. .SS* Cautionary Notes The FFI feature is inherently unsafe. If the FFI type language is used to write incorrect type definitions which do not match the actual binary interface of a foreign function, undefined behavior results. Improper use of FFI can corrupt memory, creating instability and security problems. It can also cause memory leaks and/or use-after-free errors due to inappropriate deallocation of memory. The implicit memory management behaviors encoded in the FFI type system are convenient, but risky. A minor declarative detail such as writing .code str instead of .code str-d in the middle of some nested type can make the difference between correct code and code which causes a memory leak, or instability by freeing memory which is in use. FFI developers are encouraged to unit test their FFI definitions carefully and use tools such as Valgrind to detect memory misuses and leaks. .SS* Key Concepts .NP* The \fIput\fP operation When a function call takes place from the \*(TL arena into a foreign library function, argument values must be prepared into the foreign representation. This takes place by converting Lisp objects into stack-allocated temporary buffers representing C objects. For aggregate objects containing pointers, additional buffers are allocated dynamically. For instance, suppose a structure contains a string and is passed by value. The structure will be converted to a stack-allocated equivalent C structure, in which the string will appear as a pointer. That pointer may use dynamically allocated (via .codn malloc ) string data. The operation which prepares argument material before a foreign function call is the .I put operation. In FFI callback dispatch, the operation which propagates the callback return value to the foreign caller is also the put operation. .NP* The \fIin\fP operation After a foreign function call returns from a foreign library back to the \*(TL arena, the arguments have to be examined one more time, because two-way communication is possible, and because some of the material has temporary, dynamically allocated buffers associated with it which must be released. For instance a structure passed by pointer may be updated by the foreign function. FFI needs to propagate the changes which the foreign function performed to the C version of the structure, back to the original Lisp structure. Furthermore, a structure passed by pointer uses a dynamically allocated buffer. This buffer must be freed. The operation which handles the responsibility for propagating argument data back into \*(TL objects, and frees any temporary memory that had been arranged by the .I put operation is the .I in operation. The in operation has two nuances: the by-value nuance and the by-pointer nuance. Data passed into a function by value such as function arguments or via .code ptr-in are subject to the by-value nuance. Updates to the foreign representation of these objects does not propagate back to the Lisp representation; however, those objects may contain pointers requiring the by-pointer nuance of the in operation of those pointers to be invoked. .NP* The \fIget\fP operation After a foreign call completes, it is also necessary to retrieve the call's return value, convert it to a Lisp object, and free any dynamic memory. This is performed by the .I get operation. The .I get operation is also used by a Lisp callback function, called from a foreign library, to convert the arguments to Lisp objects. .NP* The \fIout\fP operation When a Lisp callback invoked by a foreign library completes, it must provide a return value, and also update any argument objects with new values. The return value is propagated using the put operation. Updates to arguments are performed by the .code out operation. This operation is like the reverse of the in operation. Like that operation, it has a by-value and by-pointer nuance. For instance, if a callback receives a structure by value, upon return, there is no use in reconstructing a new version of the structure from the updated Lisp structure; the caller will not receive the change. However, if the structure contains pointers to data that was updated, by the callback, those changes must materialize. This is achieved by triggering the by-value nuance of the structure type's out operation, which will recursively invoke the out operation of embedded pointers, which will in turn invoke the by-pointer nuance. .SS* The FFI Type System The FFI type system consists of a notation built using Lisp syntax. Basic, unparametrized types are denoted by symbolic atoms. Similarly to a concept in the C language, .code typedef names can be globally defined, using the .code ffi-typedef function, or the .code typedef macro. Like in the C language, .code typedef names are aliases for an existing type, and not distinct types. However, this is of no consequence, since the FFI doesn't perform any type checking between two foreign types, and thus never takes into consideration whether two such types are equal. The main concern in FFI is correspondence between Lisp values and foreign types. For instance, a Lisp string argument will not convert to a foreign function parameter of type .codn int . Compound expressions denote the construction of derived types, or types which are instantiated with parameters. Each such expression has a type constructor symbol in the operator position, from a limited, fixed vocabulary, which cannot be extended. Some constituents of compound type syntax are expressions which evaluate to integer values: the dimension value for array types, the size for buffers, the width for bitfields and the value expressions for enumeration constants are such expressions. These expressions allow full use of \*(TL. They are evaluated without visibility into any apparent surrounding lexical scope. Some predefined types which are provided are in fact typedef names. For instance, the .code size-t type is a typedef name for some other integral type, defined in a platform-specific way. Which type that is may be determined by passing the syntax to the type compiler function using the expression .codn "(ffi-type-compile 'size-t)" . The type compiler converts the .code size-t syntax to the compiled type object, resolving the typedef name to the type which it denotes. The printed representation of that object reveals the identity of the type. For instance, it might be .codn "#" , indicating that .code size-t is an alias for the .code uint basic type, which corresponds to the C type .codn "unsigned int" . .SS* Simple FFI Types .coNP FFI types @, char @, zchar @ uchar and @ bchar These first two of these types, .code char and .code zchar correspond to the C character type .codn char . The .code uchar and .code bchar types correspond to .codn "unsigned char" . Both Lisp integers and character values convert to these representations if they are in their numeric range. Out-of-range values produce an exception. A foreign .codn char , .codn zchar , and .code bchar value converts to a Lisp character, whereas a .code uchar value converts to an integer. If these types are used for representing individual scalar values, there is no difference among .codn char , .code zchar and .codn bchar . What is different among these three types is that the .code array and .code zarray type constructors treat them specially. Arrays of these types are subject to conversion to and from Lisp strings. The variation among these types expresses different conversion semantics. That is to say, an array of .code bchar converts between the foreign and native Lisp representation differently from an array of .codn zchar , which in turn converts differently from an array of .codn char . Note: it is recommended to avoid using the types .code bchar and .code zchar other than for expressing the element type of an .code array or .codn zarray . .coNP FFI types @, short @, ushort @, int @, uint @ long and @ ulong These types correspond to the C integer types .codn short , .codn "unsigned short" , .codn int , .codn "unsigned int" , .code long and .codn "unsigned long" . Lisp characters and integers convert to these foreign representations, if they are in their numeric range. Foreign values of these types convert to Lisp integers. .coNP FFI types @ longlong and @ ulonglong These types are .code typedef names for integer types whose representation corresponds to the C types .code "long long" and .codn "unsigned long long" . .coNP FFI types @ int8 and @ uint8 These types correspond to 8-bit signed and unsigned integers. They convert like integer types: both Lisp integers and characters convert to these types, if in a suitable range; and under the reverse conversion, the foreign values become Lisp integers. .coNP FFI types @, int16 @, uint16 @, int32 @, uint32 @ int64 and @ uint64 These types correspond denote precisely sized C integer types. They convert like integer types: both Lisp integers and characters convert to these types, if in a suitable range; and under the reverse conversion, the foreign values become Lisp integers. .coNP FFI types @ float and @ double These types correspond to the same-named C types. They convert Lisp integers, characters and floating-point numbers to these C types. Because the \*(TL .code float is represented as a C .code double it converts directly to .code double without the possibility of range error or loss of precision. A conversion to type .code float is subject to a range check; an exception is thrown if the Lisp floating-point value is out of range of this type. Even when the conversion is possible, it alters the value, results in a loss of precision. In the reverse direction, values of both types convert to the one and only \*(TL .code float type. .coNP FFI type @ bool The type .code bool is a typedef name for the .code uchar instance of the parametrized .code bool type, which is to say, .codn "(bool uchar)" . .coNP FFI type @ val The FFI .code val type denotes the machine representation of a Lisp value cell, which is corresponds to a C pointer. Not all cell values are actually pointers, but values that are heap objects, such as vectors and conses, are. The .code val type transparently converts any Lisp object to a foreign pointer value with no representation change at all; and performs the reverse conversion from pointer to Lisp value. Note: this is utterly dangerous. Lisp values that aren't pointers must not be dereferenced by foreign code. Foreign code must not generate Lisp pointer values that aren't objects which came from a Lisp heap. Interpreting a Lisp value in foreign code requires a correct decoding of its type tag, and, if necessary, stripping the tag bits to recover a heap pointer and interpreting the type code stored in the heap object. The conversion from foreign bit pattern to Lisp value is subject to a validity checks; an exception will be thrown if the bit pattern isn't a valid Lisp object. Nevertheless, the checks has cases which report as false positives: admit some invalid objects may be admitted into the Lisp realm, possibly with catastrophic results. .coNP FFI type @ cptr This type corresponds to a foreign pointer of any type, including a pointer to a function. The .code cptr type converts between a foreign pointer and a Lisp object of type .codn cptr . Lisp objects of type .code cptr are tagged with a symbolic tag, which may be .codn nil . The unparametrized .code cptr converts foreign pointers to .code cptr objects which are tagged with .codn nil . In the reverse direction, it converts .code cptr Lisp objects of type .code cptr to foreign pointer, without regard for their type tag. There is a parametrized version of the .code cptr FFI type, which provides a measure of type safety. Note: the .code cptr type, in the context of FFI, is particularly useful for representing C pointers that are used in C library interfaces as "opaque" handles. For instance a FFI binding for the C functions .code fopen and .code fclose may use the .code cptr to represent the .code "FILE *" type. That is to say, .code cptr can be specified as the return type for .codn fopen , thereby capturing the stream handle in a .code cptr object when that function is invoked through FFI. Then, the captured .code cptr object can be passed as the argument of .code fclose to close the stream. .coNP FFI types @, str @ str-d @ and @ str-s These FFI types correspond to the C pointer type .codn "char *" , providing automatic conversion between Lisp strings and null-terminated C strings. The null pointer corresponds to the .B nil symbol. The related types .codn bstr , .codn bstr-d , .codn bstr-s , .codn wstr , .code wstr-d and .code wstr-s are also provided; these are described in the following sections. The .code str type behaves as follows. The put operation allocates, using .codn malloc , a buffer large enough to hold the UTF-8 encoded version of the Lisp string, encodes the string into that buffer, and then stores the .code "char *" pointer into the argument space. The in operation deallocates the buffer. If .code str is passed by pointer, the in operation also takes the current value of the .code "char *" pointer, which may have been replaced by a different pointer, and creates a new Lisp string by decoding UTF-8 from that buffer. The get operation retrieves the C pointer and duplicates a new string by decoding the UTF-8 contents. The type has no out operation: a string is not expected to be modified in-place. The type .code str-d type differs in behavior from .code str as follows. Firstly, it has no in operation. Consequently, .code str-d doesn't deallocate the buffer that had been allocated by put. Under the get operation, the .code str-d type assumes that ownership over the C pointer has been granted, and after duplicating a new string from the decoded UTF-8 data in the C string, it deallocates that C string by invoking the C library function .code free on it. Type type .code str- is similar to .codn str-d ; it also has no in-operation, and doesn't deallocate the buffer allocated in the put operation. Under the get operation, the .code str-s type does not assume ownership of memory, and therefore does not free the pointer received from the foreign function. The .code str-s type is intended for receiving strings via a pointer-to-pointer argument, in situations when the string must not be freed. Like other types, the string types combine with the .code ptr type family. Because the .code ptr family has memory management semantics, as does the string family, it is important to understand the memory management implications of the combination of the two. The derived pointer types .code "(ptr str-d)" and .code "(ptr str)" are effectively equivalent. They denote a string passed by pointer, with in-out semantics. The effect is that the string is dynamic in both directions. What that means is that the foreign function either must not free the pointer it was given, or else it must replace it with one which the caller can also free (or with a null pointer). The two are equivalent because .code str-d has no in operation, so its get operation is used instead; but that operation is similar to the in operation of the .code str type: both decode the string currently referenced by the .code "char *" pointer, and then pass that pointer to the C .code free function. Receiving a string by pointer from a foreign function is achieved by treating the situation as a pointer to an array of one element. So that is to say, an argument like .code "char **pstr" can be treated as either .code "(ptr-out (array 1 str-d))" if the foreign function passes ownership of the string, or else .code "(ptr-out (array 1 str-s))" if the foreign function retains ownership of the string. In either case, the argument is a vector of one element, which will be updated to the returned string, or else .code nil if the function passes back a null pointer. The type combination .code "(ptr-in str-d)" refers to a string pointer passed to a foreign function by pointer, whereby the foreign function will retain and free the pointer. The type combination .code "(ptr-in str)" passes the string pointer in the same way, but the foreign module mustn't use the pointer after returning. FFI will free the pointer that had been passed. .coNP FFI types @, bstr @ bstr-d @ and @ bstr-s The .code bstr family corresponds to null-terminated .code "char *" C strings, like the .code str family, and the family members have memory management semantics similar to their .code str counterparts. Likewise, under these types also, the null pointer corresponds to .codn nil . The .code b prefix in the naming denotes "byte". It indicates that unlike the .code str family, the .code bstr family does not use UTF-8 encoding; only Lisp strings which contain strictly code points in the range U+0000 to U+00FF may convert to these types; out-of-range characters trigger an error exception. Likewise, in the reverse direction, no UTF-8 decoding is performed: every byte value turns into the corresponding character code. The byte 0 is interpreted as the string terminator. Note: the .code bstr type may be advantageous in situations when character handling is known to be confined to the ASCII range, since UTF-8 conversion is then unnecessary overhead. Because \*(TX strings use wide characters internally, converting to and from the .code bstr type still requires memory management overhead, just like in the case of the .code str type. The .code wstr type described in the next section avoids memory management and conversion overhead. Thus, even in situations in which characters are confined to the ASCII range, if wide functions are available in the foreign API, it may be more efficient to use them, particularly if the foreign component uses that representation internally. .coNP FFI types @, wstr @ wstr-d and @ wstr-s The FFI type .code wstr corresponds to the C type .code "wchar_t *" pointing to the first character of a null terminated wide string. It converts between Lisp strings and symbols, and C strings. The family members of .code wstr have memory management semantics similar to their .code str counterparts, Likewise, under these types also, the null pointer corresponds to .codn nil . Note: because wide characters do not require UTF-8 conversion, the .code wstr family is more efficient. A .code wstr string passes into foreign code directly: the Lisp object already contains a null-terminated wide character string, and so the pointer to that is given directly to the foreign code. Similarly, ownership transfer in either direction is a pointer passage with no memory management or conversion overheads. Whenever some foreign API offers a choice between UTF-8 strings, and wide strings, the wide version should be targeted by FFI, particularly if the API is known to works with wide strings internally also. .coNP FFI types @ buf and @ buf-d The .code buf type creates a correspondence between the \*(TL .code buf type and a C pointer to a block of arbitrary data. Note that there also exists a parametrized version of the .code buf and .code buf-d type syntax which specifies a size. Under the .code buf type's put operation, no memory allocation takes place. The pointer to the buffer object's data is written into the argument space, so the foreign function can manipulate the buffer directly. If the object isn't a buffer but rather the symbol .codn nil , then a null pointer is written. The .code buf in operation has semantics as follows. In the pass-by-pointer nuance, the buffer pointer currently in the argument space is compared to the original one which had been written there from the buffer object. If they are identical, then the in operation yields the original buffer object. Otherwise, if the altered pointer is non-null, it allocates a new buffer equal in size to the original one and copies in the new data from the new pointer that was placed into the argument space by the foreign function. If the altered pointer is null, then instead of allocating a new buffer, the object .code nil is returned. The by-value nuance of the in operation does nothing. The get operation is not meaningful for an unsized .codn buf : it yields a zero length .code buf object. For this reason, parametrized .code buf type should be used for retrieving a buffer with a specific fixed size. The .code buf-d type has different memory management from .codn buf . The put operation of .code buf-d allocates a copy of the buffer and writes into the argument space a pointer to the copy. It is assumed that the foreign function takes ownership of the copy. The in operation of .code buf-d is also different. The by-value nuance of the in operation is a no-op, like that of .codn buf . The by-pointer nuance doesn't attempt to compare the previously written pointer to the current value. Rather, it assumes that if there is any non-null pointer value in the argument space, then it should take ownership of that object and return it as a new buffer. Thus if two-way dynamic buffer passing is requested using .code "(buf buf-d)" it means that the foreign function must replace the pointer with a null to indicate that it has consumed the buffer. Any non-null value in the argument space indicates that the foreign function has either rejected the pointer (not taken ownership), or has replaced it with a new object, whose ownership is being passed. Unidirectional by-pointer passing of a .code buf-d can be performed using the types .code "(ptr-out buf-d)" or .codn "(ptr-int buf-d)" . The former type will not invoke .codn buf-d 's put operation. It will only allocate a pointer-sized space, without initializing it. After the foreign call, the by-pointer semantics of the in operation will be triggered If the foreign function places a non-null pointer into the space, its ownership will be seized by a newly instantiated buffer object. Otherwise the function must place a null pointer, which results in a .code nil value emerging from the in operation as documented above. The latter type will achieve a transfer of ownership in the other direction, by invoking the .code buf-d put operation, which places a copy of the buffer into the pointer-sized location prepared in the argument space. After the call, it will invoke the by-value in semantics of .codn buf-d , which is a no-op: thus no attempt is made to extract a buffer, even if the foreign function alters the pointer. .coNP FFI type @ closure The .code closure type converts three kinds of Lisp objects to a C pointer: the object .codn nil , the .code cptr type, or the special .code ffi-closure type. When the .code nil symbol is converted to a .code closure type, it becomes a null function pointer. A .code cptr object of any kind converts to a .codn closure ; the internal pointer is converted to a function pointer. Instances of the .code ffi-closure type are produced by the .code ffi-make-closure function, or by calls to functions defined by the .code deffi-cb macro. The .code closure type is useful for passing callbacks to foreign functions: Lisp functions which appear to be C functions to foreign code. In the reverse direction, when a .code closure object is converted from the foreign function pointer representation to a Lisp object, it becomes a .code cptr object whose tag is the .code closure symbol. .coNP FFI type @ void The .code void type is useful for indicating the return type of foreign functions and callbacks which return no value. It corresponds to a zero-sized object. It will convert any lisp value into zero bytes, and convert zero bytes into .codn nil . .SS* Parametrized FFI Type Operators The following following parametrized type operators are available. .coNP FFI type @ enum .synb .mets (enum < name >> {( sym << value ) | << sym }*) .syne .desc The type .code enum specifies an enumerated type, which establishes a correspondence between a set of Lisp symbols and foreign integer values of type .codn int . The .meta name argument must either be .code nil or a symbol for which the .code bindable function returns true. It gives the tag name of the enumerated type. The remaining arguments specify the enumeration constants. In the enumeration constant syntax, each occurrence of .meta sym They must be a bindable symbol according to the .code bindable function. The symbols may not repeat within the same enumerated type. Unlike in the C language, different enumerations may use the same symbols; they are in separate spaces. If a .meta sym is given, it is associated with an integer value which is one greater than the integer value associated with the previous symbol. If there is no previous symbol, then the value is zero. If the previous symbol has been assigned the highest possible value of the FFI .code int type, then an error exception is thrown. If .mono .meti >> ( sym << value ) .onom is given, then .meta sym is given the specified value. The .meta value is an expression which must evaluate to an integer value in range of the FFI .code int type. It is evaluated in an environment in which the previous symbols from the same enumeration appear as variables whose bindings are their enumeration values, making it possible to use earlier enumerations in the definition of later enumerations. The FFI .code enum type converts two kinds of Lisp values to the foreign type .codn int : symbols which are in the set defined by the type, and integer values which are in the range which that foreign type can represent. Out-of-range integer values, symbols not defined in the enumeration, and objects not of symbol or integer type all trigger an exception. In the reverse direction, the .code enum type extracts from the foreign representation values of FFI type .codn int , and converts them, if possible, to symbols. If an integer value occurs which is not assigned to any enumeration symbol, then the conversion produces that integer value itself rather than a symbol. If an integer value occurs which is assigned to multiple enumeration symbols, it is not specified which of those symbols is produced. .coNP FFI type @ enumed .synb .mets (enumed < type < name >> {( sym << value ) | << sym }*) .syne .desc The .code enumed type operator is a generalization of .code enum which allows the base integer type of the enumeration to be specified. The following equivalence holds: .verb (enum n a b c ...) <--> (enumed int n a b c ...) .brev Any integer type or .meta typedef name may be specified for .metn type , including any one of the endian types. The enumeration inherits its size, alignment and other foreign representation details from .metn type . The values associated with the enumeration symbols must be in the representation range of .metn type , which is not checked until the conversion of a symbol through the enumeration is attempted at run time. The .code enumed type is a clone of the underlying type, inheriting most of its properties. In particular, it is possible to derive an .code enumed type from an underlying bitfield type. The resulting type is still a bitfield, and may only be used as a .code struct or .code union member. Moreover, because it is a bitfield type, there is a restriction against creating aliases for it with .codn typedef . An .code enumed bitfield allows the values of a bit field to be specified symbolically. .coNP FFI type @ struct .synb .mets (struct < name >> {( slot < type <> [ init-form ])}*) .syne .desc The FFI .code struct type maps between a Lisp .code struct and a C .codn struct . The .meta name argument of the syntax gives the structure type's name, known as the tag. If this argument is the symbol .code nil then the structure type is named by a newly generated uninterned symbol (with .codn gensym ). The .meta name is entered into a global namespace of tags which is shared by structures and unions. The .meta name also specifies the Lisp .code struct name associated with the FFI type. The .meta slot and .meta type pairs specify the structure members. The .meta slot elements must be symbols, and the .meta type elements must be FFI type expressions. A .code struct definition with no members refers to a previously defined .code struct or .code union type which has the same .meta name in the global .cod3 struct / union tag space. If no prior .code struct or .code union exists, then a definition with no slots specifies a new, incomplete structure type. A .code struct definition with no members never causes a Lisp structure type to be created. A .code struct definition that specifies one or more members either defines a new structure type, or completes an existing one. If an incomplete structure or union type which has the same .meta name exists, then the newly appearing definition is understood to provide a completion of that type. If the incomplete type is a .codn union , it thereby converted to a .code struct type. If a complete structure type which has the same .meta name already exists, then the newly appearing definition replaces that type in the tag namespace. A .code struct definition with members is entered into the .cod3 struct / union tag space immediately as an incomplete type (if it isn't already), before the members are processed. Therefore, the member definitions can refer to the .code struct type. The type becomes complete when the last member is processed, except in the special situation when that member causes the type to become a flexible structure, described several paragraphs below. A .code struct definition that specifies members causes a Lisp .code struct having the same .code name to exist, if such a type doesn't already exist. If such a type is created, instance slots are defined for it which correspond to the member definitions in the FFI .code struct definition. For any .meta slot which specifies an .meta init-form expression, that expression is evaluated during the processing of the type syntax, in the global environment. The resulting value then becomes the initial value for the slot. The semantics of this value is similar to that of a quoted object appearing as an .meta init-form in the .code defstruct macro's .meta slot-specifier syntax. For example, if the type expression .codn "(struct s (a int expr))" , which specifies a slot .code a initialized by .codn expr , generates a Lisp struct type, the manner in which that type is generated will resemble that of .code "(defstruct s nil (a (quote [value-of-expr])))" where .code [value-of-expr] denotes the substitution of the value of .code expr which had been obtained by evaluation in the global environment. Note: if more flexible initialization semantics is required, the application must define the Lisp struct type first with the desired characteristics, before processing the FFI struct type. The FFI struct type will then related to the existing Lisp struct type. Those members whose .meta slot name is specified as .code nil is ignored; no instance slots are created in the Lisp type. If a .meta init-form is specified for such a slot, there exists is no situation in which that form will be evaluated. When a Lisp object is converted to a struct, it must, firstly, be of the struct type specified by .metn name . Secondly, that type must have all of the slots defined in the FFI type. The slots are pulled from the Lisp structure in the order that they appear in the FFI .code struct definition. They are placed into the target memory area in that order, with all required padding between the members, and possibly after the last member, for alignment. Whenever a member is defined using .code nil as the .meta slot name, that member represents anonymous padding. The corresponding .meta type expression is used only to determine the size of the padding only. Its data transfer semantics is completely suppressed. When converting from Lisp, the anonymous padding member simply generates a skip of the number of byte corresponding to the size of its type, plus any necessary additional padding for the alignment of the subsequent member. Structure members may be bitfields, which are described using the .codn ubit , .code sbit and .code bit compound type operators. A structure member must not be an incomplete or zero-sized array, unless it is the last member. If the last member of FFI structure is an incomplete array, then it is a flexible structure. A structure member must not be a flexible structure, unless it is the last member; the containing structure is then itself a flexible structure. Flexible structures correspond to the C concept of a "flexible array member": the idea that the last member of a structure may be an array of unknown size, which allows for variable-length data at the end of a structure, provided that the memory is suitably allocated. Flexible structures are subject to special restrictions and requirements. See the section Flexible Structures below. In particular, flexible structures may not be passed or returned by value. See also: the .code make-zstruct function and the .code znew macro. .coNP FFI type @ union .synb .mets (union < name >> {( slot << type )}*) .syne .desc The FFI .code union type resembles the .code struct type syntactically. It provides handling for foreign objects of C .code union type. The .meta name argument specifies the name for the union type, known as a tag. If this argument is the symbol .code nil then the union type is named by a newly generated uninterned symbol (with .codn gensym ). The .meta name is entered into a global namespace of tags which is shared by structures and unions. The .meta slot and .code type pairs specify the union members. The .meta slot elements must be symbols, and the .meta type elements must be FFI type expressions. A .meta union definition with no member refers to a previously defined .code struct or .code union type which has the same .meta name in the global .cod3 struct / union tag space. If no prior .code struct or .code union exists, then a definition with no slots specifies a new, .code union type that is incomplete. A .meta union definition that specifies one or more members either defines a new structure type, or completes an existing one. If an incomplete structure type which has the same .meta name exists, then the newly appearing definition is understood to provide a completion of that type. If the prior incomplete type is a .codn struct , it is converted to .code union type. If a complete structure or union type which has the same .meta name already exists, then the newly appearing definition replaces that type in the tag namespace. A struct .code union definition with members is entered into the .cod3 struct / union tag space immediately as an incomplete type (if it isn't already), before the members are processed. Therefore, the member definitions can refer to the .code union type. The type becomes complete when the last member is processed. Unlike the FFI .code struct type, the .code union type doesn't provide automatic conversion between C and Lisp data. This is because the .code union is inherently unsafe, due to its placement of multiple types into the same storage, and lack of any information to discriminate which type is currently stored. Instead, the FFI .code union creates a correspondence between a C union that is regarded as just a region of memory, and a \*(TL data type called .codn union . An instance of the Lisp .code union type holds a copy of the C union memory, and also contains type information about the unions members. Functions are provided to store and retrieve the members; it is these functions which provide the conversion between the Lisp types and the foreign representations stored in the C union. This is done under control of the application, because due to the inherent lack of safety of the C .codn union , only the application program knows which member of the union may be accessed. Conversion between the C .code union and the Lisp .code union consists of just a memory copying operation. The following functions are provided for manipulating unions: .code make-union instantiates a new union object; .code union-members retrieves a list of the symbols serving as the union's member names; .code union-get retrieves a specified member from the union's storage, converting it to a Lisp object; .code union-put places a Lisp object into a union, using the specified member's type to convert it to a foreign representation; .code union-in performs the "in semantics" on the specified member of a union, propagating modifications in that member back to a Lisp object; and .code union-out performs "out semantics" on the specified member of a union, propagating modifications done on a previously retrieved Lisp object back into the union. .coNP FFI type @ array .synb .mets (array < dim << type ) .mets (array << type ) .syne .desc The FFI .code array type creates a correspondence between Lisp sequences and "by value" fixed size arrays in C. It converts Lisp sequences to C arrays, and C arrays to Lisp vectors. Arrays passed by values do not exist in the C language syntax. Rather, the C type which corresponds to the FFI array is a C array that is encapsulated in a .codn struct . For instance the type .code "(array 3 char)" can be visualized as corresponding to the C type .codn "struct { char anonymous[3]; }" . Thus, in the FFI syntax, we can specify arrays as function parameters passed by value and as return values. On conversion from Lisp to the foreign type, the FFI .code array simply iterates over the Lisp sequence, and performs an element for element conversion to .metn type . If the sequence is shorter than the array, then the remaining elements are filled with zero bits. If the sequence is longer than the array, then the excess elements in the sequence are ignored. Since Lisp arrays and C arrays do not share the same representation, temporary buffers are automatically created and destroyed by FFI to manage the conversion. The .meta dim argument is an ordinary Lisp expression expanded and evaluated in the top-level environment. It must produce a nonnegative integer value. In addition, several types are treated specially: when .meta type is one of .codn char , .codn zchar , .code bchar or .codn wchar , the array type establishes a special correspondence with Lisp strings. When the C array is decoded, a Lisp string is created or updated in place to reflect the new contents. This is described in detail below. The second form, whose syntax omits the .meta dim element, it denotes a variable length array. It corresponds to the concept of an incomplete array in the C language, except that no implicit array-to-pointer conversion concept is implemented in the FFI type system. This type may not be used as an array element or structure member, other than as the last structure member. It also may not be passed or returned by value, only by pointer. If the last member of a structure has this type, then it is a flexible array member; see the Flexible Structures section below. Since the type has unknown length, it has a trivial get operation which returns .codn nil . It is useful for passing a variable amount of data into a foreign function by pointer. An array of .code char represents non-null-terminated UTF-8 character data, which converts to and from a Lisp string. Any null bytes in the data correspond to the pseudo-null character .code #\exDC00 also notated as .codn #\epnul . An array of .code zchar represents a field of optionally null-terminated UTF-8 character data. If a null byte occurs in the data then the text terminates before that null byte, otherwise the data comprises the entire foreign array. Thus, null bytes do not occur in the data. A null byte in the array will not generate a pseudo-null character in the Lisp string. An array of .code bchar values represents 8-bit character data that isn't UTF-8 encoded, and is not null terminated. Each byte holds a character whose code is in the range 0 to 255. If a null byte occurs in the data, is interpreted as a string terminator. .coNP FFI type @ zarray .synb .mets (zarray < dim << type ) .mets (zarray << type ) .syne .desc The .code zarray type is a variant of .codn array . When converting from Lisp to C, it ensures that the array is null-terminated. This means that if the .meta zarray is dimensioned, then the .mono .meti >> [ dim - 1] .onom element of the C array is written out as all zero bytes, ignoring the corresponding Lisp value in the Lisp array. If the .meta zarray is undimensioned, then the size of the C array is deemed to be one greater than the actual length of the Lisp array. The elements in the Lisp array are converted to the corresponding elements of the C array, and then the last element of the C array is filled with null bytes. The .code zarray type is useful for handling null terminated character arrays representing strings, and for null terminated vectors. Unlike .codn array , .code zarray allows the Lisp object to be one element short. For instance, when a .code "(zarray 5 int)" passed by pointer a foreign function is converted back to Lisp, the Lisp object is required to have only four elements. If the Lisp object has five elements, then the fifth one will be decoded from the C array in earnest; it is not expected to be null. However, when that Lisp representation is converted back to C, that extra element will be ignored and output as a zero bytes. Lastly, the .code zarray further extends the special treatment which the .code array type applies to the types .codn zchar , .codn char , .code wchar and .codn bchar . The .code zarray type assumes, and depends on the incoming data being null-terminated, and converts it to a Lisp string accordingly. The regular .code array type doesn't assume null termination. In particular, this means that whereas .code "(array 42 char)" will decode 42 bytes of UTF-8, even if some of them are null, converting those null bytes to the U+DC00 pseudo-null, in contrast, a .code zarray will treat the 42 bytes as a null-terminated string, and decode UTF-8 only up to the first null. In the other direction, when converting from Lisp string to foreign array, .code zarray ensures null termination. Note that the type combination .code zarray of .code zchar behaves in a manner indistinguishable from a .code zarray of .codn char . The one-argument variant of the .code zarray syntax which omits the .meta dim argument specifies a null-terminated variant of the variable-length array. Like that type, it corresponds to the concept of an incomplete array in the C language. It may not be used as an array element, and may not be used as a structure member other than the last member. It cannot be passed as an argument or returned as a value. If the last member of a structure has this type, then it is a flexible array member; see the Flexible Structures section below. Unlike the ordinary variable-length .codn array , the .code zarray type supports the get operation, which extracts elements, accumulating them into a resulting vector, until it encounters an element consisting of all zero bytes. That element terminates the decoding, and isn't included in the resulting array. The variable-length .code zarray also has a special in operation. Like the get operation, the in operation extracts all elements until a terminating null, decoding them to a vector. Then, the entire original vector is replaced with the new vector, even if the original vector is longer. .coNP FFI type @ ptr .synb .mets (ptr << type ) .syne .desc The .meta ptr denotes the passage of a value by pointer. The .meta type argument gives the pointer's target type. The .code ptr type converts a single Lisp value, to and from the target type, using a C pointer as the external representation. When used for passing a value to a foreign function, the .code ptr type has in-out semantics: it supports the interfacing concept that the called function can update the datum which has been passed to it "by pointer", thereby altering the caller's object. Since a Lisp value requires a conversion to the FFI external representation, it cannot be directly passed by pointer. Instead, this semantics is simulated. The put semantics of .code ptr allocates a temporary buffer, large enough to hold the representation of .metn type . The Lisp value is then encoded into this buffer, recursively relying on the type's put semantics. After the foreign call, .code ptr triggers the in semantics of .meta type to update the Lisp object from the temporary buffer, and releases the buffer. The get semantics of .code ptr is used in retrieving a .code ptr return value, or, in a FFI callback, for retrieving the values of incoming arguments that are of .code ptr type. The get semantics assumes that the memory referenced by the C pointer is owned by foreign code. The Lisp object is merely decoded from the data area, which is then not touched. The .code out semantics of .codn ptr , used by callbacks for updating the values of arguments passed by pointer, assumes that the argument space already contains a valid pointer. The pointer is retrieved from the argument space, and the Lisp value is encoded into the memory referenced by that pointer. Note that only Lisp objects with mutable slots can be meaningfully passed by pointer with in-out semantics. If a Lisp object without immutable slots, such as an integer, is passed using .code ptr the incoming updated value of the external representation will be ignored. Concretely, if a C function has the argument signature .code "(int *)" with in-out semantics such that it updates the .code int object which is passed in, this function can be called as a foreign function using a .code "(ptr int)" FFI type for the argument. However, the argument of the foreign call on the \*(TL side is just an integer value, and that cannot be updated. On the other hand, if a FFI .code struct member is declared as of type .code "(ptr int)" then the Lisp .code struct is expected to have an integer-valued slot corresponding to that member. The slot is then subject to a bidirectional transfer. FFI will create an .codn int -sized temporary data area, encode the slot into that area and place that area's pointer into the encoded structure. After the call, the new value of the .code int will be extracted from the temporary buffer, which will then be released. The Lisp structure's slot will be updated with the new integer. This will happen even if the Lisp structure is being passed as a by-value argument. .coNP FFI type @ ptr-in .synb .mets (ptr-in << type ) .syne .desc .code ptr-in type is a variation of .code ptr which denotes the passing of a value by pointer into a function, but not out. The put semantics of .code ptr-in is the same as that of .codn ptr , but after the completion of the foreign function call, the in semantics differs. The .code ptr-in type only frees the temporary buffer, without decoding from it. The out semantics of .code ptr-in differs also. It effectively treats the object as if it were "by value", since the reverse data transfer is ruled out. In other words, .code ptr-in simply triggers the by-value nuance of .metn type 's out semantics. The get semantics of .code ptr-in is the same as that of .codn ptr . .coNP FFI type @ ptr-out .synb .mets (ptr-out << type ) .syne .desc The .code ptr-out type is a variant of .code ptr which denotes a by pointer data transfer out of a function only, not into. The put semantics of .code ptr-out prepares a data area large enough to hold .meta type and stores a pointer to that area into the argument space. The Lisp value isn't encoded into the data area. The in semantics is the same as that of .codn ptr : the by-pointer nuance of .metn type 's in semantics is invoked to decode the external representation to Lisp data. .coNP FFI type @ ptr-in-d .synb .mets (ptr-in-d << type ) .syne .desc The .code ptr-in-d type is a variant of .code ptr-in which transfers ownership of the allocated buffer to the invoked function. That is to say, the in semantics of .code ptr-in-d doesn't involve the freeing of memory that was allocated by put semantics. The .code ptr-in-d type is useful when a function expects a pointer to an object that was allocated by .code malloc and expects to take responsibility for freeing that object. Since the function may free the object even before returning, the pointer must not be used once the function is called. This is ensured by the in semantics of .code ptr-in-d which is the same as that of .codn ptr-in . The .code ptr-in-d type also has get semantics which assumes that ownership of the C object is to be seized. FFI will automatically free the C object when get semantics is invoked to retrieve a value through a .codn ptr-in-d . .coNP FFI type @ ptr-out-d .synb .mets (ptr-out-d << type ) .syne .desc The .code ptr-out-d type is a variant of .code ptr-out which is useful for capturing return values or, in a callback producing return values. The .code ptr-out-d type has empty put semantics. If it put semantics is invoked, it does nothing: no area is allocated for .meta type and no pointer is stored into the argument space. The in semantics is the same as that of .codn ptr : a pointer is retrieved from the argument space, the object is subject to .metn type 's in semantics to recover the updated Lisp value, and then the object is freed. The get semantics of .code ptr-out-d is identical to that of .codn ptr-in-d . The out semantics is identical to that of .codn ptr . .coNP FFI type @ ptr-out-s .synb .mets (ptr-out-s << type ) .syne .desc The .code ptr-out-d type is a variant of .code ptr-out similar to .codn ptr-out-d , which assumes that the C object being received has an indefinite lifetime, and doesn't need to be freed. The suffix stands for "static". Like .codn ptr-out-d , the .code ptr-out-s has no put semantics. Its in semantics recovers a Lisp value from the external object whose pointer has been stored by the foreign function, but doesn't free the external object. The get semantics retrieves a Lisp value without freeing. .coNP FFI type @ bool .synb .mets (bool << type ) .syne .desc The parametrized type .code bool can be derived from any integer or floating-point type. There is also an unparametrized .code bool which is a .code typedef for the type .codn "(bool uchar)" . The .code bool type family represents Boolean values, converting between a Lisp Boolean and foreign Boolean. A given instance of the .code bool type inherits all of its characteristics from .metn type , such as its size, alignment and foreign representation. It alters the get and put semantics, however. The get semantics converts a foreign zero value of .meta type to the Lisp symbol .codn nil , and all other values to the symbol .codn t . The put semantics converts the Lisp symbol .code nil to a foreign value of zero. Any other Lisp object converts to the foreign value one. The .code bool types are not integers, and cannot be used as the basis of bitfields: syntax like .code "(bit 3 (bool uint))" is not permitted. However, Boolean bitfields are possible when this syntax is turned inside out: the .code bool type can be derived from a bitfield type, as exemplified by .codn "(bool (bit 3 uint))" . This simply applies the above described Boolean conversion semantics to a three-bit field. A zero/nonzero value of the field converts to .cod3 nil / t and a .code nil or .cod2 non- nil Lisp value converts to a 0 or 1 field value. .coNP FFI types @ ubit and @ sbit .synb .mets ({ubit | sbit} << width ) .syne .desc The .code ubit and .code sbit types denote C-language-style bitfields. These types can only appear as members of structures. A bitfield type cannot be the argument or return value of a foreign function or closure, and cannot be a foreign variable. Arrays of bitfields and pointers, of any kind, to bitfields are a forbidden type combination that is rejected by the type system. The .code ubit type denotes a bitfield of type .codn uint , corresponding to an .code unsigned bitfield in the C language. The .code sbit type denotes a bitfield of type .codn int . Unlike in the C language, it is not implementation-defined whether such a bitfield represents signed values; it converts between Lisp integers that may be positive or negative, and a foreign representation which is two's complement. Bitfields based on some other types are supported using the more general .code bit operator, which is described below. The .meta width parameter of is an expression evaluated in the top-level environment, indicates the number of bits. It may range from zero to the number of bits in the .code uint type. In a structure, bitfields produced by .code sbit and .code ubit are allocated out in storage units which have the same width and alignment requirements as a .codn uint . These storage units themselves can be regarded as anonymous members of the structure. When a new unit needs to be allocated in a structure to hold bitfields, it is allocated in the same manner as a named member of type .code uint would be at the same position. A zero-length bitfield is permitted. It may be given a name, but the field will not perform any conversions to and from the corresponding slot in the Lisp structure. Note that in situations when the FFI struct definition causes the corresponding Lisp structure type to come into existence, the Lisp structure type will have slots for all the zero width named bitfields, even though those slots don't participate in any conversions in conjunction with the FFI type. The presence of a zero-length bitfield ensures that a subsequent structure member, whether bitfield or not, is placed in a new storage unit of the size of the bitfield's base type. Details about the algorithm by which bitfields are allocated within a structure are given in the paragraph below entitled .BR "Bitfield Allocation Rules" . A .code ubit field stores values which follow a pure binary enumeration. For instance, a bitfield of width 4 stores values from 0 to 15. On conversion from the Lisp structure to the foreign structure, the corresponding member must be a integer value in this range, or an error exception is thrown. On conversion from the foreign representation to Lisp, the integer corresponding to the bit pattern is recovered. Bitfields follow the bit order of the underlying storage word. That is to say, the most significant binary digit of the bitfield is the one which is closest to the most significant bit of the underlying storage unit. If a four-bit field is placed into an empty storage unit and the value 8 its stored, then on a big-endian machine, this has the effect of setting to 1 the most significant bit of the underlying storage word. On a little-endian machine, it has the effect of setting bit 3 of the word (where bit 0 is the least significant bit). The .code sbit field creates a correspondence between a range of Lisp integers, and a foreign representation based on the two's complement system. The most significant bit of the bitfield functions as a sign bit. Values whose most significant bit is clear are positive, and use a pure binary representation just like their .code ubit counterparts. The representation of negative values is defined by the "two's complement" operation, which maps each value to its additive inverse. The operation consists of temporarily treating the entire bitfield as unsigned, and inverting the logical value of all the bits, and then adding 1 with "wraparound" to zero if 1 is added to a field consisting of all 1 bits. (Thus zero maps to zero, as expected.) An anomaly in the two's complement system is that the most negative value has no positive counterpart. The two's complement operation on the most negative value produces that same value itself. A .code sbit field of width 1 can only store two values: -1 and 0, represented by the bit patterns 1 and 0. An attempt to convert any other integer value to a .code sbit field of width 1 results in an error. A .code sbit field of width 2 can represent the values -2, -1, 0 and 1, which are stored as the bit patterns 10, 11, 00 and 01, respectively. .coNP FFI type @ bit .synb .mets (bit < width << type ) .syne .desc The .code bit operator is more general than .code ubit and .codn sbit . It allows for bitfields based on on any integer type up to 64 bits wide. When the character types .code char and .code uchar are used as the basis of bitfields, they convert integer values, not characters. In the case of .codn char , the bitfield is signed. All remarks about .code ubit and .code sbit apply to .code bit also. Details about the algorithm by which bitfields are allocated within a structure are given in the paragraph below entitled .BR "Bitfield Allocation Rules" . Under the .code bit operator, the endian types such as .code be-int32 or .code le-int16 may also be used as the basis for bitfields. If .meta type is an endian type, the bitfield is then allocated in the same way that a bitfield of the corresponding ordinary type would be allocated on a target machine which has the byte order of that endian type. When a bitfield member follows a member which has a different byte order, the bitfield is placed into a new allocation cell. This is true even if the previous member has the same alignment. Note: the allocation of bits within a bitfield based on a byte storage cells also differs between different endian systems. However, the FFI type system does not offer one byte endian types such as .codn be-uint8 . The workaround is to switch to a wider type. Note: endian bitfields may be used to match the image of a C structure which contains bitfields, without having to conditionally define the FFI struct type differently based on whether the current machine is big or little endian. Conditionally defining a structure for two different byte orders adds verbiage to the program and is highly error-prone, since the bitfields change order within an allocation unit. For instance, on a big endian system, the definition of a structure representing an IPv4 packet might begin like this: .verb (struct ipv4-header (ver (bit 4 uint16)) (ihl (bit 4 uint16)) (dscp (bit 6 uint16)) (ecn (bit 2 uint16)) (len uint16) ...) .brev to port this to a little endian system, the programmer has to recognize that the first pair of fields is packed into one byte, and the next pair of fields into a second byte. The bytes stay in the same order, but the pairs are reversed: .verb (struct ipv4-header (ihl (bit 4 uint16)) ;; reversed pair (ver (bit 4 uint16)) (ecn (bit 2 uint16)) ;; reversed pair (dscp (bit 6 uint16)) (len be-uint16) ...) .brev Endian bitfields allow this to be defined naturally. The IPv4 header is based on network byte order, which is big-endian, so big endian types are used. The little endian version above already uses .code be-uint16 for the .meta len field. This just has to be done for the bitfields also: .verb (struct ipv4-header (ver (bit 4 be-uint16)) (ihl (bit 4 be-uint16)) (dscp (bit 6 be-uint16)) (ecn (bit 2 be-uint16)) (len be-uint16) ...) .brev .coNP FFI types @ buf and @ buf-d .synb .mets ({buf | buf-d} << size ) .syne .desc The parametrized .code buf and .code buf-d types are variants of the unparametrized .code buf and .codn buf-d , respectively. The .meta size argument is an expression which is evaluated in the top-level environment, and must produce a nonnegative integer. Because they have a size, these types have useful get semantics. The get semantics of .code buf-d is that a Lisp object of type .code buf is created which takes direct ownership of the memory. The get semantics of .code buf is that a Lisp object is created using a dynamically allocated copy of the memory. .coNP FFI type @ carray .synb .mets (carray << type ) .syne .desc The .code carray type corresponds to a C pointer, in connection with the concept of representing a variable length array that is passed and returned as a pointer to the base element. On the Lisp side, the .code carray FFI type corresponds to the .code carray Lisp type. The .code carray Lisp type is similar to .codn cptr , but supports array indexing operations, and some other features. It can be regarded as a semantic cross between .code cptr and .codn buf . The get semantics of .code carray is simply that a pointer is retrieved from memory and converted to a freshly allocated .code carray object which holds that pointer, and is marked as having an unknown size. No copy is made of the underlying array. When the application determines the size of the array, it can inform that object by calling the .code carray-set-length function. The put semantics of the .code carray FFI type is simply to write, into the argument space, the pointer which the object holds. The object must be a .code carray whose element type matches that of the FFI type. The .code carray type has in semantics. When a .code carray is passed to a foreign function as an argument to a .code ptr or .code ptr-out parameter to either a .code carray or .code cptr type, what is passed to the function is a pointer to the .codn carray 's pointer. The foreign function may update this pointer to a new value, and this value is stored back into the .code carray object. The array's length is reset to zero. If it is an owned .codn carray , arranged by .codn carray-own , then the current array freed before the new pointer is assigned, and the object's type is reset to borrowed array. The .code carray object must not be memory mapped .code carray coming from the .code mmap function. The .code carray type lacks out semantics, since Lisp code cannot change its address; so there is no new pointer to propagate back to a foreign caller which passes a .code carray to a Lisp callback, and no other memory management tasks to perform. The .code carray type is particularly useful in situations when foreign code generates such an array, and the size of that array isn't known from the object itself. It is also useful, instead of a variable-length .codn zarray , for passing a dynamic array to foreign code in situations when the application benefits from managing the memory for the array. The variable-length .code zarray FFI type's disadvantage relative to .code carray is that the .code zarray converts an entire Lisp sequence to a temporarily allocated array, which is used only for one call. By contrast, the .code carray object holds the C representation which Lisp code can manipulate; and that representation is passed directly, just like in the case of .codn buf . Unlike .codn buf , there is no dynamic variant of .codn carray . The transfer of ownership of a .code carray requires the use of explicit operations like .code carray-free and .codn carray-own . It is possible to create a .code carray view over a buffer, using .codn carray-buf . Lastly, the .code carray type is the basis for the \*(TL .code mmap function, which is documented in the section .BR "Unix Memory Mapping" . .coNP FFI type @ cptr .synb .mets (cptr << type-sym ) .syne .desc The parametrized .code cptr type is similar to the unparametrized .codn cptr . It also converts between Lisp objects of type .code cptr and foreign pointers. Unlike the unparametrized type, it provides a measure of type safety, and also supports the conversion of .code carray objects. When a foreign pointer is converted to a Lisp object under control of the parametrized .codn cptr , the resulting Lisp .code cptr object is tagged with the .meta type-sym symbol. In the reverse direction, when a Lisp .code cptr object is converted to the parametrized type, its type tag must match .metn type-sym , or else the conversion fails with an error exception. This rule contains a slight relaxation: a .code cptr object with a .code nil tag can be converted to a foreign representation using any parametrized type, if its value is null. In other situations, the .code cptr-cast function must be used to coerce the pointer object to the matching type. Note that if .meta type-sym is specified as .codn nil , then this is precisely equivalent to the unparametrized .code cptr which doesn't provide the above safety measure. A .code carray object may also be converted to a foreign pointer under the control of a parametrized .code cptr type. The .code carray object's internal pointer becomes the foreign pointer value. The conversion is only permitted if the following two restrictions are not met, otherwise an error exception is thrown. Firstly, the .meta type-sym of the .code cptr type must be the name of an FFI type, at the time when the .code cptr type expression is processed, otherwise the .code cptr is not associated with a type. Secondly, the .code carray object being converted must have an element type which matches the FFI type denoted by the .code cptr object's .metn type-sym . Pointer type safety is useful, because FFI can be used to create bindings to large application programming interfaces (APIs) in which objects of many different kinds are referenced using pointer handles. The erroneous situation can occur that a FFI call passes a handle of one kind to a function expecting a different kind of handle. If all pointer handles are represented by a single .code cptr type, then such a situation proceeds without diagnosis. If handles of different types are all mapped to .code cptr types with different tags, the situation is intercepted and diagnosed with an error exception. .coNP FFI types @ align and @ pack .synb .mets (align <> [ width ] << type ) .mets (pack <> [ width ] << type ) .syne .desc The FFI type operators .code align and .code pack define a type which is a copy of .metn type , but with adjusted alignment requirements. In some cases, .code pack (but not .codn align ) works by replacing itself with a transformed version of the .code type syntax. If the .meta width argument is present, it is an expression which is evaluated in the top-level environment. It must produce a positive integer which is a power of two. If .meta width is absent, a different default value is used depending on which type operator is specified. For .codn align , it defaults to some platform-specific maximum useful alignment value, typically 16. For .codn pack , a missing .meta width defaults to 1. The .code align operator can be used to create a version of .meta type which is aligned at least as strictly as the specified .metn width . That is to say, values of .meta width which are less than or equal to .metn type 's existing alignment have no effect on alignment, except when the type is used as a bitfield. The .code pack operator can be used to create a version of .meta type which is less strictly aligned than its existing alignment. Alignment affects the placement of the type as a structure member, and as an array element. A type with alignment 1, like the default alignment for .codn pack , can be placed at any byte offset, and thus is effectively unaligned. A type with alignment 2 can be placed only at even addresses and offsets. Alignment can be applied to all types, including arrays and structs. It may also be applied to bitfields, but special considerations have to be observed to obtain the intended effect, described below. However, out of the elementary types, only the integer and floating point types are required to support a weakening of alignment. Whether a type which corresponds to a pointer, such as a .code str or .codn buf , can be written at an offset which doesn't meet that type's default alignment is machine-dependent. If a FFI struct type is declared with a weakened alignment, whether or not such a structure can be read or written at the misaligned offsets depends on whether the individual members support it. If they are integer or floating-point types, or aggregates thereof, the usage is supported in a machine-independent manner. Alignment interacts with the allocation of bitfields in special ways. If .meta width is greater than 1, or regardless of .meta width if the operator is .codn align , the type is marked with a Boolean attribute indicating that it has altered alignment. Then, when a bitfield is based on a type which has altered alignment, then that bitfield isn't packed together with the previous field, even if the allocation rules otherwise call for it. Due to the alignment request, the byte offset is first adjusted according to the requested alignment and the bit offset is reset to zero. The bit field is then allocated at the new alignment. This requirement applies even if the requested alignment is 1, which is possible via a combination of both .code pack and .codn align , both specified with a .meta width of 1. If the requested alignment for the type of a bitfield is 1, and the previous member is a bitfield which has left a byte partially filled, then the new bitfield starts on a fresh byte, even if it would otherwise be packed with the previous bitfield. If a named bitfield has weakened alignment, other than one byte alignment produced by .codn pack , the bitfield's original type's alignment is used for the purposes of determining its contribution to the alignment of the structure. When .meta type is one of two kinds of types, the .code pack type operator exhibits special behaviors, as follows. In these situations, the .code pack operator has no semantics other than these behaviors. .RS .IP 1. When .meta type is .code struct or .code union syntax which defines at least one member, then the .code pack operator performs the following syntactic transformation: each member of .meta type is edited, by specifying a .code pack operator around its type, with the given .metn width . The surrounding .code pack operator is deleted. The effect is that .code pack is applied not to the struct or union type itself, but to its members. For example .code "(pack (struct s (x int) (y double))" is transformed into .codn "(struct s (pack 1 (x int)) (pack 1 (y double)))" . The 1 comes from the defaulting of .metn width . The rationale for this behavior is that alignment weakening is often required for all members of a structure, rather than select members. Moreover, specifying weak alignment for a structure type itself, while leaving members with strict alignments, rarely makes sense. Weakening the alignment of a structure will not eliminate the padding between the members or at the end; it will only have any useful effect when that structure is itself used as the member of another structure. An important rationale also is that the GNU C .code packed attribute works this way, and so C structures declarations using that attribute are easier to translate to the \*(TL FFI type system. Deriving a less strictly aligned version of a structure or union type without any effect on the alignment of its members may be obtained by applying the .code bit operator to either .code typedef name for a structure or union type, or else to syntax which refers to an existing type without defining members. Given the definition .codn "(typedef s (struct s (a int) (b char)))" , the type .code s is an eight byte structure with three bytes of padding at the end, which has four byte alignment. The type expression .code "(pack s)" produces a version of this type which has one byte alignment. The expression .codn "(pack (struct s))" , likewise. The resulting unaligned type is still eight bytes wide, and has three padding bytes. In other words, the .code pack operator does not transform the syntax of a structure which is already defined as an object, .IP 2. When .meta type is a .code align operation, then .code pack transforms the syntax as follows: the .code pack operator surrounding the .code align expression is removed, and introduced around the type expression that is .codn align 's own operand. Thus .code "(pack 2 (align 16 int))" is transformed into .codn "(align 16 (pack 2 int))" . The rationale for this transformation is that when both .code align and .code pack are applied to a type, the combination only makes sense when .code pack is first. For a non-structure type like .codn int , .code "(pack x (align y int))" is equivalent to just .codn "(pack x int)" , because .code pack will set the alignment to .code x regardless of the effect of .codn align . Whereas .code "(align y (pack x int))" is meaningful in that the .code align takes precedence over .code pack if .codn "(> y x)" . The main rationale is that .code pack may be applied to structure members via a code transformation. Those members may already have types which use .codn align . This transformation ensures that the semantics is applied in a useful order. For example .code "(pack (struct s (a char) (x (align 2 int))))" is first transformed into .codn "(struct s (a (pack 1 char)) (x (pack 1 (align 2 int)))))" . If this is left as-is, then the .code align on .code x is obliterated by the .codn pack , rendering it useless. A further transformation takes place to .codn "(struct s (a (pack 1 char)) (x (align 2 (pack 1 int)))))" . Now the .code align directive is increasing the alignment of .code x to 2, so that .code x will be placed at offset 2, leaving one byte of padding after the .code a member. This is how attributes work in GNU C also: the .code aligned attribute on the member of a packed structure can take precedence and increase its alignment. .RE .IP After these transformations are applied, the nested .code pack forms which occur in the transformed syntax may perform more such transformations, depending on their operands. Note that the two-argument form of .code pack with a .meta width value greater than 1 doesn't directly correspond to any single attribute specifier in GNU C. The GNU C .code packed attribute is Boolean, implicitly reducing alignment to 1. A combination of the GNU C attributes .code aligned and .code packed is used to produce the effect of .mono .meti (pack < n << type ) .onom for values of .meta n > 1. In GNU C, the .code packed attribute, when applied to a structure, distributes to its members, but isn't capable of distributing an alignment exceeding 1. So the .mono .meti (pack < n (struct ...)) .onom expression, for values of .meta n > 1, doesn't correspond to anything in GNU C; its effect can be simulated by attributing the structure type with .codn packed , and then individually applying the required alignment to the member declarations. .SS* Additional Types .coNP FFI types @, size-t @, ptrdiff-t @, int-ptr-t @, uint-ptr-t @, intmax-t @, uintmax-t @ wint-t @, sig-atomic-t @ time-t and @ clock-t .desc These additional FFI types for common C language types are provided as .code typedef aliases. The .code intmax-t and .code uintmax-t types are provided only if the host platform's .code intmax_t is no wider than 64 bits. If the host platform lacks .code intmax_t then the above two FFI types are defined as aliases for .code longlong and .codn ulonglong , respectively. .coNP FFI type @ qref .synb .mets (qref < struct-type < member1 >> [ member2 ...]) .syne .desc The FFI type operator .code qref provides a way to reference the type of a member of a struct or union. The .meta struct-type argument must be a type expression denoting a struct or union. The .meta member1 argument and any additional arguments must be symbols. If .code S is a struct or union type, and .code M is a member, then .code "(qref S M)" is a type expression denoting the type of .codn M . Moreover, if .code M itself is a struct or union, which has a member named .code N then the type of .code N can be denoted by the expression .codn "(qref S M N)" . Similarly, additional symbols reference through additional struct/union nestings. Note: the referencing dot syntax can be used to write .code qref expressions. For instance, .code "(qref S M N)" can be written as .code S.M.N instead. .coNP FFI type @ elemtype .synb .mets (elemtype << type ) .syne .desc The FFI type operator .code elemtype denotes the element type of .metn type , which must be a pointer, array or enum. Note: there is also a macro .codn elemtype . The macro expression .code "(elemtype X)" is equivalent to the expression .codn "(ffi (elemtype X))" . .coNP FFI types @, blkcnt-t @, blksize-t @, clockid-t @, dev-t @, fsblkcnt-t @, fsfilcnt-t @, gid-t @, id-t @, ino-t @, key-t @, loff-t @, mode-t @, nlink-t @, off-t @, pid-t @, ssize-t @ uid-t and @ socklen-t The additional names of various common POSIX types may also be available, depending on platform. They are provided as .code typedef aliases. .SS* Endian Types In addition to the type system described in the previous section. the FFI type system supports .IR "endian types" , which are useful for dealing with data formats defined by networking protocols and other kinds of standards, or data structure definitions from other machines. There are two kinds of .IR endianness : .I "Little endian" refers to the least-significant byte of a data type being stored at the lowest address in memory, lowest offset in a buffer, lowest offset in a file, or earlier byte in a communication stream. .I "Big endian" is the opposite: it refers to the most significant byte occurring at the lowest address, offset or stream position. For each of the signed integral types .code int16 through .codn int64 , the corresponding unsigned types .code uint16 through .codn uint64 , and the two floating-point types .code float and .codn double , the FFI type system provides a big-endian and little-endian version, whose names are derived by prefixing the .code be- or .code le- prefix to its related type. Thus, the exhaustive list of the endian types is: .codn be-int16 , .codn be-uint16 , .codn be-int32 , .codn be-uint32 , .codn be-int64 , .codn be-uint64 , .codn be-float , .codn be-double , .codn le-int16 , .codn le-uint16 , .codn le-int32 , .codn le-uint32 , .codn le-int64 , .codn le-uint64 , .code le-float and .codn le-double . These types have the same size and alignment as their plain, unprefixed counterparts. Alignment can be overridden with the .code align type construction operator to create versions of these types with alternative alignment. Endian types are supported as arguments to functions, return values, members of structs and elements of arrays. \*(TL's FFI performs the automatic conversion from the abstract Lisp integer representation to the foreign representations exhibiting the specified endianness. .SS* Incomplete Types In the \*(TL FFI type system, the following types are .IR incomplete : the type .codn void , arrays of unspecified size, and any .code struct whose last element is of incomplete type. An incomplete type cannot used as a function parameter type, or a return value type. It may not be used as an array element or union member type. A struct member type may be incomplete only if it is the last member. An incomplete structure whose last member is an array is a .IR "flexible structure" . .SS* Flexible Structures If a FFI .code struct type is defined with an incomplete array (an array of unspecified size) as its last member, then it specifies an incomplete type known as a .IR "flexible structure" . That array is the .IR "terminating array" . The terminating array corresponds to a slot in the Lisp structure; that slot is the .IR "last slot" . A structure which has a flexible structure as its last member is also, effectively, a flexible structure. When a Lisp structure is being converted to the foreign representation under the control of a flexible structure FFI type, the number of elements in the terminating array is determined from the length of the object stored in the last slot of the Lisp structure. The length includes the terminating null element for .code zarray types. The conversion is consistent with the semantics of an incomplete array that is not a structure member. In the reverse direction, when a foreign representation is being converted to a Lisp structure under the control of a flexible structure FFI type, the size of the array that is accessed and extracted is determined from the length of the object stored in the last slot, or, if the array type is a .code zarray from detecting null-termination of the foreign array. The conversion of the array itself is consistent with the semantics of an incomplete array that is not a structure member. Before the conversion takes place, all of the members of the structure prior to the terminating array, are extracted and converted to Lisp representations. The corresponding slots of the Lisp structure are updated. Then if the Lisp structure type has a .code length method, that method is invoked. The return value of the method is used to perform an adjustment on the object in the last slot. If the existing object in the last slot is a vector, its length is adjusted to the value returned by the method. If the existing object isn't a vector, then it is replaced by a new .codn nil -filled vector, whose length is given by the return value of .codn length . The conversion of the terminating array to Lisp representation the proceeds after this adjustment, using the adjusted last slot object. .SS* Bitfield Allocation Rules The \*(TL FFI type system follows rules for bitfield allocation which were experimentally derived from the behavior of the GNU C compiler on several mainstream architectures. The allocation algorithm can be imagined to walk through the structure from the first member to the last, maintaining a byte offset .I O which indicates how many whole bytes have been allocated to members so far, and a bit offset .I B which indicates, additionally, how many bits have been allocated in the byte which follows these .I O bytes, between 0 and 7. When a non-bitfield member is placed, then there are two cases: either .I B is zero (only .I O bytes have been allocated, with no fractional byte) or else .I B is nonzero. In this latter case, .I B is reset to zero and .I O is incremented by one. In either case, .I O is adjusted up to the required alignment boundary for the new member. The member is placed, and .I O is incremented again by the size of that member. When a bitfield member is placed, the algorithm considers the structure to be allocated in units of the base type of that bitfield member. For instance if the bitfield is derived from type .code uint16 then the structure's layout is considered to have been allocated in .code uint16 units. The algorithm examines the value of .I O and .I B to determine the first available unit in which at least one bit of unallocated space remains. Then, if the unit at that offset has enough space to hold the new bitfield, according to the bitfield's width, then the bitfield is placed into that unit. Otherwise, the bitfield is placed into the next available unit. After a bitfield is placed, the values of .I O and .I B are adjusted so that .I O reflects the whole number of bytes which have been allocated to the structure so far, and .I B indicates the 0 to 7 additional bits of any bitfield material protruding past those whole bytes. A zero-width bitfield is also considered with regard to the storage unit size indicated by its type. As in the case of the nonzero-width bitfield, the offset of the first available unit is found which has at least one bit of unallocated space. Then, if that unit is entirely empty, the zero-width bitfield has no effect. If that unit is partially filled, then .I O is adjusted to point to the next unit after that, and .I B is reset to zero. Note that according to this semantics, a zero-width bitfield can have an effect even if placed between non-bitfield members, or appears as the last member of a structure. Also, a structure containing only a zero-width bitfield has size zero. If, after the placement of all structure members, .I B has a nonzero value, then the offset .I O is incremented by one to cover that byte. As the last allocation step, the size of the structure is then padded up to a size which is a multiple of the alignment of the most strictly aligned member. A named bitfield contributes to the alignment of the structure, according to its type, the same way as a non-bitfield member of the same type. An unnamed bitfield doesn't contribute alignment, or else may be regarded as having the weakest possible alignment, which is byte alignment. If all of the members of a structure are unnamed bitfield members of any type, it exhibits byte alignment. The description isn't complete without a treatment of byte and bit order. Bitfield allocation follows an imaginary "bit endianness" whose direction follows the machine's byte order: most-significant bits are allocated first on big endian, least significant bits first on little endian. If a one-bit-wide bitfield is allocated into a hitherto empty structure, it will be placed into the first byte of that structure, regardless of the machine's endianness, and regardless of the underlying storage unit size for that bitfield. Within that first byte, it will be placed into the most significant bit position on a big-endian machine (bit 7); and on a little-endian machine, it will be placed into the least significant bit position (bit 0). If another one-bit-wide is allocated, it is placed into bit 6 on big endian, and bit 1 on little endian. More generally, whenever a bitfield is allocated for a big-endian machine, and the storage unit is determined into which that bitfield shall be placed, the most significant bits of that storage unit are filled first on a big-endian machine, whereas the least significant bits are filled first on a little-endian machine. From this it follows that on either type of machine, that field shall be placed at the lowest-addressed byte or bytes in which unallocated bits remain. .SS* Returning Scalar Objects by Pointer There are situations in which an a foreign function takes the address of a storage location, and writes a new value into that location. Informally, this referred to as an "out parameter" or "in-out parameter", in the case of bidirectional data transfer. In the C language, the familiar pattern looks like this: .verb void function(int *ptr); int val = 0; function(&val); .brev In the case of an aggregate type, such as a structure, being an in-out or out parameter, this pattern is easily handled in FFI because the corresponding Lisp object is also an aggregate, and therefore has reference semantics: it can be updated to receive the new value. In the case of a scalar, however, such as .code int in the above example, this may not be not possible. A Lisp integer doesn't have the referential semantics required to receive a new value by pointer, and there is no "address-of" concept to create a reference to its location. To understand the following FFI trick, it helps to first rework the C into a different form: .verb void function(int *ptr); int val[1] = { 0 }; abc_function(val); .brev Instead of a scalar value, we can declare an array of 1 element of that same type, and pass the array (which converts into a pointer to that element). This approach inspires a similar trick in the FFI domain: .verb (with-dyn-lib (...) (deffi abc-function "abc_function" void ((ptr (array 1 int))))) (let ((val (vec 0))) (abc-function val) ;; [vec 0] has updated value coming from function ) .brev We define the parameter of .code abc-function as being a pointer to an array of 1 .code int rather than an int, and then pass a vector as the argument. If the parameter is in-out, then the vector must be constructed or initialized to contain a value that will convert to the C type. If the parameter is out only, then the FFI definition can use .code ptr-out and the vector can contain the .code nil value. .SS* FFI Call Descriptors The FFI mechanism makes use of a type-like representation called the "call descriptor". A call descriptor is an object which uses FFI types to describe function arguments and return values. A FFI descriptor is required to call a foreign function, and to create a FFI closure to use as a callback function from a foreign function back into \*(TL. A FFI descriptor object can be constructed from a return value type, and a list of argument types, and several other pieces of information using the function .codn ffi-make-call-desc . This object can then be passed to .code ffi-call to specify the C type signature of a foreign function, or to .code ffi-make-closure to specify the C type signature of a FFI closure to bind to a Lisp function. The FFI macros .code deffi and .code deffi-cb provide a simplified syntax for expressing FFI call descriptors, which includes a notation for expressing variadic calls. A note about variadic foreign functions: although there is support in the call descriptor mechanism for expressing a variadic function, it expresses a particular .B instance of a variadic function, rather than the variadic function's type per se. To call the same variadic function using different variadic arguments, different call descriptors are required. For instance to perform the equivalent of the C function call .mono printf("hello\en") .onom requires a certain descriptor. To perform the equivalent of .mono printf("hello, %s\en", name) .onom requires a different descriptor. .SS* Foreign Function Type API This group of functions comprises the basic interface to the \*(TL's FFI type system module. .coNP Function @ ffi-type-compile .synb .mets (ffi-type-compile << syntax ) .syne .desc The .code ffi-type-compile function produces and returns a compiled type object from a .meta syntax argument which specifies valid FFI syntax. If the type syntax is invalid, or specifies a nonexistent type specifier or operator, an exception is thrown. Note: whenever a function argument is required to be of FFI type, what it means is that it must be a compiled object, and not a Lisp expression denoting FFI syntax. .TP* Examples: .verb (ffi-type-compile 'int) -> # (ffi-type-compile '(array 3 double)) -> # (ffi-type-compile 'blarg) -> ;; error .brev .coNP Function @ ffi-make-call-desc .synb .mets (ffi-make-call-desc < ntotal < nfixed < rettype .mets \ \ < argtypes <> [ name ]) .syne .desc The .code ffi-make-call-desc function constructs a FFI call descriptor. The .meta ntotal argument must be a nonnegative integer; it indicates the number of arguments in the call. If the call denotes a variadic function, the .meta nfixed argument must be an integer at least 1 and less than .metn ntotal , denoting the number of fixed arguments. If the call denotes an ordinary, non-variadic function, then .meta nfixed must either be specified specified as .code nil or else equal to the .meta ntotal argument. The .meta rettype parameter must be an FFI type. It specifies the function return type. Functions which don't return a value are specified by the (compiled version of) the return type .codn void . The .meta argtypes argument must be a list of types, containing at least .meta ntotal elements. If the function takes no arguments, this list is empty. If the function is variadic, then the first .meta nfixed elements of this list specify the types of the fixed arguments; the remaining elements specify the variadic arguments. The .meta name argument gives the name of the function for which this description is intended, or some other identifying symbol. This symbols is used in diagnostic messages related to errors in the construction of the descriptor itself or its subsequent use. If this parameter is omitted, then the involved FFI functions use their own names in reporting diagnostics. Note: variadic functions must not be called using a non-variadic descriptor, and vice versa, even if the return types and argument types match. Note: unlike the .code deffi and .code deffi-cb macros, the .code ffi-make-call-desc function doesn't perform any special treatment of variadic parameter types. When any of the types .codn float , .code be-float or .code le-float occur in the variadic portion of .metn argtypes , it is unspecified whether a descriptor is successfully produced and returned or whether an exception is thrown. If a descriptor is successfully produced, and then subsequently used for making or accepting calls, the behavior is undefined. .TP* Example: .verb ;; ;; describe a call to the variadic function ;; ;; type void (*)(char *, ...) ;; ;; with these actual arguments ;; ;; (char *, int) ;; (ffi-make-call-desc 2 ;; two arguments 1 ;; one fixed (ffi-type-compile 'void) ;; returns nothing (list (ffi-type-compile 'str) ;; str -> char * (ffi-type-compile 'int))) ;; int --> # (# #)> .brev .coNP Function @ ffi-type-operator-p .synb .mets (ffi-type-operator-p << symbol ) .syne .desc The .code ffi-type-operator-p function return .code t if .meta symbol is a type operator symbol: a symbol used in the first position of a recognized compound type form in the FFI type system. Otherwise, it returns .codn nil . .coNP Function @ ffi-type-p .synb .mets (ffi-type-p << symbol ) .syne .desc The .code ffi-type-p function returns .code t if .meta symbol denotes a type in the FFI type system: either a built-in type or an alias type name established by .codn typedef . Otherwise, it returns .codn nil . .coNP Function @ ffi-make-closure .synb .mets (ffi-make-closure < lisp-fun < call-desc .mets \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ >> [ safe-p <> [ abort-val ]]) .syne .desc The .code ffi-make-closure function binds a Lisp function .metn lisp-fun , which may be a lexical closure, or any callable object, with a FFI call descriptor .meta call-desc to produce a FFI closure. A FFI closure is an object of type .code ffi-closure which is suitable as an argument for the type denoted by the .code closure type specifier keyword in the FFI type language. This type appears a C function pointer in the foreign code, and may be called as such. When it is called by foreign code, it triggers a call to .metn lisp-fun . The optional .meta safe-p parameter controls whether the closure dispatch is "safe", the meaning of which is described shortly. The default value is .code t so that unsafe closure dispatch must be explicitly requested with a .code nil argument for this parameter. A callback closure which is safely dispatched, firstly, does not permit the capture of delimited continuations across foreign code. Delimited continuations can be captured inside a closure dispatched that way, but the delimiting prompt must be within the callback's local stack frame, without traversing across the foreign stack frames. Secondly, a callback closure which is safely dispatched doesn't permit direct nonlocal control transfers across foreign code, such as exception handling. Such transfers, however, appear to work anyway (with caveats): this is because they are specially handled. The closure dispatch mechanism intercepts all dynamic control transfers, converts them to an ordinary return from the callback to the foreign code, and resumes the control transfer when the foreign code itself finishes and returns. If the callback returns a value (its return type is other than .codn void ) then in this situation, the callback returns an all-zero-bits return value to the foreign caller. If the .meta abort-val parameter is specified and its value is other than .codn nil , then that value will be used as the return value instead of an all-zero bit pattern. An unsafely dispatched closure permits the capture of continuations from the callback across the foreign code and direct dynamic control transfers which abandon the foreign stack frames. Unsafe closure dispatch is only compatible with foreign code which is designed with that usage in mind. For instance foreign code which holds dynamic resources in stack variables will leak those resources if abandoned this way. There are also issues with capturing continuations across foreign code. Note: the C function pointer is called a "closure" because it carries environment information. For instance, if .code lisp-fun is a lexical closure, invocations of it through the FFI closure occur in its proper lexical environment, even though its external representation is a simple C function pointer. This requires a special trampoline trick: a piece of dynamically constructed machine code with the closure binding embedded inside it, with the C function pointer pointing to the machine code. Note: the same call descriptor can be reused multiple times to create different closures. The same Lisp function can be involved in multiple FFI closures. .TP* Example: .verb ;; Package the TXR cmp-str function as a string ;; comparison callback compatible with: ;; ;; int (*)(const char *, const char *) ;; (ffi-make-closure (fun cmp-str) (ffi-make-call-desc 2 nil ;; two args, non-variadic (ffi-type-compile 'int) ;; int return [mapcar ffi-type-compile '(str str)])) ;; args .brev .coNP Function @ ffi-call .synb .mets (ffi-call < fun-cptr < call-desc <> { arg }*) .syne .desc The .code ffi-call function invokes a foreign function. The .meta fun-cptr argument which must be a .code cptr object. It is assumed to point to a foreign function. The .meta call-desc argument must be a FFI call descriptor, produced by .codn ffi-make-call-desc . The .meta call-desc must correctly describe the foreign function. The zero or more .meta arg arguments are values which are converted into foreign argument values. There must be exactly as many of these arguments as are required by .metn call-desc . The .code ffi-call function converts every .meta arg to a corresponding foreign object. If these conversions are successful, the converted foreign arguments are passed by value to the foreign function indicated by .metn fun-cptr . An unsuccessful conversion throws an error. When the call returns, the foreign function's return value is converted to a Lisp object and returned, in accordance with the return type that is declared inside .metn call-desc . .coNP Function @ ffi-typedef .synb .mets (ffi-typedef < name << type ) .syne .desc The .code ffi-typedef function installs the compiled FFI type given by .meta type as a typedef name under the symbol given by .metn name . After this registration, whenever the type compiler encounters that symbol being used as a type specifier, it will replace it by the type object it represents. The .code ffi-typedef function returns .metn type . .TP* Example: .verb ;; define refcount-t as an alias for uint32 (ffi-typedef 'refcount-t (ffi-type-compile 'uint32)) .brev .coNP Function @ ffi-size .synb .mets (ffi-size << type ) .syne .desc The .code ffi-size function returns an integer which gives the storage size of the given FFI type: the amount of storage required for the external representation of that type. Bitfield types do not have a size; it is an error to apply this function to a bitfield. The size is machine:specific. .TP* Example: .verb (ffi-size '(ffi-type-compile 'double)) -> 8 (ffi-size '(ffi-type-compile 'char)) -> 1 (ffi-size '(ffi-type-compile '(array 42 char))) -> 42 .brev .coNP Function @ ffi-alignof .synb .mets (ffi-alignof << type ) .syne .desc The .code ffi-alignof function returns an integer which gives the alignment the given FFI type. When an instance of .meta type is placed into a structure as a member, it is placed after the previous member at the smallest available offset which is divisible by the alignment. The bytes skipped from the smallest available offset to the smallest available aligned offset are referred to as .IR padding . Bitfield types do not have an alignment; it is an error to apply this function to a bitfield. Bitfields are allocated in storage cells, and those cells have alignment which is the same as that of the type .codn int . The alignment is machine-specific. It may be more strict than what the hardware architecture requires, yet at the same time be smaller than the size of the type. For instance, the size of the type .code double is commonly 8, yet the alignment is often 4, and this is so even on processors like Intel x86 which can load and store a double at a misaligned address. The alignment of an array is the same as that of its element type. The alignment of a structure is that of its member which has the most strict (largest-valued) alignment. It is a property of arrays, derived from requirements governing the C language, that if the first element of an array is at a correctly aligned address, then all elements are. To ensure that this property holds for for arrays of structures, structures sometimes must include padding at the end. This is because the size of a structure without any padding might not be multiple of its alignment, which is derived from the most strictly aligned member. For instance, if we assume an architecture on which the size and alignment of .code int is 4, the size of the structure type .code "(struct ab (a int) (b char))" would be 5 if no padding were included. However, in an array of these structures, the second element's .code a member would be placed at offset 5, rendering it misaligned. To ensure that every .code a is placed at an offset which is multiple of 4, the struct type is extended with anonymous padding so that its size is 8. .TP* Example: .verb (ffi-alignof (ffi double)) -> 4 .brev .coNP Function @ ffi-offsetof .synb .mets (ffi-offsetof < type << member ) .syne .desc The .code ffi-alignof function calculates the byte offset of .meta member within the FFI type .metn type . If .meta type isn't a FFI struct type, or if .meta member isn't a symbol naming a member of that type, the function throws an exception. An exception is also thrown if .meta member is a bitfield. .TP* Example: .verb (ffi-offsetof (ffi (struct ab (a int) (b char))) 'b) -> 4 .brev .coNP Function @ ffi-arraysize .synb .mets (ffi-arraysize << type ) .syne .desc The .code ffi-arraysize function reports the number of elements in .metn type , which must be an array type: an .codn array , .code zarray or .codn carray . .TP* Example: .verb (ffi-arraysize (ffi (array 5 int))) -> 5 .brev .coNP Function @ ffi-elemsize .synb .mets (ffi-elemsize << type ) .syne .desc The .code ffi-elemsize function reports the size of the element type of an array, of the target type of a pointer, or of the base integer type of an enumeration. The .meta type argument must be an array, pointer or enumeration type: a type constructed by one of the operators .codn array , .codn zarray , .codn carray , .codn ptr , .codn ptr-in , .codn ptr-out , .code enum or .codn enumed . .TP* Example: .verb (ffi-elemsize (ffi (array 5 int))) -> 4 ;; (sizeof int) .brev .coNP Function @ ffi-elemtype .synb .mets (ffi-elemtype << type ) .syne .desc The .code ffi-elemtype function retrieves the element type of an array type, target type of a pointer type, or base integer type of an enumeration. The .meta type argument must be an array, pointer or enumeration type: a type constructed by one of the operators .codn array , .codn zarray , .codn carray , .codn ptr , .codn ptr-in , .codn ptr-out , .code enum or .codn enumed . .TP* Example: .verb (ffi-elemtype (ffi (ptr int))) -> # .brev .SS* Foreign Function Macro Language This group of macros provides a higher-level language for working with FFI types and defining foreign function bindings. The macros are implemented using the Foreign Function Type API described in the previous section. .coNP Macro @ with-dyn-lib .synb .mets (with-dyn-lib < lib-expr << body-form *) .syne .desc The .code with-dyn-lib macro works in conjunction with the .codn deffi , .code deffi-sym and .code deffi-var macros. When a .code deffi form appears as one of the .metn body-form s of the .code with-dyn-lib macro, that .code deffi form is permitted to use the simplified forms of the .meta fun-expr argument, to refer to library functions succinctly, without having to specify the library. The same remark applies to .code deffi-sym and .codn deffi-var , regarding their .meta var-expr parameter. A form invoking the .code with-dyn-lib macro should be a top-level form. The macro creates a global variable named by a symbol generated by .code gensym whose initializing expression binds it to a dynamic library handle. The macro then creates an environment in which the enclosed .codn deffi , .code deffi-var and .code deffi-sym forms can implicitly refer to that library via the global variable. The .meta lib-expr argument can take on three different forms: .RS .meIP nil If .meta lib-expr is .codn nil , then .code with-dyn-lib arranges for the library to refer to the \*(TX executable itself. .meIP < string If .meta lib-expr is a literal string, then .code with-dyn-lib will arrange for the hidden variable to be initialized with an expression which opens a handle to the specified library. .meIP < form If .meta lib-expr is any other form, then it is assumed to denote syntax for opening the handle to a library. That syntax is used verbatim as the initializing expression for the generated global variable which holds the library handle. .RE .IP The result value of a .code with-dyn-lib form is the symbol which names the generated variable which holds the library handle. .TP* Examples: .verb ;; refer to malloc and free functions ;; in the executable (with-dyn-lib nil (deffi malloc "malloc" cptr (size-t)) (deffi free "free" void (cptr))) ;; refer to "draw" function in fictitious ;; "libgraphics" library: (with-dyn-lib "libgraphics.so.5" (deffi draw "draw" int (cptr cptr))) ;; refer to "init_foo" function via specific ;; library handle. (defvarl foo-lib (dlopen "libfoo.so.1")) (with-dyn-lib foo-lib (deffi init-foo "init_foo" void (void))) .brev .coNP Macro @ deffi .synb .mets (deffi < name < fun-expr < rettype << argtypes ) .syne .desc The .code deffi macro arranges for a Lisp function to be defined, via .codn defun , which calls a foreign function. The .meta name argument must be a symbol suitable as a function name in a .code defun form. This specifies the function's Lisp name. The .meta fun-expr parameter specifies the foreign function which is to be called. The syntactic variants permitted for its argument are described below. The .meta rettype argument must specify the return type, using the FFI type syntax, as an unquoted literal. The macro arranges for the compilation of this syntax via .codn ffi-type-compile . The .meta argtypes argument must specify a list of the argument types, as an unquoted literal list, using FFI type syntax. The macro arranges for these types to be compiled. Furthermore, a special convention may be used for specifying a variadic function: if the .code : (colon) keyword symbol appears as one of the elements of .metn argtypes , then the .code deffi form specifies a fixed call to a foreign function which is variadic. The argument types before the colon keyword are the types of the fixed arguments. The types after the colon, if any, are of the variadic arguments. Special considerations apply to some variadic argument types, described below. The following syntactic variants are permitted of the .meta fun-expr argument: .RS .meIP < name-string If .meta fun-expr is a literal string, then the .code deffi form must be enclosed in the .code with-dyn-lib macro, appearing as one of that macro's .metn body-form s. In this situation the literal character string .meta name-string specifies a symbol to be found within the library established by the .meta with-dyn-lib macro. .meIP >> ( name-string << ver-string ) This manner of specifying the .meta fun-expr also requires the .code deffi form to be enclosed in a .codn with-dyn-lib . It selects a particular version of a symbol from the library. .meIP < form If .meta fun-expr is any other form, then it must specify an expression which evaluates to a .code cptr object giving the address of a foreign library symbol. If this form is used, then the .code deffi form need not be surrounded by a call to the .code with-dyn-lib macro. .RE .IP When the FFI type .code float is used as the type of a variadic parameter, .code deffi replaces it by the FFI type .codn double . This treatment is necessary because the C variadic argument mechanism promotes .code float values to .codn double . Note: due to this substitution, it is possible to pass floating-point values which are out of range of the .code float type, without any diagnosis. The behavior of is undefined in the Lisp-to-C direction, if the C function extracts an out-of-range .code double argument as if it were of type .codn float . The FFI types .code be-float and .code le-float cannot be used for specifying the types of a variadic argument. If any of these occur in that position, .code deffi throws an error. Rationale: these types are related to the C type .code float type, which requires promotion in variadic passing. Promotion cannot be performed on floating-point values whose byte order has been rearranged, because promotion is a value-preserving conversion. .IP The result value of a .code deffi form is .metn name . .coNP Macros @ deffi-cb and @ deffi-cb-unsafe .synb .mets (deffi-cb < name < rettype < argtypes <> [ abort-val ]) .mets (deffi-cb-unsafe < name < rettype << argtypes ) .syne .desc The .code deffi-cb macro defines, using .code defun a Lisp function called .metn name . Thus the .meta name argument must be a symbol suitable as a function name in a .code defun form. The .meta rettype and .meta argtypes arguments are processed exactly as in the corresponding arguments in the .code deffi macro. The .code deffi-cb macro arranges for .meta rettype and .meta argtypes to be compiled into a FFI call descriptor. The generated function called .meta name then serves as a combinator which takes a Lisp function as its argument, and binds it to the FFI call descriptor to produce a FFI closure. That closure may then be passed to foreign functions as a callback. The .code deffi-cb macro generates a callback which uses safe dispatch, which is explained in the description of the .code ffi-make-closure function. The optional .meta abort-val parameter specifies an expression which evaluates to the value to be returned by the callback in the event that a dynamic control transfer is intercepted. The purpose of this value is to indicate to the foreign code that the callback wishes to abort operation; it is useful in situations when a suitable return value will induce the foreign code to cooperate and itself return to the Lisp code which will then continue the dynamic control transfer. The .code deffi-cb-unsafe macro is a variant of .code deffi-cb with the same argument conventions. The difference is that it arranges for .code ffi-make-closure to be invoked with .code nil for the .meta safe-p parameter. This macro has no .meta abort-val parameter, since unsafe callbacks do not use it. .TP* Example: .verb ;; create a closure combinator which binds ;; Lisp functions to a call descriptor has the C type ;; signature void (*)(int). (deffi-cb void-int-closure void (int)) ;; use the combinator ;; some-foreign-function's second arg is ;; of type closure, specifying a callback: (some-foreign-function 42 (void-int-closure (lambda (x) (puts `callback! @x`)))) .brev .coNP Macro @ deffi-var .synb .mets (deffi-var < name < var-expr << type ) .syne .desc The .code deffi-var macro defines a global symbol macro which expands to an expression accessing a foreign variable, creating the illusion that the variable is available as a Lisp variable holding a Lisp data type. The .meta name argument gives the name of the symbol macro to be defined. The .meta var-expr argument is one of several permitted syntactic forms which specify the address of the foreign variable. They are described below. The .meta type argument expresses the variable type in FFI type syntax. Once the variable is defined, accessing the macro symbol .meta name performs a get operation on the foreign variable, yielding the conversion of that variable to a Lisp value. An assignment to the symbol performs a put operation, converting a Lisp object to a value which overwrites the object. Note: FFI memory management is not helpful in the use of variables. Suppose a string value is stored in a variable of type .codn str . This means that FFI dynamically allocates a buffer which stores the UTF-8 encoded version of the string, and this buffer is placed into the foreign variable. Then suppose another such assignment takes place. The previous value is simply overwritten without being freed. The following syntactic variants are permitted of the .meta var-expr argument: .RS .meIP < name-string If .meta var-expr is a literal string, then the .code deffi-var form must be enclosed in the .code with-dyn-lib macro, appearing as one of that macro's .metn body-form s. In this situation the literal character string .meta name-string specifies a symbol to be found within the library established by the .meta with-dyn-lib macro. .meIP >> ( name-string << ver-string ) This manner of specifying the .meta fun-expr also requires the .code deffi form to be enclosed in a .codn with-dyn-lib . It selects a particular version of a symbol from the library. .meIP < form If .meta var-expr is any other form, then it must specify an expression which evaluates to a .code cptr object giving the address of a foreign library symbol. If this form is used, then the .code deffi form need not be surrounded by a call to the .code with-dyn-lib macro. .RE .coNP Macro @ deffi-sym .synb .mets (deffi-sym < name < var-expr <> [ type-sym ]) .syne .desc The .code deffi-sym macro defines a global lexical variable called .code name whose value is a .code cptr object that refers to a symbol in a foreign library. The .meta name argument gives the name for the variable to be defined. This definition takes place place as if by the .code defparml macro. The .meta var-expr is syntax which specifies the foreign pointer, using exactly the same conventions as described for the .code deffi-var macro, allowing for a shorthand notation if this form is enclosed in a .code with-dyn-lib macro invocation. The optional .meta type-sym argument must be a symbol. If it is absent, it defaults to nil. This argument specifies the type label for the .code cptr object which holds the pointer to the foreign symbol. The result value of .meta deffi-sym is the symbol .metn name . .coNP Macro @ typedef .synb .mets (typedef < name << type-syntax ) .syne .desc The .code typedef macro provides a convenient way to define type aliases. The .meta type-syntax expression is compiled as FFI syntax, and the .meta name symbol is installed as an alias denoting that type. The .code typedef macro yields the compiled version of .meta type-syntax as its value. .coNP Macros @ deffi-struct and @ deffi-union .synb .mets (deffi-struct < name >> {( slot < type <> [ init-form ])}*) .mets (deffi-union < name >> {( slot < type <> [ init-form ])}*) .syne .desc The .code deffi-struct and .code deffi-union macros provide a more compact notation for defining FFI structure and union types together with matching .code typedef names. The semantics follows from these equivalences: .verb (deffi-struct S ...) <--> (typedef S (struct S ...)) (deffi-union U ...) <--> (typedef U (union U ...)) .brev .TP* Example: .verb (deffi-struct point (x double) (y double)) .brev .coNP Macro @ sizeof .synb .mets (sizeof < type-syntax <> [ object-expr ]) .syne .desc The macro .code sizeof calculates the size of the FFI type denoted by .codn type-syntax . The .meta type-syntax expression is compiled to a type using .codn ffi-type-compile . The .meta object-expr expression is evaluated to an object value. If .code type-syntax denotes an incomplete array or structure type, and the .meta object-expr argument is present, then a .I "dynamic size" is computed: the actual number of bytes required to store that object value as a foreign representation. The .code sizeof macro arranges for the size calculation to be carried out at macro-expansion time, if possible, so that the .code sizeof form is replaced by an integer constant. This is possible when the .meta object-expr is omitted, or if it is a constant expression according to the .code constantp function. For the type .codn void , incomplete array types, and bitfield types, the one-argument form of .code sizeof reports zero. For incomplete structure types, the one-argument .code sizeof reports a size which is equivalent to the offset of the last member. The size of an incomplete structure does not include padding for the most strictly aligned member. .coNP Macro @ alignof .synb .mets (alignof << type-syntax ) .syne .desc The macro .code alignof calculates the alignment of the FFI type denoted by .code type-syntax at macro-expansion time, and produces that integer value as its expansion, such that there is no run-time computation. It uses the .code ffi-alignof function. .coNP Macro @ offsetof .synb .mets (offsetof < type-syntax << member-name ) .syne .desc The macro .code sizeof calculates the offset of the structure member indicated by .metn member-name , a symbol, inside the FFI struct type indicated by .metn type-syntax . This calculation is performed by a macro-expansion-time call to the .code ffi-offsetof function, and produces that integer value as its expansion, such that there is no run-time computation. .coNP Macro @ arraysize .synb .mets (arraysize << type-syntax ) .syne .desc The macro .code arraysize calculates the number of elements of the array type indicated by .metn type-syntax . This calculation is performed by a macro-expansion-time call to the .code ffi-arraysize function, and produces that integer value as its expansion, such that there is no run-time computation. .coNP Macro @ elemsize .synb .mets (elemsize << type-syntax ) .syne .desc The macro .code elemsize calculates the size of the element type of an array type, or the size of target type of a pointer type indicated by .metn type-syntax . This calculation is performed by a macro-expansion-time call to the .code ffi-elemsize function, and produces that integer value as its expansion, such that there is no run-time computation. .coNP Macro @ elemtype .synb .mets (elemtype << type-syntax ) .syne .desc The macro .code elemtype produce the element type of an array type, or the target type of a pointer type indicated by .metn type-syntax . Note: the .code elemtype macro may be understood in terms of several possible implementations. The form .code "(elemtype X)" is equivalent to .codn "(ffi-elemtype (ffi-type-compile X))" . Since there exists an .code elemtype type operator, the expression is also equivalent to .codn "(ffi-type-compile '(elemtype X))" . .coNP Macro @ ffi .synb .mets (ffi << type-syntax ) .syne .desc The .code ffi macro provides a shorthand notation for compiling a literal FFI type expression to the corresponding type object. The following equivalence holds: .verb (ffi expr) <--> (load-time (ffi-type-compile 'expr)) .brev .SS* Zero-filled Object Support Communicating with foreign interfaces sometimes requires representations to be initialized consisting of all zero bits, or mostly zero bits. \*(TX provides convenient ways to prepare Lisp objects such that when those objects are converted to a foreign representation, they generate zero-filled representations. .coNP Function @ make-zstruct .synb .mets (make-zstruct < type >> { slot-sym << init-value }*) .syne .desc The .code make-zstruct function provides a convenient means of instantiating a structure for use in foreign function calls, imitating a pattern of initialization often seen in the C language. It instantiates a Lisp .code struct by conversion of zero-filled memory through FFI, thus creating a Lisp structure which appears zero-filled when converted to the foreign representation. This simplifies application code, which is spared from providing individual slot initializations which have this effect. The .meta type argument must be a compiled FFI .code struct type. The remaining arguments must occur pairwise. Each .meta slot-sym argument must be a symbol naming a slot in the FFI .code struct type. The .meta init-value argument which follows it specifies the value for that slot. The .code make-zstruct function operates as follows. Firstly, the Lisp .code struct type is retrieved which corresponds to the FFI type given by .metn type . A new instance of the Lisp type is instantiated, as if by a one-argument call to .codn make-struct . Next, each slot indicated by a .meta slot-sym argument is set to the corresponding .metn init-value . Finally, each slot of the struct which is not initialized via .meta slot-sym and .meta init-value pair, and which is known to the FFI type, is reinitialized by a conversion from a foreign object of all-zero bits to a Lisp value. argument. The .code struct object is then returned. Note: the .code znew macro provides a less verbose notation based on .codn make-zstruct . Note: slots which are not known to the FFI .code struct type may be initialized by .codn make-zstruct . Each .meta slot-sym must be a slot of the Lisp .code struct type; but need not be declared as a member in the FFI .code struct type. .coNP Macro @ znew .synb .mets (znew < type-syntax >> { slot-sym << init-value }*) .syne .desc The .code znew macro provides a convenient way of using .codn make-zstruct , using syntax which resembles that of the .code new macro. The .code znew macro generates a .code make-zstruct call, arranging for the .meta type-syntax argument to be compiled to a FFI type object, and applies quoting to every .meta slot-sym argument. The following equivalence holds: .verb (znew s a i b j ...) <--> (make-zstruct (ffi s) 'a i 'b j ...) .brev .TP* Example Given the following FFI type definition .verb (typedef foo (struct foo (a (cptr bar)) (b uint) (c bool))) .brev the following results are observed: .verb ;; ordinary instantiation (new foo) -> #S(foo a nil b nil c nil) ;; Under znew, a is null cptr of correct type: (znew foo) -> #S(foo a # b 0 c nil) ;; value of b is specified; others come from zeros: (znew foo b 42) -> #S(foo a # b 42 c nil) .brev .coNP Function @ zero-fill .synb .mets (zero-fill < type << obj ) .syne .desc The .code zero-fill function invokes the by-reference in semantics of FFI type .meta type against a zero-filled buffer, and a Lisp object .metn obj . This means that if .meta obj is an aggregate such as a vector, list or structure, it is updated as if from an all-zero-bit foreign representation. In that situation, .meta obj is also returned. An object which has by-value semantics, such as an integer, is not updated. In this case, nevertheless, the return value is a Lisp object produced by converting an all-zero-bit buffer to .metn type . .SS* Foreign Unions The following group of functions provides the means for working with foreign unions, in conjunction with the .code union FFI type. .coNP Function @ make-union .synb .mets (make-union < type >> [ initval <> [ member ]]) .syne .desc The .code make-union function instantiates a new object of type .codn union , based on the FFI type specified by the .meta type parameter, which must be compiled FFI .code union type. The object provides storage for the foreign representation of .codn type , and that storage is initialized to all zero bytes. Additionally, if .meta initval is specified, but .meta member is not, then .meta initval is stored into the union's via the first member, as if by .codn union-put . If the union type has no members, an error exception is thrown. If both .meta initval and .meta member are specified, then .meta initval is stored into the union using the specified member, as if by .codn union-put . .coNP Function @ union-members .synb .mets (union-members << union ) .syne .desc The .code union-members function retrieves the list of symbols which name the members of .metn union . These are derived from the object's FFI type. It is unspecified whether the list is freshly allocated on each call, or whether the same list is returned; applications shouldn't destructively manipulate this list. .coNP Function @ union-get .synb .mets (union-get < union << member ) .syne .desc The .code union-get function performs the get semantics (conversion from a foreign representation to Lisp) on the member of .meta union which is specified by the .meta member argument. That argument must be a symbol corresponding to one of the member names. The .meta union object's storage buffer is treated as an object of the foreign type indicated by that member's type information, and converted accordingly to a Lisp object that is returned. .coNP Function @ union-put .synb .mets (union-put < union < member << new-value ) .syne .desc The .code union-put function performs the put semantics (conversion from a Lisp object to foreign representation) on the member of .meta union which is specified by the .meta member argument. That argument must be a symbol corresponding to one of the member names. The object given as .meta new-value is converted to the foreign representation according to the type information of the indicated member, and that representation is placed into the .meta union object's storage buffer. The return value is .metn new-value . .coNP Functions @ union-in and @ union-out .synb .mets (union-in < union < memb << memb-obj ) .mets (union-out < union < memb << memb-obj ) .syne .desc The .code union-in and .code union-out functions perform the FFI in semantics and out semantics, respectively. These semantics are involved in two-way data transfers between foreign representations and Lisp objects. The .meta union argument must be a .code union object and the .meta memb argument a symbol which matches one of that object's member names. In the case of .codn union-in , .meta memb-obj is a Lisp object that was previously stored into .meta union using the .code union-put operation, into the same member that is currently indicated by .metn member . In the case of .codn union-out , .meta memb-obj is a Lisp object that was previously retrieved from .meta union using the .code union-get operation, from the same member that is currently indicated by .metn member . The .code union-in performs the by-value nuance of the in semantics on the indicated member: if the member contains pointers to any objects, those objects are updated from their counterparts in .meta memb-obj using their respective by-reference in semantics, recursively. Similarly .code union-out performs the by-value nuance of the out semantics on the indicated member: if the member contains pointers to any objects, those objects are updated with their Lisp counterparts in .meta memb-obj using their respective by-reference out semantics, recursively. Note: .code union-in is intended to be used after a FFI call, on a union-typed by-value argument, or a union-typed object contained in an argument, in situations when the function is expected to have updated the contents of the union. The .code union-out function is intended to be used in a FFI callback, on a union-typed callback argument or union-typed object contained in such an argument, in cases when the callback has updated the Lisp object corresponding to a union member, and that change needs to be propagated to the foreign caller. .SS* FFI-type-driven I/O Functions These functions provide a way to perform I/O on stream using the foreign representation of Lisp objects, performing conversion between the Lisp representations in memory and the foreign representations in a stream. The .meta stream argument used with these functions must be a stream object which, in the case of input functions, supports .code get-byte and, in the case of output, supports .codn put-byte . .coNP Function @ put-obj .synb .mets (put-obj < object < type <> [ stream ]) .syne .desc The .code put-obj function encodes .meta object into a foreign representation, according to the FFI type .metn type . The bytes of the foreign representation are then written to .metn stream . If .meta stream is omitted, it defaults to .codn *stdout* . If the operation successfully writes all bytes of the representation to .metn stream , the value .code t is returned. A partial write causes the return value to be .codn nil . All other stream error situations throw exceptions. .coNP Function @ get-obj .synb .mets (get-obj < type <> [ stream ]) .syne .desc The .code get-obj function reads from .meta stream the bytes corresponding to a foreign representation according to the FFI type .metn type . If .meta stream is omitted, it defaults to .codn *stdin* . If the read is successful, these bytes are decoded, producing a Lisp object, which is returned. If the read is incomplete, the value returned is .codn nil . All other stream error situations throw exceptions. .coNP Function @ fill-obj .synb .mets (fill-obj < object < type <> [ stream ]) .syne .desc The .code fill-obj function reads from .meta stream the bytes corresponding to a foreign representation according to the FFI type .metn type . If the read is successful, then .meta object is updated, if possible, from that representation, using the by-value in semantics of the FFI type and returned. If a by-value update of .meta object isn't possible, then a new object is decoded from the data and returned. If the read is incomplete, the value returned is .codn nil . All other stream error situations throw exceptions. .SS* Buffer Functions Functions in this area provide a way to perform conversion between Lisp objects and foreign representation to and from objects of the .code buf type. .coNP Functions @ ffi-put and @ ffi-put-into .synb .mets (ffi-put < obj << type ) .mets (ffi-put-into < dst-buf < obj < type <> [ offset ]) .syne .desc The .code ffi-put function encodes the Lisp object .meta obj according to the FFI type .meta type and returns a new buffer object of type .code buf which holds the foreign representation. The .code ffi-put-into function is similar, except that it uses an existing buffer .meta dst-buf which must be large enough to hold the foreign representation. The .meta type argument must be a compiled FFI type. If .meta type is has a variable length, then the actual size of the foreign representation is calculated from .metn obj . The .meta obj argument must be an object compatible with the conversions implied by .metn type . The optional .meta offset argument specifies a byte offset from the beginning of the data area of .meta dst-buf where the foreign representation of .meta obj is stored. The default value is zero. These functions perform the "put semantics" encoding action similar to what happens to the arguments of an outgoing foreign function call. Caution: incorrect use of this function, or its use in isolation without a matching .code ffi-in call, can cause memory leaks, because, depending on .metn type , temporary resources may be allocated, and pointers to those resources will be stored in the buffer. .coNP Function @ ffi-out .synb .mets (ffi-out < dst-buf < obj < type < copy-p <> [ offset ]) .syne .desc The .code ffi-out function performs the "out semantics" encoding action, similar to the treatment applied to the arguments of a callback prior to returning to foreign code. It is assumed that .code obj is an object that was returned by an earlier call to .codn ffi-get , and that the .meta dst-buf and .meta type arguments are the same objects that were used in that call. The .meta copy-p argument is a Boolean flag which is true if the buffer represents a datum that is being passed by pointer. If .meta copy-p is true, then .meta obj is converted to a foreign representation which is stored into .metn dst-buf . If it is false, it indicates that the buffer itself is a pass-by-value object. This means that the object itself will not be copied, but if it is an aggregate which contains pointers, the operation will recurse on those objects, invoking their "out semantics" action with pass-by-pointer semantics. The required pointers to these indirect objects are obtained from .metn dst-buf . The optional .meta offset argument specifies a byte offset from the beginning of the data area of .meta dst-buf where the foreign representation of .meta obj is understood to be stored, and where it is updated if requested by .metn copy-p . The default value is zero. The .code ffi-out function returns .metn dst-buf . .coNP Function @ ffi-in .synb .mets (ffi-in < src-buf < obj < type < copy-p <> [ offset ]) .syne .desc The .code ffi-in function performs the "in semantics" decoding action, similar to the treatment applied to the arguments of a foreign function call after it returns, in order to free temporary resources and recover the new values of objects that have been modified by the foreign function. It is assumed that .meta src-buf is a buffer that was prepared by a call to .code ffi-put or .codn ffi-put-into , and that .meta type and .meta obj are the same values that were passed as the corresponding arguments of those functions. The .code ffi-in function releases the temporary memory resources that were allocated by .code ffi-put or .codn ffi-put-into , which are obtained from the buffer itself, where they appear as pointers. The function recursively performs the in semantics across the entire type, and the entire object graph rooted at the buffer. The .meta copy-p argument is a Boolean flag which is true if the buffer represents a datum that is being passed by pointer. If it is false, it indicates that the buffer itself is a pass-by-value object. Under pass-by-pointer semantics, either a whole new object is extracted from the buffer and returned, or else the slots of .meta obj are updated with new values from the buffer. Under pass-by-value semantics, no such extraction takes place, and .meta obj is returned. However, regardless of the value of .codn copy-p , if the object is an aggregate which contains pointers, the recursive treatment through those pointers involves pass-by-pointer semantics. This is consistent with the idea that we can pass a structure by value, but that structure can have pointers to objects which are updated by the called function. Those indirect objects are passed by pointer. They get updated, but the parent structure cannot. If .meta type is has a variable length, then the actual size of the foreign representation is calculated from .metn obj . The optional .meta offset argument specifies a byte offset from the beginning of the data area of .meta src-buf from which the foreign representation of .meta obj is taken. The .code ffi-in function returns either .meta obj or a new object which is understood to have been produced as its replacement. .coNP Function @ ffi-get .synb .mets (ffi-get < src-buf < type <> [ offset ]) .syne .desc The .code ffi-get function extracts a Lisp value from buffer .meta src-buf according to the FFI type .metn type . The .meta src-buf argument is an object of type .meta buf large enough to hold a foreign representation of .metn type , at the byte offset indicated by the .meta offset argument. The .meta type argument is compiled FFI type. The optional .meta offset argument defaults to zero. The external representation in .meta src-buf at the specified offset is scanned according to .meta type and converted to a Lisp value which is returned. The .code ffi-get operation is similar to the "get semantics" performed by FFI in order to extract the return value of foreign function calls, and by the FFI callback mechanism to extract the arguments coming into a callback. The .meta type argument may not be a variable length type, such as an array of unspecified size. .SS* Foreign Arrays Functions in this area provide a means for working with foreign arrays, in connection with the FFI .code carray type. .coNP Functions @ carray-vec and @ carray-list .synb .mets (carray-vec < vec < type <> [ null-term-p ]) .mets (carray-list < list < type <> [ null-term-p ]) .syne .desc The .code carray-vec and .code carray-list functions allocate storage for the representation of a foreign array, and return a .code carray object which holds a pointer to that storage. The argument .metn type , which must be a compiled FFI type, is retained as the .code carray object's element type. Prior to returning, the functions initializes the foreign array by converting the elements of .meta vec or, respectively, .meta list into elements of the foreign array. The conversion is performed using the put semantics of .metn type , which is a compiled FFI type. The length of the returned .code carray is determined from the length of .meta vec or .meta list and from the value of the Boolean argument .metn null-term-p . If .meta null-term-p is .codn nil , then the length of the .code carray is the same as that of the input .meta vec or .metn list . A true value of .meta null-term-p indicates null termination. This causes the length of the .code carray to be one greater than that of .meta vec or .metn list , and the extra element allocated to the foreign array is filled with zero bytes. .coNP Function @ carrayp .synb .mets (carrayp << object ) .syne .desc The .code carrayp function returns .code t if .meta object is a .codn carray , otherwise it returns .codn nil . .coNP Function @ carray-blank .synb .mets (carray-blank < length << type ) .syne .desc The .code carray-blank function allocates storage for the representation of a foreign array, filling that storage with zero bytes, and returns a .code carray object which holds a pointer to that storage. The argument .metn type , which must be a compiled FFI type, is retained as the .code carray object's element type. The .meta length argument must be a nonnegative integer; it specifies the number of elements in the foreign array and is retained as the .code carray object's length. The size of the foreign array is the product of the size of .meta type as reported by the .code ffi-size function, and of .metn length . .coNP Function @ carray-buf .synb .mets (carray-buf < buf < type <> [ offset ]) .syne .desc The .code carray-buf function creates a .code carray object which refers to the storage provided and managed by the buffer object .metn buf , providing a view of that storage, and manipulation thereof, as an array. The optional .meta offset parameter specifies an offset from the start of the buffer to the location which is interpreted as the start of the .codn carray , which extends from that offset to the end of the buffer. The default value is zero: the .code carray covers the entire buffer. If a value is specified, it must be in the range zero to the length of .metn buf . The .meta type argument must be a compiled FFI type whose size is nonzero. The .code carray is overlaid onto the storage of .meta buf as follows: First, .meta offset is subtracted from the bytewise length of .metn buf , as reported by .code length-buf function to produce the effective length of the storage to be used for the array. The effective length is divided by the size of .metn type , as reported by .codn ffi-size . The resulting quotient represents the length (number of elements) of the .code carray object. Note: the returned .code carray object holds a reference to .metn buf , preventing .meta buf from being reclaimed by garbage collection, thereby protecting the underlying storage from becoming invalid. A subsequent invocation of .code carray-own operation releases this reference. Note: the relationship between the .code carray object and .meta buf is inherently unsafe: if .meta buf is subsequently subject to operations which reallocate the storage, such as .code buf-set-length the pointer stored inside the referencing .code carray object becomes invalid, and operations involving that pointer have undefined behavior. Note: if the length of the buffer is not evenly divisible by the size of the type, the calculated number of elements is rounded down. The trailing portion of the buffer corresponding to the division remainder, being insufficient to constitute a whole array element, is excluded from the array view. .coNP Function @ carray-buf-sync .synb .mets (carray-buf-sync << carray ) .syne .desc The .code carray-buf-sync function requires .meta carray to be a .code carray object which refers to a .code buf object for its storage. Such objects are created by the function .codn carray-buf . The .code carray-buf-sync function retrieves and returns the buffer object associated with .meta carray and at the same time also updates the internal properties of .meta carray using the current information: the pointer to the data, and the length of .meta carray are altered to reflect the current state of the buffer. .coNP Function @ buf-carray .synb .mets (buf-carray << carray ) .syne .desc The .code buf-carray function duplicates the underlying storage of .meta carray and returns that storage represented as an object of .code buf type. The storage size is calculated by multiplying the .code carray object's element size by the number of elements. Only that extent of the storage is duplicated. .coNP Function @ carray-cptr .synb .mets (carray-cptr < cptr < type <> [ length ]) .syne .desc The .code carray-cptr function creates a .code carray object based on a pointer derived from a .code cptr object. The .meta cptr argument must be of type .codn cptr . The object's .code cptr type tag is ignored. The .meta type argument must specify a compiled FFI type, which will become the element type of the returned .codn carray . If .meta length is specified as .codn nil , or not specified, then the returned .code carray object will be of unknown length. Otherwise, .meta length must be a nonnegative integer which will be taken as the length of the array. Note: this conversion is inherently unsafe. .coNP Function @ cptr-carray .synb .mets (cptr-carray < carray <> [ type-symbol ]) .syne .desc The .code cptr-carray function returns a .code cptr object which holds a pointer to a .code carray object's storage area. The .meta carray argument must be of type .codn carray . The .meta type-symbol argument should be a symbol. If omitted, it defaults to .codn nil . This symbol becomes the .code cptr object's type tag. The lifetime of the returned .code cptr object is independent from that of .metn carray . If the lifetime of .meta carray reaches its end before that of the .codn cptr , the pointer stored inside the .code cptr becomes invalid. .coNP Function @ length-carray .synb .mets (length-carray << carray ) .syne .desc The .code length-carry function returns the length of the .meta carray argument, which must be an object of type .codn carray . If .meta carray has an unknown length, then .code nil is returned. .coNP Function @ copy-carray .synb .mets (copy-carray << carray ) .syne .desc The .code copy-carray function returns a duplicate of .metn carray . The duplicate has the same element type and length, but has its own copy of the underlying storage. This is true whether or not .meta carray owns its storage or not. In either case, the duplicate owns .I its copy of the storage. .coNP Function @ carray-set-length .synb .mets (carray-set-length < carray << length ) .syne .desc The .code carry-set-length attempts to change the length of .metn carray , which must be an object of .code carray type. The .meta length argument indicates the new length, which must be a nonnegative integer. The operation throws an .code error exception if .meta length is negative. An .code error exception is also thrown if .meta carray is an object which owns the underlying storage. There is no provision in the .code carray type to change the storage size. It is permissible to change the length of a .code carray object which acts as a view into a buffer (as constructed via the .code carray-buf operation). This creates a potentially unsafe situation in which the length requires a larger amount of backing storage than is provided by the buffer. .coNP Accessor @ carray-ref .synb .mets (carray-ref < carray << idx ) .mets (set (carray-ref < carray << idx ) << new-val ) .syne .desc The .code carray-ref function accesses an element of the foreign array .metn carray , converting that element to a Lisp value, which is returned. The .meta idx argument must be a nonnegative integer. If .meta carray has a known length, .meta idx must be less than the length. If .meta carray has an unknown length, then the access is permitted regardless of how positive is the value of .metn idx . Whether the access has well-defined behavior depends on the actual extent of the underlying array storage. The validity of any access to the underlying storage depends on the validity of the pointer to that storage. The access to the array storage proceeds as follows. Every .code carray object has an element type, which is a compiled FFI type. A byte offset address is calculated by multiplying the size of the element type of .meta carray by .metn idx . Then, the get semantics of the element type is invoked to convert, to a Lisp object, a region of data starting at calculated byte offset in the array storage. The resulting object is returned. Assigning an a value to a .code carray-ref form is equivalent to using .code carray-refset to store the value. .coNP Function @ carray-refset .synb .mets (carray-refset < carray < idx << new-val ) .syne .desc The .code carray-refset function accesses an element of the foreign array .metn carray , overwriting that element with a new value obtained from a conversion of the Lisp value .metn new-val . The return value is .metn new-val . The .meta idx argument must be a nonnegative integer. If .meta carray has a known length, .meta idx must be less than the length. If .meta carray has an unknown length, then the access is permitted regardless of how positive is the value of .metn idx . Whether the access has well-defined behavior depends on the actual extent of the underlying array storage. The validity of any access to the underlying storage depends on the validity of the pointer to that storage. The access to the array storage proceeds as follows. Every .code carray object has an element type, which is a compiled FFI type. A byte offset address is calculated by multiplying the size of the element type of .meta carray by .metn idx . Then, the put semantics of the element type is invoked to convert .meta new-val to a foreign representation, which is written into the array storage started at the calculated byte offset. If .meta new-val has a type which is not compatible with the element type, or a value which is out of range or otherwise unsuitable, an exception is thrown. .coNP Functions @ carray-dup and @ carray-own .synb .mets (carray-dup << carray ) .mets (carray-own << carray ) .syne .desc The .code carray-dup function acts upon a .code carray object which doesn't own its underlying array storage. It allocates a duplicate copy of the array storage referenced by .metn carray , and assigns to .meta carray the new copy. Then it marks .meta carray as owning that storage. Lastly, if .meta carray references another object, that reference is removed; .meta carray no longer prevents the other object from being reclaimed by the garbage collector. If .meta carray already owns its storage, then this function has no effect. If .meta carray has an unknown size, then an error exception is thrown. A .code carray produced by the functions .code carray-vec or .code carray-blank already owns its storage. A .code carray object does not own its storage if it is produced by .code carray-buf or by the conversion of a foreign pointer under the control of the .code carray FFI type. Because .code carray objects derived from foreign pointers via FFI have an unknown size, before using .codn carray-dup , the application must determine the length of the array, and call .code carray-set-length to establish that length. After .codn carray-dup , the length may not be altered. The .code carray-dup function returns .code t if it has performed the duplication operation. If it has done nothing, it returns .codn nil . The .code carray-own function resembles .codn carray-dup , differing from that function only in two ways. Instead of allocating a duplicate copy of the underlying array storage, .code carray-own causes .meta carray to .B assume ownership of the existing storage. Secondly, it is an error to use .code carray-own on a .meta carray which references a buffer object. The .meta carray-own function always returns .codn nil . In all other regards, the descriptions of .code carray-dup apply to .codn carray-own . .coNP Function @ carray-free .synb .mets (carray-free << carray ) .syne .desc If .meta carray is a .code carray object which owns the storage to which it refers, then .code carray-free function liberates that storage by passing the pointer to the C library function .codn free . It then replaces that pointer with a null pointer, and changes the size to zero. If .meta carray doesn't own the storage, an exception is thrown. .coNP Function @ carray-type .synb .mets (carray-type << carray ) .syne .desc The .code carray-type function returns the element type of .metn carray , a compiled FFI type. .coNP Functions @ vec-carray and @ list-carray .synb .mets (vec-carray < carray <> [ null-term-p ]) .mets (list-carray < carray <> [ null-term-p ]) .syne .desc The .code vec-carray and .code list-carray functions convert the array storage of .meta carray to a freshly constructed object representation: vector, and list, respectively. The new vector or list is returned. The .meta carray object must have a known size; an .code error exception is thrown if these functions are invoked on a .code carray object of unknown size. The effective length of the new vector or list is derived from the length of .metn carray , taking into account the value of .metn null-term-p . The .meta null-term-p Boolean parameter defaults to .codn nil . If specified as true, then it has the effect that the effective length of the returned vector or list is one less than that of .metn carray : in other words, a true value of .meta null-term-p indicates that .meta carray holds storage which represents a null-terminated array, and the terminating null element is to be excluded from the conversion. If .meta null-term-p is true, but the length of .meta carray is already zero, then it has no effect; the effective length remains zero, and a zero-length vector or list is returned. Conversion of the foreign array to the vector or list is performed by iterating over all of its elements, starting from element zero, up to the element before the effective length. .coNP Functions @ carray-get and @ carray-getz .synb .mets (carray-get << carray ) .mets (carray-getz << carray ) .syne .desc The .code carray-get and .code carray-getz functions treat the contents of .meta carray as a FFI .code array and .code zarray type, respectively. They invoke the get semantics to convert the FFI array to a Lisp object, and return that object. If the element type is one of .codn char , .code bchar or .codn wchar , then the expected string conversion semantics applies. .coNP Functions @ carray-put and @ carray-putz .synb .mets (carray-put < carray << new-val ) .mets (carray-putz < carray << new-val ) .syne .desc The .code carray-put and .code carray-putz functions treat the contents of .meta carray as a FFI .code array and .code zarray type, respectively. They invoke the put semantics to convert the Lisp object .meta new-val array to the foreign array representation, which is placed into the array storage referenced by .metn carray . If the element type is one of .codn char , .code bchar or .codn wchar , then the expected string conversion semantics applies. Both of these functions return .metn carray . .coNP Accessor @ carray-sub .synb .mets (carray-sub < carray >> [ from <> [ to ]]) .mets (set (carray-sub < carray >> [ from <> [ to ]]) << new-val ) .syne .desc The .code carray-sub function extracts a subrange of a .meta carray object, returning a new .code carray object denoting that subrange. The semantics of .meta from and .meta to work exactly like the corresponding arguments of the .code sub accessor, following the same conventions. The returned .code carray shares the array has the same element type as the original and shares the same array storage. If, subsequently, elements of the original array are modified which lie in the range, then the modifications will affect the previously returned subrange .codn carray . The returned .code carray references the original object, to ensure that as long as the returned object is reachable by the garbage collector, so is the original. This relationship can be severed by invoking .code carray-dup on the returned object, after which the two no longer share storage, and modifications in the original are not reflected in the subrange. If .code carray-sub is used as a syntactic place, the argument expressions .metn carray , .metn from , .meta to and .meta new-val are evaluated just once. The prior value, if required, is accessed by calling .code carray-sub and .meta new-val is then stored via .codn carray-replace . .coNP Function @ carray-replace .synb .mets (carray-replace < carray < item-sequence >> [ from <> [ to ]]) .syne .desc The .code carray-replace function is a specialized version of .code replace which works on .code carray objects. It replaces a sub-range of .meta carray with elements from .metn item-sequence . The replacement sequence need not have the same length as the range which it replaces. The semantics of .meta from and .meta to work exactly like the corresponding arguments of the .code replace function, following the same conventions. The semantics of the .code carray-replace operation itself differs from the .code replace semantics on sequences in one important regard: the .code carray object's length always remains the same. The range indicated by .meta from and .meta to is deleted from .meta carray and replaced by elements of .metn item-sequence , which undergo conversion to the foreign type that defines the elements of .metn carray . If this operation would make the .code carray longer, any elements in excess of the object's length are discarded, whether they are the original elements, or whether they come from .metn item-sequence . Under no circumstances does .code carray-replace write an element beyond the length of the underlying storage. If this operation would make the .meta carray shorter (the range being replaced is longer than .metn item-sequence ) then the downward relocation of items above the replacement range creates a gap at the end of .meta carray which is filled with zero bytes. The return value is .meta carray itself. .coNP Function @ carray-pun .synb .mets (carray-pun < carray < type >> [ offset <> [ size-limit ]]) .syne .desc The .code carray-pun creates a new .code carray object which provides an aliased view of the same data that is referenced by the original .meta carray object. The .meta type argument specifies the element type used by the returned aliasing array. If the .meta offset argument is specified, then the aliased view is displaced by that many bytes from the start of the .meta carray object. The .meta offset argument must not be larger than the bytewise length of the array, or an error exception is thrown. The bytewise length of the array is the product of the number of elements and the element size. The default value of .meta offset is zero: no displacement. If .meta size-limit is specified, it indicates the size, in bytes, of the aliased view. This limit must not be such that the aliased view would extend beyond the array, or an error exception is thrown. If omitted, .meta size-limit defaults to the entire remainder of the array, after the offset. The number of elements of the returned array are then calculated from .metn size-limit . The .code carray-pun function calculates how many elements of .meta type fit into .metn size-limit . This value becomes the length of the aliasing array which is returned. Since the returned aliasing array and the original refer to the same storage, modifications performed in one view are reflected in the other. The aliasing array holds a reference to the original, so that as long as it is reachable by the garbage collector, so is the original. That relationship is severed if .code carray-dup is invoked on the aliasing array. The meaning of the aliasing depends entirely on the bitwise representations of the types involved. Note: .code carray-pun does not check whether .meta offset is a value that is suitably aligned for accessing elements of .metn type ; on some platforms that must be ensured. The .code carray-pun function may be invoked on an object that was itself returned by .codn carray-pun . .coNP Functions @ carray-uint and @ carray-int .synb .mets (carray-uint < number <> [ type ]) .mets (carray-int < number <> [ type ]) .syne .desc The .code carray-uint and .code carray-int functions convert .metn number , an integer, to a binary image, which is then used as the underlying storage for a .codn carray . The .meta type argument, a compiled FFI type, determines the element type for the returned .codn carray . If it is omitted, it defaults to the .code uchar type, so that the array is effectively of bytes. Regardless of .metn type , these functions first determine the number of bytes required to represent .meta number in a big endian format. Then the number of elements is determined for the array, so that it provides at least as that many bytes of storage. The representation of .meta number is then placed into this storage, such that its least significant byte coincides with the last byte of that storage. If the number is smaller than the storage provided by the array, it extended with padding bytes on the left, near the beginning of the array. In the case of .codn carray-uint , .meta number must be a nonnegative integer. An unsigned representation is produced which carries no sign bit. The representation is as many bytes wide as are required to cover the number up to its most-significant bit whose value is 1. If any padding bytes are required due to the array being larger, they are always zero. The .code carray-int function encodes negative integers also, using a variable-length two's complement representation. The number of bits required to hold the number is calculated as the smallest width which can represent the value in two's complement, including a sign bit. Any unused bits in the most-significant byte are filled with copies of the sign bit: in other words, sign extension takes place up to the byte size. The sign extension continues through the padding bytes if the array is larger than the number of bytes required to represent .metn number ; the padding bytes are filled with the value .code #b11111111 (255) if the number is negative, or else 0 if it is nonnegative. .coNP Functions @ uint-carray and @ int-carray .synb .mets (uint-carray << carray ) .mets (int-carray << carray ) .syne .desc The .code uint-carray and .code int-carray functions treat the storage bytes .meta carray object as the representation of an integer. The .code uint-carray function simply treats all of the bytes as a big-endian unsigned integer in a pure binary representation, and returns that integer, which is necessarily always nonnegative. The .code int-carray function treats the bytes as a two's complement representation. The returned number is negative if the first storage byte of .meta carray has a 1 in the most significant bit position: in other words, is in the range .code #x80 to .codn #xFF . In this case, the two's complement of the entire representation is calculated: all of the bits are inverted, the resulting positive integer is extracted. Then 1 is added to that integer, and it is negated. Thus, for example, if all of the bytes are .codn #xFF , the value -1 is returned. .coNP Functions @ fill-carray and @ put-carray .synb .mets (fill-carray < carray >> [ pos <> [ stream ]]) .mets (put-carray < carray >> [ pos <> [ stream ]]) .syne .desc The .code fill-carray and .code put-carray functions perform stream output using the .code carray object as a buffer. The semantics of these functions is as follows. A temporary buffer is created which aliases the storage of .meta carray and this buffer is used as an argument in an invocation of, respectively, the buffer I/O function .code fill-buf or .codn put-buf . The value returned by the buffer I/O function is returned. The .meta pos and .meta stream arguments are defaulted exactly in the same manner as by .code fill-buf and .codn put-buf , and have the same meaning. In particular, .meta pos indicates a byte offset into the .code carray object's storage, not an array index. .NP* C Non-Local Jumps \*(TL supports interfacing with modules that make use of the C .code setjmp and .code longjmp feature across their boundaries. It is possible to save a jump location in Lisp code with the .code setjmp macro, such that a foreign function can perform a .code longjmp to that saved context. The jump context buffer, known as the type .code jmp_buf in C, is modelled as a .code carray object whose element type is .codn uchar . The function .code jmp-buf returns such an object. Foreign functions that return a pointer to a .code jmp_buf may be suitably defined via .code deffi such that the pointer is mapped to a .code carray object whose element type is .codn uchar . The resulting object will then be usable as a jump buffer. The features described here are unsafe. When used in certain incorrect ways, the behavior is undefined. Using the .code setjmp macro and .code longjmp function as control primitives in Lisp code not interacting with foreign functions is strongly discouraged. There are situations in which the foreign function calling mechanism allocates temporary dynamic memory for converting between Lisp and C objects. These situations occur when objects are referenced by pointers, and so are are outside of the stack-based argument space. In such a situation, if the foreign function performs a .code longjmp terminating in a .code setjmp macro in Lisp code, that temporary storage will leak. .coNP Function @ jmp-buf .synb .mets (jmp-buf) .syne .desc The .code jmp-buf function returns a new .code carray object suitable for use as a jump buffer with the .code setjmp macro and .code longjmp function. .coNP Function @ longjmp .synb .mets (longjmp < jmp-buf << value ) .syne .desc The .code longjmp function restores the context saved into the .meta jmp-buf object by the .code setjmp macro. If that macro already terminated, the behavior is undefined. The .meta value must be an integer in range of the FFI type .codn int . That value will be observed in the .code setjmp form, as described. If .meta value is .code 0 (zero) the value .code 1 is used instead. This is a behavior of the underlying .code longjmp C library function. Note: a context abandoned via .code longjmp will not perform unwinding, similarly to .codn sys:abscond* . The form which is abandoned by .code longjmp should not be using scoped management of resources that relies on .code unwind-protect for clean-up. .coNP Macro @ setjmp .synb .mets (setjmp < jmp-buf < result-var < main-form << longjmp-form *) .syne .desc The .code setjmp macro saves the jump context into the .meta jmp-buf object, and evaluates the .meta main-form expression. If the .meta main-form expression terminates normally then the value it produces becomes the result of .codn setjmp , which terminates. If the .meta main-form performs a .code longjmp to the context saved in .codn jmp-buf , then that form is abruptly terminated, without performing any unwinding. Then, the zero or more .metn longjmp-form s are evaluated. The .code setjmp form terminates, yielding the value of the last .meta longjmp-form or else .codn nil . The .codn longjmp-form s are evaluated in a scope in which the .code result-var symbol is bound as a variable, taking on the integer value passed to .codn longjmp , which is never zero. The .meta jmp-buf argument must be a .code carray object suitable for use as a jump buffer. The .code result-var argument must be a bindable symbol. Once .code setjmp terminates, the contents of .meta jmp-buf become indeterminate. Any .code longjmp attempt using an indeterminate .code jmp-buf is undefined behavior. .TP* Example: .verb (let ((jb (jmp-buf))) (setjmp jb result (progn (put-line "setjmp") ;; "setjmp" is printed (longjmp jb 42)) (put-line `result is: @result`))) ;; "result is: 42" is printed .brev .IP Note: this example is for illustration only. Using .code setjmp and .code longjmp as Lisp control flow constructs in code not interacting with foreign functions is strongly discouraged. .SH* LISP COMPILATION .SS* Overview \*(TX supports two modes of processing of Lisp programs: evaluation and compilation. Expressions entered into the listener, loaded from source files via .codn load , processed by the .code eval function, or embedded into the \*(TX pattern language, are processed by the .IR evaluator . The evaluator expands all macros, and then interprets the program by traversing its raw syntax tree structure. It uses an inefficient representation of lexical variables consisting of heap-allocated environment objects which store variable bindings as Lisp association lists. Every time a variable is accessed, the chain of environments is searched for the binding. \*(TX also provides a compiler and virtual machine for more efficient execution of Lisp programs. In this mode of processing, top-level expressions are translated into the instructions of Lisp-oriented virtual machine. The virtual machine language is traversed more efficiently compared to the traversal of the cons cells of the original Lisp syntax tree. Moreover, compiled code uses a much more efficient representation for lexical variables which doesn't involve searching through an environment chain. Lexical variables are always allocated on the stack (the native one established by the operating system). They are transparently relocated to dynamic storage only when captured by lexical closures, and without sacrificing access speed. \*(TX provides the function .code compile for compiling individual functions, both anonymous and named. File compilation is supported via the function .codn compile-file . The function .code compile-toplevel is provided for compiling expressions in the global environment. This function is the basis for both .code compile and .codn compile-file . The .code disassemble function is provided to list the compiled code in a more understandable way; .code disassemble takes a compiled code object and decodes it into an assembly language presentation of its virtual-machine code, accompanied by a dump of the various information tables. File compilation via .code compile-file refers to a processing step whereby a source file containing \*(TL forms (typically named with a .code .tl file name suffix) is translated into an object file (named with a .code .tlo suffix) containing a compiled version of those forms. The compiled object file can then be loaded via the .code load function instead of the source file. Usually, loading the compiled file produces the same effect as if the source file were loaded. However, note that the behavior of compiled code can differ from interpreted code in a number of ways. Differences in behavior can be deliberately induced. Certain erroneous or dubious situations can also cause compiled code to behave differently from interpreted code. Compilation not only provides faster execution; compiled files also load much faster than source files. Moreover, they can be distributed unaccompanied by the source files, and resist reverse engineering. .SS* Top-Level Forms An important concept in file compilation via .code compile-file is that of the .IR "top-level form" , and how that term is defined. The file compiler individually processes top-level forms; for each such form, it emits a translated image. In the context of file compilation, a top-level form isn't simply any Lisp form which is not enclosed by another one. Rather, in this specific context, it has this specific definition, which allows some enclosed forms to still be considered top-level forms: .IP 1. If a form appearing in a \*(TL source file isn't enclosed in another form, it is a top-level form. .IP 2. If a .code progn form is top-level form, then each of its constituent forms is also a top-level form. .IP 3. If a .code compile-only form is top-level form, then each of its constituent forms is also a top-level form. .IP 4. If an .code eval-only form is top-level form, then each of its constituent forms is also a top-level form. .IP 5. If a .code load-time form is top-level form, then its argument is a top-level form. .IP 6. When a macro form is identified as a top-level form, it is macro-expanded as if by .code macroexpand before considering whether it contains top-level forms under rules 2\(en5. .IP 7. Rules 2\(en6 are applied recursively. .IP 8. No other forms are top-level forms. .RE .IP A top-level form is a .I primary top-level form if it doesn't contain any other top-level forms. This means that it is not a form based on any of the operators .codn progn , .code compile-only or .codn eval-only . Note that the constituent body forms of a .code macrolet or .code symacrolet top-level form are not individual top-level forms, even if the expansion of the construct combines the expanded versions of those forms with .codn progn . Note: the .code eval function implements a similar concept, specially recognizing .codn progn , .code compile-only and .code eval-only top-level forms, taking care to macro-expand and evaluate their constituents separately. In turn, the .code load function, when processing Lisp source, evaluates each primary top-level form as if by using the .code eval function. The result is that the behavior of loaded source and compiled files is consistent in this regard. .SS* File Compilation Model The file compiler reads each successive forms from a file, performs a partial expansion on that form, then traverses it to identify all of the top-level forms which it contains. Each top-level form is subject to three actions, either of the latter two of which may be omitted: compilation, execution and emission. Compilation refers to the translation to compiled form. Execution is the invocation of the compiled form. Emission refers to appending an externalized representation of the compiled form (its image) to the output which is written into the compiled file. By default, all three actions take place for every top-level form. Using the operators .code compile-only or .codn eval-only , execution or emission, or both, may be suppressed. If both are suppressed, then compilation isn't performed; the forms processed in this mode are effectively ignored. When a compiled file is loaded, the images of compiled forms are read from it and converted back to compiled objects, which are executed in sequence. Partial expansion means that file compilation doesn't fully expand each form that is encountered. Rather, an incremental expansion is performed, similar to the algorithm used by the .code eval function: .RS .IP 1. First, if .meta form is a macro, it is macro-expanded as if by an application of the function .codn macroexpand . .IP 2. If the resulting expanded form is a .codn progn , .codn compile-only , or .code eval-only form, then .code compile-file iterates over that form's argument expressions, compiling each expression recursively as if it were a separate expression. .IP 3. Otherwise, if the expanded form isn't one of the above three kinds of expressions, it is subject to a full expansion and compilation. .RE .IP Note: the structure of these three processing rules above closely resembles that of the three rules given in the description of the .code eval function, which is the basis for handling source files in .codn load . Consequently, macro expansion behaves consistently between .code compile-file and .code load of a source file. .SS* Treatment of Literals Programs specify not only code, but also data. Data embedded in a program is called .IR "literal data" . There are restrictions on what kinds of object may be used as literal data in programs subject to file compilation. Programs which stray outside of these restrictions will produce compiled files which load incorrectly or fail to load. Literal objects arise not only from the use of literal such as numbers, characters and strings, and not only from quoted symbols or lists. For instance, compiled forms which define or reference free variables or global functions require the names of these variables or functions to be represented as literals. An object used as a literal in file-compiled code must be .I externalizable which means that it has a printed representation which can be scanned to produce a similar object. An object which does not have a readable printed representation will give rise to a compiled file which triggers an exception. Literals which are themselves read from program source code naturally meet this restriction; however, with the use of macros, it is possible to embed arbitrary objects into program code. If the same object appears in two or more places in the code specified in a single file, the file compilation and loading mechanism ensures that the multiple occurrences of that object in the compiled file become a single object when the compiled file is loaded. For example, if macros are used in such a way that the compiled file defines a function which has a name generated by .codn gensym , and there are calls to that function throughout that file, this will work properly: the multiple occurrences of the gensym will appear as the same symbol. However: that symbol in the loaded file will not be identical to any other symbol in the \*(TX image; it will be newly allocated each time the compiled file is loaded. Interned symbols are recorded in a compiled file by means of their textual names and package prefixes. When a compiled file is loaded, the interned symbols which occur as literals in it are entered into the specified packages under the specified names. The value of the .code *package* special variable has no influence on this. Circular structures in compiled literals are preserved; on loading, similar circular structures are reproduced. .SS* Treatment of the Hash-Bang Line \*(TX supports the hash-bang mechanism in compiled .code .tlo files, thereby allowing compiled scripts to be executable. When a source file begins with the .code #! (hash bang) character sequence, the file compiler propagates that line (all characters up to and including the terminating newline) to the compiled file, subject to the following transformation steps: .IP 1 The line is divided into arguments, on the assumption that they are separated by exactly one space. .IP 2 Then, all occurrences of the argument .str --lisp are replaced by .strn --compiled . .IP 3 Next, all arguments which end in the suffix .str "txrlisp" have that suffix replaced by .strn "txrvm" . .IP 4 The hash bang line is then reconstituted by joining the arguments with a single space. .PP Furthermore, certain permissions are propagated from a hash-bang source file to the target file. If the source file is executable to its owner, then the target file is made executable as if by using .code chmod with the .code +x mode: all the executable permissions that are allowed by the current .code umask are are enabled on the target file. If the target file is thus being marked executable, then additional permissions are also treated as follows. If the target file has the same owner as the source file, and the source file's setuid permission bit is set, then this is propagated to the target file. Similarly, if the target file has the same group owner as the source file, and the source file's group execute bit and setgid permission bit are set, then the setgid bit is set on the target file. .SS* Compiled File Compatibility \*(TX's virtual-machine architecture for executing compiled code is evolving, and that evolution has implications for the compatibility between compiled files and the \*(TX executable image. The basic requirement is that a given version of \*(TX can load and execute the compiled files which that same version has produced. Furthermore, these files are architecture-independent, except that their encoding is in the local byte order ("endianness") of the host machine. The byte order is explicitly indicated in the files, and the .code load function resolves it. Thus a file produced by \*(TX running on a 64-bit big-endian Power PC can be loaded by \*(TX running on 32-bit x86, which is little endian. A given \*(TX version may also be capable of loading files produced by an older version, or even ones produced by a newer version. Whether this is possible depends on the versions involved. Furthermore, there is a general issue at play: code compiled by newer versions of \*(TX may require functions that are not present in older versions, preventing that code from running. Newer \*(TX may support new syntax not recognized by older \*(TX, and that syntax may end up in compiled files. Compiled files contain a minor and major version number (which is independent of the \*(TX version). The .code load function examines these numbers and decides whether the file is loadable, or whether it must be rejected. The first version of \*(TX which featured the compiler and virtual machine was 191. Older versions therefore cannot load compiled files. Versions 191 and 192 produce version 1 compiled files, and load only that version. Versions 193 through 198 produce version 2 compiled files and load only that version. Version 199 produces version 3 files and loads versions 2 and 3. Versions 200 through 215 produce version 4 files and load versions 2, 3 and 4. Versions 216 through 243 produce version 5.0 files and load versions 2, 3, 4 and 5, regardless of minor version. Versions 244 through 251 produce version 5.1 files and load versions 2, 3, 4 and 5, regardless of minor version. Versions 252 through 259 produce version 6.0 files and load only version 6, regardless of minor version. Versions 260 through 297 produce version 7.0 files and load versions 6 and 7, regardless of minor version. Version 261 introduces JSON .code #J syntax. Compiled code which contains embedded JSON literals is not loadable by \*(TX 260 and older. .SS* Recommendations for Unused Variable Diagnostics By default, the .code unused diagnostic option is enabled in .codn *compile-opts* , causing unused variables to be diagnosed. The first step in resolving an unused variable diagnostic is to determine whether it is caused by a bug in the code. If so, the resolution is to address the bug. If the situation isn't a bug, then the diagnostic is a false positive, and may be silenced. There are multiple ways to do that, six of which are given here: .IP 1. .BR "Disable the diagnostic" : for instance, compile .str foo.tl with unused warnings disabled: .verb (with-compile-opts (nil unused) (compile-file "foo.tl")) .brev .IP 2. .B "Use the" .code ignore .BR function : the compiler specially recognizes the .code ignore function such that when any of its arguments are lexical variables, they are marked used: .verb (defun stub-function (arg1 arg2) (ignore arg1 arg2)) .brev Note that an .code ignore call may be elided if it occurs in dead code, in which case it won't have the right effect: .verb (defun unused-arg (arg) ;; diagnosed as unused (when (= (+ 2 2) 5) (ignore arg) ;; wrongly placed (dead-code))) (defun unused-arg (arg) ;; no diagnostic (ignore arg) ;; correctly placed (when (= (+ 2 2) 5) (dead-code))) .brev .IP 3. .B "Use the" .code use .BR function . In the following code, parameter .code arg is diagnosed as unused on platforms in which the equality being tested is false, since the expression is constant. In situations like this, the variable is not unused, but only conditionally so. Therefore the name of the .code ignore function doesn't express the intent very well. The .code use function may be stylistically preferred: .verb (defun platform-specific-action (arg) (use arg) (if (eql (sizeof wchar) 2) (do-something arg))) .brev However, unlike .codn ignore , .code use takes exactly one argument, and returns that argument rather than .codn nil . .IP 4. .BR "Use an uninterned symbol: " unused variable diagnostics are not reported against variables named by uninterned symbols. .verb (lambda (x y) y) ;; unused x diagnosed (lambda (#:x y) y) ;; no diagnostic .brev .IP 5. .BR "In destructuring and pattern matching, put catch-all variable to use" : Examples: .verb (tree-case obj ((a b) (calculate-something a b)) (else (transform obj))) ;; unused else (tree-case obj ((a b) (calculate-something a b)) (else (transform else))) ;; diagnostic gone (match-case obj ((@a @b) (calculate-something a b)) (@else (transform obj))) ;; unused else (match-case obj ((@a @b) (calculate-something a b)) (@else (transform else))) ;; diagnostic gone .brev .IP 6. In pattern matching, use the .code @nil pattern: .verb (match-case obj ((@a @nil) (calculate-something a)) (@nil (transform obj))) .brev .IP 7. .B "In macro parameter lists use t symbol" : in macro-style parameter lists, any variable may be replaced by the .code t symbol to consume a value without binding a variable. This is intended for suppressing unused variable warnings: .verb (defmacro foo (x y) ;; y unused ^(a b c ,x)) (defmacro foo (x t) ;; no diagnostic ^(a b c ,x)) (tree-bind (a . (b . c)) obj ;; a, b, unused c) (tree-bind (t . (t . c)) obj ;; no diagnostic c) .brev .SS* Semantic Differences between Compilation and Interpretation The .code compile-only and .code eval-only operators can be used to deliberately produce code which behaves differently when compiled and interpreted. In addition, unwanted differences in behavior can also occur. The situations are summarized below. .coNP Differences Due to @ load-time Forms evaluated by .code load-time are treated differently by the compiler. When a top-level form is compiled, its embedded .code load-time forms are factored out such that the compiled image of the top-level form will evaluate these forms before other evaluations take place. The interpreter doesn't perform this factoring; it evaluates a .code load-time form when it encounters it for the first time. .coNP Treatment of literals The compiler identifies multiple occurrences of equivalent strings and bignum integers that occur as literals, and condenses each one to a single instance, within the scope of the compilation. The scope is possibly as wide as a file. If the literal .str abc appears in multiple places in the same file that is processed by .codn compile-file , in the resulting compiled file, there may be just a single .str abc object. For instance, if the file contains two functions: .verb (defun f1 () "abc") (defun f2 () "abc") .brev when compiled, these will return the same object such that .verb (eq (f1) (f2)) -> t .brev No such de-duplication is performed for interpreted code. Consequently, code which depends on multiple occurrences of these objects to be distinct objects may behave correctly when interpreted, but misbehave when compiled. Or vice versa. One example is code which modifies a string literal. Under compilation, the change will affect all occurrences of that literal that have been merged into one object. Another example is an expression like .codn "(eq \(dqabc\(dq \(dqabc\(dq)" , which yields .code nil under interpretation because the two strings are distinct object in spite of appearing side by side in the syntax, but .code t when compiled, since they denote the same string object. In the future, objects other than strings and bignums may be similarly consolidated, such as lists and vectors, which means that interpreted code which works today when compiled may misbehave in the future. Note that objects which are literally notated in source code are not the only kinds of objects considered to be literals. Objects which are constructed by macros and inserted into macro-expansions are also literals. Literals are self-evaluating objects that appear as expressions in the syntax which remains after macro-expansion, as well as arguments of the .code quote operator. If a macro calculates a new string each time it is expanded, and inserts it into the expansion as a literal, the compiler will identify and consolidate groups of such strings that are identical. .coNP Treatment of symbols A source file may contain unqualified symbol tokens which are interned in the current package. In contrast, a compiled file encodes symbols with full package qualification. When a compiled file is loaded, the current package at that time has no effect on the symbols in the compiled file, even if those symbols were specified as unqualified in the original source file. This difference can lead to surprising behaviors. Suppose a source file contains references to functions or variables or other entities which do not exist. Furthermore, suppose the entities were referenced, in that file, using unqualified symbols which didn't exist, and were expected to come from a different package from the one where they ended up interned. For instance, supposed the file is being processed in a package called .code abc and is expecting to use a function .code calc which should come from the .code xyz package. Unfortunately, no such symbol exists. Therefore, the symbol is interned as .code abc:calc and not .codn xyz:calc . In that case, it should be sufficient to ensure that the .code xyz:calc function exists, and then reload the source file. The unqualified symbol token .code calc in that file will be correctly resolved to .code xyz:calc that time. However, if the file is compiled, reloading will not be sufficient. Even though the symbol .code xyz:calc exists, the file will continue to try to refer a function using the symbol .code abc:calc which comes from a fully qualified representation stored in the compiled file. The file will have to be recompiled to fix the issue. .coNP Treatment of unbound variables Unbound variables are treated differently by the compiler. A reference to an unbound variable is treated as a global lexical access. This means that if a variable access is compiled first and then a .code defvar is processed which introduces the variable as a dynamically scoped ("special") variable, the compiled code will not treat the variable as special; it will refer to the global binding of the variable, even when a dynamic binding for that variable exists. The interpreter treats all variable references that do not have lexical bindings as referring to dynamic variables. The compiler treats a variable as dynamic if a .code defvar has been processed which marked that variable as special. .coNP Unbound Symbols in @ dwim Arguments of a .code dwim form (or the equivalent bracket notation) which are unbound symbols are treated differently by the compiler. The code is compiled under the assumption that all such symbols refer to global functions. For instance, if neither .code f nor .code x are defined, then .code "[f x]" will be compiled under the assumption that they are functions. If they are later defined as variables, the compiled code will fail because no function named .code x exists. The interpreter resolves each symbol in a .code dwim form at the time the form is being executed. If a symbol is defined as a variable at that time, it is accessed as a variable. If it defined as a function, it is accessed as a function. .coNP Bound symbols in @ dwim The symbolic arguments of a .code dwim form that refer to global bindings are also treated differently by the compiler. For each such symbol, the compiler determines whether it refers to a function or variable and, further, whether the variable is global lexical or special. This treatment of the symbol is then cemented in the compiled code; the compiled code will treat that symbol that way regardless of the run-time situation. By contrast, the interpreter performs this classification each time the arguments of a .code dwim form are evaluated. The rules are otherwise the same: if the symbol is bound as a variable, it is treated as a variable. If it is bound as a function, it is treated as a function. If it has both bindings, it is treated as a variable. The difference is that this is resolved at compile time for compiled code, and at evaluation time for interpreted code. .coNP File-Wide Insertion of Gensyms The following degenerate situation occurs, illustrated by example. Suppose the following definitions are given: .verb (defvarl %gensym%) (defmacro define-secret-fun ((. args) . body) (set %gensym% (gensym)) ^(defun ,%gensym% (,*args) ,*body)) (defmacro call-secret-fun (. args) ^(,%gensym% ,*args)) .brev The idea is to be able to define a function whose name is an uninterned symbol and then call it. An example module might use these definitions as follows: .verb (define-secret-fun (a) (put-line `a is @a`)) (call-secret-fun 42) .brev The effect is that the second top-level form calls the function, which prints 42 to standard out. This works both interpreted and compiled with .codn compile-file . Each of these two macro calls generates a top-level form into which the same gensym is inserted. This works under file compilation due to a deliberate strategy in the layout of compiled files, which allows such uses. Namely, the file compiler combines multiple top-level forms into a single object, which is read at once, and which uses the circle notation to unify gensym references. However, suppose the following change is introduced: .verb (define-secret-fun (a) (put-line `a is @a`)) (defpackage foo) ;; newly inserted form (call-secret-fun 42) .brev This still works when interpreted, and compiles successfully. However, when the compiled file is loaded, the compiled version of the .code call-secret-fun form fails with an error complaining that the .code #:g0039 (or other gensym name) function is not defined. This is because for this modified source file, the file compiler is not able to combine the compiled forms into a single object. It would not be correct to do so in the presence of the .code defpackage form, because the evaluation of that form affects the subsequent interpretation of symbols. After the package definition is executed, it is possible for a subsequent top-level form to refer to a symbol in the .code foo package such as .code foo:bar to occur, which would be erroneous if the package didn't exist. The file compiler therefore arranges for the compiled forms after the .code defpackage to be emitted into a separate object. But that division in the output file consequently prevents the occurrences of the gensym to resolve to the same symbol object. In other words, the strategy for allowing global gensym use is in conflict with support for forms which have a necessary read-time effect such as .codn defpackage . The solution is to rearrange the file to unravel the interference, or to use interned symbols instead of gensyms. .coNP Delimited Continuations There are differences in behavior between compiled and interpreted code with regard to delimited continuations. This is covered in the Delimited Continuations section of the manual. .SS* Compilation Library .coNP Function @ compile-toplevel .synb .mets (compile-toplevel < form << expanded-p ) .syne .desc The .code compile-toplevel function takes the Lisp form .meta form and compiles it. The return value is a .I "virtual-machine description" object representing the compiled form. This object isn't of function type, but may be invoked as if it were a function with no arguments. Invoking the compiled object is expected to produce the same effect as evaluating the original .meta form using the .code eval function. The .meta expanded-p argument indicates that .meta form has already been expanded and is to be compiled without further expansion. If .meta expanded-p is .codn nil , then it is subject to a full expansion. Note: in spite of the name, .code compile-toplevel makes no consideration whether or not .meta form is a "top-level form" according to the definition of that term as it applies to .code compile-file processing. Note: a form like .code "(progn (defmacro foo ()) (foo))" will not be processed by .code compile-toplevel in a manner similar to the processing by .code eval or .codn compile-file . In this example, .code defmacro form will not be evaluated prior to the expansion of .code "(foo)" (and in fact not evaluated at all) and so the latter expression isn't correctly referring to that macro. The form .code "(progn (macro-time (defmacro foo ())) (foo))" can be processed by .codn compile-toplevel ; however, the macro definition now takes place during expansion, and isn't compiled. The .code compile-file function has no such issue when it encounters such a form at the top-level, because that function will consider a top-level .code progn form to consist of multiple top-level forms that are compiled individually, and also executed immediately after being compiled. .TP* Example .verb ;; compile (+ 2 2) form and execute to calculate 4 ;; (defparm comp (compile-toplevel '(+ 2 2))) (call comp) -> 4 [comp] -> 4 .brev .coNP Function @ compile .synb .mets (compile << function-name ) .mets (compile << lambda-expression ) .mets (compile << function-object ) .syne .desc The .code compile function compiles functions. It can compile named functions when the argument is a .metn function-name . A function name is a symbol denoting an existing interpreted function, or compound syntax such as .mono .meti (meth < type << name ) .onom to refer to methods. The code of the interpreted function is retrieved, compiled in a manner which produces an anonymous compiled function, and then that function replaces the original function under the same name. If the argument is a lambda expression, then that function is compiled. If the argument is a function object, and that object is an interpreted function, then its code and lexical environment are retrieved and compiled. In all cases, the return value of .code compile is the compiled function. Note: when an interpreted function object is compiled, the compiled environment does not share bindings with the original interpreted environment. Modifications to the bindings of either environment have no effect on the other. However, the objects referenced by the bindings are shared. Shared bindings may be arranged using the .code hlet or .code hlet* macros. .coNP Functions @ compile-file and @ compile-update-file .synb .mets (compile-file < input-path <> [ output-path ]) .mets (compile-update-file < input-path <> [ output-path ]) .syne .desc The .code compile-file function reads forms from an input file, and produces a compiled output file. First, .meta input-path is converted to a .I "tentative pathname" as follows. If .meta input-path specifies a pure relative pathname, as defined by the .code pure-rel-path-p function, then a special behavior applies. If an existing load operation is in progress, then the special variable .code *load-path* has a binding. In this case, .code load will assume that the relative pathname is a reference relative to the directory portion of that pathname. If .code *load-path* has the value .codn nil , then a pure relative .meta input-path pathname is used as-is, and thus resolved relative to the current working directory. The tentative pathname is converted to an .I "actual input pathname" as follows. Firstly, if the tentative pathname ends with one of the suffixes .code .tl or .code .txr then it is considered suffixed, otherwise it is considered unsuffixed. If it is suffixed, then the actual pathname is the same as the tentative pathname. In the unsuffixed case, two possible actual input pathnames are considered. First, if the unsuffixed path refers to a file that can be opened, then that unsuffixed path is taken as actual path. Otherwise, the suffix .code .tl is added to the tentative pathname, and that becomes the actual path. If the actual path ends in the suffix .code .txr then the behavior is unspecified. If the .meta output-path parameter is given an argument, then that argument specifies the output path. Otherwise the output path is derived from the tentative input path as follows. If the tentative input path is unsuffixed, then .code .tlo is added to it to produce the output path. Otherwise, the suffix is removed from the tentative input path and replaced with the .code .tlo suffix. The .code compile-file function binds the variables .code *load-path* and .code *package* similarly to the .code load function. Over the compilation of the input file, .code compile-file establishes a new dynamic binding for several special variables. The variable .code *load-path* is given a new binding containing the actual input pathname. The .code *package* variable is also given a new dynamic binding, whose value is the same as the existing binding. Thus if the compilation of the file has side the effect of altering the value of .codn *package* , that effect will be undone when the binding is removed after the compilation completes. Compilation proceeds according to the File Compilation Model. If the compilation process fails to produce a successful translation for each form in the input file, the output file is removed. The .code compile-update-file function differs from .code compile-file in the following regard: compilation is performed only if the input file is newer than the output file, or else if the output file doesn't exist. The .code compile-file always returns .code t if it terminates normally, which occurs if it successfully translates every form in the input file, depositing the translation into the output file. If compilation fails, .code compile-file terminates by throwing an exception. The .code compile-update-file function returns .code t if it successfully compiles, similarly to .codn compile-file . If compilation is skipped, the function returns .codn nil . Note: the following idiom may be used to load a file, compiling it if necessary: .verb (or (compile-update-file "file") (load-file "file")) .brev However, note that it relies on the effect of compiling a source file being the same as the effect of loading the compiled file. This can only be true if the source file contains no .code compile-only or .code eval-only top-level forms. Two or more compiled files that are compiled by the same version of \*(TX may be catenated together to produce a single .code .tlo file. Such a file may be loaded by the .code load function. The behavior of loading such a file may differ from loading the individual files, because such a .code load is treated as a single operation. .coNP Special Variable @ *opt-level* .desc The special variable .code *opt-level* provides control over compiler optimizations. The variable takes on integer values. If the value is .codn nil , it is interpreted as zero. The meaningful range is from 0 to 7. The initial value of the variable is 7. The meanings of the values are as follows: .RS .IP 0 Almost all optimizations are disabled, except for some strength reductions of instances of he .code equal function, to take advantage of certain conditional instructions. .IP 1 Constant folding is applied, as well as algebraic reductions to list processing and arithmetic code. Two-argument calls to several common arithmetic operators are translated into calls to more efficient two-argument internal functions. .IP 2 Blocks which can be easily confirmed not to be used as exit points are removed. Variable frames in which no lexically captured variables are bound, and no dynamic variables are bound, are eliminated. .IP 3 Lambda expressions and calls to combinator functions such as .code chain and .code andf are lifted to load time, if possible. .IP 4 Control flow optimizations are applied: jump threading and elimination of unreachable code. Some peephole optimizations are applied to improve certain instruction patterns. .IP 5 Data flow optimizations are applied, such as elimination of dead register moves, or useless propagations of values from one register to another. More peephole optimizations are applied. .IP 6 Additional iterations of the levels 4 and 5 optimizations are performed, if the previous iterations have coalesced some basic blocks of the program graph. Also, at this level, .code chain expressions containing lambdas are inlined, eliminating the closures. These expressions arise out of .code opip syntax and its derivatives. .IP 7 Certain more rarely applicable optimizations are applied which reduce code size by merging some identical code blocks, or improving some more rarely occurring instruction patterns. .RE .coNP Function @ clean-file .synb .mets (compile-file << path ) .syne .desc The .code clean-file function removes a previously compiled file associated with .metn path , if such a file exists. In situations when it successfully removes a file, it returns .codn t , otherwise .codn nil . The function may also throw an exception, in situations such as encountering a nonexistent directory component or permission problem. First, if .meta path specifies a pure relative pathname, as defined by the .code pure-rel-path-p function, and if the .code *load-path* variable contains a value other than .codn nil , then .code clean-file calculates the directory name of .code *load-path* as if by using .code dir-name and catenates that directory name with .meta path to produce an intermediate path. Otherwise .meta path is considered to be the intermediate path. Next, the suffix of the intermediate path is examined. If it ends with .str .tlo or .strn .tlo.gz , then an attempt is made to remove that path, and the function terminates. If the intermediate path ends with .str .tl or .strn .txr , then two attempts are made to remove a file: first, the suffix is replaced with .str .tlo and that is attempted to be removed. If that fails due to non-existence, then the suffix .str .tlo.gz is tried. Otherwise, if the intermediate path doesn't have any of the above suffixes, then an attempt is made to remove the path with the .str .tlo suffix added, and then with the .str .tlo.gz suffix added. Note: no attempt is made to remove the unmodified intermediate path itself, except in the cases when it ends with .str .tlo or .strn .tlo.gz , because that risks removing a source file rather than a compiled file. .coNP Macro @ with-compilation-unit .synb .mets (with-compilation-unit << form *) .syne .desc When a file is processed by .codn compile-file , certain actions, such as the issuance of diagnostics about undefined functions and variables, are delayed until the file is completely processed. The .code with-compilation-unit macro allows these actions to be collectively deferred until multiple files are completely processed. The macro evaluates each enclosed .meta form in a single compilation environment. After the last .meta form is evaluated, deferred actions of any enclosed .code compile-file forms are performed, and then the value of the last .meta form is returned. It is permissible to nest .code with-compilation-unit forms, lexically or dynamically. The outermost invocation of .code with-compilation-unit dominates; all deferred .code compile-file actions are held until the outermost enclosing .code with-compilation-unit terminates. .coNP Operators @ compile-only and @ eval-only .synb .mets (compile-only << form *) .mets (eval-only << form *) .syne .desc These operators take on a special behavior only when they appear as top-level forms in the context of file compilation. When a .code compile-only or .code eval-only form is processed by the evaluator rather than the compiler, or when it is processed outside of file compilation, or when it is appears as other than a top-level form even under file compilation, then these operators behave in a manner identical to .codn progn . When a .code compile-only form appears as a top-level form under file compilation, it indicates to the file compiler that the .metn form s enclosed in it are not to be evaluated. By default, the file compiler executes each top-level form after compiling it. The .code compile-only operator suppresses this evaluation. When an .code eval-only form appears as a top-level form under file compilation, it indicates to the file compiler that the .metn form s enclosed in it are not to be emitted into the output file. By default, the file compiler includes the compiled image in the output written to the output file. The .code eval-only operator suppresses this inclusion. Forms which are surrounded by both an .code eval-only form and a .code compile-only form are neither executed nor emitted into the output file. In this situation, the forms are skipped entirely; no compilation takes place. .TP* Notes: The .code compile-file function not only compiles, but also executes every form for the following reason: the correct compilation of forms can depend on the execution of earlier forms. For instance, code may depend on macros. Macros may in turn depend on functions and variables. All those definitions are required in order to compile the dependent code. Those dependencies may be in a separate file which is loaded by a .code load form; that .code load form must be executed. Note that execution of a form implies that the .code load-time forms that it contains are evaluated (prior to other evaluations). Suppression of the execution of a form also suppresses the evaluation of .code load-time forms. Situations in which .code compile-only is useful are those in which it is desirable to stage the execution of some top-level form into the compiled file, and not have it happen during compilation. For instance: .verb ;; in a main module (compile-only (start-application)) .brev It is not desirable to have the file compiler try to start the application as a side effect of compiling the main module. The right behavior is to compile the .code "(start-application)" top-level form so that this will happen when that module is loaded. Situation in which .code eval-only is useful is for specifying forms which have a compile-time effect only, but are not propagated into the compiled file. For example, since the correct treatment of literal symbols occurring in a compiled file does not depend on the .code *package* variable, in many cases, the .code in-package invocation in the file can be wrapped with .codn eval-only : .verb (eval-only (in-package app)) .brev The .code in-package form must be evaluated during compilation so that the remaining forms are read in the correct package. However the loading of the compiled versions of those forms doesn't require that package to be in effect; thus a compiled image of the .code in-package form need not appear in the compiled file. Macros definitions may be treated with .code eval-only if the intent is only to make the expanded code available in the compiled file, and not to propagate compiled versions of the macros which produced it. .coNP Structure @ compile-opts .synb .mets (defstruct compile-opts () .mets \ \ shadow-fun shadow-var shadow-cross unused .mets \ \ log-level constant-throws) .syne .desc The .code compile-opts structure represents compiler options: its slots are variables which affect compiler behavior. The compiler expects the special variable .code *compile-opts* to hold a .code compile-opts structure. It is recommended to manipulate options using the .code with-compile-opts macro. Currently, all of the options are diagnostic. In the future, there may be other kinds of options. Diagnostic options which are Boolean take on the values .codn nil , .codn t , .code :warn or .codn :error . Numeric options take integer values. The .code t and .code :warn value are synonyms. A value of .code nil means that the option is disabled. The .code t and .code :warn values mean that the diagnostic controlled by the option will be emitted as a warning. The .code :error value indicates that the diagnostic will be an error. The slots of .code compile-opts are as follows: .RS .coIP shadow-fun Diagnostic option, off by default. This option controls whether a diagnostic is emitted whenever a lexical function shadows another lexical function, a global function or a global macro. Note: shadowing of local macros is not diagnosed, because the compiler operates on code in which macros no longer exist. .coIP shadow-var Diagnostic option, off by default. This option controls whether a diagnostic is emitted whenever a lexical variable shadows another lexical variable, a global variable, or a global macro. Note: shadowing of local macros is not diagnosed, because the compiler operates on code in which macros no longer exist. Note: special variables are not diagnosed for shadowing. .coIP shadow-cross Diagnostic option, off by default. This option controls whether a diagnostic is emitted whenever a lexical function shadows a variable, or vice versa: whether a lexical variable shadows a function. .coIP unused Diagnostic option, set to warn by default. This option controls whether a diagnostic is emitted whenever a lexical variable is defined whose value is not used. Variables whose names are uninterned symbols are exempt from this diagnostic. The rationale is that uninterned symbols are used for naming machine-generated variables, in generated code such as macro expansions. Situation in which a machine-generated variable is unused arise fairly often, and are and are not the result of a programming error. For the purpose of this diagnostic, what constitutes use of a variable is an access to its value, which isn't optimized away before being noted. Storing a value isn't use. An example of an access which is optimized away before being noted is an access which occurs in trivially dead code: for instance .code "(if nil a)" does not access .code a because the compiler discards the .meta then expression of an .code if whose .meta test expression is constantly false. The discarded expression is never traversed in a way that would cause it to be noted as accessing the .code a variable. .coIP log-level Diagnostic option, .code nil by default. When set to a positive integer value, it enables logging, with increasing values implying more detailed logging. The value 1 causes .code compile-file and .code compile-update-file to emit an informational message whenever a file is compiled. The value 2 causes informational messages emitted for each compound top-level that is compiled, if it is a compound form beginning with a symbol. .coIP constant-throws Diagnostic option, .code t by default. Controls whether the compiler issues diagnostics when it encounters a constant expression, whose evaluation throws an exception, such as .codn "(/ 0 0)" . .RE .coNP Special Variable @ *compile-opts* .desc The special variable .code *compile-opts* holds a value of type .code compile-opts which is a structure type. It is recommended to manipulate options using the .code with-compile-opts macro. .coNP Macro @ with-compile-opts .synb .mets (with-compile-opts >> {( value << option *) | << form }*) .syne .desc The .code with-compile-opts macro takes zero or more arguments. Each argument is either a clause which affects compiler options or else an ordinary .meta form which is processed in a context in which the .code *compile-options* variable has been affected by all of the previous clauses. It is unspecified whether the clauses operate destructively on .code *compile-options* or freshly bind it. However, the macro dynamically binds .code *compile-options* at least once, so that when it terminates, its previous value is restored. This binding is performed using .codn compiler-let . When .code with-compile-opts occurs in code processed by the compiler, all of the clause-driven compile option manipulation is performed in the compiler's own context. The changes to the .code *compile-options* variable are not visible to the code being compiled. Thus the macro may be used to transparently change compiler options over individual subexpressions in compiled code. When .code with-compile-opts occurs in interpreted code, the manipulations of .code *compile-options* are visible to the .metn form s. This allows interpreted build steps to configure compiler options around functions such as .codn compile-file . The clauses which operate on options have list syntax consisting of a value followed by one or more symbols which must be the names of options which are compatible with that value. The clause indicates that all those options take on that value. The possible values are: .codn nil , .codn t , .code :warn and .codn :error . These values are documented under the description of the .code compile-opts structure. .TP* Example: The following expression specifies that the file .str foo.tl is to be compiled with function and variable shadowing treated as error, but unused variable checking disabled. Then compile .str bar.tl with unused variable checking enabled. .verb ;; this form must be interpreted in order for ;; the compile-file call to "see" the effect of the ;; option manipulation. (with-compile-opts (:error shadow-var shadow-fun) (nil unused) (compile-file "foo.tl") (:warn unused) (compile-file "bar.tl")) ;; when the following form is compiled, the unused ;; variable warning will be disabled just around ;; the (let (y) x). (lambda (x) (with-compile-opts (nil unused) (let (y) x))) ;; Show detailed traces of what forms are ;; compiled in these two files. (with-compile-opts (2 log-level) (compile-file "foo.tl") (compile-file "bar.tl")) .brev .coNP Operator @ compiler-let .synb .mets (compiler-let >> ({( sym << init-form )}*) << body-form *) .syne .desc The .code compiler-let operator strongly resembles .code let* but has different semantics, relevant to compilation. It also has a stricter syntax in that variables may not be symbols without a .metn init-form : only variable binding specifications of the form .mono .meti >> ( sym << init-form ) .onom are allowed. Symbols bound using .code compiler-let are expected to be special variables. For every .metn sym , the expression .mono .meti (special-var-p << sym ) .onom should be true. The behavior is unspecified for any .meta sym which doesn't name a special variable. When the compiler encounters the .code compiler-let construct, the compiler itself establishes a dynamic scope in which the implied special variable bindings are in effect. This effect is not incorporated into the compiled code. The compiler then implicitly places the .metn body-form s, into a .code progn from, and compiles that form. While the implicit .code progn is being compiled, the dynamic bindings established by .code compiler-let are in scope. Thus .code compiler-let may be used to bind special variables which influence compiler behavior. The .code compiler-let form is treated like .code let* by the interpreter, provided that every .meta sym names a special variable. .coNP Macro @ load-time .synb .mets (load-time << form ) .syne .desc The .code load-time macro makes it possible for a program to evaluate a form, such that, subsequently, the value of that form is then treated as if it were a literal object. Literals are pieces of the program syntax which are not evaluated at all. On the other hand, the values of expressions are not literals. From time to time, certain situations benefit from the program being able to perform an evaluation, and then have the result of that evaluation treated as a literal. The .code macro-time macro makes this possible in its particular manner: that macro allows one or more expressions to be evaluated during macro expansion. The result of the .code macro-time is then quoted and substituted in place of the expression. That result then appears as a true literal to the executing code. The .code load-time macro similarly arranges for the single form .meta form to be evaluated. However, this evaluation doesn't take place at expansion time. It is delayed until the program executes. What exactly "delayed until the program executes" means depends on whether .code load-time is used in compiled or interpreted code, and in what situation is it compiled. If the .code load-time form appears in interpreted code, then the exact time when .meta form is evaluated is unspecified. The evaluator may identify all .code load-time forms which occur anywhere in a top-level expression, and perform their evaluations immediately, before evaluating the form itself. Then, when the .code load-time forms are encountered again during the evaluation of the form, they simply retrieve the previously evaluated values as if they were literal. Or else, the evaluation may be performed late: when the .code load-time form itself is encountered during normal evaluation. In that case, .meta form will still be evaluated only once and then its value will be be inserted as a literal in subsequent reevaluations of that .code load-time form, if any. If a .code load-time form appears in a non-top-level expression which is compiled, the compiler arranges for the compiled version of .meta form to be executed when compiled version of the entire expression is executed. This execution occurs early, before the execution of forms that are not wrapped in .codn load-time . The value produced by .code form is entered into the static data vector associated with the compiled top-level expression, which also holds ordinary literals. Whenever the value of that .code load-time form is required, the compiled code references it from the data vector as if it were a true literal. When a .code load-time top-level form is processed by .codn compile-file , it has no unusual semantics; the effect is that it is replaced by its argument .metn form , which is in that case also considered a top-level form. The implications of the translation scheme may be understood separately from the perspective of code processed with .codn compile-toplevel , .code compile and .codn compile-file . A .code load-time form appearing in a form passed to .code compile-toplevel is translated such that its embedded .meta form will be executed each time the virtual-machine description returned by .code compile-toplevel is executed, and the execution of all such forms is placed ahead of other code. A .code load-time form appearing in an interpreted function which is processed by .code compile is evaluated immediately, and its value becomes a literal in the compiled version of the function. A .code load-time form appearing as a non-top-level form inside a file that is processed by .code compile-file is compiled along with that form and deposited into the object file. When the object file is loaded, each compiled top-level form is executed. Each compiled top-level form's .code load-time calculations are executed first, and the corresponding .meta form values become literals at that point. This execution order is individually ensured for each top-level form. Thus, the .code load-time forms in a given top-level form may rely on the side-effects of prior top-level forms having taken place. Note that, by default, .code compile-file also immediately executes each top-level form which it compiles and deposits into the output file. This execution is equivalent to a load; it causes .code load-time forms to be evaluated. The .code compile-only operator must be used around .code load-time forms which must be evaluated only when the compiled file is loaded, and not at compile time. In all situations, the evaluation of .meta form takes place in the global environment. Even if the .code load-time form is surrounded by constructs which establish lexical bindings, those lexical bindings aren't visible to .metn form . Which dynamic bindings are visible to .meta form depends on the exact situation. If a .code load-time form occurs in code that had been processed by .code compile-file and is now being loaded by .codn load , then the dynamic environment in effect is the one in which the .code load occurred, with any modifications to that environment that were performed by previously executed forms. If a .code load-time form occurs in code that had been processed by .codn compile-toplevel , then .meta form is evaluated in the dynamic environment of the caller which invokes the execution of the resulting compiled object. When a .code load-time form occurs in the code of a function being processed by .codn compile , then .meta form is evaluated in the dynamic environment of the caller which invokes .codn compile . If a .code load-time form occurs in a form processed processed by the evaluator, it is unspecified whether it takes place in the original dynamic environment in which the evaluator was invoked, or whether it is in the dynamic environment of the immediately enclosing form which surrounds the .code load-time form. A .code load-time form may be nested inside another .code load-time form. In this situation, two cases occur. If the two forms are not embedded in a .codn lambda , or else are embedded in the same .codn lambda , then the inner .code load-time form is superfluous due to the presence of the outer .codn load-time . That is to say, the inner .mono .meti (load-time << form ) .onom expression is equivalent to .metn form , because the outer form already establishes its evaluation to be in a load-time context. If the inner .code load-time form occurs in a .codn lambda , but the outer form occurs outside of that .codn lambda , then the semantics of the inner .code load-time form is relevant and necessary. This is because expressions occurring in a .code lambda are evaluated when the .code lambda is called, which may take place from a non-load-time context, even if the .code lambda itself was produced in a load-time context. An expression being embedded in a .code lambda means that it appears either in the .code lambda body, or else in the parameter list as the initializing expression for an optional parameter. .TP* Notes: When interpreted code containing .code load-time is evaluated, a mutating side effect may take place on the tree structure of that code itself as a result of the .code load-time evaluation. If that previously evaluated code is subsequently compiled, the compiled translation may be different from compiling the original unevaluated code. Specifically, the compiler may take advantage of the .code load-time evaluation which had already taken place in the interpreter, and simply take that value, and avoid compiling .meta form entirely. This also has implications on the dynamic environment that is in effect when .meta form is evaluated. If .meta form is evaluated by the interpreter, then it interacts with the dynamic environment which as in effect in that situation; then when the compiler later just takes the result of that evaluation, the compiler's dynamic environment is irrelevant since .meta form isn't being evaluated any more. If .metn form , when evaluated multiple times, potentially produces a different value on each evaluation, this has implications for the situation when an object produced by .code compile-toplevel is invoked multiple times. Each time such an object is invoked, the .code load-time forms are evaluated. If they produce different values, then it appears that the values of literals are changing. All lexical closures derived from the same compiled object share the same literal data. The .code load function never evaluates a compiled expression more than once. If the same compiled file is loaded more than once, a new compiled object instance is produced from each compiled expression, carrying its own storage area for literals. The .code compile function also never evaluates a compiled expression more than once; it produces a compiled object, and then executes it once in order to obtain a lexical closure which is returned. Invoking the closure doesn't cause the .code load-time expressions to be evaluated. The .code load-time form is subject to compiler optimizations. A top-level expression is assumed to be evaluated at load time, so .code load-time does nothing in a top-level expression. It becomes active inside forms embedded in a .code lambda expressions. Since .code load-time may be used to hoist calculations outside of loops, .code load-time is also active in those parts of loops which are repeatedly evaluated. The use of .code load-time is similar to defining a variable and then referring to the variable. For instance, a file containing this: .verb (defvarl a (list 1 2)) (defun f () (cons 0 a)) .brev is similar to .verb (defun f () (cons 0 (load-time (list 1 2)))) .brev When either file is loaded, in source or compiled form, .code list expression is evaluated at load time, and then when .code f is invoked, it retrieves the list. Both approaches have advantages. The variable-based approach gives the value a name. The semantics of the variable is straightforward. The variable .code a can easily be assigned a new value. Using its name, the variable can be inspected from the interactive listener. The variable can be referenced from multiple top-level forms directly; it is not a static datum tied to a table of literal values that is tied to a single top-level form. Furthermore, the use of .cod3 defvar / defvarl versus .cod3 defparm / defparml controls whether the variable gets replaced with a new value when the file is reloaded. The advantage of .code load-time is that it doesn't require a separate top-level form to achieve its load-time effect: the expression is simply nested at the point where it is needed. The .code load-time form can therefore be generated by macros, whose expansions cannot inject extra top-level forms into the site where they are invoked. If a macro writer would like some form to be evaluated at load time and its value accessible in a macro expansion that appears arbitrarily nested in code, then .code load-time may provide the path to a straightforward implementation strategy. Access to a .code load-time value is fast because it doesn't involve referencing through a variable binding; compiled code accesses the value directly via its fixed position in the static data table associated with that code. This advantage is insignificant, however, because access to lexical variables in compiled code is similarly fast, and a value can easily be propagated from a global variable to a lexical for the sake of speed. That said, .code load-time eliminates that copying step too. A .code load-time is also useful when the value is not required, and instead the form produces a useful effect, which should be hoisted to load time. For instance, consider a macro which produces the following expansion: .verb (progn (load-time (defvar #:g0025)) (other-logic ... #:g0025)) .brev no matter where this expansion is inserted, .code compile-file and .code load will ensure that the .code defvar is executed once, when the compiled file is loaded, as if that .code defvar appeared on its own as a top-level form. Then the .code other-logic form can refer to the variable, without the .code defvar being evaluated on each execution of the .codn progn . The author of a macro can use .code load-time to stage the evaluation of global effects that the macro expansion depends on simply by bundling these effects into the expansion, wrapped in .codn load-time . .TP* "Dialect Note:" The .code load-time macro is similar to the ANSI Common Lisp .code load-time-value special operator. It doesn't support the .meta read-only-p argument featured in the ANSI CL operator. The semantics of .code load-time is somewhat more precisely specified in terms of concrete implementation concepts. The ANSI CL .code load-time-value may evaluate .meta form more than once in interpreted code; effectively, the ANSI CL implementation may treat .code "(load-time-value x)" as .codn "(progn x)" . This is not true of \*(TL's .code load-time which requires once-only evaluation even in interpreted code. The name .code load-time is used instead of .code load-time-value for several reasons. Firstly, .code load-time is useful for staging effects, like definitions, to load time, even when the resulting value is not used. Secondly, unlike \*(TL, ANSI CL features multiple values: a form can yield zero or more values. The ANSI CL .code load-time-value operator, however, is restricted to yielding a single value, and its name may have been chosen to emphasize this aspect/restriction. That doesn't apply in the context of \*(TL in which all expressions which terminate normally yield exactly one value, making .str -value a suffix that adds no value. Lastly, .code load-time is shorter, and harmonizes with .codn macro-time , which existed earlier. .coNP Function @ disassemble .synb .mets (disassemble << function-name ) .mets (disassemble << function ) .mets (disassemble << compiled-expression ) .syne .desc The .code disassemble function presents a disassembly listing of the virtual-machine code of a compiled function or form. It also presents the literal data contained in that compiled object in a tabular form which is readily cross-referenced with the disassembly listing. If the argument is a .meta function-name then the function object is retrieved from the binding indicated by the name, in the global namespace. That object is then treated as if it were the .meta function argument. A .meta function argument is one that is a function object. Only compiled virtual-machine functions can be disassembled; other kinds of functions are rejected by .codn disassemble . The .code disassemble function will also process the .meta complied-expression object that is returned by the .code compile-toplevel function. In the case of .metn function , the entire compiled form containing .meta function is disassembled. That form usually contains code which is external to the function, even possibly other functions. The disassembly listing indicates the entry point in the code block where the execution of .meta function begins. The .code disassemble function returns its argument. .coNP Function @ dump-compiled-objects .synb .mets (dump-compiled-objects < stream << object *) .syne .desc The .code dump-compiled-objects function writes compiled objects into .meta stream in the same format as the .code compile-file function. Unlike under .codn compile-file , the output is written into an arbitrary stream rather than a named file. The objects aren't specified by the to-be-compiled syntax processed from a source file, but rather as zero or more arguments which specify objects that are already compiled. Each .meta object must be be one of three kinds of values: .RS .IP 1. a virtual-machine-description object returned by .code compile-toplevel function; or .IP 2. a compiled function object, satisfying the function .codn vm-fun-p ; or else .IP 3. the name of a compiled function object, which may take any of the forms suitable as arguments to the .code symbol-function function. .RE .IP First, .code dump-compiled-objects writes some preamble information into .metn stream . Then, for each .meta object that is not already a virtual-machine description, its corresponding virtual-machine description is retrieved. The virtual-machine description is converted into the externalized format required for the object format and that externalized format is written into .metn stream . The .code object argument are thus processed in left-to-right order. If exactly one call to .code dump-compiled-objects is used to populate an initially empty file, and no other data are written into the file, then that file is a valid compiled file. If that file is processed by .code load-file then each of the externalized forms is converted to a virtual-machine description and executed. Note that virtual-machine descriptions are not functions. A function's virtual-machine description is the compiled version of the top-level form whose evaluation produced that function. For example, if the following top-level form is compiled and executed, two functions are defined: .verb (let () (defun a ()) (defun b ())) .brev Then, the following two expressions all have the same effect on stream .codn s : .verb (dump-compiled-objects s 'a) (dump-compiled-objects s 'b) .brev Whether the .code a or .code b symbol is used to specify the object to be dumped, the same virtual-machine description is externalized and deposited into the stream. That machine description, when loaded and executed, defines two functions. .SH* INTERACTIVE LISTENER .SS* Overview On some target platforms, \*(TX provides an interactive listener, which is invoked using the .code -i command-line option, or by executing .code txr with no arguments. The interactive listener provides features like visual editing of the command line, tab completion on \*(TL symbols, and history recall. .SS* Basic Operation The interactive listener prints a numbered prompt. The number in the prompt increments with every command. The first command line is numbered 1, the second one 2 and so forth. The listener accepts input characters from the terminal. Characters are either interpreted as editing commands or other special characters, or else are inserted into the editing buffer. However, control characters which don't correspond to commands are silently rejected. The carriage return character generated by the .key Enter key indicates that a complete line has been entered, and it is to be interpreted. The listener parses the line as a \*(TL expression, evaluates it, and prints the resulting value. If the evaluation of the line throws an exception, the listener intercepts the exception and prints information about it preceded by two asterisks and a space. These asterisks distinguish an exception from a result value. If an empty line is entered, or a line containing only spaces, tabs or embedded carriage returns or linefeeds, the prompt is repeated without incrementing the number. Such a line is not entered into the history. A line which only contains a \*(TL comment (optional spaces, tabs or embedded carriage returns or linefeeds, followed by a semicolon), also causes the prompt to be repeated without incrementing the number. However, such a line .B is entered into the history. The listener does not allow lines containing certain bad syntax to be submitted with .keyn Enter . If the buffer contains an expression with unbalanced parentheses or brackets, or unterminated literals, then .key Enter generates a newline character which is inserted into the buffer. In that situation, if that newline character is being added at the very end of the buffer, the listener flashes the exclamation mark character (!) two times to warn the user that line has not been submitted: no computation is taking place, and the listener is waiting for more input. It is possible to force the submission of an unbalanced line using the sequence .key Ctrl-X .keyn Ctrl-F . .SS* Limitations The interactive listener can only accept up to 4095 abstract characters of input in a single command line. Though the edit buffer is referred as the "command line", it may contain multiline input. The carriage return characters which separate multiple lines count as one abstract character each, and are understood to occupy two display positions. Until \*(TX 286, the command line had to contain exactly one complete \*(TL expression, or a comment. Multiple expressions were not evaluated. This restriction has been lifted: multiple expressions in the command line are parsed as one unit, and evaluated as if they were placed into a .code progn form. If all the expressions evaluate and terminate normally, the value of the last expression is printed. In multiline mode, if the number of lines exceeds the number of lines of the terminal display, the editing experience is adversely affected in unspecified ways. The screen updating logic in the listener is based on the assumption that the display terminal uses ANSI emulation. No other terminal emulation is supported. The .code TERM environment variable is ignored. .SS* Ways to Quit Pressing .key Ctrl-D in a completely empty command line terminates the listener. Another way to quit is to enter the .code :quit keyword symbol. When the form input into the listener consists of this symbol, the listener will terminate: .verb 1> (+ 2 2) 4 2> :quit os-shell $ .brev Another way to terminate is to evaluate a call to the .code exit function. This method allows a termination status to be specified: .verb 1> (exit 1) os-shell $ .brev However, if a \*(TX interactive session is terminated this way, it will not save the listener history. Raising a fatal signal with the .code raise function is another way to quit: .verb 1> (raise sig-abrt) Aborted (core dumped) os-shell $ .brev The previous remark about not saving the listener history applies here also. .SS* Interrupting Evaluation .key Ctrl-C typed while editing a command line is interpreted as an editing command which causes that command line to be canceled. The listener prints the string .str ** intr and repeats the same prompt. If a command line is submitted for evaluation, the evaluation might take a long time or block for input. In these situations, typing .key Ctrl-C will issue an interrupt signal. The listener has installed a handler for this signal which generates an exception of type .code error which is caught by the listener. The exception's message is the string .str intr so that the listener ends up printing .str intr ** like in the case of the .key Ctrl-C editing command. In this situation, though, a new command-line prompt is issued with an incremented number, and the exception is recorded as a value. .SS* Listener Variables .coNP Variables @, *0 @, *1 @, *2 ..., @ *99 .desc The listener provides useful variables which allow commands to reference the results of previous commands. As noted previously, the commands are enumerated with an incrementing number. Each command's number, modulo 100, corresponds to one of the variables .codn *0 , .codn *1 , .codn *2 , \&..., .codn *99 . Thus, up to the previous hundred results can be referenced: .verb ... 99> (+ 2 2) ;; stored in *99 4 100> (* 3 2) ;; stored in *0 6 101> (+ *99 *0) ;; i.e. (+ 4 6) 10 .brev .coNP Symbol Macros @, *-1 @, *-2 ..., @ *-20 The listener provides small number of symbol macros for referencing the results of previous commands in a relative. The macro .code *-1 refers to the value of the immediately previous command. The macro .code *-2 refers to the value of the command before that one and so on. Note: each of these macros expands to a reference to the .code *r vector, according to the following pattern: .mono *-1 --> [*r (mod (- *v 1) 100)] *-2 --> [*r (mod (- *v 2) 100)] ... *-20 --> [*r (mod (- *v 20) 100)] .onom .coNP Variable @ *n .desc The listener variable .code *n evaluates to the current command-line number: the number of the command in which the variable occurs: .verb 5> *n 5 6> (* 2 *n) 12 .brev .coNP Variable @ *v .desc The listener variable .code *v evaluates to the current variable number: the command number modulo 100: .verb 103> *v 3 104> *v 4 .brev .coNP Variable @ *r .desc The listener variable .code *r evaluates to a hash table which associates variable numbers with command results: .verb 213> 42 42 214> [*r 13] 42 .brev The result hash allows relative addressing. For instance the expression .code "[*r (mod (pred *v) 100)]" refers to the result of the previous command. .SS* Exceptions The interactive listener catches all exceptions. Each caught exception is associated with the command's variable number, and stored as a value in the appropriate listener variable as well as the .code *r result hash. Exceptions are turned into values by creating a cons cell whose .code car is the exception symbol and whose .code cdr holds the exception's arguments. For each caught exception, a message is printed beginning with the sequence .strn "** " . Exactly how the message appears depends on the type and content of the exception. .SS* Editing The following sections describe the interactive editing commands available in the listener. Terminals can often be configured with different choices of cursor shape: such as a block-shaped cursor, an underline cursor or a vertical line or "I-beam" cursor. In the following sections, the phrase "character under the cursor" refers to the character that is currently covered by a block cursor, underlined by an underline cursor, or that is immediately to the right of an I-beam cursor. .NP* Move Left and Right Moving within the line is achieved using the left and right arrow keys .key \[<-] and .keyn \[->] . In addition, .key Ctrl-B ("back") and .key Ctrl-F ("forward") perform this movement. .NP* Jump to Beginning and End of Line The .key Ctrl-A command moves to the beginning of the line. ("A" is the beginning of the alphabet). The .key Ctrl-E ("end") command jumps to the end of the line, such that the last character of the line is to the left of the cursor position. On terminals which have the Home and End keys, these may also be used instead of .key Ctrl-A and .keyn Ctrl-E . In line mode, these commands move the cursor to the beginning or end of the edit buffer. In multiline mode, if the cursor is not already at the beginning of a physical line, then .key Ctrl-A moves it to the first character of the physical line. Otherwise, .key Ctrl-A moves the cursor to the beginning of the edit buffer. Similarly, in multiline mode, if the cursor not already at the end of a physical line, .key Ctrl-E moves it there. Otherwise, the cursor moves to the end of the edit buffer. .NP* Jump to Matching Parenthesis If the cursor is on an opening or closing parenthesis, brace or bracket, the .key Ctrl-] command tries to jump to the matching character. The logic for finding the matching character is identical to that of the Parenthesis Matching feature. If no matching character is found, then no movement takes place. If the cursor is not on an opening or closing parenthesis, brace or bracket, then the closest such character is found. The cursor is moved to that character and then an attempt is made to jump to the matching one from that new position. If the cursor is equidistant to two such characters, then one of them is chosen as follows. If the two characters are oriented in the same way (both are opening and closing), then that one is chosen whose convex side faces the cursor position. Thus, effectively, an inner enclosure is favored over an outer one. Otherwise, if the two characters have opposite orientation (one is opening and the other closing), then the one which is to the right of the cursor position is chosen. Note: the .key Ctrl-] character can be produced on some terminals using .key Ctrl-5 (using the keyboard home row 5, not the numeric keypad 5). This the same key which produces the % character when Shift is used. The % character is used in the Vi editor for parenthesis matching. .NP* Character Swap The .key Ctrl-T (twiddle) command exchanges the character under the cursor with the previous character. .NP* Delete Character Left The Backspace key erases the character to the left of the cursor, and moves the cursor to the position which that character occupied. It doesn't matter whether this key generates ASCII characters 8 (BS) or 127 (DEL): either one is acceptable. The .key Ctrl-H command also performs the same action, since it corresponds to ASCII BS. .NP* Delete Character Right The .key Ctrl-D command deletes the character under the cursor, if the cursor is block-shaped, or to the right of the cursor if the cursor is an I-beam. the cursor maintains its current character position relative to the start of the line. In multiline mode, if .key Ctrl-D is at the end of a line that is not the last line, it deletes the newline character, causing the following line to be joined to the end of the current line. If the cursor is at the end of the buffer, then .key Ctrl-D does nothing, except if the buffer is completely empty, in which case it is a quit indication. The Delete key, if available on the terminal, is a near synonym of .keyn Ctrl-D . It performs all the same functions, except that it does not act as a quit indication; Delete has no effect when the buffer is empty. When a visual selection is in effect, then .key Ctrl-D and .key Del delete that selection, and copy it to the clipboard. .NP* Delete Word Left The .key Ctrl-W ("word") command deletes the word to the left of the cursor position. More precisely, this command first deletes any consecutive whitespace characters (spaces or tabs) to the left of the cursor. Then, it deletes consecutive non-whitespace characters. Material under the cursor or to the right remains. The deleted material is copied into the clipboard. .NP* Delete to Beginning of Line The .key Ctrl-U ("undo typing") command is a "super backspace" operation: it deletes all characters to the left of the cursor position. The cursor is moved to the leftmost position. In multiline mode, .key Ctrl-U deletes only to the beginning of the current physical line, not all the way to the first position of the buffer. .key Ctrl-U copies the deleted material into the clipboard. .NP* Delete to End of Line The .key Ctrl-K ("kill") command deletes the character under the cursor position and all subsequent characters. The cursor position doesn't change. In multiline mode, .key Ctrl-K deletes only until the end of the current physical line, not the entire buffer. The material deleted by .key Ctrl-K is copied into the clipboard. .NP* Verbatim Character Insert The .key Ctrl-V ("verbatim") command places the listener's input editor into a mode in which the next character is interpreted literally and inserted into the line, even if that character is a special character such as .keyn Enter , or a command character. .NP* Verbatim Insert Mode The two-character sequence .key Ctrl-X .key Ctrl-V ("extended verbatim", "super paste") enters into a verbatim insert mode, useful for entry of free-form text. It is particularly useful in multiline mode. In this mode, almost every character is inserted verbatim, including .keyn Enter . The only commands recognized are: .keyn Ctrl-X , which terminates this mode, .key Backspace (whether that key generates ASCII BS or DEL) and arrow key navigation. .key Enter inserts a line break, which appears as such in multiline mode, or as .code ^M in line mode. .NP* Delete Current Line The .key Ctrl-X .key Ctrl-K command sequence may be used in multiline mode to delete the entire physical line under the cursor. Any lines below that line move up to close the gap. In line mode, the command has no effect, other than canceling select mode. The deleted line, including the terminating newline character, if it has one, is copied into the clipboard. .NP* History Recall By default, the most recent 500 lines submitted to the interactive listener are remembered in a history. This history is available for recall, making it convenient to repair mistakes, or compose new lines which are based on previous lines. Note that the history suppresses consecutive, duplicate lines. The number of lines retained may be customized using the .code *listener-hist-len* variable. If the .key \[ua] key is used while editing a line, the contents of the line are placed into a temporary save area. The line display is then updated to show the most recent line of history. Using .key \[ua] additional times will recall successively less recent lines. The .key \[da] key navigates in the opposite direction: from older lines to newer lines. When .key \[da] is invoked on the most recent history line, then the current line is restored from the temporary save area. Instead of .key \[ua] and .keyn \[da] , the commands .key Ctrl-P ("previous") and .key Ctrl-N ("next") may be used. If the .key Enter key is pressed while a recalled history line is showing, then that line will be submitted as if it were a newly composed line. The originally edited line which had been placed in the save area is discarded. When a recalled line is showing, it may be edited. There are two important behaviors to note here. If a recalled history line is edited, and then .key \[ua] or .key \[da] or a navigation command is used to show a different history line, or to restore the original current line, then the edit is made permanent: the edited line replaces its original version in the same position in the history. This feature allows corrections to be made to the history. The edit is recorded in the line's undo history as a single change; if the edited line is visited again, then a single .key Ctrl-O command will revert all the edits that were made. However, if a recalled line is edited and submitted without navigating to another line, then it is submitted as a newly composed line, without replacing the original in the history. Each submitted line is entered into the history, if it is different from the most recent line already in history. This is true whether it is a freshly composed line, a recalled history line, or an edited history line. .NP* History Search It is possible to search backwards through the history interactively for a line containing a substring. The .key Ctrl-R command is used to initiate search. The command prompt is replaced with the prefix .code search: next to which a pair of empty square brackets appears, indicating that the listener is in search mode. The square brackets are the search box, enclosing the search text, which is initially empty. In search mode, characters may be typed. They accumulate inside the search box, and constitute the string to search for. The listener instantly navigates to the most recent line which contains a substring match for the search string, and places the cursor on the first character of the match. Control characters entered directly are ignored. The .key Ctrl-V command be used to add a character verbatim, as in edit mode. To remove characters from the search box, Backspace can be used. The search is not repeated with the shortened search text: the same line continues to show until a character is added, at which point a new search is issued. Search mode has a "home position": a starting point for searches. The initial home position is whatever line of history is selected when search mode is initiated. Searches work backward in history from that line. If search text is edited by deleting characters and then adding new ones, the new search proceeds from the home position. The .key Ctrl-R command can be used in search mode. It registers the currently showing line as the new home position, and then repeats the search using the existing search text backwards from the new position. If the search text is empty, .key Ctrl-R has no effect. The .key Ctrl-C command leaves search mode at any time and causes the listener to resume editing the original input at the original character position. The .key Enter key accepts the result of a search and submits it as if it were a newly composed line. Navigation and editing keys may be used in search mode. A navigation or editing key immediately cancels search mode, and is processed in edit mode, using whatever line was located by the search, at the matching character position. The .key Ctrl-L (Clear Screen and Refresh), as well as .key Ctrl-Z (Suspend to Background) commands are available in search mode. Their effects takes place without leaving search mode. Navigating to a history line manually using .key \[ua] or .key \[da] (or .key Ctrl-P and .keyn Ctrl-N ) has the same net effect same as locating that line using .key Ctrl-R search. .NP* Submit and Stay in History Normally when the .key Enter key is used on a recalled history line, the next time the listener is reentered, it jumps back to the newest history position where a new line is about to be composed. The alternative command sequence .key Ctrl-X .key Enter provides a useful alternative behavior. After the submitted line is processed, the listener doesn't jump to the newest history position. Instead, it stays in the history, advancing forward by one position to the successor of the submitted line. .key Ctrl-X .key Enter can be used to conveniently submit a range of lines from the history, one by one, in their original order. .NP* Insert Previous Word The equivalent command sequences .key Ctrl-X .key w and .key Ctrl-X .key Ctrl-W insert a word from the previous line at the cursor position. A word is defined as a sequence of non-whitespace characters, separated from other words by whitespace. By default, the last word of the previous line is inserted. Between the .key Ctrl-X and the following .key Ctrl-W or .keyn w , a decimal number can be entered. The number 1 specifies that the last word is to be inserted, 2 specifies the second last word, 3 the third word from the right and so on. Only the most recent three decimal digits are retained, so the number can range from 0 to 999. A value of 0, or a value which exceeds the number of words causes the .key Ctrl-W or .key w to do nothing. Note that "previous line" means relative to the current location in the history. If the 42nd most recent history line is currently recalled, this command takes material from the 43rd history line. .NP* Insert Previous Atom The equivalent command sequences .key Ctrl-X .key a and .key Ctrl-X .key Ctrl-A insert an atom from the previous line at the cursor position. A line only makes atoms available if it expresses a valid \*(TX form, free of syntax errors. A line containing only whitespace or a comment makes no atoms available. For the purposes of this editing feature, an atom is defined as the printed representation of a Lisp atom taken from the Lisp form specified in the previous line. The line is flattened into atoms as if by the .code flatcar function. By default, the last atom is extracted. A numeric argument typed between the .key Ctrl-X and .key Ctrl-A or a can be used to select a atoms by position from the end. The number 1 specifies the last atom, 2 the second last and so on. Only the most recent three decimal digits are retained, so the number can range from 0 to 999. A value of 0, or a value which exceeds the number of words causes the .key Ctrl-A or a to do nothing. Note that "previous line" has the same meaning as for the .key Ctrl-X .key Ctrl-W (insert previous word) command. .NP* Insert Previous Line The command sequences .key Ctrl-X .key Ctrl-R ("repeat") and .key Ctrl-X .keyn r , which are equivalent, insert an entire line of history into the current buffer. By default, the previous line is inserted. A less recent line can be selected by typing a numeric argument between the .key Ctrl-X and the .key Ctrl-R or .keyn r . The immediately previous history line is numbered 1, the one before it 2 and so on. If this command is used during history navigation, it references previous lines relative to the currently recalled history line. .NP* Symbolic Completion If the Tab key is pressed while editing a line, it is interpreted as a request for completion. There is a second completion command: the sequence .key Ctrl-X .keyn Tab . When completion is invoked with .key Tab or .key Ctrl-X .keyn Tab , the listener looks at a few of the trailing characters to the left of the cursor position to determine the applicable list of completions. Completions are determined from among the \*(TL symbols which have global variable, function, macro and symbolic macro bindings, as well as the static and instance slots of structures. Symbols which have operator bindings are also taken into consideration. If a package-qualified symbol is completed, then completion is restricted to that package. Keyword symbol completion is restricted to the contents of the keyword package. The namespaces which are searched for symbols are restricted according to preceding character syntax. For instance if the characters .code ".(" or .code ".[" immediately precede the prefix, then only those symbols are considered which are methods: that is, each is the static slot of at least one structure, in which that static slots holds a function. The difference between .key Tab and .key Ctrl-X .key Tab is that Tab completion looks only for prefix matches among the eligible identifiers. Thus it is a pure completion in the sense that it suggests additional material that may follow what has been typed. If the buffer contains .code (list it will only suggest completions which can be endings for .code list such as .codn list* , .codn listp , and .codn list-str . It will not suggest identifiers which rewrite the .code list prefix. By contrast, the .key Ctrl-X .key Tab completion suggests not only pure completions but also alternatives to the partial identifier, by looking for substring matches. For instance .code copy-list is a possible completion for .codn list , as is .codn proper-list-p . If no completions are found, then the BEL character is sent to the terminal to generate a beep or a visual alert indication. The listener returns to editing mode. If completions are found, listener enters into completion selection mode. The first available completion is placed into the line as if it had been typed in. The other completions may be viewed one by one using the Tab key. (Note that the .key Ctrl-X is not used, only Tab, even if completion mode had been entered via .key Ctrl-X .keyn Tab ). When the completions are exhausted, the original uncompleted line is shown again, and Tab can continue to be used to cycle through the completions again. In completion mode, the .key Ctrl-C character acts as a command to cancel completion mode and return to editing the original uncompleted line. Any other input character causes the listener to keep the currently shown completion, and return to edit mode, where that character is immediately processed as if it had been typed in edit mode. .NP* Edit with External Editor The two character command .key Ctrl-X .key Ctrl-E launches an external editor to edit the current command line. The command line is stored in a temporary file first, and the editor is invoked on this file. When the editor terminates, the file is read into the editing buffer. The editor is determined from the .code EDITOR environment variable. If this variable is unset or empty, the command does nothing. The temporary file is created in the home directory, if that can be determined. Otherwise it is created in the current working directory. If the creation of the file fails, then the command silently returns to edit mode. The home directory is determined from the .code HOME environment variable in POSIX environments. On MS Windows, the .code USERPROFILE variable is probed for the user's directory. If the command line contains embedded carriage returns (which denote line breaks in multiline mode) these are replaced with newline characters when written out to the file. Conversely, when the edited file is read back, its newlines are converted to carriage returns, so that multiline content is handled properly. (See the following section, Multiline Mode.) .NP* Undo Editing The listener provides an undo feature. The .key Ctrl-O command ("old", "oops") restores the edit buffer contents and cursor position to a previous state. There is a single undo history which records up the 200 most recent edit states. However, the states are associated with history lines, so that it appears that each line has its own, independent undo history. Undoing the edits in one line has no effect on the undo history of another line. Undo also records edits for lines that have been canceled with .key Ctrl-C and are not entered into the history, making it possible to recall canceled lines. The undo history is lost when \*(TX terminates. Undo doesn't save and restore previous contents of the clipboard buffer. There is no redo. When undo removes an edit to restore to a prior edit state, the removed edit is permanently discarded. Note that if undo is invoked on a historic line, each undo step updates that history entry instantly to the restored state, not only the visible edit buffer. This is in contrast to the way new edits work. New edits are not committed to history until navigation takes place to a different history line. Also note that when new edits are performed on a historic line and it is submitted with .key Enter without navigating to another line, the undo information for those edits is retained, and belongs to the newly submitted line. The historic line hasn't actually been modified, and so it has no new undo information. However, if a historic line is edited, and then navigation takes place to a different historic line, then the undo information is committed to that line, because the modifications to the line have been placed back in the history entry. .SS* Visual Selection Mode The interactive listener supports visual copy and paste operation. Text may be visually selected for copying into a clipboard or or for deletion. In visual selection mode, the actions of some editing commands are modified so that they act upon the selection instead of their usual target, or upon both the target and the selection. .NP* Making a Selection The .key Ctrl-S command enters into visual selection mode and marks the starting point of the selection, which is considered the position immediately to the left of the current character. While in visual selection mode, it is possible to move around using the usual movement commands. The ending point of the selection tracks the movement. The selected text is displayed in reverse video. Typing .key Ctrl-S again while in visual selection mode cancels the mode. Tab completion, history navigation, history search and editing in an external editor all cancel visual selection mode. By default, the selection excludes the character which lies to the right of the rightmost endpoint. Thus, the selection simply consists of the text between these two positions, whether or not they are reversed. This style of selection pairs excellently with an I-beam style cursor, and has clear semantics. The endpoints are referenced to the positions between the characters, and everything between them is selected. The selection behavior may be altered using the Boolean configuration variable .codn *listener-sel-inclusive-p* . This variable is .code nil by default. If it is changed to true, then the selection includes the character to the right of the rightmost endpoint, if there is such a character within the current line. This style of selection pair well with a block-shaped cursor. It creates the apparent semantics that the endpoints of the selection are characters, rather than points between characters, and that these characters are included in the selection. .NP* Selection Endpoint Toggle In visual selection, the starting point of the selection remains fixed, while the ending point tracks the movement of the cursor. The .key Ctrl-^ command will exchange the two points. The effect is that the cursor jumps to the opposite end of the selection. That end is now the ending point which tracks the cursor movement. .NP* Visual Copy The .key Ctrl-Y command ("yank") copies the selected text into a clipboard buffer. The previous contents of the clipboard buffer, if any, are discarded. Unlike the history, the clipboard buffer is not persisted. If \*(TX terminates, it is lost. .NP* Visual Cut If the .key Ctrl-D command is invoked while a selection is in effect, then instead of deleting the character under the cursor, it deletes the selection, and copies it to the clipboard. The Delete key has the same effect. .key Ctrl-D and .key Del have no effect on the clipboard when visual selection is not in effect, and they operate on just one character. .NP* Clipboard Paste The .key Ctrl-Q command ("quote the clipboard") inserts text from the clipboard at the current cursor position. The cursor position is updated to be immediately after the inserted text. The clipboard text remains available for further pasting. If nothing has been yet been copied to the clipboard in the current session, then this command has no effect. .NP* Clipboard Swap Paste The .key Ctrl-X .key Ctrl-Q command sequence ("exchange quote") exchanges the selected text with the contents of the clipboard. The selection is copied into the clipboard as if by .key Ctrl-Y and replaced by the previous contents of the clipboard. If the clipboard has not yet been used in the current session, If nothing has been yet been copied to the clipboard in the current session, then this command behaves like .keyn Ctrl-Y : text is yanked into the clipboard, but not deleted. .NP* Visual Replace In visual selection mode, an editing commands may be used which insert new text, or a character may be typed in order to insert it. When this happens, the selection is first deleted and visual mode is canceled. Then the insertion takes place and visual mode is canceled. The effect is that the newly inserted text replaces the selected text. This applies to the Clipboard Paste .key Ctrl-Q command also. If a selection is effect when .key Ctrl-Q is invoked, the selected text is replaced with the clipboard buffer contents. When a selection is replaced in this manner, the contents of the clipboard are unaffected. .NP* Delete in Selection Mode In visual mode, it is possible to issue commands which delete text. One such command is .keyn Ctrl-D . Its special behavior in selection mode, Visual Cut, is described above. The .key Backspace key and .key Ctrl-H also have a special behavior in select mode. If the cursor is at the rightmost endpoint of the selection, then these commands delete the selection and nothing else. If the cursor is at the leftmost endpoint of the selection, then these commands delete the selection, and take their usual effect of deleting a character also. In both cases, selection mode is canceled. The clipboard is not affected. The .key Ctrl-W command for deleting the previous word, when used in visual selection mode, deletes the selection and cancels selection mode, and then deletes the word before the selection. Only the deleted selection is copied into the clipboard, not the deleted word. All other deletion commands such as .key Ctrl-K simply cancel visual selection mode and take their usual effect. .SS* Multiline Mode The listener operates in one of two modes: line mode and multiline mode. This is determined by the special variable .code *listener-multi-line-p* whose default value is .code t (multiline mode). It is possible to toggle between line mode and multiline mode using the .key Ctrl-J command. In line mode, all input given to a single prompt appears to be on a single line. When the line becomes longer than the screen width, it scrolls horizontally. In line mode, carriage return characters embedded in a line are displayed as .codn ^M . In multiline mode, when the input exceeds the screen width, it simply wraps to take up additional lines rather than scrolling horizontally. Furthermore, multiline mode not only wraps long lines of input onto multiple lines of the display, but also supports true multiline input. In multiline mode, carriage return characters embedded in input are treated as line breaks rather than being rendered as .codn ^M . Because carriage returns are not line terminators in text files, lines which contain embedded carriage returns are correctly saved into and retrieved from the persistent history file. When .key Enter is typed in multiline mode, the listener tries to determine whether the current input, taken as a whole, is an incomplete expression which requires closing punctuation for elements like compound expressions and string literals. If the input appears incomplete, then the .key Enter is inserted verbatim at the current cursor position, rather than signaling that the line is being submitted for evaluation. The .key Ctrl-X .key Enter command sequence also has this behavior. .SS* Reading Forms Directly from the Terminal In addition to multiline mode, the listener provides support for directly parsing input from the terminal, suitable for processing large amounts of pasted material. If the .code :read keyword is entered into the listener, it will temporarily suspend interactive editing and allow the \*(TL parser to read directly from standard input. The reading stops when an error occurs, or EOF is indicated by entering .keyn Ctrl-D . In direct parsing mode, each expression which is read is evaluated, but its value is not printed. However, the value of the last form evaluated is returned to the interactive listener, which prints the value and accepts it as if it as the result value of the .code :read command. Note that none of the material read from the terminal is entered into the interactive history. Only the .code :read command which triggers this parsing mode appears in the history. .SS* Clear Screen and Refresh The .key Ctrl-L command clears the screen and redraws the line being edited. This is useful when the display is disturbed by the output of some background process, or serial line noise. .SS* Suspend to Background The .key Ctrl-Z ("Zzzz... (sleep)") command causes \*(TX to be placed into the background in a suspended, and control returned to the system shell. Bringing the suspended \*(TX back into the foreground is achieved with a shell job-control command such as the .code fg command in GNU Bash. When \*(TX is resumed, the interactive listener will redisplay the edited line and restore the previous cursor position. Making full use of this feature requires a POSIX job control shell, in the sense that without job control support in the shell, there may not be a way to restore \*(TX into the terminal session's foreground, causing the user to lose interactive control over that \*(TX instance. .SS* Editing Help The .key Ctrl-X .key ? command shows a summary of commands, in a four-line display which temporarily replaces the editing area. The help text is divided into several pages. .key Ctrl-C dismisses the display, and returns to editing. The .keyn Ctrl-P , .key \[<-] and .key \[ua] keys return to the previous screen. The .key Ctrl-Z and .key Ctrl-L commands are available, having their usual meaning of suspending and refreshing the display. Any other key advances to the next screen. Advancing from the last screen, dismisses the display, and returns to editing. Navigating to the previous screen when the first screen is being shown also dismisses the display and returns to editing. .SS* Print the Prompt The .code :prompt command prints the current prompt, followed by a newline, without incrementing the prompt number. The .code :p command prints just the current prompt number, followed by a newline, without incrementing the number. In plain mode, the .code :prompt-on command enables the printing of prompts. The full prompt is printed before reading each new command line. An abbreviated prompt is printed before reading the continuation lines of an incomplete expression. The printing of prompts is automatically enabled if the input device is an interactive terminal. None of these prompt-related commands are entered into the history. .SS* Plain Mode When the input device isn't an interactive terminal, or if the .code -n or .code --noninteractive command-line operations are used when invoking \*(TX, the listener operates in .IR "plain mode" . It reads input without providing any of the editing features of visual mode: no completion, history recall, selection, or copy and paste. Only the line editing features provided by the operating system are available. Prompts appear if standard input is an interactive terminal, or if explicitly enabled. There is still an incrementing counter, and the numbered variables .codn *1 , .codn *2 , .code ... for accessing evaluation results are established. Lines are still entered into the history, and the interactive profile is still processed, as usual. Plain mode reads whole lines of input, yet recognizes multi-line expressions. Whenever a line of input is read which represents incomplete syntax, another line of input is read and appended to that line. This repeats until the accumulated input represents complete syntax, and is then processed as a unit. Like in visual mode, each unit of input may contain multiple expressions. These are parsed as a unit and evaluated as if they were the elements of a .code progn expression. The resulting value which is printed is that of the last expression. .SS* Interactive Profile File Unless the .code --noprofile option has been used, when the listener starts up, it looks for file called .code .txr-profile in the user's home directory, as determined by the .code HOME environment variable in POSIX environments or the .code USERPROFILE environment variable on MS Windows. If that variable doesn't exist, no further attempt is made to locate this file. If the .code .txr-profile file does not exist, but .code .txr_profile exists, then that file is taken as the profile file instead. Falling back on .code .txr_profile is obsolescent and will be removed in some future version of \*(TX. The switch to .code .txr-profile was introduced in \*(TX 297. If the history file exists, it is subject to security checks. First, the .code path-components-safe is applied to its path name. The function validates that no component of the path name is a directory that is writable to another user, or a symbolic link that could be rewritten by another user. If that check passes, the file is then checked with the function .code path-strictly-private-to-me-p which requires that other users have no read or write permission. If the checks fail, then an error message is displayed and the file is not loaded. If the file passes the security check, it is expected to be readable and to contain \*(TL forms, which are read and evaluated. Syntax errors encountered while reading the profile file are displayed on standard output, and any exceptions thrown that are derived from .code error are caught and displayed. The interactive listener starts in spite of these situations. Exceptions not derived from .code error will terminate the process. The profile file is not read by noninteractive invocations of \*(TX: that is, when the .code -i option isn't present. .SS* History Persistence The history is maintained in a text file called .code .txr-history in the user's home directory. Whenever the interactive listener terminates, this file is updated with the history contents stored in the listener's memory. The next time the listener starts, it first reloads the history from this file, making the most recent .code *listener-hist-len* expressions of a previous session available for recall. If the .code .txr-history file does not exist, but a file called .code .txr_history exists, then that file is loaded instead and that same file will be written to when the history is saved. This behavior is obsolescent; support for recognizing a .code .txr_history file will be removed in a future release of \*(TX. The switch to .code .txr-history was introduced in \*(TX 297. The history file is maintained in a way that is somewhat robust against the loss of history arising from the situation that a user manages multiple simultaneous \*(TX sessions. When a session terminates, it doesn't blindly overwrite the history file, which may have already been updated with new history produced by another session. Rather, it appends new entries to the history file. New entries are those that had not been previously read from the history file, but have been newly entered into the listener. An effort is made to keep the history file trimmed to no more than twice the number of entries specified in .codn *listener-hist-len* . The terminating session first makes a temporary copy of the existing history, which is trimmed to the most recent .code *listener-hist-len* entries. New entries are then appended to this temporary file. Finally, the actual history file is replaced with this temporary file by a .code rename-path a rename operation. This algorithm doesn't use locking, and is therefore not robust against the situation when a two or more multiple interactive \*(TX sessions belonging to the same user terminate at around the same time. The home directory is determined from the contents of the .code HOME environment variable in POSIX environments or .code USERPROFILE on MS Windows. If this variable doesn't exist, or the user doesn't have permissions to write to this directory or to an existing history file in that directory, then the history isn't saved. It is possible to save the history without terminating the interactive session, using the .code :save command. This saves the history in the manner described above. Each invocation of .code :save only adds to the history file new input since the most recent .code :save command. When the history file is loaded, security checks take place, in exactly the same way that the .str .txr-profile file is validated. First the path of the history file is checked using the function .codn path-components-safe , which determines that no component of the path name can be subverted by another user, other than the superuser. If that check passes, then the file is checked using .code path-strictly-private-to-me-p which requires that other users have no read or write permission. If the checks fail, then an error message is displayed and the history file is not loaded. .SS* Parenthesis Matching A feature of the listener is visual parenthesis matching in the form of a brief forward or backward jump of the cursor. This provides a hint to the programmer, helping to prevent avoid parenthesis balancing errors. When any of the three closing characters .codn ) , .code ] or .code } is inserted, the listener scans backward for the matching opening character. Likewise, if any of the three opening characters .codn ( , .code [ or .code { is inserted in the middle of text, the listener scans forward for the matching closing character. If the matching character is found, the cursor jumps to that character and then returns to the original position a brief moment later. If a new character is typed during the brief time delay, the delay is immediately canceled, so as not to hinder rapid typing. This back-and-forth jump behavior also occurs when a character is erased using Backspace, and the cursor ends up immediately to the right of a parenthesis. Note that the matching is unsophisticated; it doesn't observe the lexical conventions and syntax of the \*(TL programming language. For instance, a closing parenthesis outside a string literal may match match an opening one inside a string literal. .SS* Listener Configuration Variables The listener's behavior can be influenced through values of certain global variables. The settings can be made persistent by means of setting these variables in the interactive profile file. .coNP Special Variable @ *listener-hist-len* .desc This special variable determines how many lines of history are retained by the listener. Changing this variable from within the listener has an instant effect. If the number is reduced from its current value, history lines are immediately discarded. The default value is 500. .coNP Special Variable @ *listener-multi-line-p* .desc This is a Boolean variable which indicates whether the listener is in multiline mode. The default value is .codn nil . Changing this variable from within the listener takes effect immediately for the next line of input. If multiline mode is toggled interactively from within the listener, the variable is updated to reflect the latest state. This happens when the command is submitted for evaluation. .coNP Special Variable @ *listener-sel-inclusive-p* .desc This Boolean variable controls the behavior of visual selection. It is .code nil by default. A visual selection is determined by endpoints, which are abstract positions understood as being between characters. When a visual selection begins, it marks an endpoint immediately to the left of a block-shaped cursor, or precisely at the in-between position of an I-beam cursor. The end of the visual selection is similarly determined from the ending cursor position. The selection consists of those characters which lie between these positions. This style of selection pairs well with an I-beam style cursor shape. If the .code *listener-sel-inclusive-p* variable is set true, then the selection also includes one more character to the right of the rightmost endpoint, if there is such a character within the current line, giving rise to the appearance that the selection is determined by the starting and ending character, and includes them. This type of selection pairs well with a block-shaped cursor. .coNP Special Variable @ *listener-pprint-p* .desc This Boolean variable controls how the listener prints the results of evaluations. It is .code nil by default. When the variable is .codn nil , the evaluation result of each line entered into the listener is printed using the .code prinl function. Thus values are rendered in a machine-readable syntax, ensuing read/print consistency. If the variable is set true, the evaluation result of each line is printed using the .code pprinl function. .coNP Special Variable @ *listener-greedy-eval-p* .desc The special variable .code *listener-greedy-eval-p* controls whether or not a "greedy evaluation" feature is enabled in the listener. The default value is .codn nil , disabling the feature. Greedy evaluation means that after the listener evaluates the input expressions successfully and prints the value of the last one, it then checks whether that value is an expression that may be further subject to nontrivial evaluation. If so, it evaluates that expression, and prints the resulting value. The process is then repeated with the resulting value. It keeps repeating until evaluation throws an error, or produces a self-evaluating object. These additional evaluations are performed in such a way that all warnings are suppressed and all other exceptions are intercepted. Greedy evaluation doesn't affect the state of the listener. Only the original expression is entered into the history. Only the value of the original expression is saved in the result hash or a numbered variable. The command-line number .code *n is incremented by one. The additional evaluations are only performed for the purpose of producing useful output. The evaluations may have side effects. .TP* Example: .verb 1> (set *listener-greedy-eval-p* t) t 2> 'a a 3> (defvar b 2) b 2 4> (defvar c '(+ 2 2)) c (+ 2 2) 4 5> (defvar d '(list '+ 2 2)) d (list '+ 2 2) (+ 2 2) 4 .brev The .code "(defvar d ...)" form produces .code d symbol as its result value. That symbol has a variable binding as a result of that .code defvar and so evaluates; that evaluation produces .codn "(list '+ 2 2)" , the contents of .codn d . That object is a Lisp expression and is evaluated, producing .code "(+ 2 2)" and that is also an expression, which reduces to .codn 4 . The object .code 4 is self-evaluating, and so the greedy evaluation process stops. .coNP Special Variable @ *listener-auto-compound-p* .desc The special variable .code *listener-auto-compound-p* controls whether or the listener is operating in "auto compound expression" mode. The default value is .codn nil , disabling the feature. Normally, an input line can contain multiple expressions, which are treated as if they were combined into a single expression by .codn progn . Thus all the expressions are evaluated, and the value from the last one is printed. In auto compound mode, the behavior changes. An input line which consists of multiple expressions is turned into a compound form whose constituents are those items. Thus, for instance, the input .code "+ 2 2" is treated as the compound expression .code "(+ 2 2)" resulting in .code 4 being calculated. When a single expression is input, it is evaluated as-is, and thus in that case auto compound expression mode makes no difference. .coNP Special Variable @ *doc-url* .desc The special variable .code *doc-url* holds a character string representing a web URL intended to point to the HTML version of this document. The initial value points to the publicly hosted document on the Internet. The user may change this to point to another location, such as a locally hosted copy of the document. This variable is used by the .code doc function. .SS* Listener-Related Functions .coNP Function @ doc .synb .mets (doc <> [ symbol ]) .syne .desc The .code doc function provides help for the library symbol .metn symbol . If information about .meta symbol is available in the HTML version of this document, and is indexed, then this function causes that document to be opened using a web browser, such that the browser navigates to the appropriate section of the manual. If the .meta symbol argument is omitted, then the document is opened without navigating to a particular section. The base URL for the document is configured by the .code *doc-url* variable. If .meta symbol is successfully found, or else not specified, and .code doc successfully invokes the URL-opening mechanism, it returns .codn t . Otherwise, it throws an error exception. The web browser is invoked using a system-dependent strategy. On MS Windows, the .code ShellExecuteW function is relied upon to open the URL. On other platforms, if the .code BROWSER environment variable exists and is nonempty, its value is assumed to indicate the name or path of the web-browsing program which can accept the URL as an argument. If this variable doesn't exist or is empty, then .code doc searches for a system-dependent URL-opening utility, such as .codn xdg-open . If this utility is not found, then .code doc falls back to searching for a browser using one of several names. If no URL-opening mechanism is identified using the above strategies, an error exception is thrown. However, if the mechanism is identified, but does not successfully dispatch the URL to a browser, there is no requirement to throw an error exception. It may appear that the .code doc function returns .code t but has no effect. .coNP Function @ quip .synb .mets (quip) .syne .desc The .code quip function returns a randomly selected string containing a humorous quip, quote or witticism. The following code may be added to .code .txr-profile to produce a random quip on startup: .verb (put-line (quip)) .brev The .code quip function was introduced in \*(TX 244. If the .code .txr-profile is used with installations of older \*(TX versions, it is recommended to use the following, to avoid calling the undefined function, as well as to prevent a warning: .verb (if (fboundp 'quip) (put-line (quip)) (defun quip ())) .brev In addition, older \*(TX versions require the profile file to be named .codn .txr_profile . .SH* SETUID/SETGID OPERATION On platforms with the Unix filesystem and process security model, \*(TX has support for executing setuid/setgid scripts, even on platforms whose operating system kernel does not honor the setuid/setgid bit on hash-bang scripts. On these systems, taking advantage of the feature requires \*(TX to be installed as a setuid/setgid executable. For this reason, \*(TX is aware when it is executed setuid and takes care to manage privileges. The following description about the handling of setuid applies to the parallel handling of setgid also. When \*(TX starts, early in its execution it determines whether or not is is executing setuid. If so, it temporarily drops privileges, as a precaution. This is done before processing the command-line arguments. When \*(TX determines that it is executing a setuid script (a file marked executable to its owner and attributed with the set-user-ID bit), it then attempts to impersonate the owner of the script file by changing to effective user ID to that owner just before executing the file. It retains the real and saved user ID. If the attempt to assume that user ID is unsuccessful, then \*(TX permanently drops setuid privileges before executing the script. Likewise, before executing any code other than a setuid script, \*(TX also drops privileges. \*(TX tries to honor and implement the setuid permissions on a script whether or not it is running setuid. When not running setuid, it nevertheless tries to change its effective user ID to that of the owner of the setuid script. This will succeed if it has sufficient permissions to do so. To rephrase: in order for \*(TX to execute a file which is setuid root, it has to be running with a root effective user ID somehow. In order to execute a file which is setuid to a non-root user, \*(TX has to be running effectively as root or else as that user. It doesn't matter whether these privileges are achieved effectively using the setuid mechanism, or whether \*(TX is running with the required user ID as its real ID. However, if \*(TX is running setuid, it takes special care to temporarily drop the privileges as early as possible, and eventually to drop the privileges permanently before executing any code, other that the setuid script. If the setuid script cannot be executed with the privileges it calls for, \*(TX also drops privileges and executes it anyway, strictly as the real user who invoked the \*(TX executable. What it means to drop privileges is to change the effective user ID and the saved user ID to be equal to the real user ID. On platforms where the .code setresuid function is available, \*(TX uses that function to drop privileges. On platforms where .code setresuid is not available, \*(TX tries to drop privileges using the C language function call .codn "setuid(r)" , where .code r is the previously noted real user ID obtained from .codn getuid() . On some platforms, this only works for dropping root privileges: it overwrites the real and saved ID only if the caller is effectively root. On those platforms, this approach does not drop non-root privileges. \*(TX tries to detect whether this approach worked by evaluating the C language expression .codn "seteuid(e)" , where .code e is the previously noted effective user ID. In other words, it attempts to regain the dropped privilege by recovering the previous effective ID. If this attempt succeeds, \*(TX immediately aborts. Dropping setgid privileges is similar. Where .code setresgid is available it is used, otherwise an attempt is made with .code "setegid(r)" where .code r is the previously noted real group ID. Then a test using .code "setegid(e)" is performed using the original effective group ID as .codn e . This is done after dropping any setuid root user ID privilege which would allow such a test to succeed. If \*(TX is running both setuid and setgid, and execute a script which is setuid only, it will still drop group privileges, and vice versa: if it executed a setgid script, it will drop user privileges. For instance, if a root-owned \*(TX runs a setgid script which is owned by user .code 10 and group-owned by group .codn 20 , that script will run with an effective group ID of 20. The effective user ID will be that of the user who invoked the script: \*(TX will drop the root privilege to the original real ID of the user, and while for the setgid operation, it will change to the group ID of the script. The setuid/setgid privilege machinery in \*(TX does not manipulate the list of supplementary ("ancillary", in the language of POSIX) group IDs. It is unnecessary for security because the list does not change while running with setuid privilege. No group IDs are added to the list which need to be retracted when privileges are dropped. The supplementary groups also persist across the execution of a setuid/setgid script. .SH* STANDALONE APPLICATION SUPPORT The \*(TX executable image supports a general mechanism by means of which a custom program can be packaged as an apparent standalone executable. .SS* The Internal Argument String The \*(TX executable contains a 128 byte data area preceded by the seven-byte ASCII character sequence .strn @(txr): . The 128 byte data area which follows this identifying prefix represents a null-terminated UTF-8 string. In the stock executable, this area is filled with null bytes. If the \*(TX executable is edited such that this area is replaced with a nonempty, null-terminated UTF-8 string, the program will, for the purposes of command-line-argument processing, treat this string as if it were the one and only command-line argument. (The original command-line arguments are still retained in the .code *args* and .code *args-full* variables). The function .code save-exe creates a copy of the \*(TX executable with a custom internal argument. .TP* Example: Suppose that \*(TX is copied to an executable in the same directory called .code myapp (or .code myapp.exe on an operating system which requires the .code .exe suffix). Also suppose that in the same directory, there exists a file called .codn myscript.tl . This .code myapp executable can then be edited so that the data area which follows the .code @(txr): bytes contains the following string: .verb --args|-e|(load (path-cat (dir-name txr-exe-path) "main.tl")) .brev When the .code myapp executable is invoked, it will process the above string as a single command-line argument, causing the .code main.tl \*(TL source file to be loaded. Any arguments passed to .code myapp are ignored and available to .code main.tl via the .code *args* variable. .SS* Deployment Directory Structure The \*(TX executable may require library files, depending on the functionality invoked by the program code. Library files are located relative to the installation directory, called the .IR sysroot . The executable tries to dynamically determine the sysroot from its own location, according to the following directory structure. The executable may be renamed, it need not be called .codn txr : .verb /path/to/sysroot/bin/txr .../share/txr/stdlib/cadr.tl .../stdlib/cadr.tlo .../stdlib/except.tl ... .../share/txr/lib/... .brev The above structure is assumed if the executable finds itself in a directory named .strn bin . Otherwise, if the executable finds itself in a directory not named .strn bin , the following structure is expected: .verb /path/to/installation/txr .../stdlib/cadr.tl .../stdlib/cadr.tlo .../stdlib/except.tl ... .../lib/... .brev The .strn lib/ directory shown above is for third-party libraries. This is the directory indicated in the default value of the .code *load-search-dirs* special variable. The directory is not required to exist. Note that this structure had changed starting in \*(TX 264. Older versions of \*(TX, when the executable is not in a directory named .strn bin , expect the following structure: .verb /path/to/installation/txr .../share/txr/stdlib/cadr.tl .../share/txr/stdlib/cadr.tlo .../share/txr/stdlib/except.tl ... .brev When a custom application is deployed using a possibly renamed .code txr executable, one of the above structures should be observed: either the sysroot with a .code bin subdirectory where the executable is located, on the same level with the .code share directory, or else the second structure in which the .code stdlib directory is a direct subdirectory of the executable directory. If one of these structures is not observed, the application may fail due to the failure of a library file to load. If the executable discovers that its name ends in the suffix .str lisp (or else .str lisp.exe on the MS Windows platform) then the behavior is as if the .code --lisp command line option had been given. Similarly, if the executable finds that its name ends in .str vm (or .str vm.exe on MS Windows) it behaves as if the .code --compiled option had been given. .coSS Function @ save-exe .synb .mets (save-exe < path << arg-string ) .syne .desc The .code save-exe function produces an edited copy of the \*(TX executable at the specified .metn path , inserting .meta arg-string as the internal argument string. In order for the copied executable to be useful, the required installation directory structure must be provided around it, as described in the previous section, Deployment Directory Structure. The return value of .code save-exe is unspecified. The .code arg-string should encode to 127 bytes of UTF-8 or less, or else it will be abruptly truncated, possibly in the middle of a UTF-8 sequence. .TP* Example: Create a copy of \*(TX called .code myapp which will load a file called .code main.tl that is located in the same directory. .verb (save-exe "myapp" "--args|-e|(load (path-cat (dir-name txr-exe-path) \e \e \e"main.tl\e"))") .brev .SH* DEBUGGER \*(TX had a simple, crude, built-in debugger, which was removed. .SH* COMPATIBILITY .SS* Overview New \*(TX versions are usually intended to be backward-compatible with prior releases in the sense that documented features will continue to work in the same way. Due to new features, new versions of \*(TX will supply new behaviors where old versions of \*(TX would have produced an error, such as a syntax error. Though, strictly speaking, this means that something is working differently in a new version, replacing an error situation with functionality is usually not considered a deviation from backward-compatibility. There is one notable deviation from this general requirement for backwards compatibility: the handling of compiled files. For pragmatic reasons, from time to time \*(TX may break backward compatibility, such that a newer version of \*(TX will not load compiled files produced by an older version. The files will have to be recompiled with the new \*(TX. More details are given in the section .B "Compiled File Compatibility" under the major section .BR "LISP COMPILATION" . The rationale for not requiring backward compatibility support for older compiled files is that older files require the older implementation of the virtual machine which they target. In some cases the differences between the older virtual machine and new is so great that \*(TX would have to carry a whole separate virtual-machine implementation for the sake of the older files, which is a significant burden. .coSS The @ -C compatibility option When a change is introduced which is not backward compatible, \*(TX's .code -C option can be used to request emulation of old behavior. The option was introduced in \*(TX 98, and so the oldest \*(TX version which can be emulated is \*(TX 97. Side effects occur in the processing of the option. If the option is specified multiple times, the behavior is unspecified. .coSS Environment variable @ TXR_COMPAT If the .code TXR_COMPAT environment variable exists, and its value is not en empty string, it must contain a decimal integer. Its value is taken by \*(TX as a request to emulate old behaviors, just like the value of the .code -C option. If the variable has incorrect contents or an out-of-range value, \*(TX will print an error diagnostic and exit. If both .code -C and the .code TXR_COMPAT environment variable are supplied, the behavior is unspecified. .SS* Compatibility Version Values The following version values which have a special meaning as arguments to the .code -C option, along with a description of what behaviors are affected. For each of these version values, the described behaviors are provided if .code -C is given an argument which is equal or lower. For instance .code "-C 103" selects the behaviors described below for version 105, but not those for 102. .IP 294 Until \*(TX 294, the .code pprint function rendered a buffer object simply by sending its raw bytes to the destination stream, rather than rendering the object as a stream of hexadecimal digit pairs. The old behavior is restored with compatibility values of 294 or lower. .IP 289 Until \*(TX 289, the .code replace function had different semantics in the handling of the .meta index-list argument (now called .metn index-seq ) and the .meta replacement-sequence argument. When the .meta index-list contained more indices than elements of .meta replacement-sequence then the replacement of elements in the main sequence would stop. No deletion of elements was performed. This behavior is restored by selecting 289 or lower compatibility. Note, however, that this breaks the ability of the .code del macro to delete items from a sequence by .metn index-list . The .code del macro could do that in version 289 or older, and the behavior That behavior didn't work in version 289 or older, and is supported by the new semantics of .metn replace , which is capable of deleting items specified by .metn index-list . .IP 288 Integers and ranges callable like functions are a new feature introduced after \*(TX 288. The latter, callable ranges, are a breaking change; certain expressions with a range in the function position interpreted the range as a sequence. Using this compatibility value disables ranges being callable, restoring the old behaviors. .IP 283 In \*(TX 283 and older versions, the .meta flags parameter of the .code ftw function defaults to zero, rather than .codn ftw-phys . .IP 275 In \*(TX 275 and older versions, the FFI type operator .code align can weaken the alignment of a type. The current behavior is that it can only increase the strictness of alignment, which mimics the .code aligned type attribute found in GNU C. For instance .code "(align 2 int)" will not have an effect, because 2 is lower than the alignment of .codn int . The .code pack type operator must be used instead to specify any alignment, including lower. A compatibility value of 275 or lower restores the ability of .code align to specify weaker alignment. .IP 273 In \*(TX 273 and older versions, .code lazy-str-get-trailing-list has a flaw, which causes it to produce an extra empty string. Because the .code @(freeform) directive in the pattern language is based on lazy strings, and depends on this function, it is affected by this issue. The extra empty string is produced because the materialized prefix of the lazy string is split on the terminator without regard for the fact that it ends in the terminator, producing an extra empty piece. For instance, if the terminator is .strn \en the materialized prefix of the lazy string is .strn foo\en and the remaining list of not-yet-materialized lazy string material is .codn "(\(dqbar\(dq \(dqbaz\(dq)" , then the returned list is .codn "(\(dqfoo\(dq \(dq\(dq \(dqbar\(dq \(dqbaz\(dq)" , rather than .codn "(\(dqfoo\(dq \(dqbar\(dq \(dqbaz\(dq)" . Whenever the lazy string's .meta terminator is non-empty, this issue reproduces in almost all instances, because the materialized prefix, unless it is empty, is always terminated by the .meta terminator and so the split always produces the extra empty string. This is not a rare edge case. Compatibility values of 273 and lower restore this behavior. .IP 272 The compatibility version value 272 restores old behaviors in the pattern language with regard to the regex and function cases of positive match variables. \*(TX 273, several semantic improvements took place in this area, which can break existing code. Pattern variables of the form .mono .meti >> @{ bident >> ( fun >> [ args ...])} .onom can now invoke a vertical function against the full input, and the variable consequently to be bound to multiple lines. Previously this syntax invoked only horizontal functions or else vertical functions in a single-line horizontal mode. That behavior is restored by 272 or lower compatibility. Secondly, the function is now always invoked, whether or not the variable has a binding. The variable is then matched against the text spanned by the function to either give it a new binding or match the existing binding. The old behavior, restored by 272 or lower compatibility, is that the function is not invoked when the variable has a binding; the variable's value is instead used to match text. Lastly, a similar change took place in positive match regular expression variables of the .mono .meti >> @{ bident <> / regex /} .onom form. Prior to 273, when a variable of this form has an existing binding, the regex is ignored, and the situation is treated as a match for the variable content. This old behavior is also restored. .IP 265 Until \*(TX 265, the .code with-resources macro exhibited an undocumented behavior: the three-element binding expression .mono .meti >> ( var < init << cleanup ) .onom immediately caused the .code with-resources form to terminate with a return value of .code nil if the .meta init form returned .codn nil . Neither the .meta cleanup in the same expression, nor any subsequent binding expressions or the body of the construct, would be evaluated. Prior cleanup forms would be evaluated in reverse order, as documented. A compatibility value of 265 or less restores this behavior. .IP 262 Selection 262 compatibility restores a wrong behavior which existed between versions 191 and 262 due to a regression. The wrong behavior is that the .code defsymacro operator macro-expanded the replacement form, instead of associating the macro symbol with the unexpanded form. This makes a crucial difference to symbol macros which rely on expansion-time effects, such as producing a different expansion each time they are used. .IP 258 Selecting 258 or lower compatibility causes .code abs-path-p to behave like .codn portable-abs-path-p . .IP 257 Until \*(TX 257, the function .code lexical-var-p returned .code t for not only lexical variables, but also for locally bound special variables, which are not lexical. The behavior is restored if 257 or older compatibility is selected. .IP 251 Until \*(TX 251, the syntax .code "obj.[fun arg]" was equivalent to .codn "[obj.fun arg]" , providing little utility. A compatibility value of 251 or lower restores that behavior. The new behavior is that .code "obj.[fun arg]" is equivalent to .codn "obj.[fun obj arg]" , with .code obj evaluated only once, performing method dispatch. .IP 248 Until \*(TX 248, the .code hash-revget function defaulted to using .code eql equality for searching the hash table for matching values rather than the current .codn equal . Also, until 248, the .code @ token for denoting meta-expressions was treated with a low precedence relative to the range dot .code .. token. This led to strange results, such as .code @(a)..@(b) parsing in a way equivalent to .code "@(rcons a @(b))" rather than .codn "(rcons @(a) @(b))" . Not is that undesirable due to the lack of symmetry, it's also inconsistent with .code "@a..@b" denoting .codn "(rcons @a @b)" . The latter is because in that case the .code @ is handled as part of the symbol token as a token, and not as a separate operator. A compatibility value of 248 or lower restores the above old behaviors of .code @ and .codn hash-revget . .IP 244 Until \*(TX 244, the .code env-hash function returned a new hash table each time it was called. The behavior is restored if 244 or older compatibility is selected. .IP 243 Two mistakes in the pseudorandom number generator (PRNG) were discovered, affecting \*(TX 243 and older. Using this compatibility value, or lower, will restore the buggy behavior, allowing pseudorandom number sequences produced by those older versions can be reproduced. The PRNG is intended to be an implementation of the WELL512a PRNG described by Panneton and L'Ecuyer. The coding mistakes, however, resulted in the PRNG being an implementation of something other than WELL512a. .IP 242 In \*(TX 242 and older, the instantiation of an object whose type inherits from the same supertype more than once resulted in duplicate execution of the supertype's initialization. This was a documented behavior. After 242, duplicate initialization is suppressed. For more information, see the section .BR "Duplicate Supertypes" . A compatibility value of 242 or lower restores the duplicate initialization behavior. .IP 237 Compatibility values of 237 or lower restore the destructive behavior of the .code sort and .code shuffle functions. .IP 234 In \*(TX 234 and older versions, the exception throwing functions .code throw and .code throwf did not return, regardless of the exception type. All unhandled exceptions triggered internal handling leading to unwinding and termination. The current behavior is that only .code error exceptions lead to termination. When a non-error exception isn't intercepted by a catch or handler, the .code throw or .code throwf returns normally, yielding the value .codn nil . If a compatibility value equal to or lower than 234 is requested, the old behavior occurs: all unhandled exceptions terminate. .IP 231 Versions of \*(TX until 231 contained an undocumented feature: some library functions which are documented as having parameters that must be of string type were allowing the arguments to be symbols. For such symbolic arguments, the name of the symbol obtained from .code symbol-name was implicitly taken as the required string value. This behavior was removed: passing symbolic arguments to library function parameters documented as strings will cause an exception to the thrown. If a compatibility value of 231 or lower is specified, however, the tolerant behavior is restored. .IP 227 In \*(TX 227 and older versions, the functions .codn carray-uint , .codn carray-int , .code uint-carray and .code int-carray had different names, namely .codn carray-unum , .codn carray-num , .code unum-carray and .codn num-carray , respectively. If 227 or lower compatibility is selected, these functions become available under their old names in addition to their new names. .IP 225 After \*(TX 225, the behavior of the .code do operator was adjusted. Previously, a form like .code "(do set x)" which contains no variable references like .codn @1 , .code @2 or .codn @rest , generated a function similar to .codn "(lambda (. rest) (set x))" . This was contrary to documentation, which states that .code "(do set x)" should produce a variadic function which has one required argument, and which assigns that argument to the variable .code x when invoked. The current implementation is that .code "(do set x)" is equivalent to .code "(do set x @1)" which produces the documented behavior. If 225 or lower compatibility is selected, then the old behavior of .code do takes effect. .IP 224 After \*(TX 224, the treatment of certain special structure functions has changed. Selecting 224 compatibility or lower restores that behavior. The specification given in the .B "Special Structure Functions" paragraph has always stated that special functions must be static slots, and that the behavior is unspecified if they are instance slots. The behavior of \*(TX 224 and earlier was that these functions worked anyway if they were instance slots; after \*(TX 224, they some special functions will no longer be recognized if bound to instance slots. .IP 222 After \*(TX 222, the behavior of .code :vars in .code @(collect) was subject to an adjustment. Previously, if the collect body didn't bind any variables, and both required and optional variables were specified in .codn :vars , it would still bind all of the optional ones to their default values. This was a poor behavior which violated the idea that .code :vars enforces an all-or-nothing binding discipline to keep the collected lists consistent. Selecting 222 compatibility or lower restores this behavior. .IP 215 After \*(TX 215, the behavior of the .code load function changed with respect to its treatment of the .code *load-path* variable. In cases where .code load resolved the path by adding a suffix, .code *load-path* was bound to the unsuffixed name, which was a documented behavior. After \*(TX 215, also, the behavior of the .code sub-str function changed. When the arguments implicate the entire string, .code sub-str started just returning the original string, and not making a copy. The old behavior was to always make a copy. The above old behaviors of .code load and .code sub-str are restored if 215 or lower compatibility is requested. Note, however, that the restoration of the .code sub-str behavior in response to the compatibility option was only introduced in \*(TX 251. In \*(TX 249 and older, the compatibility value has no effect on the behavior of .codn sub-str . .IP 202 Up to \*(TX 202, the .code logxor function was incorrectly implemented, producing wrong results when both arguments are the same fixnum integer, or the same bignum object. The incorrect behavior is restored if 202 or earlier compatibility is requested. After 202, the behavior of the .code print function changed with regard to symbols in the keyword package. Regardless of the .meta pretty-p flag, keywords are printed with the leading colon. Compatibility with 202 or earlier restores the behavior that when the .meta pretty-p flag is true, symbols are printed without package prefixes. .IP 199 After \*(TX 199, certain global variables that had been deprecated for a long time, and no longer documented, were removed. Requesting 199 or earlier compatibility restores those variables. .IP 190 Until \*(TX 190, the .code reset-struct function neglected to perform .code :postinit initializations, and didn't invoke finalization on the structure object if an exception was thrown during reinitialization. Thus, contrary to documented requirements, reinitialization of a structure didn't behave like fresh construction. Also, until \*(TX 190, macro parameter lists implemented the requirement that a .code : (colon keyword symbol) argument to an optional was treated as a missing argument, triggering argument-defaulting behavior. That requirement was removed; the colon symbol behaves as an ordinary value under destructuring with macro parameter lists. Moreover, until \*(TX 190, the .code pub symbol package didn't exist; the .code *package* variable was initialized to the user package and so symbols introduced by application code were interned in the same package as the \*(TL library. Until \*(TX 190, .code defmacro and .code defsymacro forms were evaluated immediately during macro expansion; in \*(TX 191 or later, this eager evaluation was abandoned. Unfortunately, this change introduced a regression, causing the replacement form of a .code defsymacro to be macro-expanded at the time that form is traversed by the expander, so that the macro is associated with the expanded version of that form. This is something which had been fixed in 137. It went unnoticed until much later, after the 262 release. All the above old behaviors are restored in compatibility with version 190 or earlier. Finally, one more change after \*(TX 190 that is controlled by the compatibility mechanism was a critical redesign of the requirements for the behavior of the .code ldiff function. Version 190 compatibility causes the .code ldiff symbol to refer to the old implementation of .codn ldiff . .IP 188 Until \*(TX 188, .codn equal -based hash tables printed using the notation .code "#H((:equal-based ...) ...)" whereas .codn eql -based hash tables simply omitted the .code :equal-based keyword. Changes were introduced in \*(TX 187 which gave rise to a read/print inconsistency with printing behavior. In \*(TX 189, further changes were introduced to fix this inconsistency: .codn equal -based hash tables print without any keyword indicating equality, and .codn eql -based hash tables print as .codn "#H((:eql-based) ...)" . If 188 or compatibility is selected, hash tables are printed in the old way. .IP 187 Until \*(TX 187, hash tables constructed by the .code hash function were based on .code eql equality by default; the .code :equal-based keyword argument had to be specified to override this default, and the .code :eql-based keyword didn't exist. Selecting 187 or lower compatibility restores the behavior of .code eql equality being default, and the .code :eql-based keyword being unrecognized. This affects all functions which implicitly rely on .codn hash , those being : .codn uniq , .codn unique , and .codn group-by . In spite of these changes, the printed representation of hash tables continues to use the .code :equal-based keyword to indicate hash tables based on .code equal and its absence to indicate .code eql equality. The new .code :eql-based keyword may be used in hash literals (unless 187 compatibility is in effect, in which case it is ignored). .IP 184 A value of 184 or lower switches to the old implementation of the .code op and .code do macros which was replaced starting in \*(TX 185. Also, this has the effect of disabling the special recognition of meta-expressions and meta-symbols in the dot position of function calls, and the macro expansion of meta-symbols in quasiliterals. This is because the old .code op implementation implements these behaviors itself. The implication is that user code which binds custom macros to .code sys:var or .code sys:expr may be affected by 184 or lower compatibility. .IP 185 A value of 185 or lower restores the old precedence of the double dot notation for expressing ranges, relative to the referencing dot. Until \*(TX 185, the expression .code a.b..c.d parsed as .codn "(qref a (rcons b c) d)" . What is worse, it parsed this way even if written as .codn "a.b .. c.d" . Starting in \*(TX 186, .code .. has a lower precedence, producing the more useful and intuitive parse .codn "(rcons (qref a b) (qref c d))" : in other words, the range with endpoints given by .code a.b and .codn c.d . .IP 183 A value of 183 or lower restores an inconsistent behavior in the .code "@(bind)" directive and other places in the \*(TX pattern language where binding takes place. Prior to version 184, a string-tree match was only tried in both directions when the left-hand side of a binding (the "pattern") was a variable. For non-variable pattern terms, such as Lisp expressions or atoms, the string-tree match was tried in one direction only: a string tree arising out of the pattern could match a string atom value on the right side. A string tree is a nested list structure whose leaves are strings: a list of strings, a list of lists of strings, and so on, in any mixture. Concretely, before \*(TX 184, .mono @(bind "a" ("a" "b" "c")) .onom didn't match, but .mono @(bind ("a" "b" "c") "a") .onom did. However, if the variable .code a contained .strn a then .mono @(bind a ("a" "b" "c")) .onom did match: an inconsistency. .IP 177 A value of 177 or lower causes the emulation of a bug which was present in the .code rng awk macro. A range whose start and end condition matched on the same record failed to activate for that record, even though .code rng is inclusive. The behavior is incompatible with POSIX Awk. .IP 174 A value of 174 or lower restores a previous behavior of variable substitution in the .code output directive and in quasiliterals in both the \*(TX pattern language and \*(TL. The behavior in question is the evaluation of the element indexing or range selection modifier, exemplified by .codn "@{a [2]}" . The previous behavior was that if the variable is of any type other than list, it is converted to a string (unless it already is one). The indexing then applies to the string. If it is a list then the indexing or range selection applies to the original list value, prior to conversion to text. The current behavior is that indexing and range selection is applied to the original value if that value is any sequence type which satisfies the .code seqp function, otherwise to the string representation. .IP 172 A value of 172 or lower restores a behavior of the \*(TX pattern matching language when matching a variable followed by a directive, such as .codn "@a@(fun b)" . The old behavior is that the scan for a match for the directive takes place in an environment in which a binding for .code a has not yet been established. The new behavior is that the variable is always bound prior to the processing of the directive. During the search, it is bound to the range of text spanning between the starting position and the position being tried. .IP 170 A value of 170 or lower disables the behavior that \*(TX scans standard input when no input sources are specified on the command line. Standard input must be requested explicitly using the .code - argument. This is how it was in all versions of \*(TX up to 170. Some programs may behave differently because of this. Specifically, programs which do not take any arguments, and do not select an input source using the .code @(next) directive, or suppress the use of an input source using .codn "@(next nil)" , may now accidentally read from standard input. Until version 170, the functions .codn split , .codn split* , .code partition and .code partition* ignored negative indices in their .meta index-list argument (now called .metn index-seq ). The new behavior is that the length of the input sequence is added to any negative index values. The resulting values are then ignored if they are still negative. .IP 165 A value of 165 restores the following behaviors, which changed starting in 166. There was a change in Lisp evaluation support of the \*(TX pattern language. Specifically, Lisp argument forms were not subject to expansion prior to evaluation in these directives: .codn output , .codn mod , .codn modlast , .codn skip , .codn fuzz , .codn load , .codn close , .codn call , .code cat and .codn next . .IP 161 Version 161 was the last version in which a bug existed in the .code handle macro. In spite of the documentation claiming that .code handle has the same syntax as .codn catch , the clauses of .code handle were being passed the exception symbol as the leftmost argument, followed by the exception arguments. This convention is different from .code catch clauses which do not receive the exception symbol, only the arguments. The discrepancy was corrected by making .code handle behave like .codn catch , as documented. Requesting compatibility with 161 or earlier restores the previous behavior of the .code handle macro. .IP 156 After version 156, two behaviors changed in the in the macro expander for .codn caseq , .code caseql and .codn casequal : one outright bug was fixed, and one hitherto undocumented behavior was changed and specified in the documentation at the same time. Selecting a compatibility value of 156 or less restores the previous behaviors. The bug was that single-atom case keys were undergoing evaluation. For instance .code "(caseql x (a 0))" would arrange for the evaluation of .code a as a variable, rather than treating it as the symbol .code a itself. Though the compatibility mechanism restores the behavior, applications depending on the evaluating behavior should be changed to instead use .codn caseq* , .code caseql* or .codn casequal . A workaround for this bug for \*(TX versions 156 or older is to replace simple keys with a key list of length one, exemplified by a rewrite of the foregoing expression to .codn "(caseql x ((a) 0))" . Here .code a is not evaluated. The undocumented behavior was that a matching clause which has no forms to be evaluated was producing a result value of .codn t . For example .code "(case 1 (1))" previously yielded .codn t , but now yields .codn nil , and this behavior is documented. .IP 155 After version 155, the .code tok-str and .code tok-where functions changed semantics. Previously, these functions exhibited the flaw that under some conditions they extracted an empty token immediately following a nonempty token. This behavior was working as designed and documented, but the design was flawed, creating a major difficulty in simple tokenizing tasks when tokens may be empty strings. Requesting compatibility with version 155 or earlier restores the behavior. .IP 154 After version 154, changes were introduced in the semantics of struct literals. Previously, the syntax .code "#S(abc x a y b)" denoted the construction of an instance of .code abc with .code "x a y b" as the constructor parameters, similarly to .codn "(new abc x 'a y 'b)" . The new behavior is that .code abc is constructed using no parameters, as if by .code "(new abc)" and then the slot values are assigned. This means that the values specified in the literal override any manipulations of those slots by the type's user-defined .code :postinit handlers. Also, after 154, .code print methods are expected to take three arguments and are invoked for both pretty printing and regular machine-readable printing. Until 154, a struct's .code print methods was called only when that struct was being pretty-printed, and only with two arguments; ordinary printing side-stepped the method and rendered the standard .code #S syntax featuring all instance slots. .IP 151 After version 151, changes were implemented to the way static slots work in \*(TL structs. Selecting compatibility with 151 restores most of the behaviors. Until 151, each structure type had its own instance of static slots whether they were newly defined or inherited. Under the new scheme, a derived struct shares one instance of each inherited static slot with its base type. Under the old scheme, a struct inherits the static initialization functions of its bases (the .meta static-initfun argument passed in .codn make-struct-type ). These are invoked because they are relied upon by the .code defstruct macro to perform the initializations of all the inherited static slots. Under the new scheme, the static initialization functions are not inherited. Only the type's own .meta static-initfun is invoked to initialize its newly defined static slots that it doesn't share with the parent. The inherited static slots simply preserve their current values they have in the base type; their values are untouched by the introduction of a derived type. The .code static-slot-ensure also changed semantics after version 151. The old behavior was problematic because it affected all static slots throughout the inheritance hierarchy matching the name passed in by argument. Since this function is the basis for redefining methods, its behavior broke the semantics of overriding. Selecting 151 compatibility only restores the behavior of this function and macros based on it like .codn defmeth : in the situation when it introduces a new static slot into one or more struct types, in compatibility mode it introduces the slot separately into each type without sharing, and it recurses over the entire type hierarchy, storing .meta new-val into all static slots which match .metn name . .IP 150 Until version 150, the .code match-regex function behaved in a different way from what was documented. Rather than returning the length of the match, it returned the index one past the last matching character. In the case when the starting position is zero, these values coincide; they are different if the match begins at some position inside the string. Compatibility with 150 restores the behavior. The .code match-regst function was also affected by this issue; however, since it returned nonsense result not corresponding to the matching text, it was repaired without backward compatibility. Also affected by version 150 compatibility are the .code match-regex-right and .code match-regst-right functions. These functions worked as documented; however, their specification changes after version 150 to a semantics which is more useful and less surprising to the programmer. .IP 148 Up until version 148, the .code :postinit handlers specified in a .code defstruct were executed in derived-to-base order, opposite to the order of execution of .code :init handlers. Though described in terms of .code defstruct syntax and concepts, this is actually a change in how .code make-struct-type treats its .meta postinitfun argument. Specifying 148 or earlier compatibility provides this old behavior. Also, until version 148, the .code trim-str function stripped leading and trailing whitespace from a string consisting of not only spaces, tabs and newlines, but also carriage returns, vertical tabs and form feeds. .IP 145 In versions 144 and 145, \*(TX opened files in text mode on Cygwin, enabling conversion between CR-LF line endings and abstract newline characters. This behavior change was retracted, so that files on Cygwin are opened without specifying text mode, causing the streams to be effectively binary. The intended "Windows native" behavior of streams being text mode is instead provided in the Windows version of \*(TX by the Cygnal library. .IP 143 Until version 143, the .code stdlib variable didn't include the trailing slash. The .code makunbound function semantics changed after version 143 to be more compatible with ANSI Common Lisp. Until 143, that function removed only the global binding, leaving the dynamic rebinding of a variable intact. The .code defsymacro operator neglected to remove the symbol's special variable mark, if the symbol was previously defined as a special variable. Also, until version 143 many more places in the \*(TX pattern language used bind expressions rather than Lisp expressions. The compatibility option restores these behaviors. .IP 142 Until version 142, the \*(TX pattern language supported a prefix convention on data sources. Data sources beginning with the character .code ! were treated as system command pipes, and data sources beginning with .code $ indicated that a directory is to be scanned. This convention was recognized both for command-line arguments, the arguments of the .code @(next) directive, and of the .code @(output) directive, whether or not the argument was a literal or a computed value. This feature was dropped from the language after version 142. Also, until version 142, the .code @(next) directive recognized the name .str - as denoting standard input, and .code @(output) recognized it as standard output. These behaviors were also removed; versions after 142 recognize this convention only when it appears as a command-line argument. Lastly, until version 142, the .code @(output) directive evaluated the .meta destination argument as an expression of the \*(TX pattern language, requiring .code @ to be used to denote a Lisp expression. This is no longer required. All these old behaviors are provided if compatibility with 142 or earlier is requested. .IP 139 After \*(TX 139, changes were implemented in the area of pseudorandom number generation. Compatibility with 139 brings back the previous seeding algorithm used by .codn make-random-state , allowing the old pseudorandom sequences to be reproduced. This is only the case if the default value of 8 is used for the .meta warmup-period argument of that function (which didn't exist in 139 or earlier versions). .IP 138 After \*(TX 138, the variable name lookup rules in the \*(TX pattern language changed for greater utility and consistency. Compatibility with 138 or later restores the previous rules under which most accesses to a \*(TL variable from \*(TL require the .code @ prefix denoting Lisp evaluation, but some do not. .IP 137 Compatibility with \*(TX 137 restores the behavior of not expanding symbol macros in the dot position of a function call form. For instance if .code x is a symbol macro, in this compatibility mode it is not recognized in a form like .codn "(list 1 2 . x)" . This preserves the behavior of code which depends on .code x in such a form to refer to a variable that is being otherwise shadowed by the symbol macro. \*(TX 137 compatibility also restores a particular behavior of the global and local macro defining operators .code defsymacro and .codn symacrolet : in compatibility mode, these operators macro-expand the replacement forms of symbol macros at expansion time, and then bind the resulting expanded forms to their respective macro symbols. The forms are then potentially expanded again when the symbol macros are substituted. This wrong behavior was never implied by the documentation. The .code with-slots macro is also affected by this, because it is implemented in terms of .codn symacrolet . Lastly, \*(TX 137 compatibility mode also restores another behavior of the dot position in function call forms: if the dot position of a function call form produces a sequence that is not a list, that sequence is converted to a list so that .mono (list . "abc") .onom produces .codn "(#\ea #\eb #\ec)" . After 137, no such treatment is applied to the value and the same form now yields .strn abc . .IP 136 A request for compatibility with \*(TX 136 or earlier restores the old behavior of the .code if directive, which in used to be a syntactic sugar for a .code cases directive with .code require at the top of each block. Though semantically well-defined and working as documented, the behavior was confusing, since failed matching caused potential evaluation of multiple clauses, whereas programmers expect an if/elif/else ladder to select exactly one clause. .IP 128 Compatibility with \*(TX 128 or earlier brings back the behavior that expressions in quasiliterals are evaluated according to \*(TX evaluation rules for quasiliterals which occur in the \*(TX pattern language. Similarly, expressions in .code @(output) blocks are treated \*(TX pattern language expressions. .IP 127 In versions of \*(TX until 127, the functions .codn symbol-function , .code fboundp and .code fmakunbound behaved similarly to their Common Lisp counterparts. See the Dialect Notes under these functions. .IP 124 In \*(TX 124 and earlier versions, the .code @(next) directive didn't evaluate the .meta source argument as a Lisp expression, but as a \*(TX pattern language expression. Lisp expressions thus had to be delimited by .codn @ . The current behavior is that the argument is treated as Lisp. If the compatibility option is set to 124 or lower, the old behavior is restored. However, even without the presence of the compatibility option, if the .meta source argument is a meta-expression or meta-symbol (denotes by the .code @ prefix in front of a compound expression or symbol, respectively) it is also treated in the old way. This latter behavior is obsolescent and will eventually disappear, and the compatibility option will be the only way to get the old behavior. .IP 123 In \*(TX 123 and earlier, the variable initialization forms of a .code for or .code for* loop were evaluated outside of the scope of the implicit .code nil block. They are now inside the block. The compatibility option will restore the old behavior. .IP 121 In \*(TX 121 and earlier versions, \*(TL expressions evaluated in the pattern language were placed in a lexical environment in which the pattern variables were visible as lexical variables. The meant that these variables could be directly captured in lexical closures. On the other hand, it meant that a Lisp function defined in a .code @(do) block could not access a variable established by a later .codn @(bind) . It doesn't make sense for dynamically captured variables to be lexical, so the rule was changed. The backward compatibility switch will enable the old scoping behavior. Capturing the values of pattern variables in closures is possible indirectly under the new rule: simply bind new lexical variables with their values. .IP 118 The .code slot-p function's name changed to .code slotp after 118. The compatibility option causes .code slot-p to be defined also. .IP 117 The arguments of the .code make-struct-type acquired changed after version 117. 117 compatibility brings back the old interface. .IP 114 \*(TX until version 114 reported parse errors in this format: .verb ./txr: (file.txr:123): syntax error .brev The new format omits the program name prefix and parentheses. Also, the .code kill function returned an integer, obtained from the return value of the underlying C function, rather than converting that value to a Boolean. The old behavior was not documented, and 114 compatibility restores it. Lastly, prior to 115, random state objects were of type .code *random-state* (the same symbol as the special variable name) rather than of type .codn random-state . This is a bug whose behavior is simulated by 114 compatibility. .IP 113 Version 113 is the last version in which the .codn stat , .codn lstat , and .code fstat functions returned a property list rather than a structure. Requesting 113 compatibility restores the behavior of returning a property list. However, the filesystem testing functions like .code path-exists-p will not work, because they rely on these functions returning a structure. .IP 109 The optional trailing semicolon on hex and octal codes in the \*(TX pattern language was introduced in 110. The feature is disabled with 109 or lower compatibility, so that .code @\ex21;a encodes .code !;a rather than the current behavior of encoding .codn !a . Also, in 109 and earlier, newlines were allowed in word list literals and word list quasiliterals. They were treated as a word-separating space. A backslash-escaped newline, and all whitespace around it, was deleted just like in ordinary literals, and did not separate words. The old behavior is emulated. .IP 107 Up through \*(TX 107, by accident, there was a function called .code flip as well as an operator by the same name. The function was renamed to .codn flipargs . Version 107 compatibility or earlier provides the function under the original name also. Also, up until this version, \*(TX allowed functions and macros to be defined with the same names as built-in operators, and macros. Newer versions reject this as an error. Requesting compatibility to 107 or earlier suppresses the rejection, though without introducing any requirement that redefinition will work as expected. .IP 105 Provides the behavior that the .code open-file function automatically marks a stream open on a TTY devices as a real-time stream (subject to the availability of the POSIX .code isatty function). Also allows unrecognized backslash escape sequences in regular expression syntax to simply denote the escaped character literally, as was historically the case prior to \*(TX 106, so that .code \ez for instance denotes .codn z . As of \*(TX 106, these are diagnosed as errors. .IP 102 Up to \*(TX 102, the .code get-string function did not close the stream. This old behavior is emulated. .IP 101 Up to \*(TX 101, the .code make-like function incorrectly returned .code nil when converting the empty list .code nil to string type. This affects numerous generic sequence functions, causing their result to be .code nil instead of an empty string. .IP 100 Up to \*(TX 100, the .code split-str function had an undocumented behavior. When the .code sep argument was an empty string, it split the string into individual characters as if by calling .codn list-str . This behavior changed to the currently documented behavior starting in \*(TX 101. Also, the arguments of the .code where function, which introduced in \*(TX 91, were reversed starting in \*(TX 101. .IP 99 Up to \*(TX 99, the substitution of TXR Lisp expressions in .code @(output) directives and in the quasistrings of the pattern language exhibited the buggy behavior that if the TXR Lisp expression produced a list, the list was rendered as a parenthesized representation, or the text .code nil in the empty list case. Moreover, in the .code @(output) case, the value of TXR Lisp expressions was not subject to filtering. Starting with \*(TX 100, these issues are fixed, making the behavior consistent with that of TXR Lisp quasiliterals. .IP 97 Up to \*(TX 97, the error exception symbols such as .code file-error were named with underscores, as in .codn file_error . These error symbols existed: .codn type_error , .codn internal_error , .codn numeric_error , .codn range_error , .codn query_error , .code file_error and .codn process_error . .coSS Variables @ txr-version and @ lib-version .desc The .code txr-version variable gives the version of the \*(TX executable. Programs can express conditional variable based on detecting the version. The .code lib-version variable gives the version of the installed library of \*(TX code accompanying the executable. It is expected that these two variables have an identical value. Any discrepancy in their value indicates an installation whose library or \*(TX executable were upgraded independently. Should such a situation arise in any system and cause a problem, \*(TX programs can be defensively coded against it with the help of these variables. Some features of the \*(TX library are built into the executable, whereas others are in the library directory. This aspect of library symbols isn't specified in this manual; knowing which of these two variables is relevant to a library feature requires familiarity with the implementation. .SH* APPENDIX .SS* A. NOTES ON EXOTIC REGULAR EXPRESSIONS Users familiar with regular expressions may not be familiar with the complement and intersection operators, which are often absent from text processing tools that support regular expressions. The following remarks are offered in the hope that they may be of some use. .TP* "Equivalence to Sets" Regexp intersection is not essential; it may be obtained from complement and union as follows, since De Morgan's law applies to regular-expression algebra: .code (R1)&(R2) = .codn ~(~(R1)|~(R2)) . (The complement of the union of the complements of R1 and R2 constitutes the intersection.) This law works because the regular expression operators denote set operations in a straightforward way. A regular expression denotes a set of strings (a potentially infinite one) in a condensed way. The union of two regular expressions .code R1|R2 denotes the union of the set of texts denoted by .code R1 and that denoted by .codn R2 . Similarly .code R1&R2 denotes a set intersection, and .code ~R denotes a set complement. Thus algebraic laws that apply to set operations apply to regular expressions. It's useful to keep in mind this relationship between regular expressions and sets in understanding intersection and complement. Given a finite set of strings, like the set .mono { "abc", "def" } .onom which corresponds to the regular expression .codn (abc|def) , the complement is the set which contains an infinite number of strings: it consists of all possible strings except .str abc and .strn def . It includes the empty string, all strings of length 1, all strings of length 2, all strings of length 3 other than .str abc and .strn def , all strings of length 4, etc. This means that a "harmless looking" expression like .code ~(abc|def) can actually match arbitrarily long inputs. .TP* "Set Difference" How about matching only three-character-long strings other than .str abc or .strn def ? To express this, regex intersection can be used: these strings are the intersection of the set of all three-character strings, and the set of all strings which are not .str abc or .strn def . The straightforward set-based reasoning leads us to this: .codn ...&~(abc|def) . This .code A&~B idiom is also called set difference, sometimes notated with a minus sign: .code A-B (which is not supported in \*(TX regular-expression syntax). Elements which are in the set .codn A , but not .codn B , are those elements which are in the intersection of .code A with the complement of .codn B . This is similar to the arithmetic rule .codn "A - B = A + -B" : subtraction is equivalent to addition of the additive inverse. Set difference is a useful tool: it enables us to write a positive match which captures a more general set than what is intended (but one whose regular expression is far simpler than a positive match for the exact set we want), then we can intersect this over-generalized set with the complemented set of another regular expression which matches the particulars that we wish excluded. .TP* "Expressiveness versus Power" It turns out that regular expressions which do not make use of the complement or intersection operators are just as powerful as expressions that do. That is to say, with or without these operators, regular expressions can match the same sets of strings (all regular languages). This means that for a given regular expression which uses intersection and complement, it is possible to find a regular expression which doesn't use these operators, yet matches the same set of strings. But, though they exist, such equivalent regular expressions are often much more complicated, which makes them difficult to design. Such expressions do not necessarily . B express what it is they match; they merely capture the equivalent set. They perform a job, without making it obvious what it is they do. The use of complement and intersection leads to natural ways of expressing many kinds of matching sets, which not only demonstrate the power to carry out an operation, but also easily express the concept. .TP* "Example: Matching C Language Comments" For instance, using complement, we can write a straightforward regular expression which matches C language comments. A C language comment is the digraph .codn /* , followed by any string which does not contain the closing sequence .codn */ , followed by that closing sequence. Examples of valid comments are .codn /**/ , .code "/* abc */" or .codn /***/ . But C comments do not nest (cannot contain comments), so that .code "/* /* nested */ */" actually consists of the comment .codn "/* /* nested */" , which is followed by the trailing junk .codn */ . Our simple characterization of the interior part of a C comment as a string which does not contain the terminating digraph makes use of the complement, and can be expressed using the complemented regular expression like this: .codn (~.*[*][/].*) . That is to say, strings which contain .code */ are matched by the expression .codn .*[*][/].* : zero or more arbitrary characters, followed by .codn */ , followed by zero or more arbitrary characters. Therefore, the complement of this expression matches all other strings: those which do not contain .codn */ . These strings make up the inside of a C comment between the .code /* and .codn */ . The equivalent simple regex is quite a bit more complicated. Without complement, we must somehow write a positive match for all strings such that we avoid matching .codn */ . Obviously, sequences of characters other than .code * are included: .codn [^*]* . Occurrences of .code * are also allowed, but only if followed by something other than a slash, so let's include this via union: .verb ([^*]|[*][^/])*. .brev Alas, we already have a bug in this expression. The subexpression .code [*][^/] can match .codn ** , since a .code * is not a .codn / . If the next character in the input is .codn / , we missed a comment close. To fix the problem we revise to this: .verb ([^*]|[*][^*/])* .brev (The interior of a C language comment is any mixture of zero or more non-asterisks, or digraphs consisting of an asterisk followed by something other than a slash or another asterisk). Oops, now we have a problem again. What if two asterisks occur in a comment? They are not matched by .codn [^*] , and they are not matched by .codn [*][^*/] . Actually, our regex must not simply match asterisk-non-asterisk digraphs, but rather sequences of one or more asterisks followed by a non-asterisk: .verb ([^*]|[*]*[^*/])* .brev This is still not right, because, for instance, it fails to match the interior of a comment which is terminated by asterisks, including the simple test cases where the comment interior is nothing but asterisks. We have no provision in our expression for this case; the expression requires all runs of asterisks to be followed by something which is not a slash or asterisk. The way to fix this is to add on a subexpression which optionally matches a run of zero or more interior asterisks before the comment close: .verb ([^*]|[*]*[^*/])*[*]* .brev Thus the semi-final regular expression is .verb [/][*]([^*]|[*]*[^*/])*[*]*[*][/] .brev (Interpretation: a C comment is an interior string enclosed in .codn "/* */" , where this interior part consists of a mixture of non-asterisk characters, as well as runs of asterisk characters which are terminated by a character other than a slash, except for possibly one rightmost run of asterisks which extends to the end of the interior, touching the comment close. Phew!) One final simplification is possible: the tail part .code [*]*[*][/] can be reduced to .code [*]+[/] such that the final run of asterisks is regarded as part of an extended comment terminator which consists of one or more asterisks followed by a slash. The regular expression works, but it's cryptic; to someone who has not developed it, it isn't obvious what it is intended to match. Working out complemented matching without complement support from the language is not impossible, but it may be difficult and error-prone, possibly requiring multiple iterations of trial-and-error development involving numerous test cases, resulting in an expression that doesn't have a straightforward relationship to the original idea. .TP* "The Non-Greedy Operator" The non-greedy operator .code % is actually defined in terms of a set difference, which is in turn based on intersection and complement. The uninteresting case .code (R%) where the right operand is empty reduces to .codn (R*) : if there is no trailing context, the non-greedy operator matches .code R as far as possible, possibly to the end of the input, exactly like the greedy operator. The interesting case .code (R%T) is defined as a "syntactic sugar" which expands to the expression .code ((R*)&(~.*(T&.+).*))T which means: match the longest string which is matched by .codn R* , but which does not contain a non-empty match for .codn T ; then, match .codn T . This is a useful and expressive notation. With it, we can write the regular expression for matching C language comments simply like this: .code [/][*].%[*][/] (match the opening sequence .codn /* , then match a sequence of zero or more characters non-greedily, and then the closing sequence .codn */ . With the non-greedy operator, we don't have to think about the interior of the comment as set of strings which excludes .codn */ . Though the non-greedy operator appears expressive, its apparent simplicity may be deceptive. It looks as if it works "magically" by itself; "somehow" this .code .% part "knows" only to consume enough characters so that it doesn't swallow an occurrence of the trailing context. Care must be taken that the trailing context passed to the operator really is the correct text that should be excluded by the non-greedy match. For instance, take the expression .codn .%abc . If you intend the trailing context to be merely .codn a , you must be careful to write .codn (.%a)bc . Otherwise, the trailing context is .codn abc , and this means that the .code .% match will consume the longest string that does not contain .codn abc , when in fact what was intended was to consume the longest string that does not contain .codn a . The change in behavior of the .code % operator upon modifying the trailing context is not as intuitive as that of the * operator, because the trailing context is deeply involved in its logic. On a related note, for single-character trailing contexts, it may be a good idea to use a complemented character class instead. That is to say, rather than .codn (.%a)bc , consider .codn [^a]*abc . The set of strings which don't contain the character a is adequately expressed by .codn [^a]* .