From 3c04bc8c55b8ce961b830126846da18ce4767d7b Mon Sep 17 00:00:00 2001 From: Kaz Kylheku Date: Thu, 6 Sep 2012 22:35:27 -0700 Subject: * txr.1: Documented string library. --- ChangeLog | 4 + txr.1 | 372 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 375 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index a549dba9..c9ed4ec2 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2012-09-06 Kaz Kylheku + + * txr.1: Documented string library. + 2012-09-02 Kaz Kylheku * eval.c (eval_init): Follow function renames. diff --git a/txr.1 b/txr.1 index 8f73618c..6323f6a9 100644 --- a/txr.1 +++ b/txr.1 @@ -1073,7 +1073,7 @@ followed by a character name, the letter x followed by hex digits, the letter o followed by octal digits, or a single character. Valid character names are: nul, alarm, backspace, tab, linefeed, newline, vtab, page, return, esc, space. This convention for character literals is similar to that of the -Scheme language. +Scheme language. Note that #\elinefeed and #\enewline are the same character. .SS String Literals @@ -7471,56 +7471,421 @@ which is interposed between the catenated strings. .SS Function split-str +.TP +Syntax: + + (split-str ) + +.TP +Description: + +The split-str function breaks the into pieces, returing a list +thereof. The argument must be a string. It specifies the separator +character sequence within . All non-overlapping occurences of + within are identified in left to right order, and are removed +from . The string is broken into pieces according to the gaps left +behind by the removed separators. + +Adjacent occurrences of within are considered to be separate +gaps which come between empty strings. + +This operation is nondestructive: is not modified in any way. + .SS Function split-str-set +.TP +Syntax: + + (split-str ) + +.TP +Description: + +The split-str function breaks the into pieces, returing a list +thereof. The argument must be a string. It specifies a set of +characters. All occurences of any of these characters within are +identified, and are removed from . The string is broken into pieces +according to the gaps left behind by the removed separators. + +Adjacent occurrences of characters from within are considered to +be separate gaps which come between empty strings. + +This operation is nondestructive: is not modified in any way. + .SS Function list-str +.TP +Syntax: + + (list-str ) + +.TP +Description: + +The list-str function converts a string into a list of characters. + .SS Function trim-str +.TP +Syntax: + + (trim-str ) + +.TP +Description: + +The trim-str function produces a copy of from which leading and +trailing whitespace is removed. Whitespace consists of spaces, tabs, +carriage returns, linefeeds, vertical tabs and form feeds. + .SS Function string-lt +.TP +Syntax: + + (string-lt ) + +.TP +Description: + +The string-lt function returns t if is lexicographically prior +to . The behavior does not depend on any kind of locale. + +Note that this function forces (fully instantiates) any lazy string arguments, +even if doing is is not necessary. + .SS Function chrp +.TP +Syntax: + + (chrp ) + +.TP +Description: + +Returns t if is a character, otherwise nil. + .SS Function chr-isalnum +.TP +Syntax: + + (chr-isalnum ) + +.TP +Description: + +Returns t if is an alpha-numeric character, otherwise nil. Alpha-numeric +means one of the upper or lower case letters of the English alphabet found in +ASCII, or an ASCII digit. This function is not affected by locale. + .SS Function chr-isalpha +.TP +Syntax: + + (chr-isalpha ) + +.TP +Description: + +Returns t if is an alphabetic character, otherwise nil. Alphabetic +means one of the upper or lower case letters of the English alphabet found in +ASCII. This function is not affected by locale. + .SS Function chr-isascii +.TP +Syntax: + + (chr-isalpha ) + +.TP +Description: + +This function returns t if the code of character is in the range +0 to 127, inclusive. For characters outside of this range, it returns nil. + .SS Function chr-iscntrl +.TP +Syntax: + + (chr-iscntrl ) + +.TP +Description: + +This function returns t if the character is a character whose code +ranges from 0 to 31, or is 127. In other words, any non-printable ASCII +character. For other characters, it returns nil. + .SS Function chr-isdigit +.TP +Syntax: + + (chr-isdigit ) + +.TP +Description: + +This function returns t if the character is is an ASCII digit. +Otherwise, it returns nil. + .SS Function chr-isgraph +.TP +Syntax: + + (chr-isgraph ) + +.TP +Description: + +This function returns t if is a non-space printable ASCII character. +It returns nil if it is a space or control character. + +It also returns nil for non-ASCII characters: Unicode characters with a code +above 127. + .SS Function chr-islower +.TP +Syntax: + + (chr-islower ) + +.TP +Description: + +This function returns t if is an ASCII lower case letter. Otherwise it returns nil. + .SS Function chr-isprint +.TP +Syntax: + + (chr-isprint ) + +.TP +Description: + +This function returns t if is an ASCII character which is not a +control character. It also returns nil for all non-ASCII characters: Unicode +characters with a code above 127. + .SS Function chr-ispunct +.TP +Syntax: + + (chr-ispunct ) + +.TP +Description: + +This function returns t if is an ASCII character which is not a +control character. It also returns nil for all non-ASCII characters: Unicode +characters with a code above 127. + .SS Function chr-isspace +.TP +Syntax: + + (chr-isspace ) + +.TP +Description: + +This function returns t if is an ASCII whitespace character: any of the +characters in the set #\espace, #\etab, #\elinefeed, #\enewline, #\ereturn, +#\evtab, and #\epage. For all other characters, it returns nil. + .SS Function chr-isupper +.TP +Syntax: + + (chr-isupper ) + +.TP +Description: + +This function returns t if is an ASCII upper case letter. Otherwise it returns nil. + .SS Function chr-isxdigit +.TP +Syntax: + + (chr-isxdigit ) + +.TP +Description: + +This function returns t if is a hexadecimal digit. One of the ASCII +letters A through F, or their lower-case equivalents, or an ASCII digit 0 +through 9. + .SS Function chr-toupper +.TP +Syntax: + + (chr-toupper ) + +.TP +Description: + +If character is a lower case ASCII letter character, this function +returns the upper case equivalent character. If it is some other +character, then it just returns . + .SS Function chr-tolower +.TP +Syntax: + + (chr-tolower ) + +.TP +Description: + +If character is an upper case ASCII letter character, this function +returns the lower case equivalent character. If it is some other +character, then it just returns . + .SS Functions num-chr and chr-num +.TP +Syntax: + + (num-chr ) + (chr-num ) + +.TP +Description: + +The argument must be a character. The num-chr function returns that +character's Unicode code point value as an integer. + +The argument must be a fixnum integer in the range 0 to #\e10FFFF. +The argument is taken to be a Unicode code point value and the +corresponding character object is returned. + .SS Function chr-str +.TP +Syntax: + + (chr-str ) + +.TP +Description: + +The chr-str function performs random access on string to retrieve +the character whose position is given by integer , which must +be within range of the string. + +The index value 0 corresponds to the first (leftmost) character of the string +and so non-negative values up to one less than the length are possible. + +Negative index values are also allowed, such that -1 corresponds to the +last (rightmost) character of the string, and so negative values down to +the additive inverse of the string length are possible. + +An empty string cannot be indexed. A string of length one supports index 0 and +index -1. A string of length two is indexed left to right by the values 0 and +1, and from right to left by -1 and -2. + +.TP +Notes: + +Direct use of chr-str is equivalent to the DWIM bracket notation except +that must be a string. The following relation holds: + + (chr-str s i) --> [s i] + +since [s i] <--> (ref s i), this also holds: + + (chr-str s i) --> (ref s i) + .SS Function chr-str-set +.TP +Syntax: + + (chr-str-set ) + +.TP +Description: + +The chr-str function performs random access on string to overwrite +the character whose position is given by integer , which must +be within range of the string. The character at is overwritten +with character . + +The argument works exactly as in chr-str. + +The argument must be a modifiable string. + +.TP +Notes: + +Direct use of chr-str is equivalent to the DWIM bracket notation except +that must be a string. The following relation holds: + + (chr-str-set s i c) --> (set [s i] c) + +since (set [s i] c) <--> (refset s i c), this also holds: + + (chr-str s i) --> (refset s i c) + .SS Function span-str +.TP +Syntax: + + (span-str ) + +.TP +Description: + +The span-str function determines the longest prefix of string which +consists only of the characters in string , in any combination. + .SS Function compl-span-str +.TP +Syntax: + + (compl-span-str ) + +.TP +Description: + +The compl-span-str function determines the longest prefix of string which +consists only of the characters which do not appear in , in any +combination. + .SS Function break-str +.TP +Syntax: + + (break-str ) + +.TP +Description: + +The break-str function returns an integer which represents the position of the +first character in string which appears in string . + +If there is no such character, then nil is returned. + .SH VECTORS .SS Function vector @@ -7639,6 +8004,11 @@ bracket syntax: (refset seq idx new) <--> (set [seq idx] new) +The difference is that ref and refset are first class functions which +can be used in functional programming as higher order functions, whereas the +bracket notation is syntactic sugar, and set is an operator, not a function. +Therefore the brackets cannot replace all uses of ref and refset. + .SS Function sort .TP -- cgit v1.2.3