diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2014-06-15 21:48:04 -0700 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2014-06-15 21:48:04 -0700 |
commit | 4a223e77f8bf67c9236232bce354d60951b25bed (patch) | |
tree | 9e9a1bb72884b56dfaa240763f979ca630cea23e /txr.1 | |
parent | 548dd7697516a2fea8930d3fa9e88ea48d5ab630 (diff) | |
download | txr-4a223e77f8bf67c9236232bce354d60951b25bed.tar.gz txr-4a223e77f8bf67c9236232bce354d60951b25bed.tar.bz2 txr-4a223e77f8bf67c9236232bce354d60951b25bed.zip |
* lib.c (obj_print): Render character DC00 as "pnul".
Clean up code which chooses rendering for characters.
Print C0 and C1 control characters, as well as D800-DFFF,
FFFE and FFFF and characters above FFFF using hex;
others are printed using the #\<char> notation.
* parser.y (char_from_name): map "pnul" to DC00.
* txr.1: Documented pnul, clarified character
printing rules, and added a cautionary note about
possible ambiguity in printing.
Diffstat (limited to 'txr.1')
-rw-r--r-- | txr.1 | 24 |
1 files changed, 22 insertions, 2 deletions
@@ -1123,8 +1123,12 @@ Character literals are introduced by the #\e syntax, which is either followed by a character name, the letter x followed by hex digits, the letter o followed by octal digits, or a single character. Valid character names are: nul, alarm, backspace, tab, linefeed, newline, vtab, page, return, -esc, space. This convention for character literals is similar to that of the -Scheme language. Note that #\elinefeed and #\enewline are the same character. +esc, space and pnul. This convention for character literals is similar to that +of the Scheme language. Note that #\elinefeed and #\enewline are the same +character. The #\epnul character is specific to TXR and denotes the U+DC00 +code in Unicode; the name stands for "pseudo-null", which is related to +its special function. For more information about this, see the section +"Character Handling and International Characters". .SS String Literals @@ -12136,6 +12140,22 @@ and calls to these functions: For pprint, tostringp and pprinl, the equivalence is produced by using "~a" in format rather than "~s". +Note: for characters, the print function behaves as follows: most control +characters in the Unicode C0 and C1 range are rendered using the #\ex notation, +using two hex digits. Codes in the range D800 to DFFF, and the codes +FFFE and FFFF are printed in the #\exNNNN with four hexadecimal digits, and +charater above this range are printed using the same notation, but with six +hexadecimal digits. Certain characters in the C0 range are printed using +their names such as #\enul and #\ereturn, which are documented +in the Character Literals section not far from the start of this document. +The DC00 character is printed as #\epnul. All other characters are printed as +#\e<char>, where <char> is the actual character. + +Caution: read-print consistency is affected by trailing material. If additional +digits are printed immediately after a number without intervening whitespace, +they extend that number. If hex digits are printed after the character x, +which is rendered as #\ex, they look like a hex character code. + .SS Function streamp .TP |