summaryrefslogtreecommitdiffstats
path: root/txr.1
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2014-06-15 21:48:04 -0700
committerKaz Kylheku <kaz@kylheku.com>2014-06-15 21:48:04 -0700
commit4a223e77f8bf67c9236232bce354d60951b25bed (patch)
tree9e9a1bb72884b56dfaa240763f979ca630cea23e /txr.1
parent548dd7697516a2fea8930d3fa9e88ea48d5ab630 (diff)
downloadtxr-4a223e77f8bf67c9236232bce354d60951b25bed.tar.gz
txr-4a223e77f8bf67c9236232bce354d60951b25bed.tar.bz2
txr-4a223e77f8bf67c9236232bce354d60951b25bed.zip
* lib.c (obj_print): Render character DC00 as "pnul".
Clean up code which chooses rendering for characters. Print C0 and C1 control characters, as well as D800-DFFF, FFFE and FFFF and characters above FFFF using hex; others are printed using the #\<char> notation. * parser.y (char_from_name): map "pnul" to DC00. * txr.1: Documented pnul, clarified character printing rules, and added a cautionary note about possible ambiguity in printing.
Diffstat (limited to 'txr.1')
-rw-r--r--txr.124
1 files changed, 22 insertions, 2 deletions
diff --git a/txr.1 b/txr.1
index d52c2a84..b0accf49 100644
--- a/txr.1
+++ b/txr.1
@@ -1123,8 +1123,12 @@ Character literals are introduced by the #\e syntax, which is either
followed by a character name, the letter x followed by hex digits,
the letter o followed by octal digits, or a single character. Valid character
names are: nul, alarm, backspace, tab, linefeed, newline, vtab, page, return,
-esc, space. This convention for character literals is similar to that of the
-Scheme language. Note that #\elinefeed and #\enewline are the same character.
+esc, space and pnul. This convention for character literals is similar to that
+of the Scheme language. Note that #\elinefeed and #\enewline are the same
+character. The #\epnul character is specific to TXR and denotes the U+DC00
+code in Unicode; the name stands for "pseudo-null", which is related to
+its special function. For more information about this, see the section
+"Character Handling and International Characters".
.SS String Literals
@@ -12136,6 +12140,22 @@ and calls to these functions:
For pprint, tostringp and pprinl, the equivalence is produced by using "~a" in
format rather than "~s".
+Note: for characters, the print function behaves as follows: most control
+characters in the Unicode C0 and C1 range are rendered using the #\ex notation,
+using two hex digits. Codes in the range D800 to DFFF, and the codes
+FFFE and FFFF are printed in the #\exNNNN with four hexadecimal digits, and
+charater above this range are printed using the same notation, but with six
+hexadecimal digits. Certain characters in the C0 range are printed using
+their names such as #\enul and #\ereturn, which are documented
+in the Character Literals section not far from the start of this document.
+The DC00 character is printed as #\epnul. All other characters are printed as
+#\e<char>, where <char> is the actual character.
+
+Caution: read-print consistency is affected by trailing material. If additional
+digits are printed immediately after a number without intervening whitespace,
+they extend that number. If hex digits are printed after the character x,
+which is rendered as #\ex, they look like a hex character code.
+
.SS Function streamp
.TP