summaryrefslogtreecommitdiffstats
path: root/txr.1
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2021-05-28 06:52:26 -0700
committerKaz Kylheku <kaz@kylheku.com>2021-05-28 06:52:26 -0700
commit8bc5fc7a77eb1a6707f3c742235ab38ca210f55e (patch)
treebd5f91229ca61ff30dad2868b64c836a421468f1 /txr.1
parentc0c5e8836a89c8439a675bbd52d6fed134792477 (diff)
downloadtxr-8bc5fc7a77eb1a6707f3c742235ab38ca210f55e.tar.gz
txr-8bc5fc7a77eb1a6707f3c742235ab38ca210f55e.tar.bz2
txr-8bc5fc7a77eb1a6707f3c742235ab38ca210f55e.zip
json: handling for bad UTF-8 bytes, NUL and \u0000.
* parser.l <JLIT>: Convert \u+0000 sequence to U+DC00 code point, the pseudo-null. Also include JLIT in in the rule for catching bad bytes that are not matched by {UANYN}. * txr.1: Document this treatment as extensions to JSON. * lex.yy.c.shipped: Updated.
Diffstat (limited to 'txr.1')
-rw-r--r--txr.116
1 files changed, 15 insertions, 1 deletions
diff --git a/txr.1 b/txr.1
index 7293f30f..2b26b832 100644
--- a/txr.1
+++ b/txr.1
@@ -12348,7 +12348,7 @@ Introduces a JSON literal.
Introduces a JSON quasiliteral, allowing unquoting and splicing of Lisp expressions.
The implementation of JSON syntax is based on, and intended to conform with
-the IETF RFC8259 document. Only \*(TX's extensions to JSON syntax are described
+the IETF RFC 8259 document. Only \*(TX's extensions to JSON syntax are described
in this manual, as well as the correspondence between JSON syntax and Lisp.
The
@@ -12407,6 +12407,20 @@ symbol is bound as a macro, which is expanded when a
.code #J
expression is evaluated.
+The following extensions to JSON are supported.
+
+When an invalid UTF-8 byte is encountered inside a JSON string, its value is
+mapped into the code point range U+DC01 to U+DCFF. That byte is consumed, and
+decoding continues with the next byte. This treatment is consistent with the
+treatment of invalid UTF-8 bytes in \*(TL literals and I/O streams. If the
+valid UTF-8 byte U+0000 (ASCII NUL) occurs in a JSON string, it is also mapped
+to U+DC00, \*(TX's pseudo-null character. This treatment is consistent with
+\*(TX string literals and I/O streams.
+
+The JSON escape sequence
+.code "\eu0000"
+denoting the U+0000 NUL character is also converted to U+DC00.
+
\*(TL does not impose the restriction that the keys in a JSON object
must be strings:
.code "#J{1:2,true:false}"