diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2021-05-28 06:52:26 -0700 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2021-05-28 06:52:26 -0700 |
commit | 8bc5fc7a77eb1a6707f3c742235ab38ca210f55e (patch) | |
tree | bd5f91229ca61ff30dad2868b64c836a421468f1 /txr.1 | |
parent | c0c5e8836a89c8439a675bbd52d6fed134792477 (diff) | |
download | txr-8bc5fc7a77eb1a6707f3c742235ab38ca210f55e.tar.gz txr-8bc5fc7a77eb1a6707f3c742235ab38ca210f55e.tar.bz2 txr-8bc5fc7a77eb1a6707f3c742235ab38ca210f55e.zip |
json: handling for bad UTF-8 bytes, NUL and \u0000.
* parser.l <JLIT>: Convert \u+0000 sequence to U+DC00
code point, the pseudo-null. Also include JLIT
in in the rule for catching bad bytes that are not
matched by {UANYN}.
* txr.1: Document this treatment as extensions to JSON.
* lex.yy.c.shipped: Updated.
Diffstat (limited to 'txr.1')
-rw-r--r-- | txr.1 | 16 |
1 files changed, 15 insertions, 1 deletions
@@ -12348,7 +12348,7 @@ Introduces a JSON literal. Introduces a JSON quasiliteral, allowing unquoting and splicing of Lisp expressions. The implementation of JSON syntax is based on, and intended to conform with -the IETF RFC8259 document. Only \*(TX's extensions to JSON syntax are described +the IETF RFC 8259 document. Only \*(TX's extensions to JSON syntax are described in this manual, as well as the correspondence between JSON syntax and Lisp. The @@ -12407,6 +12407,20 @@ symbol is bound as a macro, which is expanded when a .code #J expression is evaluated. +The following extensions to JSON are supported. + +When an invalid UTF-8 byte is encountered inside a JSON string, its value is +mapped into the code point range U+DC01 to U+DCFF. That byte is consumed, and +decoding continues with the next byte. This treatment is consistent with the +treatment of invalid UTF-8 bytes in \*(TL literals and I/O streams. If the +valid UTF-8 byte U+0000 (ASCII NUL) occurs in a JSON string, it is also mapped +to U+DC00, \*(TX's pseudo-null character. This treatment is consistent with +\*(TX string literals and I/O streams. + +The JSON escape sequence +.code "\eu0000" +denoting the U+0000 NUL character is also converted to U+DC00. + \*(TL does not impose the restriction that the keys in a JSON object must be strings: .code "#J{1:2,true:false}" |