summaryrefslogtreecommitdiffstats
path: root/txr.1
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2021-06-03 06:29:10 -0700
committerKaz Kylheku <kaz@kylheku.com>2021-06-03 06:29:10 -0700
commit4edb6eac41259a43a547772c4249ba2d6cc68106 (patch)
tree5ef739853974cf1f148f7733d0d734f0ced27d3e /txr.1
parent0f66ac2b5412eb432f165b680fb495f972f33917 (diff)
downloadtxr-4edb6eac41259a43a547772c4249ba2d6cc68106.tar.gz
txr-4edb6eac41259a43a547772c4249ba2d6cc68106.tar.bz2
txr-4edb6eac41259a43a547772c4249ba2d6cc68106.zip
json: improve escaping for script tags.
* lib.c (out_json_str): Strengthen the test for escaping the forward slash. It has to occur in the sequence </script rather than just </. Recognize <!-- and --> in the string, and encode them. * tests/010/json.tl: Cover this area with some tests. * txr.1: Documented.
Diffstat (limited to 'txr.1')
-rw-r--r--txr.145
1 files changed, 34 insertions, 11 deletions
diff --git a/txr.1 b/txr.1
index 5aaf35e2..d3e3a769 100644
--- a/txr.1
+++ b/txr.1
@@ -72200,27 +72200,50 @@ and
.codn \er .
.IP 3.
If the character sequence
-.code "</"
+.code "</script"
occurs in a string, then in the JSON representation the slash is escaped, such
that the sequence is rendered as
-.codn "<\e/" .
-Instances of the
+.codn "<\e/script" .
+Instances of
.code /
-(forward slash, solidus) occurs not preceded by
-.code <
-(less than) are unescaped. Rationale: this is a feature of JSON which allows
-for safer embedding of the resulting JSON into HTML
+(forward slash, solidus) in other situations are unescaped. Rationale: this is
+a feature of JSON which allows for safer embedding of the resulting
+JSON into HTML
.code script
tags.
-.IP 4.
+.IP 4
+If the character sequence
+.code <!--
+occurs in a string, then in the JSON representation, the sequence is
+rendered as
+.codn <\eu0021-- .
+Instances of
+.code !
+(exclamation mark) in other situations are not encoded. Rationale: safe
+embedding in HTML
+.code script
+tags.
+.IP 5
+If the character sequence
+.code -->
+occurs in a string, then in the JSON representation, the sequence is
+rendered as
+.codn -\eu002D> .
+Instances of
+.code -
+(hyphen) in other situations are not encoded. Rationale: safe
+embedding in HTML
+.code script
+tags.
+.IP 6.
The code point U+DC00 (\*(TX's pseudo-null character) is translated into the
.code "\eu0000"
escape syntax.
-.IP 5.
+.IP 7.
The code points U+DC01 through U+DCFF are send to the stream as-is.
If the stream performs UTF-8 encoding, these characters turn into individual
bytes in the range 0 to 255.
-.IP 6.
+.IP 8.
Control characters in the U+0001 to U+001F other than the ones subject
to rule 1 above are rendered as
.code \eu
@@ -72229,7 +72252,7 @@ the range U+D800 to U+DBFF, U+DD00 to U+DFFF, and the code points
U+FFFE and U+FFFF are also encoded as
.code \eu
escape sequences.
-.IP 7.
+.IP 9.
A character outside of the BMP (Basic Multilingual Plane) in the range
U+10000 to U+10FFFF is encoded using as a pair of consecutive
.code \eu