summaryrefslogtreecommitdiffstats
path: root/txr.1
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2012-03-13 12:57:21 -0700
committerKaz Kylheku <kaz@kylheku.com>2012-03-13 12:57:21 -0700
commit5961f0de80abce4645ec2f022b2346e24b6479ed (patch)
tree2ae5ac4b66170ddf43bbc968f1160cd9aa9f2be9 /txr.1
parentaa62864118b755f10b30fd58a3b7fd8407ce8c6c (diff)
downloadtxr-5961f0de80abce4645ec2f022b2346e24b6479ed.tar.gz
txr-5961f0de80abce4645ec2f022b2346e24b6479ed.tar.bz2
txr-5961f0de80abce4645ec2f022b2346e24b6479ed.zip
Implementing URL filtering.
* eval.c (eval_init): New intrinsic functions: url-encode, url-decode. * filter.c (tourl_k, fromurl_k): New keyword variables. (is_url_reserved, digit_value): New static functions. (url_encode, url_decode): New functions. (filter_init): Intialize new keyword variables and register new :tourl and :fromurl filters. * filter.h (tourl_k, fromurl_k, url_encode, url_decode): Declared. * txr.1: Updated. * txr.vim: Likewise.
Diffstat (limited to 'txr.1')
-rw-r--r--txr.121
1 files changed, 21 insertions, 0 deletions
diff --git a/txr.1 b/txr.1
index fbc61d4d..e5e63478 100644
--- a/txr.1
+++ b/txr.1
@@ -3638,6 +3638,25 @@ Convert the 26 lower case letters of the English alphabet to upper case.
.IP :downcase
Convert the 26 upper case letters of the English alphabet to lower case.
+.IP :fromurl
+Decode URL-encoded (a.k.a. percent-encoded) text. Character triplets consisting
+of the % character followed by a pair of hexadecimal digits (case insensitive)
+are are converted to bytes having the value represented by the hexadecimal
+digits (most significant nybble first). Sequences of one or more such bytes are
+treated as UTF-8 data and decoded to characters.
+
+.IP :tourl
+Convert to URL encoding according to RFC 3986. The text is first converted
+to UTF-8 bytes. The bytes are then converted back to text as follows.
+Bytes in the range 0 to 32, and 127 to 255 (note: including the ASCII DEL),
+bytes whose values correspond to ASCII characters which are listed by RFC 3986
+as being in the "reserved set", and the byte value corresponding to the
+ASCII % character are encoded as a three-character sequence consisting
+of the % character followed by two hexadecimal digits derived from the
+byte value (most significant nybble first, upper case). All other bytes
+are converted directly to characters of the same value without any such
+encoding.
+
Example: to escape HTML characters in all variable substitutions occuring in an
output clause, specify :filter :to_html in the directive:
@@ -6754,6 +6773,8 @@ Certain object types have a custom equal function.
.SS Function match-fun
+.SS Functions url-encode and url-decode
+
.SH APPENDIX A: NOTES ON EXOTIC REGULAR EXPRESSIONS
Users familiar with regular expressions may not be familiar with the complement