diff options
author | Kaz Kylheku <kaz@kylheku.com> | 2012-03-30 16:36:57 -0700 |
---|---|---|
committer | Kaz Kylheku <kaz@kylheku.com> | 2012-03-30 16:36:57 -0700 |
commit | cdc4a396d2a0d59798c87d4d8afcabb122e8047a (patch) | |
tree | 949034e5090a599d389cf7b5ff71cbdbdbd05fa0 | |
parent | 0e42089c32336d536e886072966d795df14b6700 (diff) | |
download | txr-cdc4a396d2a0d59798c87d4d8afcabb122e8047a.tar.gz txr-cdc4a396d2a0d59798c87d4d8afcabb122e8047a.tar.bz2 txr-cdc4a396d2a0d59798c87d4d8afcabb122e8047a.zip |
* txr.1: Documenting the debugger with an example session.
-rw-r--r-- | ChangeLog | 4 | ||||
-rw-r--r-- | txr.1 | 192 |
2 files changed, 196 insertions, 0 deletions
@@ -1,5 +1,9 @@ 2012-03-30 Kaz Kylheku <kaz@kylheku.com> + * txr.1: Documenting the debugger with an example session. + +2012-03-30 Kaz Kylheku <kaz@kylheku.com> + Version 63 * txr.c (version): Bumped. @@ -99,6 +99,9 @@ standard error device (but the if the situations occur, they still fail the query). This option does not suppress error generation during the parsing of the query, only during its execution. +.IP -d +Invoke the interactive txr debugger. See the DEBUGGER section. + .IP -v Verbose operation. Detailed logging is enabled. @@ -7401,6 +7404,195 @@ followed by a period, e or E. .SS Functions url-encode and url-decode +.SH DEBUGGER + +.B TXR +has a simple, crude, built-in debugger. The debugger is invoked by adding +the the -d command line option to an invocation of txr. +In this debugger it is possible to step through code, set breakpoints, +and examine the variable binding environment. + +Prior to executing any code, the debugger waits at the txr> prompt, +allowing for the opportunity to set breakpoints. + +Help can be obtained with the h or ? command. + +Whenever the program stosp at the debugger, it prints the Lisp-ified +piece of syntax tree that is about to be interpreted. +It also shows the context of the input being matched. + +The s command can be used to step into a form; n to step over. +Sometimes the behavior seems counter-intuitive. For instance stepping +over a @(next) directive actually means skipping everything which follows +it. This is because the query material after a @(next) is actually child +nodes in the abstract syntax tree node of the next directive. + +.SS Sample Session + +Here is an example of the debugger beign applied to a web scraping program +which connects to a US NAVY clock server to retrieve a dynamically-generated +web page, from which the current time is extracted, in various time zones. +The handling of the web request is done by the wget command; the txr +query opens a wget command as and scans the body of the HTTP response containing +HTML. This is the code, saved in a file called navytime.txr: + + @(next `!wget -c http://tycho.usno.navy.mil/cgi-bin/timer.pl -O - 2> /dev/null`) + <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final"//EN> + <html> + <body> + <TITLE>What time is it?</TITLE> + <H2> US Naval Observatory Master Clock Time</H2> <H3><PRE> + @(collect :vars (MO DD HH MM SS (PM " ") TZ TZNAME)) + <BR>@MO. @DD, @HH:@MM:@SS @(maybe)@{PM /PM/} @(end)@TZ@/\t+/@TZNAME + @ (until) + </PRE>@/.*/ + @(end) + </PRE></H3><P><A HREF="http://www.usno.navy.mil"> US Naval Observatory</A> + + </body></html> + @(output) + @ (repeat) + @MO-@DD @HH:@MM:@SS @PM @TZ + @ (end) + @(end) + +This is the debug session: + + $ txr -d navytime.txr + stopped at line 1 of navytime.txr + form: (next (sys:quasi "!wget -c http://tycho.usno.navy.mil/cgi-bin/timer.pl -O - 2> /dev/null")) + depth: 1 + data (nil): + nil + +The user types s to step into the (next ...) form. + + txr> s + stopped at line 2 of navytime.txr + form: (sys:text "<!DOCTYPE" (#<sys:regex: 95e4590> 1+ #\space) "HTML" (#<sys:regex: 95e4618> 1+ #\space) "PUBLIC" (#<sys:regex: 95e46a8> 1+ #\space) "\"-//W3C//DTD" (#<sys:regex: 95e4750> 1+ #\space) "HTML" (#<sys:regex: 95e47d8> 1+ #\space) "3.2" (#<sys:regex: 95e4860> 1+ #\space) "Final\"//EN>") + depth: 2 + data (1): + "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final\"//EN>" + txr> s + +The current form now is a syt:text form which is an internal representation of +a block of horizontal material. The pattern matching is in vertical mode at +this point, and so the line of data is printed without an indication of +character position. + + stopped at line 2 of navytime.txr + form: (sys:text "<!DOCTYPE" (#<sys:regex: 95e4590> 1+ #\space) "HTML" (#<sys:regex: 95e4618> 1+ #\space) "PUBLIC" (#<sys:regex: 95e46a8> 1+ #\space) "\"-//W3C//DTD" (#<sys:regex: 95e4750> 1+ #\space) "HTML" (#<sys:regex: 95e47d8> 1+ #\space) "3.2" (#<sys:regex: 95e4860> 1+ #\space) "Final\"//EN>") + depth: 3 + data (1:0): + "" . "<!DOCTYPE HTML PUBLIC \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + +The user types s to step in. + + txr> s + stopped at line 2 of navytime.txr + form: "<!DOCTYPE" + depth: 4 + data (1:0): + "" . "<!DOCTYPE HTML PUBLIC \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + +Now, the form about to be processed is the first item of the (sys:text ...), +a the string "<!DOCTYPE". + +The input is shown broken into two quoted strings with a dot in between. +The dot indicates the current position. The left string is emtpy, meaning +that this is the leftmost position. The programmer steps: + + txr> s + stopped at line 2 of navytime.txr + form: (#<sys:regex: 95e4590> 1+ #\espace) + depth: 4 + data (1:9): + "<!DOCTYPE" . " HTML PUBLIC \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + +Control has now passed to the second element of the (sys:text ...), +a regular expression which matches one or more spaces, generated by +a single space in the source code according to the language rules. + +The input context shows taht "<!DOCTYPE" was matched in the input, and the +position moved past it. + + txr> s + stopped at line 2 of navytime.txr + form: "HTML" + depth: 4 + data (1:10): + "<!DOCTYPE " . "HTML PUBLIC \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + +Now, the regular expression has matched and moved the psoition past +the space; the facing input is now "HTML ...". + +The programmer then repeats the s command by hitting Enter. + + txr> + stopped at line 2 of navytime.txr + form: (#<sys:regex: 95e4618> 1+ #\espace) + depth: 4 + data (1:14): + "<!DOCTYPE HTML" . " PUBLIC \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + txr> + stopped at line 2 of navytime.txr + form: "PUBLIC" + depth: 4 + data (1:15): + "<!DOCTYPE HTML " . "PUBLIC \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + txr> + stopped at line 2 of navytime.txr + form: (#<sys:regex: 95e46a8> 1+ #\espace) + depth: 4 + data (1:21): + "<!DOCTYPE HTML PUBLIC" . " \e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + txr> + stopped at line 2 of navytime.txr + form: "\e"-//W3C//DTD" + depth: 4 + data (1:22): + "<!DOCTYPE HTML PUBLIC " . "\e"-//W3C//DTD HTML 3.2 Final\e"//EN>" + txr> + stopped at line 2 of navytime.txr + form: (#<sys:regex: 95e4750> 1+ #\espace) + depth: 4 + data (1:34): + "<!DOCTYPE HTML PUBLIC \e"-//W3C//DTD" . " HTML 3.2 Final\e"//EN>" + +It is not evident from the session transcript, but during interactive use, +the input context appears to be animated. Whenever the programmer hits +Enter, the new context is printed and the dot appears to advance. + +Eventually the programmer becomes bored and place a breakpoint on line 15, +where the @(output) block begins, and invokes the c command to continue the +execution: + + txr> b 15 + txr> c + stopped at line 15 of navytime.txr + form: (output (((repeat nil (((sys:var MO nil nil) "-" (sys:var DD nil nil) " " (sys:var HH nil nil) ":" (sys:var MM nil nil) ":" (sys:var SS nil nil) " " (sys:var PM nil nil) " " (sys:var TZ nil nil))) nil nil nil nil nil nil)))) + depth: 2 + data (16): + "" + +The programmer issues a v command to take a look at the variable bindings, +which indicate that the @(collect) has produced some lists. + + txr> v + bindings: + 0: ((PM " " "PM" "PM" "PM" "PM" "PM" "PM") (TZNAME "Universal Time" "Eastern Time" "Central Time" "Mountain Time" "Pacific Time" "Alaska Time" "Hawaii-Aleutian Time") (TZ "UTC" "EDT" "CDT" "MDT" "PDT" "AKDT" "HAST") (SS "35" "35" "35" "35" "35" "35" "35") (MM "32" "32" "32" "32" "32" "32" "32") (HH "23" "07" "06" "05" "04" "03" "01") (DD "30" "30" "30" "30" "30" "30" "30") (MO "Mar" "Mar" "Mar" "Mar" "Mar" "Mar" "Mar")) + +Then a continue command, which finishes the program, whose output appears: + + txr> c + Mar-30 23:22:52 UTC + Mar-30 07:22:52 PM EDT + Mar-30 06:22:52 PM CDT + Mar-30 05:22:52 PM MDT + Mar-30 04:22:52 PM PDT + Mar-30 03:22:52 PM AKDT + Mar-30 01:22:52 PM HAST + .SH APPENDIX A: NOTES ON EXOTIC REGULAR EXPRESSIONS Users familiar with regular expressions may not be familiar with the complement |