summaryrefslogtreecommitdiffstats
path: root/txr.1
diff options
context:
space:
mode:
authorKaz Kylheku <kaz@kylheku.com>2012-03-13 22:02:48 -0700
committerKaz Kylheku <kaz@kylheku.com>2012-03-13 22:02:48 -0700
commit7f0f22c4e455f457d37ddf542b36c49db20d16af (patch)
treeae4b2ed4afa05a8714c86f80ce5a9d4a8d4afe05 /txr.1
parentbe33de3909100a204726cdde52bf6077e890ca6d (diff)
downloadtxr-7f0f22c4e455f457d37ddf542b36c49db20d16af.tar.gz
txr-7f0f22c4e455f457d37ddf542b36c49db20d16af.tar.bz2
txr-7f0f22c4e455f457d37ddf542b36c49db20d16af.zip
Change: @(block) requires @(end) from now on.
Blocks no longer extend to the end of the surrounding scope. * match.c (v_block): Rewrite for new syntax. * parser.l (BLOCK): New token type handled. * parser.y (BLOCK): New token. (block_clause): New nonterminal grammar symbol. (clause): Collateral fix: replaced a bunch of list(X, nao) forms with cons(X, nil). Introduced block_clause as a constituent of clause. * txr.1: Revamped documentation of block, and wrote about using blocks for reducing nested skips and reducing backtracking in general.
Diffstat (limited to 'txr.1')
-rw-r--r--txr.1101
1 files changed, 80 insertions, 21 deletions
diff --git a/txr.1 b/txr.1
index e5e63478..b51813d2 100644
--- a/txr.1
+++ b/txr.1
@@ -1136,8 +1136,9 @@ end of a line. Also Fails if no data remains (there is no current line).
Continue matching in another file or other data source.
.IP @(block)
-The remaining query is treated as an anonymous or named block.
-Blocks may be referenced by @(accept) and @(fail) directives.
+Groups to gether a sequence of directives into a logical name block,
+which can be explicitly terminated from within using
+the @(accept) and @(fail) directives.
Blocks are discussed in the section BLOCKS below.
.IP @(skip)
@@ -1524,6 +1525,65 @@ will keep scanning even though it has found the correct match, then backtrack
to the last good match once it runs out of data. The regular skip with explicit
@(eof) will stop when the @(eof) matches.
+.SS Reducing Backtracking with Blocks
+
+Skip can consider considerable CPU time when multiple skips are nested.
+Consider:
+
+ @(skip)
+ A
+ @(skip)
+ B
+ @(skip)
+ C
+
+This is actually nesting: the second a third skips occur within the body of the
+first one, and thus this creates nested iteration. TXR is searching for the
+combination of skips which find match the pattern of lines A, B and C, with
+backtracking behavior. The outermost skip marches through the data until it
+finds A, followed by a pattern match for the second skip. The second skip
+iterates within to find B, followed by the third skip, and the third skip
+iterates to find C. If there is only one line A, and one B, then this is
+reasonably fast. But suppose there are many lines matching A and B,
+giving rise to a large number combinations of skips which match A and B, and
+yet no match for C, triggering backtracking. The nested stepping which tries
+the combinations of A and B can give rise to a considerable running time.
+
+One way to deal with the problem is to unravel the nesting with the help of
+blocks. For example:
+
+ @(block)
+ @ (skip)
+ A
+ @(end)
+ @(block)
+ @ (skip)
+ B
+ @(end)
+ @(skip)
+ C
+
+Now the scope of each skip is just the remainder of the block in which it
+occurs. The first skip finds A, and then the block ends. Control passes to the
+next block, and backtracking will take place to a block which completed (unless
+all these blocks are enclosed in some larger construct which backtracks,
+causing the blocks to be re-executed.
+
+Of course, this rewrite is not equivalent, and cannot be used for instance
+in backreferencing situations such as:
+
+ @;
+ @; Find some three lines which are the same.
+ @;
+ @(skip)
+ @line
+ @(skip)
+ @line
+ @(skip)
+ @line
+
+This example depends on the nested search-within-search semantics.
+
.SS The Trailer Directive
The trailer directive introduces a trailing portion of a query or subquery
@@ -2627,28 +2687,36 @@ to @(block nil).
The @(skip) and @(collect) directives introduce implicit anonymous blocks,
as do function bodies.
+Blocks are useful for terminating parts of a pattern matchin search
+prematurely, and escaping to a higher level. This makes blocks not only
+useful for simplifying the semantics of certain pattern matches,
+but also an optimization tool.
+
+Judicious use of blocks and escapes can reduce or eliminate the amount
+backtracking that TXR performs.
+
.SS Block Scope
The names of blocks are in a distinct namespace from the variable binding
space. So @(block foo) has no interaction with the variable @foo.
A block extends from the @(block ...) directive which introduces it,
-to the end of the subquery in which that directive is contained. For instance:
+until the matching @(end), and may be empty. For instance:
@(some)
abc
@(block foo)
xyz
@(end)
+ @(end)
Here, the block foo occurs in a @(some) clause, and so it extends to the @(end)
-which terminates that clause. After that @(end), the name foo is not
-associated with a block (is not "in scope"). A block which is not contained in
-any subquery extends to the end of the overall query. Blocks are never
-terminated by @(end).
+which terminates the block. After that @(end), the name foo is not
+associated with a block (is not "in scope"). The second @(end) terminates
+the @(some) block.
The implicit anonymous block introduced by @(skip) has the same scope
-as the @(skip): they extends over all of the material which follows the skip,
+as the @(skip): it extends over all of the material which follows the skip,
to the end of the containing subquery.
.SS Block Nesting
@@ -2659,11 +2727,15 @@ which they are nested. For instance:
@(block)
@(block)
...
+ @(end)
+ @(end)
is a nesting of two anonymous blocks, and
@(block foo)
@(block foo)
+ @(end)
+ @(end)
is a nesting of two named blocks which happen to have the same name.
When a nested block has the same name as an outer block, it creates
@@ -2671,19 +2743,6 @@ a block scope in which the outer block is "shadowed"; that is to say,
directives which refer to that block name within the nested block refer to the
inner block, and not to the outer one.
-A more complicated example of nesting is:
-
- @(skip)
- abc
- @(block)
- @(some)
- @(block foo)
- @(end)
-
-Here, the @(skip) introduces an anonymous block. The explicit anonymous
-@(block) is nested within skip's anonymous block and shadows it.
-The foo block is nested within both of these.
-
.SS Block Semantics
A block normally does nothing. The query material in the block is evaluated