summaryrefslogtreecommitdiffstats
path: root/stdlib
Commit message (Collapse)AuthorAgeFilesLines
* compiler: new late-peephole case.Kaz Kylheku2021-11-291-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is related to the pattern in the previous commit. When we have a situation like this: lab1 mov tn nil lab2 ifq tn nil lab4 lab3 gcall tn ... We know that if lab1 is entered, then lab2 will necessarily fall through: the lab4 branch is not taken because tn is nil. But then, tn is clobbered immediately in lab3 by the gcall tn. In other words, the value stored into tn by lab1 is never used. Therefore, we can remove the "mov tn nil" instruction and move the l1 label. lab2 ifq tn nil lab4 lab1 lab3 gcall tn ... There are 74 hits for this pattern in stdlib. * stdlib/optimize.tl (basic-blocks late-peephole): Implement the above pattern.
* compiler: revise no-longer-matching late peephole case.Kaz Kylheku2021-11-291-14/+3
| | | | | | | | | | | * stdlib/optimize.tl (basic-blocks late-peephole): This pattern doesn't match any more because of code removed by the previous commit. If we shorten it by removing the lab1 block, then it matches. Because the pattern is shorter, the reduction being performed by the replacement is no longer needed; it has already been done. The remaining value is that threads the jump from lab3 to lab4. This missing threading is what I noticed when evaluating the effects of the previous patch; this restores it.
* compiler: replace late-peephole pattern with real approach.Kaz Kylheku2021-11-261-12/+8
| | | | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks merge-jump-thunks): For each group of candidate jump-blocks, search the entire basic block list for one more jump block which is identical to the others, except that it doesn't end in a jmp, but rather falls through to the same target that the group jumps to. That block is then included in the group, and also becomes the default leader since it is pushed to the front. (basic-blocks late-peephole): Remove the peephole pattern which tried to attack the same problem. The new approach is much more effective: when compiling stdlib, 77 instances occur in which such a block is identified and added! The peephole pattern only matched six times.
* quips: new ones about syntactic sugar.Kaz Kylheku2021-11-261-0/+2
| | | | * stdlib/quips.tl (sys:%quips%): New entries.
* compiler: another late peephole pattern.Kaz Kylheku2021-11-261-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are six hits for this in stdlib, two of them in optimize.tl itself. The situation is like: label1 (instruction ...) (jmp label3) label2 (instruction ...) label3 where (instruction ...) is identical in both places. label1 and label2 are functionally identical blocks, which means that the pattern can be rewritten as: label1 label2 (instruction ...) label3 When the label1 path is taken it's faster due to the elimination of the jmp, and code size is reduced by two instructions. This pattern may possibly the result of an imperfection in the design of the basic-blocks method merge-jump-thunks. The label1 and label2 blocks are functionally identical. But merge-jump-thunks looks strictly for blocks that end in a jmp instruction. It's possible that there was a jmp instruction and the end of the label2 block, which got eliminated before merge-jump-thunks, which is done late, just before late-peephole. * stdlib/optimize.tl (basic-blocks late-peephole): New rule for the above pattern.
* buffers: use unbuffered I/O in convenience functions.Kaz Kylheku2021-11-171-2/+2
| | | | | | | | * stdlib/getput.tl (file-get-buf, command-get-buf): If the number of bytes to read is specified, we use an unbuffered stream. A buffered stream can read more bytes in order to fill a buffer, which is undesirable when dealing with a device or pipe.
* Version 272.txr-272Kaz Kylheku2021-11-111-1/+1
| | | | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated.
* compiler: late-peephole match for a wasteful register move.Kaz Kylheku2021-11-101-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I've noticed a wasteful instruction pattern in the compiled code for the sys:awk-code-move-check function: 7: 2C020007 movsr t2 t7 8: 3800000E if t7 14 9: 00000007 10: 20050002 gcall t2 1 t9 d1 t8 t6 t7 11: 00090001 12: 00080401 13: 00070006 14: 10000002 end t2 Here, the t2 register can be replaced with t7 in the gcall and end instructions, and the movsr t2 t7 instruction can be eliminated. It looks like something that could somehow be targeted more generally with a clever peephole pattern assisted by data-flow information, but for now I'm sticking in a dumb late-peephole pattern which just looks for this very specific pattern. * stdlib/optimize.tl (basic-blocks late-peephole): Add new pattern for eliminating the move, as described above. There are several hits for this in the standard library in addition to the awk module: in the path-test, each-prod and getopts files.
* compiler: avoid eval of unsafe constantp in some situations.Kaz Kylheku2021-11-091-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In situations when the compiler evaluates a constant expression in order to make some code generating decision, we don't just want to be using safe-const-eval. While that prevents the compiler from blowing up, and issues a diagnostic, it causes incorrect code to be generated: code which does not incorporate the unsafe expression. Concrete example: (if (sqrt -1) (foo) (bar)) if we simply evaluate (sqrt -1) with safe-const-eval, we get a diagnostic, and the value nil comes out. The compiler will thus constant-fold this to (bar). Though the diagnostic was emitted, executing the compiled code does not produce the exception from (sqrt -1) any more, but just calls bar. In certain cases where the compiler relies on the evaluation of a constant expression, we should bypass those cases when the expression is unsafe. In cases where the expression will be integrated into the output code, we can test with constantp. The same is true in some other mitigating circumstances. For instance if we test with constantp, and then require safe-const-eval to produce an integer, we are okay, because a throwing evaluation will not produce an integer. * stdlib/compiler.tl (safe-constantp): New function. (compiler (comp-if, comp-ift, lambda-apply-transform)): Use safe-constantp rather than constantp for determining whether an expression is suitable for compile-time evaluation.
* compiler: handle constant expressions that throw.Kaz Kylheku2021-11-081-17/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | When the compiler evaluates constant expressions, it's possible that they throw, for instance (/ 1 0). We now handle it better; the compiler warns about it and is able to keep working, avoiding constant-folding the expression. * stdlib/compiler.tl (eval-cache-entry): New struct type. (%eval-cache%): New hash table variable. (compiler (comp-arith-form, comp-fun-form)): Add some missing rlcp calls to track locations for rewritten arithmetic expressions, so we usefullly diagnose a (sys:b/ ...) and such. (compiler (comp-if, comp-ift, comp-arith-form, comp-apply-call, reduce-constant, lambda-apply-transform)): Replace instances of eval of constantp expressions with safe-const-eval, and instances of the result of eval being quoted with safe-const-reduce. (orig-form, safe-const-reduce, safe-const-eval, eval-cache-emit-warnings): New functions. (compile-top-level, with-compilation-unit): Call eval-emit-cache-warnings to warn about constant expressions that threw. squash! compiler: handle constant expressions that throw.
* doc: spelling, doc-syms refresh.Kaz Kylheku2021-11-031-2/+2
| | | | | | | | * txr.1: Fix spelling errors that have crept in due to read-once and the quasiliteral fixes to matching. * stdlib/doc-syms.tl: Forgotten refresh, needed by the fix to the wrong random-float-inc name.
* compiler: rephrase length check with tree-case.Kaz Kylheku2021-11-021-7/+6
| | | | | | * stdlib/compiler.tl (compiler comp-arith-neg-form): Instead of the length check on the form, we can use a tree case to require three argument.
* compiler: bug: invalid transformation of (- x y ...).Kaz Kylheku2021-11-021-6/+2
| | | | | * stdlib/compiler.tl (compiler comp-arith-neg-form): Remove algebraically incorrect transformation.
* compiler: remove excess call to reduce-constant.Kaz Kylheku2021-11-021-11/+10
| | | | | | | * stdlib/compiler.tl (compiler comp-arith-form): There is no need here to pass the form through reduce-constant, since we are about to divide up its arguments and individualy reduce them, much like what that function does.
* compiler: catch bugfix.Kaz Kylheku2021-11-021-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c8f12ee44d226924b89cdd764b65a5f6a4030b81 tried to fix an aspect of this problem. I ran into an issue where the try code produced a D register as its output, and this was clobbered by the catch code. In fact, the catch code simply must not clobber the try fragment's output register. No matter what register that is, it is not safe. A writable T register could hold a variable. For instance, this infinitely looping code is miscompiled such that it terminates: (let ((x 42)) (while (eql x 42) (catch (progn (throw 'foo) x) (foo () 0)))) When the exception is caught by the (foo () 0) clause x is overwritten with that 0 value. The variable x is assigned to a register like t13, and since the progn form returns x as it value, it compiles to a fragment (tfrag) which indicates t13 as its output register. The catch code wrongly borrows ohis as its own output register, placing the 0 value into it. * stdlib/compiler.tl (compiler comp-catch): Get rid of the coreg local variable, replacing all its uses with oreg.
* compiler: don't lift top-level lambdas.Kaz Kylheku2021-11-011-1/+5
| | | | | | | | | | | | | | | | | | | The compiler is lifting top-level lambdas, such as those generated by defun, using the load-time mechanism. This has the undesireable effect of unnecessarily placing the lambdas into a D register. * stdlib/compiler.tl (*top-level*): New special variable. This indicates that the compiler is compiling code that is outside of any lambda. (compiler comp-lambda-impl): Bind *top-level* to nil when compiling lambda, so its interior is no longer at the top level. (compiler comp-lambda): Suppress the unnecessary lifting optimization if the lambda expression is in the top-level, outside of any other lambda, indicated by *top-level* being true. (compile-toplevel): Bind *top-level* to t.
* compiler: compile-toplevel: bind *load-time* to t.Kaz Kylheku2021-11-011-0/+1
| | | | | | | | * stdlib/compiler.tl (compile-toplevel): Recently, I removed the binding of *load-time* to t from this function. That is not quite right; we want to positively bind it to nil. A new top-level compile starts out in non-load-time. Suppose that some compile-time evaluation recurses into the compiler.
* rel-path, path-equal: native Windows fixes.Kaz Kylheku2021-11-011-19/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The checks for native Windows are incorrect, plus there are some issues in the path-volume function. We cannot check for native Windows at macro-expansion time simply by calling (find #\\ path-sep-chars) because we compile on Cygwin where that is false. What we must do is check for being on Windows at macro-expansion time, and then in the "yes" branch of that decision, the code must perform the path-sep-char test at run-time. In the "no" branch, we can output smaller code that doesn't deal with Windows. * stdlib/copy-file.tl (if-windows, if-native-windows): New macro, which give a clear syntax to the above described testing. (path-split): Use if-native-windows. (path-volume): Use if-native-windows. In addition, fix some broken tests. The tests for a UNC path "//whatever" cannot just test that the first components are "", because that also matches the path "/". It has t be that the first two components are "", and there are more components. A similar issue occurs in the situation when there is a drive letter. We cannot conclude that if the component after the drive letter is "", then it's a drive absolute path, because that situation occurs in a path like "c:" which is relative. We also destructively manipulate the path to splice out the volume part and turn it into a simple relative or absolute path. This is because the path-simplify function deosn't deal with the volume prefix; its logic like eliminating .. navigations from root do not work if the prefix component is present. (rel-path): We handle a missing error case here: one path has volume prefix and the other doesn't. Also the error cases that can only occur on Windows are wrapped with if-windows to remove them at compile time.
* match: fix quasiliteral issue.Kaz Kylheku2021-10-261-0/+1
| | | | | | | | * stdlib/match.tl (compile-match): Handle the (sys:expr (sys:quasi ...)) case by recursing on the (sys:quasi ...) part, thus making them equivalent. This fixes the newly introduced broken test cases, and meets the newly documented requirements.
* pic: use ifa to remove repeated array access.Kaz Kylheku2021-10-261-2/+2
| | | | | | | * stdlib/pic.tl (insert-commas): Use ifa to bind the anaphoric variable it to [num (pred i)]. With the new ifa behavior involving read-place, this now prevents two accesses to the array.
* ifa: take advantage of read-once.Kaz Kylheku2021-10-261-1/+1
| | | | | | | | | | * stdlib/ifa.tl (ifa): When the form bound to the it anaphoric variable is a place, such that we use placelet, wrap the place in (read-once ...) so that multiple evaluations of it don't cause multiple accesses of the place. * txr.1: Documented.
* places: new accessor read-once.Kaz Kylheku2021-10-262-0/+12
| | | | | | | | | | | * lisplib.c (place_set_entries): Trigger autoload on read-once. * stdlib/place.t (read-once): New function and place. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* random: new function random-float-incl.Kaz Kylheku2021-10-251-1/+2
| | | | | | | | | | | | This function includes the 1.0 value excluded by random-float. * rand.c (random_float_incl): New static function. (rand_init): Register random_float_incl intrinsic. * txr.1: Document, and add discussion about uniformity requirements and what they mean and do not mean. * stdlib/doc-syms.tl: Updated.
* compiler: improvement in wasteful jmp elimination.Kaz Kylheku2021-10-232-5/+31
| | | | | | | | | | | | | | | * stdlib/compiler.tl (compiler optimize): After the dataflow-driven peephole optimization, call elim-dead-code again. * stdlib/optimize.tl (basic-blocks check-bypass-empty): New method. (basic-bocks elim-dead-code): After eliminating unreachable blocks from the list, we use check-bypass-empty to squeeze out any empty blocks: blocks that have no instructions in their list, other than the leading label. This helps elim-next-jmp to find more opportunities to eliminate a wasteful jump, because sometimes these jumps straddle over empty blocks. Furthermore, elim-next-jmp can generate more empty blocks itself; so we check for this situation, delete the blocks and iterate.
* compiler: also clear .next before re-linking graph.Kaz Kylheku2021-10-231-0/+1
| | | | | | | | | | | * stdlib/optimize.tl (basic-blocks elim-dead-code): When clearing the links before recalculating the graph, also clear the next field of every block, because link-graph only sets this if necessary, assuming that the value is already nil. Thus by not resetting it, we risk leaving stale values in these .next fields. The code reachability calculation relies on next fields, so if they falsely point to dead blocks, those blocks could be falsely retained.
* compiler: fix failing load-time tests.Kaz Kylheku2021-10-221-2/+1
| | | | | | | | | | | * stdlib/compiler.tl (usr:compile-toplevel): Do not bind *load-time* to t at the top level. The idea behind this binding was to treat load-time as a transparent form that does nothing if it occurs in the top-level since the top-level is already at load-time. However, this is problematic because it breaks the expectation that load-time calculations are factored out of a form and done prior to its evaluation, even if that form is top-level.
* ffi: deffi, deffi-cb: eliminate generated globals.Kaz Kylheku2021-10-221-31/+21
| | | | | | | | | | | | | | | | | | | The immediate problem is that with-dyn-lib creates a defvarl, but deffi uses load-time forms to refer to that. In compiled code, these load-time evaluations will occur before the defvarl exists. The conceptual problem is that with-dyn-lib might not be a top-level form. It can be conditionally executed, as it happens in stdlib/doc-syms.tl, which is now broken. Let's not use load-time, but straight lexical environments. * stdlib/ffi.tl (with-dyn-lib): Translate to a simple let which binds sys:ffi-lib as a lexical variable. (sys:with-dyn-lib-check): Use lexical-var-p to test what sys:ffi-lib is lexically bound as a variable. (deffi, sys:deffi-cb-expander): Instead of gloval defvarl variables, bind the needed pieces to lexical variables, placing the generated defun into that scope.
* path-equal: enable and fix failing tests.Kaz Kylheku2021-10-201-2/+3
| | | | | | | | | * stdlib/copy-file.tl (path-simplify): If the incoming path's first component is "", it is absolute; in that case swallow any components that go above. * tests/018/path-equal.tl: Uncomment two previously failing tests.
* pic: support parenthesis negative notation.Kaz Kylheku2021-10-191-1/+22
| | | | | | | | | | * pic.tl (add-neg-parens): New system function. (expand-neg-parens): New macro. (expand-pic): New numeric pattern with parentheses. Also suport escaping of parentheses. (pic): Recognize parenthesized numeric pattern here also. * tests/018/format.tl: New tests.
* pic: bug: handle ! in digit separator logicKaz Kylheku2021-10-181-1/+1
| | | | | * stdlib/pic.tl (comma-positions): Must also look for ! point if the . point is not found.
* pic: preserve decimal period in ### overflow fill.Kaz Kylheku2021-10-181-1/+4
| | | | | | | | | | | | | | | | | | * pic.tl (expand-pic-num): If the overflowing field specifies a decimal point other than in the rightmost position, then stick one into the fill pattern. The motivation for this is that it harmonizes with the digit separators. The new digit separator insertion logic will treat the # characters like digits, and requires the embedded decimal in order to work properly. Allowing digit separation to work in the fill pattern will make for better looking output in column displays. That's the same reason why we insert digit separators among leading zeros. * tests/018/format.tl: Overflow test cases updated in light of this requirement change. * txr.1: Documented.
* doc: doc-syms refresh.Kaz Kylheku2021-10-181-0/+1
| | | | * stdlib/doc-syms.tl: Updated.
* pic: new feature: digit-separating commas.Kaz Kylheku2021-10-181-3/+51
| | | | | | | | | | | | | | | | This allows for pic patterns like #,###,###.### which incorporate digit separating commas into the output. * stdlib/pic.tl (comma-positions, insert-commas, expand-pic-num-commas): New system functions. (expand-pic): Recogize comma as a character which can be escaped using the tilde. Recognize a more complicated numeric pattern with commas. If the matched token contains commas, treat it using expand-pic-num-commas. (pic): Propagate a copy of the new numeric pattern here, where it is used for separation into tokens. * txr.1: Documented.
* quips: five new ones: quippy day today.Kaz Kylheku2021-10-151-0/+5
| | | | | * stdlib/quips.tl: New quips about rights, Lisp smugness, macros and Reddit.
* path-equal: propagate fixes from rel-path.Kaz Kylheku2021-10-111-12/+10
| | | | | | | * stdlib/copy-file.tl (path-equal): This function is based on rel-path and so suffers the same bugs. Retarget it to use the new functions and approach to volumes from rel-path, so it benefits from the fixes.
* rel-path: multiple bugs for native Windows.Kaz Kylheku2021-10-111-21/+55
| | | | | | | | | | | | | The first bug is that we are using the spl function with pat-sep-chars. But spl does not take a set of characters; we need the sspl function. Other bugs are handling drive letters or UNC paths properly on Windows. * stdlib/copy-file.tl (path-split, path-volume): New functions. (rel-path): Split path properly. Diagnose for all bad combinations of mismatching absolute/relative paths with or without a volume or incompatible volumes.
* New path-equal function.Kaz Kylheku2021-10-101-0/+14
| | | | | | | | | | * lisplib.c (copy_file_set_entries): Add path-equal to autoload symbols. * stdlib/copy-file.tl (path-equal): New function. * tests/018/path-equal.tl: New file. * txr.1: Documented.
* rel-path: refactor, fix diagnostic message.Kaz Kylheku2021-10-101-32/+29
| | | | | | | * stdlib/copy-file.tl (path-simplify): New function. (rel-path): Get rid of macrolet by using macro-time expression; remove flet since canon is now path-simplify at the top level. Fix diagnostic.
* Version 271.txr-271Kaz Kylheku2021-10-051-1/+1
| | | | | | | | | | | | | | * RELNOTES: Updated. * configure (txr_ver): Bumped version. * stdlib/ver.tl (lib-version): Bumped. * txr.1: Bumped version and date. * txr.vim, tl.vim: Regenerated. * protsym.c: Likewise.
* awk: :fields specifies conversions.Kaz Kylheku2021-10-042-43/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * stdlib/awk.tl (sys:awk-compile-time): Slot field-names renamed to field-name-conv. (sys:awk-expander): Parse the new syntax which allows (sym fn) pairs with optional fn, creating a list of normalized items in the field-name-conv slot of the compile-time structure. (sys:awk-symac-let): Adjust the code to the pair representation in field-name-conv. (sys:awk-field-name-code): New function for generating the field conversion code. (awk): Now that we have two optional pieces of code to wrap around p-actions form, we factor that out of the awk-lambda, to a series of conditional assignments. Here we handle the generation of the field conversionns. * conv.tl (sys:conv-expand-sym): New macro, used in sys:awk-field-name-code and sys:conv-let. (sys:conv-let): Simplify with sys:conv-expand-sym. Drop optional argument from i; it connects with no documented feature, and is not usable from fconv. * tests/015/awk-fields.tl: New tests. * txr.1: Updated, including cruft in fconv documentation. Change-Id: Ie42819f58af039fdbcdb1ae365c89dc1add55c93
* ffi: add cptr-carray function.Paul A. Patience2021-10-021-0/+1
| | | | | | | | | | | * ffi.c (cptr_carray): New function. (ffi_init): Register cptr-carray intrinsic. * ffi.h (cptr_carray): Declared. * txr.1: Documented. * stdlib/doc-syms.tl: Updated.
* awk: new :fields feature for named fields.Kaz Kylheku2021-10-011-30/+48
| | | | | | | | | | | | | * stdlib/awk.tl (sys:awk-compile-time): New slot, field-names. (sys:awk-expander): Validate and store field-names into compile-time structure. (sys:awk-symac-let): New macro. (awk): Wrap sys:awk-symac-let around code to generate field name macros. * tests/015/awk-fields.tl: New file. * txr.1: Documented.
* compiler: peephole: recalc and rescan in a few more cases.Kaz Kylheku2021-09-301-0/+9
| | | | | | * stdlib/optimize.tl (basic-block peephole-block): In a few more cases, we should be setting the recalc flag to recalculate liveness, and adding some block to the rescan list.
* compiler: fix up linkage and recalc liveness in one peephole case.Kaz Kylheku2021-09-301-8/+11
| | | | | | | | * stdlib/optimize.tl (basic-blocks peephole-block): Rearrange the code a bit so we don't calculate the xbl, which potentially performs the cut-block, if there is no ybl. We set the bb.recalc flag since we may have cut a block into two and have redirected a jump, and also update the links for that reason.
* compiler: eliminate some redundant hash lookups.Kaz Kylheku2021-09-301-11/+12
| | | | | | | * stdlib/optimize.tl (basic-blocks thread-jumps-block, basic-blocks peephole-block): Streamline various cases of [bb.hash jlabel] being wastefully called twice to look up the same block referenced by the same label.
* compiler: eliminate basic-block next-block method.Kaz Kylheku2021-09-301-23/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The next-block method performs a linear search through the basic block list, which is physically ordered, to find the physically next block. This is actually not needed in several places that use the method; they want the logically next block, which is nil if the last instruction of the current doesn't potentially fall through to the next block. In the one place where we need the physical next block, in the elim-next-jump method, the caller can dynamically provide this, since it walks the list. * stdlib/optimize.tl (basic-block next-block): Method removed. (basic-block link-graph): We revise the logic here a little bit. All of the cases now consistently use the mechanism of setting link-next to nil to indicate that they don't fall through to the next block. The special case handling of the close instruction is clearer. (basic-block (thread-jumps-block, peephole-block)): Several cases here referred to the physically next block via the next-block method. This can be replaced by just using the next pointer, which will be the same. (basic-blocks elim-next-jump): This method now takes the next block as an argument, since there is no next-block method it can call to get the physcally next block. The argument is guaranteed non-null, so we don't need the .? null-safe slot access syntax. (basic-blocks elim-dead-code): Iterate over the next blocks simultaneously, and pass the next block into elim-next-jump. We no longer iterate over the last block, which has no physical next block.
* compiler: cosmetic: merge set assignments.Kaz Kylheku2021-09-301-7/+7
| | | | | | * stdlib/optimize.tl (basic-blocks join-block): Merge set forms into one. (basic-blocks elim-dead-code): Likewise.
* compiler: improvement in next-block linking.Kaz Kylheku2021-09-291-3/+3
| | | | | | | * stdlib/optimize.tl (basic-blocks link-graph): Do not search the entire list for a block's successor. Iterate over the cdr of the list in parallel, so that the next block is directly available at each iteration.
* compiler: remove impossible cases in jump threading.Kaz Kylheku2021-09-291-8/+4
| | | | | | | | | | | | | | * stdlib/optimize.tl (basic-blocks thread-jumps-block): There can't be any instructions in a basic block after an if or ifq, so in these cases, jrest is always nil. Let's ignore that nil efficiently with @nil, and get rid of the cut-block branches of the code. There is a similar case in peephole-block, but the target of the jump is an (end ...) which doesn't necessarily end a basic block. I temporarily put in an (assert (null jrest)), and, surprisingly, it never went off during a rebuild of the library or running of the test case. Still, only a jend ends a basic block; it would not be correct to simplify it like these two cases in thread-jumps-block.
* compiler: peephole: merge basic blocks when jmp removed.Kaz Kylheku2021-09-291-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | When a jmp instruction is removed from (necessarily) the end of a basic block, that basic block can be merged with the next one, and marked for re-scanning. A test case where this eliminates wasteful register-register move instruction is (match #(@a) #(3) a). * stdlib/optimize.tl (basic-blocks): New slot, tryjoin. (basic-blocks join-block): Null out the instruction list of the joined block. This helps if we do this during peephole processing, because it happens in the middle of an iteration over a list of blocks which can still visit the next block that has been merged into its predecesor; we don't want to be processing instructions that are no longer relevant. (basic-blocks peephole-block): In the one case where a conditional instruction is deleted from the end of the basic block, we add the block to the rescan list, and also to the tryjoin list. If the block can be merged with the next one, that can create more opportunities for peephole optimization. (basic-blocks peephole): Use zap in a few places to condense the logic of sampling a state variable that needs to be nulled out. Add the processing of the tryjoin list: pop basic blocks from the list, and try to merge them with their successor, if possible. We handle cases here where the next block could itself be in tryjoin. Also, if we join any blocks, we set the recalc flag to recalculate the liveness info.