summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* lib: remove unnecessary braces in funcall FVM cases.Kaz Kylheku2021-03-251-25/+15
| | | | | * lib.c (funcall, funcall1, funcall2, funcall3, funcall4): Remove unnecessary braces.
* compiler: new jump skid optimization case.Kaz Kylheku2021-03-251-0/+9
| | | | | | | | | | | | | | ifq tN t0 label0 ifq tN t0 label1 ... ... label0: --> ... if tN label2 label1: label1: ... ... * share/txr/stdlib/optimize.tl (basic-blocks thread-jumps-block): Implement case. There are there are 56 "hits" for this in the standard library.
* vm: allow signals during vm execution.Kaz Kylheku2021-03-251-1/+6
| | | | | | | | | | I have experimented with several approaches, and found that this one has an immeasurably small effect on performance, below the noise floor. * vm.c (vm_jmp): Call sig_check_fast whenever there is a backwards branch. (vm_execute): Check for signals once, on entry.
* METALICENSE: fix typo.Kaz Kylheku2021-03-241-1/+1
| | | | * METALICENSE: yeras typo teported by P. A. Patience.
* MacOS: adjust socket-basic test for dgram size.Kaz Kylheku2021-03-241-4/+7
| | | | | | | | | * tests/014/socket-basic.tl (%iters%): New variable. 2000 on MacOS, 5000 elsewhere. (client, server): Use %iters% instead of hard-coded 5000. (test): Rename to sock-test, since it clashes with the test macro coming from common.tl, which we neeed to for the os-symbol function.
* M1: Fix sed issue.Kaz Kylheku2021-03-241-1/+5
| | | | | | | | | | | | | | | | | The MacOS sed doesn't accept semicolon termination for commands involving defining labels or branching to labels. This is not a new issue, but it is revealed on newer MacOS because the sed now complains about unused labels. In the sed command ':x; /whatever/ { ...; tx }', everything after the initial : is interpreted as a label. * Makefile (DEPGEN): Split the sed command's syntax up into logical lines using multiple -e commands, applying some formatting with indentation to try to keep it readable. It looks like multiple -e options just glue together to make a program, as if they were lines of code; one -e can define a label referenced by another, and even the closing brace can be treated as a separate command.
* M1: disable -nopie.Kaz Kylheku2021-03-241-4/+2
| | | | | * configure: Don't try to use -nopie in Darwin environments. With newer clang on the M1, it generates nuisance warnings.
* build: port to Apple M1.Kaz Kylheku2021-03-232-2/+15
| | | | | | | | | | * configure (lit_align): Also don't calculate lit_align as 2 on Darwin/arm64; that is a hack needed on 32 bit x86 Mac OS. * jmp.S: Define DEFUN for Arm64 on Darwin. We have to use %% as the statement separator; semicolons are comments. Use the same __aarch64__ code for __arm64__. We just need some Apple specific tidbits before and after.
* build: use yacc pattern rule with "y" as stem.Kaz Kylheku2021-03-221-9/+1
| | | | | | | | | | | | | | | | Thanks to a recent discussion in the GNU Make mailing list about some issue related to grouped patterns, I hit upon the realization that the rule y.tab.h y.tab.c: parser.y has a common stem, y, which can be exploited to write a pattern rule. This is only active in maintainer mode; in user mode, the y.tab.h and y.tab.c are separately produced from .shipped files. * Makefile (y.tab.h): Eliminate rule which detects a removed y.tab.h. (y.tab.c): Delete rule, replacing with new rule. (%.tab.c %.tab.h): New pattern rule, which groups the targets together.
* ffi: support float type as variadic argument.Kaz Kylheku2021-03-222-8/+69
| | | | | | | | | | | | | | | | | | | The float type promotes to double when passed as a variadic argument. This patch adds internal FFI types which models that promotion. It uses double for its C type, while still performing the range checks for float. Also, the types be-float and le-float are rejected from being variadic arguments. * share/txr/stdlib/ffi.tl (analyze-argtypes): Rewrite function, adding validation and substitution for the variadic part of the argument type list. Map float type to double, and reject be-float and le-float. * txr.1: Documented.
* ffi: fix missing support for misaligned ushort.Kaz Kylheku2021-03-221-0/+2
| | | | | | | | | * ffi.c (ffi_ushort_get): Add the missing align_sw_get macro that checks for and handles misalignment. Commit e539fbc2af3bd4a5bd4ca3d9b567862005509aa1 made on 2017-06-05 added these macros and fixes to numerous functions. The commit message claims that ffi_ushort_get was among the functions fixed, but no change was made to the function.
* ffi: ffi_uchar_put: statement after declaration.Kaz Kylheku2021-03-221-1/+1
| | | | * ffi.c (ffi_uchar_put): Fix for C90 compat and consistency.
* compiler: improve end-propagating optimization.Kaz Kylheku2021-03-171-7/+13
| | | | | | | | | | | | | | | | | | | | When a jmp instruction is replaced by the end instruction that it jumps to, or possibly by a two-instruction sequence ending in end, there can be more opportunities to optimize. For instance, the second reduction in a sequence like this: mov t3, t5 mov t3, t5 jmp label --> end t3 --> end t5 ... label: end t3 * share/txr/stdlib/optimize.tl (basic-blocks peephole-block): If the end-propagation is done, we change the linkage of the current block to indicate that it has no next blocks. We add it to the rescan list and set the recalc flag so the liveness information is updated.
* compiler: use registers for function parameters.Kaz Kylheku2021-03-171-67/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | If a function has nothing but parameters that are not captured in lexical closures, they can be converted registers. The function then doesn't need a variable frame for its parameters. This is similar to the eliminate-frame optimization, and borrows the same code and logic. * share/txr/stdlib/compiler.tl (compiler eliminate-frame): We no longer assume that the code coming in starts with a frame instruction we can eliminate using (cdr code) and an end insruction we can eliminate with a trailing pattern. This is because when this function is used for a lambda, this is not the case; a lambda's variable frame is implicit, created by the VM for any lambda with a nonzero frame size, rather than by a frame instruction. (compiler comp-let): In the call to eliminate-frame, we now trim away the first and last instruction, to get rid of the (frame ...) and (end ...). (compiler comp-lambda-impl): Install a closure spy against the variable frame to detect which variables are captured in closures, similarly to in comp-let. Under the right conditions, pass the code through eliminate-frame to turn the variables into registers. The close instruction has to be rewritten, because the frame size is now zero, and the number of t registers has changed.
* poll: iterate sequences efficiently.Kaz Kylheku2021-03-162-7/+11
| | | | | | | * sysif.c (poll_wrap): Use seq_iter for efficience when poll_list is a vector or other generalized sequence. * txr.1: Change wording to say that poll-list is a sequence.
* compiler: split variable spies into two types.Kaz Kylheku2021-03-161-33/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have the situation that there are effectively two kinds of spies: let constructs plant spies only in order to learn about what variables are being captured, whereas lambdas plant spies in order to intercept variable accesses (and to inform the spies that are interested in what is captured). Let us split these up into two separate types, with different methods, in different stacks. * share/txr/stdlib/compiler.tl (struct var-spy): Renamed to closure-spy. (var-spy accessed, var-spy assigned): Methods removed: the closure-spy type has only the captured method. (struct capture-var-spy): Renamed to accesss-spy. (capture-var-spy var-spies): Renamed to access-spy closure-spies. (capture-var-spy captured): Method removed; the access spy is doesn't receive captured calls. (struct compiler): Slot var-spies removed, replaced with closure-spies and access-spies. (with-var-spy): Macro removed. (with-spy): New function. (with-closure-spy, with-access-spy): New macros. (compiler push-var-spy, compiler pop-var-spy): Methods removed. (compiler push-closure-spy, compiler pop-closure-spy, compiler push-access-spy, compiler pop-access-spy): New methods. (compiler comp-var, compiler comp-setq, compiler comp-lisp1-setq, compiler comp-lisp1-value): Walk new access-spies list rather than var-spies to report about accesses and assignments. (compiler comp-let): Use with-closure-spy macro rather than with var-spy. The spy object is now a closure-spy type, and the variable is cspy rather than vspy. (compiler comp-lambda-impl): Use with-access-spy instead of with-var-spy. The spy object is now of type access-spy. It refers to the current me.closure-spies from the compiler.
* tests: fix retest logic for parallel operation.Kaz Kylheku2021-03-161-7/+5
| | | | | | | | | I'm giving up and just using a recursive make invocation for the "retest" target. * Makefile (retest): Depend on nothing. Remove the tst directory and invoke make tests recursively. (tests.clean): Target removed.
* compiler: trim unused accumulation from var-spy.Kaz Kylheku2021-03-161-8/+2
| | | | | | | | | | | The var-spy structure is only being used for detecting captured variables, not assigned or accessed variables. So the information about accessed and assigned is being wastefully accumulated, never used. * share/txr/stdlib/compiler.tl (struct var-spy): Remove slots acc-vars and set-vars. (var-spy accessed, var-spy assigned): Methods become no-ops.
* compiler: eliminate unused list in optimizer.Kaz Kylheku2021-03-161-12/+9
| | | | | | | * share/txr/stdlib/optimize.tl (basic-blocks elim-dead-code): Ironically, we have a dead variable named unreachable in this function. Let's remove it and the build form which calculates it.
* doc: improve predicate pattern description.Kaz Kylheku2021-03-151-5/+10
| | | | | * txr.1: Fix wrong word; clarify what is first form and second form.
* doc: pattern matching: revise, take out of beta.Kaz Kylheku2021-03-151-67/+140
| | | | | | | | * txr.1: structural pattern matching intro is divided into sections, rearranged and reworded, with additional detail, especially in regard to variables. Also, "a the" typo fixed in list pattern notation section. Pattern matching is now taken out of beta; the paragraph about the beta status is removed.
* doc: formally document meta-syntax.Kaz Kylheku2021-03-151-7/+84
| | | | | | | | | * txr.1: Add a missing section defining meta-symbols, meta-numbers and meta-expressions, which explains the sys:var and sys:expr shorthand. Also documented are corner cases involving deceptive floating-point syntax. Everywhere in the document, we normalize "meta-variable" as "meta-symbol".
* cat-str: seq_iter conversion,Kaz Kylheku2021-03-142-13/+30
| | | | | | | * lib.c (cat_str): Traverse sequences of strings efficiently using seq_iter framework. * txr.1: Document.
* lib: fix neglect to use self variable.Kaz Kylheku2021-03-145-31/+33
| | | | | | | | | | | | | | | | * arith.c (toint): Use self instead of repeating function name in diagnostic. * ftw.c (ftw_wrap): Likewise. * glib.c (glow_wrap): Likewise. * rand.c (make_random_state, random): Likewise. * lib.c (nreverse, reverse, remove_if, int_str, less, chr_str, chr_str_set, unintern, rehome_sym, in): Likewise. (vector, obj_print_impl): Pass self to helper function instead of repeating identical literal.
* doc: rewrite macro intro.Kaz Kylheku2021-03-121-20/+31
| | | | * txr.1: Rewrote Macros section intro paragraphs.
* doc: mistaken reference to command-getKaz Kylheku2021-03-121-1/+1
| | | | | * txr.1: The description of command-get-buf mistakenly refers to command-get.
* lib: fix hard-coded cat-str in diagnostic.Kaz Kylheku2021-03-121-2/+2
| | | | | | * lib.c (cat_str_measure): Use the self parameter in diagnostics rather than cat-str, so errors are reported against the correct function.
* Version 254txr-254Kaz Kylheku2021-03-106-1231/+1262
| | | | | | | | | | * RELNOTES: Updated. * configure, txr.1: Bumped version and date. * share/txr/stdlib/ver.tl: Bumped. * txr.vim, tl.vim: Regenerated.
* compiler: eliminate unused global symbol accesses.Kaz Kylheku2021-03-101-1/+1
| | | | | | * share/txr/stdlib/optimize.tl (basic-blocks peephole-blocks): Extend dead reg mov pattern to also handle the getlx, getv, getf and getfb instructions.
* compiler: use effect-free criterion for elimination.Kaz Kylheku2021-03-102-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | When the result of a function call is not used, the call can be eliminated if the function has no effects. Effect-free functions are superset of constant-foldable functions. Not all effect-free functions are const-foldable because in some cases, creating an object at run-time is a documented semantics which cannot be constant-folded. For instance (list 1) cannot be constant folded, because it may be relied upon to generate a fresh object each time it is called. However, bringing a new list to life is not an effect. If the value is not used, we can safely eliminate it. The same reasoning applies to (gensym "abc"). It must generate a unique symbol each time, and so cannot be constant-folded. But call to gensym whose value is not used can be eliminated. * share/txr/stdlib/compiler.tl (%effect-free-funs%): List of side-effect-free functions registered in eval.c. (%effect-free%): Hash of effect-free functions, incorporating %const-foldable% also. * share/txr/stdlib/optimize.tl (basic-blocks peephole-block): Refer to %effect-free% rather than %const-foldable%.
* compiler: more const-foldable functions.Kaz Kylheku2021-03-101-6/+7
| | | | | * share/txr/stdlib/compiler.tl (%const-foldable-funs%): Add rest, nilf, tf, join, join-with, empty.
* compiler: eliminate unused closures.Kaz Kylheku2021-03-101-1/+10
| | | | | | | | | | | | | | | | | | | In this patch, we optimize away closures that are not used. Unused closures that have been hoisted to the top level are not affected. We look for close instructions which produce a dead treg, and rewrite these to jmp instructions to the same label. When this happend, we set a flag for a dead code elimination pass to be done again, to actually remove the now unreachable closure code. * share/txr/stdlib/optimize.tl (struct basic-blocks): New slot, reelim, indicating that dead code elimination is to be done again after peephole since the control flow graph has changed. (basic-blocks peephole-block): New pattern for eliminating a dead register, targeting the close instruction. (basic-blocks peephole): After peephole, check whether the reelim flag has been set and do elim-dead-code again.
* compiler: eliminate dead calls.Kaz Kylheku2021-03-102-74/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main idea in this patch is to identify calls functions whose values are not used and which have no side effects. For this purpose, we borrow the same set of functions that we use as targets for constant folding: those in the compiler's %const-foldable% hash. A call is dead if it is a gcall or gapply instruction which calls one of these functions, and the destination register is dead. To maximize the opportunities for this elimination, whenever we eliminate such an instruction, we mark the block for re-scanning, and we re-calculate the liveness info for that block and then globally. * share/txr/stdlib/compiler.tl (struct compiler): New slots, datavec and symvec. (compiler get-datavec): Cache the calculated datavec in the new datavec slot. (compiler get-symvec): Cache the calculated symvec in the new symvec slot. (compiler optimize): Pass the symvec to the basic-blocks BOA constructor; it is now needed for identifying functions that are referenced by symvec index number in the gcall and gapply instructions. * share/txr/stdlib/optimize.tl (struct basic-blocks): New symvec slot, added to the the BOA parameter list also. New slot recalc, indicating re-calculation of liveness is needed. (basic-blocks cut-block): Use pushnew to put bl onto the rescan list, to prevent duplicates. Also push the new block onto the rescan list. (basic-blocks local-liveness): This method can now be called again to re-calculate local liveness, so we must reset the live slot to nil. (basic-blocks calc-liveness): Take an optional list of blocks to scan for local liveness, defaulting to all of them. (basic-blocks peephole-block): Factor out some register liveness tests to local functions. Add a pattern targetting gcall and gapply instructions, testing for a dead register from a call to a %const-foldable% function. Use pushnew everywhere to add to the rescan list. Set the recalc flag when the liveness-based reductions are applied. (basic-blocks peephole): If there are blocks to be scanned again, then if the recalc flag is set, recalculate local liveness for all the blocks to be re-scanned and re-do global liveness.
* sort: bugfix: broken for vectors/strings.Kaz Kylheku2021-03-101-1/+2
| | | | | * lib.c (sort): Return the copied and sorted object object, not the original.
* lib: new functions join, join-with.Kaz Kylheku2021-03-095-17/+85
| | | | | | | | | | | | | | | | | | | | | | | | | That old cat-str function is often a pain, requiring the pieces as a list. We have a sys:fmt-join that is undocumented. That functions is now exposed as usr:join, and documented. Also introducing join-with that takes a separator as the leftmost argument. Thus (op join-with "::") gives us a function that joins pieces with :: in between. * eval.c (eval_init): Regiser fmt_join function under join symbol also. Register join-with. * lib.c (join_with): New function. (fmt_join): Now a wrapper for join_with that passes a nil separator. * lib.h (join_with): Declared. * share/txr/stdlib/optimize.tl (basic-blocks join-blocks): Rename the local function join, which now triggers a warning about a standard function being redefined. * txr.1: Redocumented cat-str, and documented join-with and join.
* compiler: optimization control.Kaz Kylheku2021-03-084-107/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | * lisplib.c (compiler_set_entries): Register *opt-level* symbol for auto-loading. * share/txr/stdlib/compiler.tl (*opt-level*): New special variable. (compiler comp-let): Eliminate frames only at level 3. (compiler comp-lambda-impl): Lift load time at level 3. (compiler comp-arith-form): Constant-folding only at lvl 1. (compiler comp-fun-form): Algebraic substitutions and reductions and constant-folding only at level 1. (compiler comp-apply-call): Constant folding at level 1. (compiler optimize): Optimizations off if level zero. Thread jumps and eliminate dead code at level 2. Flow-analysis based optimizations at level 3. Additional optimizations at level 4. (compile comp-block): Block elimination at level 3. (compile-toplevel): Rebind *opt-level*, giving it value zero if it is previously nil. * share/txr/stdlib/optimize.tl (basic-blocks get-insns): Just retrieve the instructions, letting caller decide whether to call late-peephole or not. * txr.1: Documented *opt-level*.
* Version 253txr-253Kaz Kylheku2021-03-066-297/+337
| | | | | | | | | | * RELNOTES: Updated. * configure, txr.1: Bumped version and date. * share/txr/stdlib/ver.tl: Bumped. * txr.vim, tl.vim: Regenerated.
* hashing: bug: hash-equal zero: floats and bignums.Kaz Kylheku2021-03-051-2/+2
| | | | | | hash.c (equal_hash): Do not multiply by the seed if it is zero; substitute 1. Otherwise under the default seed of zero, the hash becomes zero.
* compiler: streamline load-time hoisting of calls.Kaz Kylheku2021-03-041-17/+23
| | | | | | | | * share/txr/stdlib/compiler.tl (compiler comp-fun-form): Rearrange the logic so that we only try the speculative compilation when the three main conditions are right, not before. This drastically reduces the number of times we need to take the compiler snapshot.
* compiler: bug: duplicate code in load-time lifting.Kaz Kylheku2021-03-041-5/+22
| | | | | | | | | | | | | | | | | | | | | | | This issue affects the original code which lifts lambdas to load-time, as well as the new, recently added code for similarly lifting functional combinator expressions. The problem is that the trick works by compiling an expression twice. The result of the first compile is thrown away in the case when we compile it again in the load-time context. But compiling has a side effect: the expression itself may have an embedded load-time-liftable expression, which gets deposited into the load-time fragment list. Thus garbage ends up in the list of load-time fragments. We likely want to save and restore other things, like allocated D regisers. * share/txr/stdlib/compiler.tl (compiler shapshot, compiler restore): New methods. (comp-lambda-impl, comp-fun): Save a snapshot of the compiler state before doing the speculative compilation. If we don't use that compilation, we restore the state from the snapshot.
* compiler: frame depth bug in load-time.Kaz Kylheku2021-03-041-6/+3
| | | | | | | | | | | | | | * share/txr/stdlib/compiler.tl (compile-in-toplevel): This macro must not save and restore the nlev variable of the compiler. The nlev variable is a maximum: it measures the maximum environment depth, establishing the depth of the display needed when executing the code of the resulting VM description. By saving and restoring this variable around the compilation of a load-time form, we cause the load-time form's contribution to the maximum to be ignored. If the load-time form perpetrates nesting that is deeper than other code, it will execute with an under-sized display, causing an assertion crash or corruption.
* math: defend against locale decimal separator.Kaz Kylheku2021-03-041-0/+14
| | | | | | | | | | | | | | The int_flo function also has a sensitivity to locale because the bignum handling is text-based, involving printing a floating-point value, and then assuming it contains a period decimal separator. * arith.c (int_flo): If CONFIG_LOCALE_TOLERANCE is enabled, look for dec_point rather than '.' in the formatted number. When matching and destructuring the number with sscanf, don't look for the '.' character, but rather a complemented character set which can maching nothing but the separator, whatever that is.
* lib: defend against locale-specific wcstod.Kaz Kylheku2021-03-043-3/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wcstod function has locale-specific behavior. It uses a locale-specific decimal separator character, which may not be the period. Though in TXR we never call setlocale in order to activate localization, it is a good idea to put in code to defend against this. If locale is ever unleashed on the code, it really botches our floating-point handling. However, let's keep that defensive logic disabled for now using the preprocessor. The strategy is to inquire about the locale's decimal character at startup. Then, in the flo_str function, we preprocess the input by converting the decimal period to the locale-specific character before calling wcstod. On the output side, we also deal with it in the format function; we call sprintf, and then convert the locale-specific characer to period. I tested all this by temporarily introducing the setlocale call, and switching to a locale with a comma as the separator, geting make tests to pass, and doing some interactive testing. This is not going to receive coverage in the test suite. * lib.c (dec_point): New global variable. (flo_str): If dec_point isn't '.', we must copy the string into a local buffer, which we get from alloca, and edit any '.' that it contains to dec_point. (locale_init): New function; initializes dec_point. (init): Call locale_init. * lib.h (dec_point): Declared. * stream.c (formatv): Don't look for a '.' in the result of printing a double using %e; look for dec_point. Post-process floating-point sprinf by substituting occurrences of dec_point with the period.
* compiler: another late-peephole pattern.Kaz Kylheku2021-03-031-0/+21
| | | | | | | | | | | * share/txr/stdlib/optimize.tl (late-peephole): Add a 6-instruction pattern which is seen in pattern-matching code in optimize.tlo and compiler.tlo. This can be tidied up a little bit, reducing the instructions to 5, eliminating redundant comparisons and threading a jump. At this point we are probably working too hard for too little gain. More effort in other areas like common subexpression elimination could produce bigger wins.
* compiler: new late-peephole pass.Kaz Kylheku2021-03-031-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | I noticed a lot of this in compiled code: (if reg label1) label2 (jmp label3) label1 by inverting the test, this can be shortened to (ifq reg (t 0) label2) label1 We must keep label1 in case it is the target of other branches. Also, we only perform this if label2 is unreachable: the (jmp label3) instruction is reached only when the previous if branch is not taken. * share/txr/stdlib/optimize.tl (basic-blocks get-insns): Call new late-peephole method. (basic-blocks late-peephole): New method, incorporating the above pattern.
* compiler: lift uslot and umethod forms too.Kaz Kylheku2021-03-031-1/+1
| | | | | | | | | | | | | | | | The uslot and umethod functions produce functions; and should be lifted to load-time, if possible. For instance, the .foo expression [mapcar .foo list] translates to a (uslot 'foo) function call. This references no variables, and so is liftable to load-time. The umethod function is an extension of uslot that allows partially applied method arguments to be carried. If those arguments are all functional, the umethod call is liftable. * share/txr/stdlib/compiler.tl (%functional-funs%): Include umethod and uslot.
* lib: remove unnecessary load-time forms.Kaz Kylheku2021-03-032-10/+9
| | | | | | | | | | | | | | Because of the previous optimization, some load-time forms that appear in the library are unnecessary. * share/txr/stdlib/optimize.tl (basic-blocks merge-jump-thunks): Remove load-time around functional combinators. * share/txr/stdlib/socket.tl (sys:in6addr-condensed-text): Remove one load-time that is now unnecessary, and one around an op which was unnecessary even at the time it was written, since the lambda is lifted without it.
* compiler: lift functional expressions to load-time.Kaz Kylheku2021-03-031-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea behind this optimization is that certain expressions that only calculate functions can be hoisted to load time. These expressions meet these criteria: 1. Are not already in a top-level or load-time context. 2. Are function calls to a standard library functional operator like chain andf, orf, juxt, ... 3. Do not access any variables. 3. Do not access any functions other than public (usr package) global functions in the standard library. An example of such an expression might be: [chain cdr [iff symbolp list]] If such an expression is embedded in a function, we don't want the function to re-calculate it every time, which requires time and generates garbage. We can transform it to the equivalent of: (load-time [chain cdr [iff symbolp list]]) to have it calculated once. * share/txr/stdlib/compiler.tl (%functional-funs%, %functional%): New global variables. (compiler comp-fun-form): After compiling the function call, check for the conditions for lifting. If so, compile the form again as a load-time literal. The logic is similar to how lambdas are lifted to load-time, though the conditions are different.
* compiler: merge duplicate jump blocks.Kaz Kylheku2021-03-022-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've noticed that some duplicate short blocks are generated, which look like this. Often they are consecutive, which is why they are noticeable. These can become one block: mov t17 d3 jmp label17 mov t17 d3 jmp label17 mov t17 d3 jmp label17 We identify identical blocks by looking for short instruction sequences that end in an unconditional jmp, and then we group duplicates in a hash table keyed on the instruction sequence. We must ignore the label: the first instruction in each block is a unique label. We have to be careful about which ones to delete. Any block that is entered from the top must be preserved. When these blocks are identified, at least one block must remain that is removable for the optimization to be able to do anything. If all blocks are removable, we pick a leader which is preserved. Otherwise we pick a leader from the preserved blocks. The non-preserved blocks are deleted, and all jumps to them from other blocks are redirected to jumps to the leader. * share/txr/stdlib/compiler.tl (optimize): Call merge-jump-thunks as the last pass. * share/txr/stdlib/optimize.tl (basic-blocks merge-jump-thunks): New method.
* compiler: basic-block print method for debugging.Kaz Kylheku2021-03-021-1/+10
| | | | | | * share/txr/stdlib/optimize.tl (basic-block print): Print basic blocks such that related blocks are printed by their label rather than the whole graph.