| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When functions are optimized away due to constant folding,
instead of replacing them with a nil, we now compact the
table to close the gaps and renumber the references in the
code.
* stdlib/compiler.tl (compiler null-stab): Method removed.
(compiler compact-dregs): Renamed to compact-dregs-and-syms.
Now compacts the symbol table also. This is combined with
D-reg compacting because it makes just two passes through
the instruction: a pass to identify the used D registers
and symbol indices, and then another pass to edit the
instructions with the renamed D registers and renumbered
symbol indices.
(compiler optimize): Remove the call to the null-unused-data
on the basic-blocks object; nulling out D regs and symbol
table entries is no longer required. Fllow the rename of
compact-dregs to compact-dregs-and-syms which is called
the same way otherwise.
* stdlib/optimize.tl (basic-blocks null-unused-data):
No longer used method removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now have some constant folding in the optimizer too, not
just in the front end compiler pass. This is leaving behind
dead D registers that are not referenced in the code.
Let's compact the D register table to close the gap.
* stdlib/compiler.tl (compiler get-dreg): In this function
we no longer check that we have allocated too many D
registers. We let the counter blow past %lev-size%.
Because this creates the fighting chance that the compaction
of D regs will reduce their number to %lev-size% or less.
By doing this, we allow code to be compilable that otherwise
would not be: code that allocates too many D regs which
are then optimized away.
(compiler compact-dregs): New function. Does all the work.
(compiler optimize): Compact the D regs at optimization
level 5 or higher.
(compile-toplevel): Check for an overflowing D reg count
here, after optimization.
* stdlib/optimize.tl (basic-blocks null-unused-data):
Here, we no longer have to do anything with the D registers.
|
|
|
|
|
|
|
|
| |
* stdlib/compiler.tl (compiler get-dreg): Fix
indentation proble.
* stdlib/optimize.tl (basic-block fill-treg-compacting-map):
Likewise.
|
|
|
|
|
|
|
| |
* hash.c (hash_iter_peek): The loop here must be a top-of-test
while loop, not a bottom-test do loop. In the chained hashing
implementation, this was a do loop, but it also had a test
with a break for the index.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/path-test.tl (path-volume): Don't return
:abs for a path whose empty first component
isn't followed by any more items. Otherwise
we return :abs for a path formed by splitting
the empty string, and then calls like (rel-path "" "a")
complain about a mixture of absolute and relative.
With this change, empty paths given to rel-path
behave as if they were ".".
* tests/018/rel-path.tl: New test cases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit does two things. The replace function, implemented
under the hood by four specializations: replace-list, replace-vec,
replace-str and replace-buf, will handle the index-list case
a little differently. This is needed to fix the ability of the
del macro work on place designated by an index list, such as:
(del [sequence '(1 3 5 6)]
which now deletes elements 1, 3, 5 and 6 from the sequence,
and returns a sequence of those items. The underlying
implementation uses replace with an index-list, which is now
capable of deleting items. Previously, replace would stop
processing the index list when the replacement-sequence
corresponding to the index list ran out of items. Now,
when the replacement-sequence runs out of items, the
remaining index-list sequence elements specify items to
be deleted. For instance if str holds "abcdefg" then:
(set [str '(1 3 5)] "xy")
will change str to "axcyeg". Elements 1 and 3 are replaced
by x and y, respectively. Element 5, the letter f, is
deleted, because the replacement "xy" has no element
corresponding to 5.
* lib.c (replace_list, replace_str, replace_vec): Implement
new deleteion semantics for the case when the replacement
sequence runs out of items.
* buf.c (replace_buf): Likewise.
* tests/010/seq.txr: Some new test cases here for
deletion.
* tests/010/seq.expected: Updated.
* txr.1: Documented new semantics of replace, including
a new restriction that if elements are being deleted,
the indices should be monotonically increasing regardless
of the type of the sequence (not only list).
A value of 289 for the -C option documented, which restores
the previous behavior of replace (breaking deletion by
index-list, unfortunately: you don't always get to
simulate an old version of TXR while using new features.)
|
|
|
|
|
|
| |
* txr.1: fix misspelling of Arithmetic and algorithm
introduced by recent updates: new math functions,
and documentation of CRC-32 parameters.
|
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks do-peephole-block): Extend
constant-folding case to recognize gapply as well as gcall.
We just have to take care in how we apply apply arguments
to the actual function to get the value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's consider the DIM expression [a . b c]. Without
this change we get:
syms:
0: a
1: b
2: c
code:
0: 98020000 getf t2 0
1: 98030001 getf t3 1
2: 98040002 getf t4 2
3: 1C020002 apply t2 t2 t3 t4
4: 00030002
5: 00000004
6: 10000002 end t2
With this change:
syms:
0: a
1: b
2: c
code:
0: 98030001 getf t3 1
1: 98040002 getf t4 2
2: 24020002 gapply t2 0 t3 t4
3: 00030000
4: 00000004
5: 10000002 end t2
There are 17 hits for this optimization in optimize.tl
alone!
* stdlib/optimize.tl (basic-blocks do-peephole-block):
New pattern here. We recognize an instruction sequence
which begins with a (getf treg idx) and ends in
an (apply dest treg ...), without any instructions in
between accessing or redefining treg. Additionally,
the treg value must not be used after the apply,
unless the apply redefines it. In that case, we
rewrite this pattern to eliminate that initial getf
instruction, and substitute (gapply dest idsx ..)
for the apply.
|
|
|
|
|
|
| |
* eval.c (op_defsymacro, rt_defsymacro, makunbound,
fmakunbound): Don't call vm_invalidate_binding if there
is no binding for the symbol.
|
|
|
|
|
| |
* lib.c (obj_print_impl): Check for this case and handle.
The #nil syntax is not readable.
|
|
|
|
|
|
|
|
| |
* eval.c (op_defsymacro, rt_defsymacro): We must call
vm_invalidate_binding so the VM forgets a cached binding
for this variable.
* tests/019/redef.tl: Test added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a bug in rt_defun in that it was not calling
vm_invalidate_binding. This mean that compiled functions
were not picking up redefinitions. This bug is fixed now
because rt_defun now calls sethash on the top_fb directly,
which modifies the existing binding cell; it is not
allocating a new cell.
We put in new test cases to confirm the proper redefinition
behaviors.
The proper redefinition behavior exposes an issue in
pattern matching.
* tests/019/redef.tl: New file.
* stdlib/match.tl (transform-quote): This function's compiled
image, when deposited into a .tlo file, becomes incorrect
because (sys:hash-lit) turns into #H() syntax, which reads
back as something else. In other words (sys:hash-lit)
deosn't have print-read consistency and so doesn't
externalize. To fix this right we would need a print mode
which ensures machine readability rather than human
readability, like in Common Lisp. For now, we just break up
the pattern so that it's not a literal match. This bug was
hidden due to theredefinition issue. When match.tl is
being compiled, it defines non-triv-pat-p twice. Due to
redefinitions not kicking in properly, the first definition
of non-triv-pat-p remains in effect for some functions.
When transform-qquote is being expanded, the (sys:hash-lit)
pattern is treated as non-trivial, even though it is
is trivial, and so it is turned into pattern matching code.
The code doesn't contain a (sys:hash-lit) literal and so
the issue doesn't occur.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like was done with the function and variable top-level
environments, we simplify the macro ones.
* eval.c (func_get_name, lookup_mac, lookup_symac,
lookup_symac_lisp1, op_defsymacro, rt_defsymacro,
rt_defmacro, op_defmacro, reg_mac, reg_symacro):
Adjust to simpler representation where the hash cell
itself is the binding cell, rather than holding a
binding cell in its cdr.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since their inception, the top_fb and top_fb hashes associated
symbols with bindings cells, causing an extra cons cell to be
allocated for each entry. I don't remember why this is. It
might have been that way so that gethash(top_fb, sym) would
return a cell when the variable exists, or else nil. This was
before we had functions like gethash_e and inhash that return
the hash cell itself. A hash cell is also a cons and can serve
as a binding just fine by itself.Let's make it so.
For now, the macro and symbol macro environments stay the
way they are; I will likely convert them also.
* eval.c (env_fbind, env_vbind, lookup_global_var, lookup_sym_lisp1,
lookup_fun, func_get_name, rt_defv, rt_defun, set_symbol_value,
reg_fun, reg_varl): Update all these functions so they treat the
hash cell from top_vb or top_fb as the binding cell, rather than
putting or expecting the cdr of that cell (i.e the hash value)
to be a binding cell.
* hash.[ch] (gethash_d): New function. Jus gethash_e without the
pesky self argument, that would only be needed for error
reporting if we pass an object that isn't a hash.
* stdlib/place.tl (sys:get-fun-getter-setter, sys:get-vb):
These two functions must be adjusted since they refer to the
top-fb and top-vb variables. sys:get-vb isn't used anywhere;
it continues to exist in order to provide run-time support
to files that were compiled with an buggy version of the
symbol-value place.
|
|
|
|
| |
* eval.c (rt_defv): Report as sys:rt-defv, not sys:defv.
|
|
|
|
|
|
| |
* stdlib/optimize.tl (basic-blocks do-peephole-block): The
constant folding case should fire even if some of the
arguments of the call aren't D registers but T0.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/compiler.tl (%effect-free-funs%, %effect-free%,
%functional-funs%, %functional%): Move variables
into stdlib/constfun.tl
* stdlib/constfun.tl %effect-free-funs%, %effect-free%,
%functional-funs%, %functional%): Moved here.
* stdlib/optimize.tl: Use load-for to express dependency
on constfun module; don't depend on the compiler having
loaded it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler handles trivial constant folding over the
source code, as a source to source transformation.
However, there are more opportunities for constant folding
after data flow optimizations of the VM code.
Early constant folding will not fold, for instance,
(let ((a 2) (b 3)) (* a b))
but we can reduce this to an end instruction that returns
the value of a D register that holds 6. Data flow optimizations
will propagate the D registers for 2 and 3 into the gcall
instruction. We can then recognize that we have a gcall with
nothing but D register operands, calling a constant-foldable
function. We can allocate a new D register to hold the result
of that calculation and just move that D register's value
into the target register of the original gcall.
* stdlib/compiler.tl (compiler get-dreg): When allocating
a new D reg, we must invalidate the datavec slot which is
calculated from the data hash. This didn't matter before,
because until now, get-datavec was called after compilation,
at which point no new D regs will exist. That is changing;
the optimizer can allocate D regs.
(compiler null-dregs, compiler null-stab): New methods.
(compiler optimize): Pass self to constructor for basic-blocks.
basic-blocks now references back to the compiler.
At optimization level 5 or higher, constant folding can
now happen, so we call the new method in the optimizer to
null the unused data. This overwrites unused D registers
and unused parts of the symbol vector with nil.
* stdlib/optimize (basic-blocks): Boa constructor now takes
a new leftmost param, the compiler.
(basic-blocks do-peephole-block): New optimization case:
gcall instruction invoking const-foldable function, with
all arguments being dregs.
(basic-blocks null-unused-data): New method.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure: Detect all the new functions, with separate
tests for the unary and binary ones.
* arith.c (cbrt_s, erf_s, erfc_s, exp10_s, exp2_s,
expm1_s, gamma_s, j0_s, j1_s, lgamma_s, log1p_s, logb_s,
nearbyint_s, rint_s, significand_s, tgamma_s, y0_s, y1_s,
copysign_s, drem_s, fdim_s, fmax_s, fmin_s, hypot_s,
jn_s, ldexp_s, nextafter_s, remainder_s, scalb_s, scalbln_s,
yn_s, r_copysign_s, r_drem_s, r_fdim_s, r_fmax_s, r_fmin_s,
hypot_s, r_jn_s, r_ldexp_s, r_nextafter_s, r_remainder_s,
r_scalb_s, scalbln_s, r_yn_s): New symbol variables.
(not_available): New static function.
(cbrt_wrap, erf_wrap, erfc_wrap, exp10_wrap, exp2_wrap,
expm1_wrap, gamma_wrap, j0_wrap, j1_wrap, lgamma_wrap,
log1p_wrap, logb_wrap, nearbyint_wrap, rint_wrap,
significand_wrap, tgamma_wrap, y0_wrap, y1_wrap,
copysign_wrap, drem_wrap, fdim_wrap, fmax_wrap,
fmin_wrap, hypot_wrap, jn_wrap, ldexp_wrap,
nextafter_wrap, remainder_wrap, scalb_wrap, scalbln_wrap,
yn_wrap): New static functions.
(arith_set_entries, arith_instantiate): New static functions.
(arith_init): Initialize symbols and instantiate functions
via autoload mechanism. In a program that doesn't use the
functions, we suffer only the overhead of interning the symbols.
* lib.h (UNUSED): New macro for GCC unused attribute.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
| |
* eval.c (eval_init): We have a repeat_s variable; no need
to call intern(lit("repeat"), user_package).
|
|
|
|
|
|
| |
* chksums/md5.c (encode, decode): Fix these functions so they
actually swap the byte order of 4 byte words on big endian.
Tested on PPC64.
|
|
|
|
|
|
|
|
|
|
| |
* eval.c (me_load_for): An object which is not one of the
valid clause symbols is not necessarily a symbol; don't
call it one in the diagnostic.
* stdlib/struct.tl (sys:check-slot): Similarly, an object that
isn't the name of a struct slot isn't necessarily a
symbol; don't call it one.
|
|
|
|
|
| |
* lib.c (unique): Use seq_iter_t rather than dividing into
list and vector cases.
|
|
|
|
|
|
|
|
| |
* hash.c (group_reduce): Use seq_iter_t instead of
obsolete vector and list iteration.
* txr.1: Use 1..11 range in one group-reduce
example instead of (range 1 10).
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.c (group_by): Replace obsolete list/vector switching
with seq_iter_t iteration. The group-by function is limited
otherwise.
* txr.1: Update group-by documentation not to assert that
the sequence argument is required to be a list or vector.
Examples for group-by and group-map now use a 0..11 range
instead of (range 0 10).
|
|
|
|
|
|
|
|
| |
* genchksum.txr, chksum.c: The declarations of the symbol
variables, COBJ class handles, and most of the chksum_init
function is also generated now. A comment is added explaining
what parts of the file are generated, and what parts are
scraped, and giving some advice to the maintainer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A giant macro that generates numerous functions and variables
is hell for anyone who has to debug the code. The right
approach is external code generation, where the result is
a clean source file that is checked into the repository.
* genchksum.txr: New file.
* chksum.c: Now generated.
(chksum_impl): Macro and its invocations removed.
(sha1_stream_impl, sha1_stream, sha1_szmax_upd,
sha1_buf, sha1_str, sha1, sha1_ops,
sha1_begin,
sha1_utf8_byte_callback, sha1_hash,
sha1_end): These are directly defined again.
(sha256_stream_impl, sha256_stream, sha256_szmax_upd,
sha256_buf, sha256_str, sha256, sha256_ops,
sha256_begin,
sha256_utf8_byte_callback, sha256_hash,
sha256_end): Likewise.
generated by macro.
md5_stream_impl, md5_stream,
md5_szmax_upd,
md5_buf, md5_str, md5, md5_ops,
md5_begin,
md5_utf8_byte_callback, md5_hash,
md5_end): Likewise.
|
|
|
|
|
|
| |
* txr.1: SHA-1 functions documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
| |
* txr.1: SHA256 should be SHA-256 in several places.
|
|
|
|
|
| |
* txr.1: Specify which crc32 is implemented: which polynomial,
bit direction, and complementation of the output.
|
|
|
|
| |
* tests/013/chksum.tl: New file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* chksums/sha1.c, chksums/sha1.h: New files.
* LICENSE, METALICENSE: Mention new code.
* chksum.c (sha1_ctx_s, sha1_ctx_cls): New static variables.
(chksum_init): Register sha-ctx symbol, and sha_ctx_s
COBJ class. Register sha1-stream, sha1, sha1-begin, sha1-hash
and sha1-end intrinsics.
(sha1_stream_impl, sha1_stream, sha1_szmax_upd,
sha1_buf, sha1_str, sha1, sha1_ops, sha1_begin,
sha1_utf8_byte_callback, sha1_hash, sha1_end): These
functions and variables are generated by a call to the
cksum_impl macro.
* Makefile (OBJS): add chksums/sha1.o object file.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* chksum.c (chksum_impl): New macro. Two calls to
this macro generate the implementations of SHA256 and MD5.
The code is identical.
(sha256_stream_impl, sha256_stream, sha256_szmax_upd,
sha256_buf, sha256_str, sha256, sha256_ops, sha256_begin,
sha256_utf8_byte_callback, sha256_hash, sha256_end):
Directly written definitions removed, now generated by macro.
md5_stream_impl, md5_stream, md5_szmax_upd,
md5_buf, md5_str, md5, md5_ops, md5_begin,
md5_utf8_byte_callback, md5_hash, md5_end): Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some OS distros have switched to a library called libxcrypt,
which, instead of returning null on failure, like the Glibc
crypt, returns a valid "failure token" string starting with
a * character, which is guaranteed to be different from the
salt argument.
We should check for this so that we handle errors uniformly.
Users are reporting failing crypt tests that are coming
up with "*0" instead of throwing an exception.
* sysif.c (crypt_wrap): Only accept a non-null result if
it isn't one of the two strings "*0" and "*1".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* RELNOTES: Updated.
* configure (txr_ver): Bumped version.
* stdlib/ver.tl (lib-version): Bumped.
* txr.1: Bumped version and date.
* txr.vim, tl.vim: Regenerated.
* protsym.c: Regenerated.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (seq_kind_tab): Map the RNG type to SEQ_VECLIKE
rather than SEQ_NOTSEQ. This one liner change causes
ranges to be rejected for iteration needlessly.
For instance, we can now do (take 42 "00".."99").
Note that ranges are not literally vector-like; they
don't support the ref operation. The effect of this change
is simply that they are not rejected due to being
SEQ_NOTSEQ. The iteration code already handles RNG
specially, so the SEQ_VECLIKE classification doesn't
actually matter.
|
|
|
|
|
|
|
|
|
| |
* lib.c (dwim_set): Handle seq argument being an integer
or range.
* tests/012/callable.tl: A few tests.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (do_generic_funcall): Allow integers and ranges
to be function callable. They take one argument and
index into it or extract a slice. In the case of ranges,
this is a breaking change. Ranges can already be used
in the function position in some limited ways that are
not worth preserving.
* tests/012/callable.tl: New file.
* tests/012/iter.tl: Here we fix two instances of
breakage. Using txr -C 288 will restore the
behaviors previously tested here.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These functions are useful when sorting a sequence
using an expensive keyfun.
* autoload.c (csort_set_entries, csort_instantiate):
New static functions.
(autlod_init): Register autoloading of csort module
via new functions.
* stdlib/csort.tl: New file.
* tests/012/sort.tl: csort functions included in tests.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hash-map converts a function mapping over a sequence
into a hash table.
* hash.[ch] (hash_map): New function.
* tests/010/hash.tl: Test case.
* genman.txr: The hash-map identifier introduces
a hash collision. We have to deal with that somehow now.
(colli): We put the conflicting entries into a new hash called
colli which maps them to an increment value.
(hash-title): Increment the hash code h by the amount
indicated in colli, if the title is found there.
* txr.1: Documented.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* lib.c (equal): Several cases which react to the
type of the left argument have a default path which
wrongly short-circuits to an early return.
All these cases must break through to the logic
at the end of the function which tests the right side
for a possible equality substitution.
* tests/012/struct.tl: One breaking test cases added.
equal was found to return nil for two structures
that have equal lists as their equality substitute.
|
|
|
|
|
|
|
|
|
|
| |
* gc.[ch] (gc_prot_array_alloc): Return the COBJ via
new pointer argument.
* lib.c (ssort_vec): Capture the object from gt_prot_array_alloc
into a local variable, That makes it visible to the garbage
collector, so it won't be prematurely reclaimed. Since we don't
use or return that object, we need to use gc_hint.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this change we fix the bug that the debugger commands
yield their Lisp forms rather than evaluating them.
* eval.c (eval_intrinsic): Takes one more argument,
the macro environment. This is passed into env_to_menv
as the root macro environment.
(eval_init): Update registration of eval intrinsic
to have two optional arguments.
* eval.h (eval_intrinsic): Declaration updated.
* parser.c (read_file_common, read_eval_ret_last): Pass
nil argument to new parameter of eval_intrinsic.
(repl): Pass the env parameter as the new menv
parameter of eval_intrinsic, rather than the existing
env parameter. This fixes the command dispatch in
the debugger, since the command table is consists of
symbol macros, and not variables. For instance the
backtrace command bt is a binding of the bt symbol
to the form (sys:print-backtrace), which has to be
substituted for it and executed. When that envrionment
is used as the ordinary environment, bt looks like
a variable whose value is the list (sys:backtrace).
* parser.y (elem, check_parse_time_action): Fix
eval_intrinsic calls.
* txr.c (txr_main): Likewise.
* txr.1: Documented.
* y.tab.c.shipped: Updated.
* stdlib/doc-syms.tl: Updated.
|
|
|
|
|
|
|
|
|
|
| |
hash.c (hash_remove): In the algorithm that moves cells
forward in the cluster in order to avoid the use of
tombstones, we use slightly better variables. The cell
we are looking to replace is called wipe rather than
start. The cryptic varaible name k is renamed to iprobe,
because it the initial probe location for the cell
at position i.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original chained hashing scheme makes certain
guarantees in situation when a hash table that is being
iterated is also undergoing insertions or deletions.
The original scheme meets these requirements simply,
by putting a freeze against hash table growth while
there are outstanding iterations. Chained hashing
gracefully handles load factors above 1.
Load factors above 1 are not possible under open
addressing (and we don't even want to approach 1)
but we would like to preserve the requirements.
The requirements are:
1. If an iterator has already visited an item, it
will not see that item again, regardless of insertions
which cause the table to be reorganized.
2. It is not specified/required that an iterator will
visit newly inserted items. It may visit some of those
items, but not others.
3. If an iterator has not yet visited an item, and
that item is deleted, it will not see that item,
regardless of any insertions that reorganize the table.
In this commit, we implement a "table stack" scheme.
1. When a table is resized due to insertions, and
it is being iterated (h->usecount > 0), in that situation
it will push the existing small table onto a stack,
the h->tblstack (table stack).
2. Iterators take a hash table's current table and its
size, and work with that snapshot of the table.
If the original hash table grows, existing iterators
work with the original table as it existed just before
the reorganization. So after that they do not see any
new insertions.
3. Whenever the delete operation (hash_remove) finds
the item and removes it from the current table,
it also walks the table stack, searches for the item
in every table in the stack and nulls it out.
This search is oblivious to nil; it's a blind search
that goes around the table starting at the first
probe position, looking for the identical cons cell
to take out. This algorithm ensures that iterators
will not see a deleted item, unless they already visited
it before the deletion, of course.
* hash.h (struct hash_iter): New members table, and mask.
* hash.c (struct hash): New member, tblstack.
(hash_grow): We drop the vec argument and recreate it
locally (not essential to this commit).
If we find that the usecount is positive, we push the
existing table onto the table stack. Otherwise,
we take the opportunity to obliterate the table stack.
(hash_insert): Drop the restriction that hash_grow is
only called when the use count is zero. Adjust calls
to hash_grow to drop the vec argument.
(hash_remove): When an item is found and deleted, and
the table is in use by iterators, walk the table stack
and delete it from each previous table. Otherwise,
if the table is not in use by iterators, obliterate
the table stack.
(hash_mark): Exit early also if there is a table stack,
and mark that stack.
(do_make_hash, make_similar_hash, copy_hash): Initialize
table stack in new hash.
(hash_iter_mark): Mark the iterator's table. This is
likely not necessary since we also mark the hash table,
which should have a pointer to that same table.
That wouldn't be defensive programming, though.
(hash_iter_init, us_hash_iter_init): Initialize table and mask.
(hash_iter_next_impl, hash_iter_peek): These functions
have to walk the table snapshot taken by the iterator,
using the captured mask, and not the current table.
(has_reset): If the target table's use count drops to zero,
obliterate its table stack. We add a missing setcheck here;
this operation potentially stores a different hash into
an existing iterator. It's not being done safely with
regard to generational GC.
* tests/010/hash.tl: New tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* hash.h (struct hash_iter): chain and cons members removed;
index member added.
* hash.c (struct hash_ops): acons_new_c_fun member removed.
(hash_ops_init): Reduced to three arguments.
(hash_grow): Function removed.
(hash_find_slot, hash_lookup, hash_grow, hash_insert,
hash_remove): New static functions.
(hash_mark): Iterate the simple open table, which directly
contains pointers to the hash entry cons cells, rather
than lists of these.
(hash_acons_new_c, hash_aconsql_new_c, hash_aconsq_new_c):
Static functions removed.
(hash_eq_ops, hash_eql_ops, hash_equal_ops): Initializer
macro calls reduced to three arguments.
(copy_hash_chain): Function removed.
(copy_hash): copy_hash_chain logic altered to duplicate
the hash cells like copy_hash_chain did.
(gethash_c, gethash_e, remhash): Retargeted to the new
static functions that do open addressing with linear
probing.
(hash_iter_mark): No cons member to mark any more.
(hash_iter_init, us_hash_iter_init): Initialization updated
to new representation.
(hash_iter_next_impl, hash_iter_peek): Rewritten to traverse simple table.
Loses second argument.
(hash_iter_next, hash_next): Don't pass second argument
to hash_iter_next_impl.
(do_weak_tables): Adjusted to walk open table.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hash tables are supposed to start with a mask of 255,
which is the 256 modulus less one. However, due to a
coding mistake, they get a modulus of 252. Now this still
works. It just has a few zero bits in it: 1111 1100
rather than 11111111. When this grows by doubling in size,
we get 1 1111 1001, 11 1111 0011 and so on.
Now this still all works, because hash values AND-ed
with such a mask are no greater than that mask. The
table isn't accessed out of bounds. It's just inefficiently
used.
* hash.c (do_make_hash, make_similar_hash, clearhash):
When converting the power-of-two modulus, which is a Lisp
word, to the integer mask, subtract 1 from the integer
value, not the original Lisp word.
|
|
|
|
| |
* stdlib/quips.tl (%quips%): New one.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/match.tl (match-cond): New macro.
* autoload.c (match_set_entries): match-cond triggers
autoload of match module.
* tests/011/patmatch.tl: Tests.
* txr.1: Documented.
* stdlib/doc.tl: Updated.
|