| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ffi.c (ffi_be_i16_rput, ffi_be_i16_rget, ffi_be_u16_rput,
ffi_be_u16_rget, ffi_be_i32_rput, ffi_be_i32_rget, ffi_be_u32_rput,
ffi_be_u32_rget, ffi_le_i16_rput, ffi_le_i16_rget, ffi_le_u16_rput,
ffi_le_u16_rget, ffi_le_i32_rput, ffi_le_i32_rget, ffi_le_u32_rput,
ffi_le_u32_rget): Rewriten to avoid memory clearing,
memsets, pointer arithmetic and use of helper functions. The big endian rput
and rget functions just wrap the non-endian versions.
The ones which need byte swapping work in terms of a full ffi_arg
word. For instance to prepare a 16 bit big endian unsigned return
value we byte swap the uin16_t, then convert fo ffi_arg.
|
|
|
|
|
|
|
| |
* ffi.c (ffi_be_i16_rput): We need to memset the remaining
parts of the 64 bit word to 0, like in all the other
ffi_be_xxx_put functions that are less than 64 bits wide. Also
removing the (void) tft cast is removed since tft is used.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ffi.c (ffi_i8_rput, ffi_i8_rget, ffi_u8_rput, ffi_u8_rget):
These functions are not doing the correct job; they are just
casting the pointer to the target type, like on little endian.
The big endian rget must fetch the entire 64 bit word
(ffi_arg) and convert its value to the target type. If
it's a character value, the actual bits are found at
*(src + 7) not at *src. The rput function must do the
reverse; convert the value to the 64 bit ffi_arg and
store that.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only the JMP instruction is checking for a backwards branch
and calling sig_check_fast() so that a loop can be interrupted
by Ctrl-C. The compiler can optimize that so that a backwards
jump is performed by an instruction in the IF family.
* vm.c (vm_if, vm_ifq, vm_ifql): Check for a backwards
branch and call sig_check_fast. Also, eliminate the redundant
call to vm_insn_bigop, which is just a masking macro.
The ip variable is already the result of vm_insn_bigop.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similarly to what was done in get_csv, we optimize he
use of get_char and unget_char. I see a 7.5% speed
improvement in a simple benchmark of the awk macro
with the record separator rs set to #/\n/.
* regex.c (scan_until_common): Obtain the strm_ops
operations of the stream, and pull the low level
get_char and unget_char virtual operations from it.
Call these directly in the loop. Thereby, we avoid
all the type checking overhead in these functions.
|
|
|
|
|
|
|
|
|
|
|
| |
Another 5-6% gained form this.
* stream.c (us_get_char, us_unget_char): Static functions
removed.
(get_csv): Retrieve the get_char and unget_char pointers from
the strm_ops structure outside of the loop, and then just
call these pointers. Careful: the unget_char virtual has
reversed parameters compared to the global function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Another almost 16% speedup.
* lib.c (us_length_STR): New static function.
(string_extend): Use us_length_STR, since we know the
object is of type STR.
(us_string_extend_STR_CHR): New function.
(length_str): Handle STR case via use_length_STR.
* lib.h (us_string_extend_STR_CHR): Declared.
* stream.c (get_csv): Use us_string_extend_STR_CHR
instead of string_extend.
|
|
|
|
|
|
|
| |
* lib.c (string_extend): We know that num_fast + delta is in
the fixnum range, because we checked this condition. So
we can just assign it without informing the garbage collector.
This yields about a 16% speedup in get-csv.
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/awk (awk-state upd-rec-to-f): Handle a new case
of fs being the keyword symbol :csv, producing a
field-splitting lambda that calls get-csv.
* tests/015/awk-basic.tl: Several new test cases for
this CSV feature.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
| |
I'm seeing about a 6% improvement in get-csv from this.
* stream.c (us_get_char, us_unget_char): New static functions,
which assume all arguments have correct type.
(get_csv): If we use source_opt, validate that it's a stream
with class_check. Use us_get_char and use_unget_char.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nowhere in the image do we have cobj_class inheritance
deeper than one. No class has a superclass which itself
has a superclass. Based on this, we can eliminate loops
coded to handle the general case.
* lib.c (sutypep, cobjclassp): Do not iterate to chase
the chain of super pointers. Do the subclass check
based on the assumption that there is at most a super
pointer to class which itself then has no super.
(cobj_register_super): Assert if the situation occurs
that a class is registered with a super that is not
a root. All these calls take place on startup, so if
the assumption is wrong, the assert will be 100%
reproducible.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like in a recent commit for mkstring, we impose a minimum
allocation size of 6 for vectors, which means 8 cells
together with the two informaton words at the base of
the vector.
* lib.c (vec_own): Take an alloc parameter in addition
to the length, which is stored in v[vec_alloc].
(vector): Impose a minimum alloc size of 6.
(copy_vec, nested_vec_of_v): Pass alloc parameter
to vec_own which is the same as the length parameter;
i.e. no behavior change for these functions.
|
|
|
|
|
|
|
| |
* lib.c (string_extend): When more space is needed in the
string, grow by 50% rather than 25%. This speeds up code
at the expense of some wasted space. Waste space can be
dealt with by the final flag in programs where it matters.
|
|
|
|
|
|
| |
* lib.c (mkstring): Do not allocate less than 8 characters,
including null terminator, to the string. This speeds up code
which builds up strings from empty, one character at a time.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The malloc_usable_size use in the STR type actually makes
operations like string_extend substantially slower. It is
faster to store the allocated size locally.
Originally, on platforms that have malloc_usable_size,
we were able to use the word of memory reclaimed from
the string type to store a cached hash code. But that logic
was revereted in 2022, so there is no such benefit.
* configure (have_malloc_usable_size): Variable
removed. Test for the malloc_usable_size function
removed.
(HAVE_MALLOC_USABLE_SIZE, HAVE_MALLOC_NP): Do not define
these preprocessor symbols.
* lib.c (HAVE_MALLOC_NP_H): Do not test for this variale
to include <malloc_np.h>
(string-own, string, string_utf8, mkstring, mkustring,
string_extend, string_finish, string_set_code,
string_get_code, length_str): Eliminate #ifdefs
on HAVE_MALLOC_USABLE_SIZE.
* lib.h (struct wstring): Eliminate #ifdef on
MALLOC_USABLE_SIZE, so alloc member is unconditionally
defined on all platforms.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handle field separations with lambdas, similarly to record
separation. The idea is that we replace the rec-to-f method,
which contains a cond statement checking the variables for
which field separation discipline applies, with a lambda which
is updated whenever any of those ariables change.
* awk.tl (awk-state): New instance slot, rec-to-f.
(awk-state :postinit): Call new upd-rec-to-f method
so that rec-to-f is populated with the default field
separating lambda.
(awk-state rec-to-f): Method removed.
(awk-state upd-rec-to-f): New method, based on rec-to-f.
This doesn't perform the field separation, but returns
a lambda which will perform it.
(awk-state loop): We must call upd-rec-to-f whenever
we change par-mode, because it influences field separation.
(awk-mac-let): Replace the symbol macros fs, ft, fw and
kfs with new implementations that use the reactive slot
mechanism provided by rslot. Whenever the awk macro assigns
any of these, the upd-rec-to-f method will be called.
* tests/015/awk-basic.tl: New file. These basic tests of
field separation pass before and after this change.
* tests/common.tl (otest, motest): New macros.
|
|
|
|
|
| |
* txr.1: The example possible value "~3,4f" should be shown
as a string literal in quotes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stream.c (get_csv): Let's add a new state init. If get_char
returns nil and we are in the init state, let's bail to a
nil return. While we are at it, let's not allocate the record
or string until we read at least one character. If we read
a character in the init state, let's allocate those two
objects, and then change to the rfield state and fall through
to it to handle the character.
* tests/010/csv.tl: Fix one incorrect test: (tocsv "") now
returns nil, as it should. Add tests for multiple record
extraction, also covering missing line termination on the last
record as well as CR-LF termination.
* txr.1: Documented nil return conditions.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stream.c (put_csv, tocsv): New functions.
(stream_init): put-csv and tocsv intrinsics registered.
* stream.h (put_csv, tocsv): Declared.
* tests/010/csv.tl (mtest-pcsv): New macro.
New test cases.
* txr.1: Documented.
|
|
|
|
| |
* txr.1: Split Data Interchange Support into JSON and CSV.
|
|
|
|
|
|
|
| |
* stream.c (get_csv): All cases handle end-of-stream
the same way, so we check for nil outside of the case
switch. Then only characters need to be handled, so we
can call c_chr(ch) and switch on it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* autload.c (csv_set_entries, csv_instantiate): Functions
removed.
(autoload_init): Autoload registration for stdlib/csv
removed.
* stdlib/csv.tl: File removed.
* stream.c (get_csv): New function.
(stream_init): Register get-csv intrinsic.
* stream.h (get_csv): Declared.
|
|
|
|
|
|
| |
* csv.tl (get-csv): Since there are only three states, there
is no jump table optimization. We might as well use keyword
symbols for the states rather than integers.
|
|
|
|
|
|
|
|
|
| |
* stdlib/csv.tl (get-csv): Pre-process the input by a small
state machine that maps CR-LF sequences to LF. Then
we don't have to recognize #\return anywhere in the state
machine and can delete the cr and qcr states, as well
as all the code recognizing #\return and branching to those
states.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* autloload.c (csv_set_entries, csv_instantiate): New
static funtions.
(autoload_init): Register autoload of stdlib/csv
module via new functions.
* stdlib/csv.tl: New file.
* tests/010/csv.tl: Likewise.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* autoload.c (enum_set_entries, enum_instantiate): New static
functions.
(autoload_init): Register autoload of stdlib/enum module
via new functions.
* stdlib/enum.tl: New file.
* tests/016/enum.tl: Likewise.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Makefile (clean): Do not remove the run.sh file here.
That is a temporary file that install-tests should be cleaning
away when done, which we can remove in distclean.
(clean-doc, clean-tests): New targets, which let us clean
up generated documentation files and the test-generated
state information in tst.
(distclean): Do not remove the documentation stuff here; rely
on clean-doc. Also depend on clean-tests.
Do remove run.sh here.
(install-tests): Add missing .PHONY. Remove run.sh.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In an opip pipeline, only the first pipeline element can
receive multiple arguments. The subsequent elements
receive the single return value from the previous element.
Therefore if it is a left-inserting pipeline created
by lopip, only the first element needs to use lop.
The others can use lop1, resulting in an optimization.
Furthermore in the flow/lflow macros, even the first
function in the pipeline is called with one argument:
the result of the input expression. So the case of lflow,
every element of the pipe that would translate to lop
can go to lop1 instead.
* stdlib/opt.tl (sys:opip-expand): Calculate a local
variable called opsym-rest which determines which op
symbol we use for the recursive call. This is the
same as the incoming opsym, except in the case when
opsym is lop, in which case we substitute lop1.
(sys:lopip1): New macro, like lopip but uses lop1
for the first element also.
(lflow): Expand to sys:lopip1 rather than lopip.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* autoload.c (op_set_entries): Autoload on lop1 symbol.
* stldlib/op.tl (sys:op-expand): Add lop1 case.
(sys:opip-expand): Add lop1 to the list of operators
that are recgonized and specially treated.
(lop1): New macro.
* tests/012/op.tl: New tests.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stream.h (struct json_opts): Member flat removed.
I noticed that !jo.flat was always being tested together
with jo.fmt == json_fmt_standard. Except for a few
places where the code only tested for json_fmt_standard,
resulting in flat output, but some extra spaces.
What distiguishes flat mode now is simply that we
disable stream indentation.
* lib.c (out_json_rec): Remove tests for !jo.flat.
(out_json): Remove initialization of jo.flat member.
In this function we set up indentation on the stream
resulting in multi-line mode (existing behavior).
(put_json): Remove initialization of jo.flat member.
If flat mode is requested, then it overrides the
format to json_fmt_default. I.e. json_fmt_standard
coresponding to :standard is only in effect if flat
is not requested.
In this function we set up indentation on the stream
if flat mode isn't requested, otherwise we disable
indentation (existing behavior, enough to make flat
work).
* tests/010/json.tl: Tests for flat mode, :standard
formatting, and combinaton of both.
|
|
|
|
|
|
|
|
|
|
| |
this also affects put_jsonl and tojson.
* lib.c (put_json): The flat argument must be properly
defaulted. Without this we are treating it as true when it
is missing due to the convention that missing args are
signaled by the : symbol. This bugs breaks the ability
to use the :standard value for *print-json-format*.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lop macro is inconsistent from op in that it
inserts the trailing function arguments on the
left even if arguments are explicitly given in the
form via @1, @2, ... or @rest. This change makes
lop is equivalent to op in all situations when these
metas are given.
* stdlib/op.tl (compat-225, compat-298): New top-level
variables.
(op-expand): local variable compat replaced by references to
compat-225. If compat-298 is *not* in effect, then metas
are checked for first in the cond, preventing the lop
transformation from taking place.
* tests/012/op.tl: Test cases for lop, combinations of
do with lop and a few for op also.
* txr.1: Redocumented, added compat notes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSE, LICENSE-CYG, METALICENSE, Makefile, alloca.h,
args.c, args.h, arith.c, arith.h, autoload.c, autoload.h,
buf.c, buf.h, cadr.c, cadr.h, chksum.c, chksum.h,
chksums/crc32.c, chksums/crc32.h, combi.c, combi.h, configure,
debug.c, debug.h, eval.c, eval.h, ffi.c, ffi.h, filter.c,
filter.h, ftw.c, ftw.h, gc.c, gc.h, glob.c, glob.h, gzio.c,
gzio.h, hash.c, hash.h, itypes.c, itypes.h, jmp.S,
lex.yy.c.shipped, lib.c, lib.h, linenoise/linenoise.c,
linenoise/linenoise.h, match.c, match.h, parser.c, parser.h,
parser.l, parser.y, protsym.c, psquare.h, rand.c, rand.h,
regex.c, regex.h, signal.c, signal.h, socket.c, socket.h,
stdlib/arith-each.tl, stdlib/asm.tl, stdlib/awk.tl,
stdlib/build.tl, stdlib/cadr.tl, stdlib/comp-opts.tl,
stdlib/compiler.tl, stdlib/constfun.tl, stdlib/conv.tl,
stdlib/copy-file.tl, stdlib/csort.tl, stdlib/debugger.tl,
stdlib/defset.tl, stdlib/doloop.tl, stdlib/each-prod.tl,
stdlib/error.tl, stdlib/except.tl, stdlib/expander-let.tl,
stdlib/ffi.tl, stdlib/getopts.tl, stdlib/getput.tl,
stdlib/glob.tl, stdlib/hash.tl, stdlib/ifa.tl,
stdlib/keyparams.tl, stdlib/load-args.tl, stdlib/match.tl,
stdlib/op.tl, stdlib/optimize.tl, stdlib/package.tl,
stdlib/param.tl, stdlib/path-test.tl, stdlib/pic.tl,
stdlib/place.tl, stdlib/pmac.tl, stdlib/quips.tl,
stdlib/save-exe.tl, stdlib/socket.tl, stdlib/stream-wrap.tl,
stdlib/struct.tl, stdlib/tagbody.tl, stdlib/termios.tl,
stdlib/trace.tl, stdlib/txr-case.tl, stdlib/type.tl,
stdlib/vm-param.tl, stdlib/with-resources.tl,
stdlib/with-stream.tl, stdlib/yield.tl, stream.c, stream.h,
struct.c, struct.h, strudel.c, strudel.h, sysif.c, sysif.h,
syslog.c, syslog.h, termios.c, termios.h, time.c, time.h,
tree.c, tree.h, txr.1, txr.c, txr.h, unwind.c, unwind.h,
utf8.c, utf8.h, vm.c, vm.h, vmop.h, win/cleansvg.txr,
y.tab.c.shipped: Copyright bumped to 2025.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stdlib/match.tl (match-case-to-casequal): the (do inc
dfl-cnt) action has a problem: it inserts an implicit extra
parameter to the invocation of inc, which crashes the +
addition due to that parameter being the matching @nil object.
We don't need this entire case because it handles @nil,
which also matches the following case for (sys:var ...),
since @nil is (sys:var nil). That case ahs the same action
of incrementing dfl-cnt.
* tests/011/patmatch.tl: Test case added.
|
|
|
|
|
|
|
|
|
|
| |
* RELNOTES: Updated.
* configure (txr_ver): Bumped version.
* stdlib/ver.tl (lib-version): Bumped.
* txr.1: Bumped version and date.
|
|
|
|
|
| |
* parser.c (load_rcfile): When neither ~/.txr-profile nor ~/.txr_profile
exist, then bail. Do not pass nil to abs-path-p and other functions.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* RELNOTES: Updated.
* configure (txr_ver): Bumped version.
* stdlib/ver.tl (lib-version): Bumped.
* txr.1: Bumped version and date.
* txr.vim, tl.vim: Regenerated.
|
|
|
|
|
|
|
|
| |
* lib.c (length_str_range): On platforms where wchar_t is
unsigned, we calculate bogus values for reversed ranges.
On Android, gcc warns about the code, and the recently
added tests fail. Let's cast the characters to long before
doing the subtraction, which is the argument type of labs.
|
|
|
|
|
|
|
|
|
| |
* lib.c (seq_iter_init_with_info): String ranges are
inclusive. We must not assume at a range whose endpoints are
the same is empty; we must check that case for the endpoints
being strings.
* tests/012/seq.tl: New tests.
|
|
|
|
| |
* tests/012/seq.tl: New tests.
|
|
|
|
|
|
| |
* eval.c (map_common): use the all_zero_init macro, defined
differently for C and C++ for initializing a struct to all
zero.
|
|
|
|
| |
* stdlib/quips.tl (%quips%): New one.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The profile and history files should have used hyphens
from the beginning. Let's switch to that but continue
to work with the old files if present, as an obsolescent
feature.
* parser.c (open_txr_file): Treat files with .txr-profile
suffix as Lisp.
(load_rcfile): Arguments rearranged. This function now needs
the home directory and the existence test function, but
does not need the profile file name. It tries .txr-profile
first, then .txr-profile.
(repl): Call load_rcfile in the new way. Try two history
files: first .txr-history and then .txr_history. Remember
which one was used so the same one is saved. If neither file
existed at startup, then save new history into .txr-history.
* txr.1: Documented.
|
|
|
|
|
|
|
|
|
|
|
| |
* ffi.c (make_ffi_type_struct): We must calculate the size
of a flexible structure the way GCC does it. We cannot simply
truncate it at the offset of the member. Rather, the size
is calculated in the usual way. The alignment of the array is
taken into account for the purpose of determining what is the
most aligned member of the structure, and then padding is
added, if required. Thus, the size may exceed the offset of
that member.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ranges are iterable, denoting abstract sequences. The copy
function now copies a range by constructing the array.
This is useful when copy is used for the purpose of obtaining
a mutable copy. For example, (shuffle 0..100) will now work,
returning a shuffled vector of the integers from 0 to 99.
* lib.c (copy): Handle RNG case via vec_seq.
* tests/012/seq.tl,
* tests/012/sort.tl: New test cases.
* txr.1: Documented. Documentation for the copy function
improved.
|
|
|
|
|
|
|
| |
* lib.c (refset, replace): Word the bad object diagnostic in
terms of it not being a modifiable sequence. This covers
cases when the object is something abstractly iterable, like
a range. We don't want to say that it's not a sequence.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (nfa_closure, nfa_move_closure): Do not add epsilon
states to the output array. We only need to add them to the
stack for the spanning traversal. Epsilon states are not real
states; they are just a representation of the concept of
transitioning to multiple states at the same time.
When we add them to the output, they just end up being ignored
when nfa_move_closure is called again on that set, since it
only cares about states that have real transitions for a
character.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* regex.c (nfa_has_transitions): The logical disjunction here
is wrong. We would like to test whether a state has
transitions if it is not an epsilon state. The code which uses
this macro doesn't care about epsilon states, even though they
have transitions; those are squeezed out by transitively
closure. The wrong condition here makes the code think that
a NFA set has transitions when it does not, preventing the
result code REGM_MATCH_DONE to be produced which can spare
a character from being consumed from a stream.
|
|
|
|
|
|
| |
* regex.c (mfa_move_closure): Fix misleading wording in
comment. The state which matches the character (has a
transition for it) is not the one added to the move set.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 9aa751c8a4f845ef2d2bba091c81ffeded941afd
broke things.
This fix affects the function read-until-match,
scan-until-match and count-until-match which share
implementation.
* regex.c (scan_until_common): In the REGM_MATCH_DONE
and REGM_MATCH cases, we must push the character onto
the local stack, before doing the match = stack
assignment. Otherwise, it's possible that the stack
is empty and so no match is recorded. The REGM_FAIL
case will then behave as if no match was found, consuming
a character and continuing.
* txr.1: Codify an existing behavior: only non-empty matches
for the regex are considered by read-until-match.
* tests/015/regex.tl: New file. I am amazed to discover
that we don't seem to have a test suite for regexes at all.
Putting the tests here which confirm this fix and provide
coverage for some edge cases in read-until-match.
|