summaryrefslogtreecommitdiffstats
path: root/regex.c
Commit message (Collapse)AuthorAgeFilesLines
* Support negative positions in regex matching funs.Kaz Kylheku2016-09-211-1/+9
| | | | | | | | | * regex.c (match_regex, match_regex_right): Detect a negative start or end position, respectively, and add the string length to it. If it is still negative, bail with nil. * txr.1: Documented.
* Move regex intrinsic registrations to regex.c.Kaz Kylheku2016-09-211-0/+14
| | | | | | | | * eval.c (eval_init): Remove all regex-related function registrations from here. * regex.c (regex_init): Move regex-related function registrations here.
* regex: optimize double complement.Kaz Kylheku2016-09-161-40/+46
| | | | * regex.c (reg_optimize): Implement ~~R -> R reduction.
* regex: add case to complement optimization.Kaz Kylheku2016-09-151-0/+2
| | | | | | | | * regex.c (reg_optimize): Based on the reasoning in the previous commit, we can also statically optimize a complement whose argument is the t regex: match nothing. We convert that to match everything: the .* regex. Now (regex-compile "~[]") -> #/.*/.
* regex: fix broken complement operator.Kaz Kylheku2016-09-151-1/+3
| | | | | | | | | | | | | | | | | | | The form (match-regex "xy" #/~ab/) should return 2 (full match) because "xy" is in the complement of the set { "ab" }. It wrongly returns 1. * regex.c (reg_derivative): Handle the case when the derivative of the complement's constituent expression yields nil. This means that the complemented regex matches the input. In this case, the complement must lapse to the .+ regex: match one or more characters. That is to say, if the input has at least one more character, there is a match, which covers all such characters. Otherwise there is no match: the input matches the complemented regex. In the t case, the return value is also wrong. If the complemented regex hits a brick wall (matches nothing, not even the empty string), the correct complement is "match everything": the .* regex. Not the match empty string regex!
* NFA regex optimization: use just one set array.Kaz Kylheku2016-07-191-48/+31
| | | | | | | | | | | | | | | | | | | | | | We don't have to flip between two arrays, since the nfa_closure and and nfa_move_closure can write the output set into the same array. * regex.c (struct nfa_machine): Replace flip and flop members with a single set. (nfa_closure, nfa_move_closure): out array parameter removed; in renamed to set. References to in and out simply replaced with set. (nfa_run): Allocate one set instead of two, plus the stack. Remove code to swap the two pointers on each iteration. (regex_machine_reset): Prepare initial closure in the one and only set array. (regex_machine_init): Allocate set array, rather than flip an flop. (regex_machine_cleanup): Free set array and null out pointer rather than flip and flop arrays. (regex_machine_feed): Pass just the set ot the nfa_move_closure function. Remove flip/flop pointer swapping
* NFA regex optimization: combine move and closure.Kaz Kylheku2016-07-191-37/+90
| | | | | | | | | | | | | | | | | | | | * regex.c (struct nfa_machine_t): Remove move and clos array pointers, replace with flip and flop. Remove nmove member. (nfa_move): Static function removed. (nfa_move_closure): New static function, based on nfa_move and logic from nfa_closure. (nfa_run): Use nfa_move_closure and flip between two arrays. (regex_machine_reset): Remove reference to nmove member in nfa_machine_t. Prepare initial closure in flip array. (regex_machine_init): Allocate flip and flop arrays, rather than removed move and clos. (regex_machine_cleanup): Free flip and flop arrays and zero out the pointers, rather than removed move and clos. (regex_machine_feed): Replace nfa_move and nfa_closure with combined nfa_move_closure from flip to flop, and exchange of flip and flop arrays.
* New --free-all option for freeing memory on exit.Kaz Kylheku2016-06-071-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although we are garbage-collected, being able to clean up on shutdown is nevertheless useful for uncovering leaks. Leaks can occur, for instance, due to neglect to free out-of-heap satellite data from objects that are reclaimed by gc. This feature is long overdue. * arith.c, arith.h (arith_free_all): New function. * gc.c, gc.h (gc_free_all): New function. * lib.c (init): Remove program name parameter and redundant initialization of progname globl variable. * lib.h (progname): Superfluous declaration removed. This is already declared in txr.h. (init): Declaration updated. * regex.c (char_set_destroy): Do not check the static allocation flag here; just destroy the object. Do check for a null pointer, though. (char_set_cobj_destroy): This cobj destructor now checks the static flag of the char set object and avoids freeing it. Thus our char set singletons are left alone by gc, but our global freeing function takes care of them. (wide_cs): New static variable moved out of wide_display_char_p to static scope. (regex_free_all): New function. * regex.h (regex_free_all): Declared. * txr.c (progname): const qualifier and initializer removed. (main): Ensure progname is always dynamically allocated, even in the argv[0] == 0 case. Do not pass progname to init; it doesn't take that argument any more. (free_all): New static function. (txr_main): Implement --free-all option. * txr.h (progname): Declaration updated.
* Some streamlining in the cons recycling.Kaz Kylheku2016-05-151-1/+1
| | | | | | | | | | | * lib.c (rcyc_pop): Just assume that *plist points to a cons and access the fields directly. (rcyc_cons): Don't bother with rplacd. (rcyc_list): Don't bother with set macro. * regex.c (read_until_match): Defensive coding: locally ensure that rcyc_pop won't be called on a nil stack, which will now segfault.
* Recycle conses in unget-char and read-until-match.Kaz Kylheku2016-04-201-3/+7
| | | | | | | | | | | | * regex.c (ead_until_match): Use rcyc_pop instead of pop to move the conses to the recycle list. We know these are not shared with anything. Adding additional logic to completely recycle the stack. * socket.c (dgram_get_char): Use rcyc_pop to get the character from the push-back list. * stream.c (stdio_get_char): Likewise.
* read-until-match can optionally keep matched text.Kaz Kylheku2016-04-201-21/+19
| | | | | | | | | | | | | | | | | | | | * regex.c (read_until_match): New argument, include_match. Three times repeated termination code refactored into block reached by forward goto. (regex_init): Registration of read-until-match updated. * regex.h (read_until_match): Declaration updated. * stream.c (struct record_adapter_base): New member, include_match. (record_adapter_get_line): Pass match to read_until_match as new argument. (record_adapater): New argument, include_match. (stream_init): Update registration of record-adapter. * stream.h (record_adapter): Declaration updated. * txr.1: Updated.
* Fix broken read_until_match.Kaz Kylheku2016-04-191-17/+51
| | | | | * regex.c (read_until_match): Completely rewrite broken, unsalvageable, garbage logic.
* Header file cleanup.Kaz Kylheku2016-01-221-1/+0
| | | | | | | * arith.c, cadr.c, debug.c, eval.c, filter.c, gencadr.txr, glob.c, hash.c, linenoise/linenoise.c, lisplib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Remove unncessary header files.
* Regex printing not escaping [ and ].Kaz Kylheku2016-01-121-1/+2
| | | | | * regex.c (print_rec): Handle '[' and ']' in backslash-adding switch.
* Print control chars in regexes using \x.Kaz Kylheku2016-01-121-53/+70
| | | | | | | | | | | | | | | | | | | | * lib.c (out_str_char): Static function becomes extern. * lib.h (out_str_char): Declared. * regex.c (puts_clear_flag, putc_clear_flag): New static functions. (print_class_char): Take semicolon flag argument. Use out_str_char to render characters not escaped locally. Clear the semicolon flag. (paren_print_rec): Take semicolon flag argument, and pass it down. Clear it when printing parentheses. (print_rec): Take semicolon flag argument, and pass down to lower level functions. Use putc_clear_flag and puts_clear_flag instead of put_string and put_char. Use out_str_char for char object not esaped locally. (regex_print): define semi_flag and pass it down to print_rec.
* regex_print: [ and ] in char class must be escaped.Kaz Kylheku2016-01-121-1/+1
| | | | * regex.c (print_class_char): Add missing character cases.
* Record-delimiting stream adapter.Kaz Kylheku2016-01-011-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | * regex.c (read_until_match): New function. (regex_init): Registered read-until-match intrinsic. * regex.h (read_until_match): Declared. * stream.c (struct delegate_base): New struct type. (delegate_base_mark, delegate_put_string, delegate_put_char, delegate_put_byte, delegate_get_char, delegate_get_byte, delegate_unget_char, delegate_unget_byte, delegate_close, delegate_flush, delegate_seek, delegate_truncate, delegate_get_prop, delegate_set_prop, delegate_get_error, delegate_get_error_str, delegate_clear_error, make_delegate_stream): New static functions. (struct record_adapter_base): New struct type. (record_adapter_base_mark, record_adapter_mark_op, record_adapter_get_line): New static functions. (record_adapter_ops): New static structure. (record_adapter): New function. (stream_init): Registered record-adapter intrinsic. * stream.h (record_adapter): Declared. * txr.1: Documented read-until-match and record-adapter.
* Copyright year bump.Kaz Kylheku2015-12-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * LICENSE, METALICENSE, Makefile, args.c, args.h, arith.c, arith.h, cadr.c, cadr.h, combi.c, combi.h, configure, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, glob.c, glob.h, hash.c, hash.h, jmp.S, lib.c, lib.h, lisplib.c, lisplib.h, match.c, match.h, parser.c, parser.h, parser.l, parser.y, rand.c, rand.h, regex.c, regex.h, share/txr/stdlib/cadr.tl, share/txr/stdlib/except.tl, share/txr/stdlib/hash.tl, share/txr/stdlib/ifa.tl, share/txr/stdlib/path-test.tl, share/txr/stdlib/place.tl, share/txr/stdlib/struct.tl, share/txr/stdlib/txr-case.tl, share/txr/stdlib/type.tl, share/txr/stdlib/with-resources.tl, share/txr/stdlib/with-stream.tl, share/txr/stdlib/yield.tl, signal.c, signal.h, stream.c, stream.h, struct.c, struct.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.1, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Add 2016 copyright. * linenoise/LICENSE, linenoise/linenoise.c, linenoise/linenoise.h: Bump one principal author's copyright from 2014 to 2015. The code is based on a snapshot of 2015 upstream work.
* range-regex returns range, not cons.Kaz Kylheku2015-12-071-2/+2
| | | | | | | | | | | * regex.c (range_regex): Return range. (search_regst): Use appropriate accessors on range returned by range_regex. * lib.c (tok_where): Destructure range returned by range_regex, using range_bind. * txr.1: Documented changed behavior.
* Fix serious regression in search_regex.Kaz Kylheku2015-11-061-3/+1
| | | | | | | | * regex.c (search_regex): In the Sep 7 2015 commit titled "Don't use prot1 for temporary gc protection", a rel1 call was left behind, causing an assert whenever the function is used for a succesful "from end" search.
* Stop using C library setjmp/longjmp.Kaz Kylheku2015-10-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXR is moving to custom assembly-language routines. This is mainly motivated by a very dubious thing done in the GNU C Library setjmp and longjmp in the name of security. Evidently, glibc's setjmp "mangles" certain pointer values which are stored into the jmp_buf buffer. It's been that way since 2005, evidently. This means that, firstly, all along, the use of setjmp in gc.c to get registers into a buffer so they can be scanned has not actually worked properly. More importantly, this pointer mangling in setjmp and longjmp is very hostile to a stack copying implementation of delimited continuations. The reason is that continuations contain jmp_buf buffers, which get relocated in the process of capturing and reviving a continuation. Any pointers in a jmp_buf which point into the captured stack segment have to be fixed up to point into the relocated location. Mangled pointers make this difficult, requiring hacks which are specific to glibc and the machine architecture. We might as well implement a clean, well-behaved setjmp and longjmp. * Makefile (jmp.o): New object file. (dbg/%.o, opt/%.o): New rules for .S prerequisites. * args.c, arith.c, cadr.c, combi.c, cadr.c, combi.c, debug.c, eval.c, filter.c, glob.c, hash.c, lib.c, match.c, parser.c, rand.c, regex.c, signal.c, stream.c, struct.c, sysif.c, syslog.c, txr.c, unwind.c, utf8.c: Removed <setjmp.h> include. * gc.c: Switch to struct jmp and jmp_save, instead of jmp_buf and setjmp. * jmp.S: New source file. * signal.h (struct jmp): New struct type. (jmp_save, jmp_restore): New function declarations denoting assembly language routines in jmp.S. (extended_jmp_buf): Uses struct jmp instead of setjmp. (extended_setjmp): Use jmp_save instead of setjmp. (extended_longjmp): Use jmp_restore instead of longjmp.
* Additional reductions for and.Kaz Kylheku2015-09-291-0/+6
| | | | | | | * regex.c (reg_optimize): If the empty regex is and-ed with another regex, that other regex must be nullable, otherwise the and matches nothing. This is captured in some new reductions for the and operator.
* Simplify and optimization.Kaz Kylheku2015-09-291-4/+1
| | | | | | * regex.c (reg_optimize): No need to check reg_matches_all in and optimization case because the argument object has already been reduced that way by reg_optimize recursion.
* Optimize some cases of the regex branch operator.Kaz Kylheku2015-09-291-0/+43
| | | | | | * regex.c (reg_compl_char_p): New static function. (reg_optimize): Optimize various cases of the or operator: (R|) -> R?, (a|b) -> [ab] and others.
* Some optimizations for * ? and +.Kaz Kylheku2015-09-291-4/+21
| | | | | | * regex.c (regex_optimize): Simplify compounded uses of repetition operators: RR* -> R, R+? -> R* and so on.
* Regex printer fails on \w, \s or \d in char class.Kaz Kylheku2015-09-291-0/+2
| | | | | regex.c (print_rec): Bugfix: handle symbols in character class syntax.
* More complement optimizations.Kaz Kylheku2015-09-281-0/+19
| | | | | * regex.c (reg_optimize): Transform ~.*c to (.*[^c])? and ~c.* to ([^c].*)? where c is a single-character match.
* Streamline some regex optimizations.Kaz Kylheku2015-09-281-15/+48
| | | | | | | * regex.c (reg_single_char_p, invert_single): New static functions. (reg_optimize): Simplify complement operator optimizations using new functions.
* Optimization for one-character range.Kaz Kylheku2015-09-271-2/+7
| | | | | * regex.c (reg_optimize): [a] -> a. Also take advantage of this where the complement case generates [a].
* Optimize complement operator more.Kaz Kylheku2015-09-271-0/+28
| | | | | * regex.c (reg_optimize): Recognize and transform several cases: ~c -> ([^c]?|..+); ~[^c] -> ([c]?|..+); and ~.*c.* -> [^c]*.
* S-exp level regex optimization.Kaz Kylheku2015-09-271-32/+156
| | | | | | | | | | | | | | * regex.c (dv_compile_regex): Replaced by two functions, reg_expand_nongreedy and reg_compile_csets. (reg_expand_nongreedy, reg_compile_csets): New static functions. (reg_optimize): New static function. (regex_compile): Expand nongreedy syntax in incoming regex, and then optimize it before deciding whether to use NFA or derivatives. If derivatives are used, compile the character sets in the regex to character set objects. (regex_init): Register some intrinsic functions for debugging, sys:reg-expand-nongreedy and sys:reg-optimize.
* Support t regex in NFA compiler and in printer.Kaz Kylheku2015-09-271-1/+16
| | | | | | | | | | | | | | | | The t regex means "match nothing". This patch allows the NFA compiler to handle it. This will be necessary for an upcoming regex optimizer which can put out such an object. Also, the recursive regex printer can print the object now. * regex.c (nfa_kind_t): New enum member, nfa_reject. (nfa_state_reject): New static function. (nfa_compile_regex): Compile t regex into a reject state which cannot reach its corresponding acceptance state. (nfa_map_states): Handle nfa_reject case in switch, similarly to nfa_accept: nothing to transition into. (print_rec): Render the t regex as the empty character class [].
* Replace internal_error with exception throws in regex.Kaz Kylheku2015-09-271-7/+7
| | | | | | * regex.c (nfa_compile_regex, dv_compile_regex, reg_nullable, reg_matches_all, reg_derivative, regex_requires_dv): Throw an exception for the bad operator case.
* Bug in complement case of reg_matches_all.Kaz Kylheku2015-09-271-1/+2
| | | | | | * regex.c (reg_matches_all): A complement matches all if its argument matches nothing, not if its argument is anything but the empty match nil.
* regex: major optimization for complement operator.Kaz Kylheku2015-09-241-1/+46
| | | | | | | | | | | This change a huge improvement for expressions that use complement, directly or via the non-greedy % operator. * regex.c (reg_matches_all): New static function. (reg_derivative): When the dervative is applied to a complement expression, identify situations when the remaining expression cannot possibly match anything, and convert them to the t expression.
* Regex state-marking counter wraparound bug.Kaz Kylheku2015-09-151-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a NFA regex goes through more than 4.29 billion state transitions, the state coloring "visited" marker wraps around. There could still exist states with old values at or near zero, which destroys the correctness of the closure calculations. * regex.c (nfa_handle_wraparound): New static function. The wraparound situation is handled by detecting when the next marker value is UINT_MAX. When this happens, we visit all states, marking them to UINT_MAX. Then we visit them again, marking them to zero, and set the next marker value to 1. (nfa_free): Added comment about why we don't have a wraparound check, in case it isn't obvious. (nfa_run): Check for wraparound before eveyr nfa_closure call. (regex_machine_reset): Check for wraparound before nfa_closure call. Fix: store the counter back in the start state's visited field. (regex_machine_init): Initialize the n.visited field of the regex machine structure to zero. Not strictly necessary, since it's initialized moments later in regex_machine_reset, but good form. (regex_machine_feed): Check for wraparound before nfa_closure call.
* Use alloca for some temporary arrays in regex module.Kaz Kylheku2015-09-151-11/+5
| | | | | * regex.c (nfa_free): Use alloca for array of all states. (nfa_run): Use alloca for move, closure and stack arrays.
* Remove limit on NFA state size and allocate tightly.Kaz Kylheku2015-09-151-62/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * regex.c (struct regex): New member, nstates. (NFA_SET_SIZE): Preprocessor symbol removed. (struct nfa_machine): New member, nstates. (nfa_all_states): Function removed. (nfa_map_states): New static function. (nfa_count_one, nfa_count_states, nfa_collect_one): New static functions. (nfa_free): Takes nstates argument. Calculate array of all states using nfa_map_states over nfa_collect_one rather than nfa_all_states. The array is tightly allocated. Also the spanning tree traversal needs just one root, nfa.start. It's not clear why nfa_all_states used nfa.start and nfa.accept as roots. (nfa_closure): Takes nstates parameter; array bounds checking performed tightly against nstates rather than NFA_SET_SIZE. (nfa_move): Check against NFA_SET_SIZE removed. (nfa_run): Take nstates argument. Allocate arrays tightly. Pass nstates to nfa_closure. (regex_destroy): Pass regex->nstates to nfa_free. (regex_compile): Initialize regex->nstates. (regex_run): Pass regex->nstates to nfa_run. (regex_machine_reset): Pass nstates to nfa_closure. (regex_machine_init): Initialize n.nstates member of regex machine. Allocate arrays tightly. (regex_machine_feed): Pass nstates to nfa_closure.
* Fix memory leak in regexes.Kaz Kylheku2015-09-141-1/+1
| | | | | | * regex.c (nfa_free): The visited marker must be incremented, otherwise nfa_all_states will only collect start and accept.
* Don't use prot1 for temporary gc protection.Kaz Kylheku2015-09-071-3/+1
| | | | | | | | | | | | * lib.c (split_str, split_str_set, list_str, int_str): Use gc_hint rather than prot1/rel1. More efficient, doesn't use space in the prot_stack array. * regex.c (search_regex): Likewise. * stream.c (vformat_str, formatv, run): Likewise. In formatv, rel1 wasn't being called in the uw_unwind block, so this fixes a bug.
* Count East Asian Wide and Full Fidth chars as two columns.Kaz Kylheku2015-08-101-0/+66
| | | | | | | | | | | | | | | | | * regex.c (create_wide_cs): New static function. (wide_display_char_p): New function. * regex.h (wide_display_char_p): Declared. * stream.c (put_string, put_char): Use wide_display_char_p to determine whether an extra column need be counted. Also bugfix: iswprint evidently cannot be relied to work over the entire Unicode range, at least not in the C locale. Glibc's version and is reporting valid Japanese characters as unprintable on Ubuntu. As a hack we instead check for control characters and invert the result: control chars are unprintable. * tests/009/json.expected: Updated.
* Pass pretty flag to cobj print operation.Kaz Kylheku2015-08-011-2/+3
| | | | | | | | | | | | | | | | | | | | | * hash.c (hash_print_op): Take third argument, and call cobj_print_impl rather than cobj_print. * lib.c (cobj_print_op): Take third argument. The object class is * printed with obj_print_impl. (obj_print_impl): Static function becomes extern. Passes its pretty flag argument to cobj print virtual function. * lib.h (cobj_ops): print takes third argument. (cobj_print_op): Declaration updated. (obj_print_impl): Declared. * regex.c (regex_print): Takes third argument, and ignores it. * stream.c (stream_print_op, stdio_stream_print, cat_stream_print): Take third argument, and ignore it. * stream.h (stream_print_op): Declaration updated.
* Correction to COBJ initialization pattern.Kaz Kylheku2015-07-301-2/+2
| | | | | | | | | | | | | In fact, the previosuly documented process is not correct and still leaves a corruption problem under generational GC (which has been the default for some time). * HACKING: Document flaw in the initialization pattern previously thought to be correct, and show fix. * hash.c (copy_hash): Fix instance of incorrect pattern. * regex.c (regex_compile): Likewise.
* Bugfix: throwing error when trying to print valid regexps.Kaz Kylheku2015-04-191-1/+1
| | | | | | * regex.c (print_rec): Only dianose "bad object in regex syntax" for some atom other than nil, which denotes an empty (sub)expression, like what results from #// or #/a|/ and such.
* * regex.c (match_regex_right): Bugfix: zero length matchesKaz Kylheku2015-02-201-1/+1
| | | | | should return zero length, rather than nil. This is achieved by trying the match at one past the last character.
* String-returning wrappers for some regex matching functions.Kaz Kylheku2015-02-201-0/+21
| | | | | | | | | | | * eval.c (eval_init): Register search-regst, match-regst and match-regst-right intrinsics. * regex.c (search_regst, match_regst, match_regst_right): New functions. * regex.h (search_regst, match_regst, match_regst_right): Declared. * txr.1: Documented new variants.
* * regex.c (print_rec): A compound must use parentheses forKaz Kylheku2015-02-151-2/+8
| | | | elements which have a higher precedence than catenation.
* Update copyright notices from 2014 to 2015.Kaz Kylheku2015-02-011-1/+1
| | | | | | | | | | | * arith.c, arith.h, combi.c, combi.h, debug.c, debug.h, eval.c, eval.h, filter.c, filter.h, gc.c, gc.h, hash.c, hash.h, lib.c, lib.h, match.c, match.h, parser.h, rand.c, rand.h, regex.c, regex.h, signal.c, signal.h, stream.c, stream.h, sysif.c, sysif.h, syslog.c, syslog.h, txr.c, txr.h, unwind.c, unwind.h, utf8.c, utf8.h: Update. * LICENSE, METALICENSE: Likewise.
* Use macro to initialize cobj_ops.Kaz Kylheku2015-01-291-14/+10
| | | | | | | | | | * lib.h (cobj_ops_init): New macro. * hash.c (hash_ops, hash_iter_ops): Initialize with cobj_ops_init. * rand.c (random_state_ops): Likewise. * regex.c (char_set_obj_ops, regex_obj_ops): Likewise.
* * Makefile: Removing trailing spaces.Kaz Kylheku2014-10-241-16/+16
| | | | | | | | | | (GREP_CHECK): New macro. (enforce): Rewritten using GREP_CHECK, with new checks. * arith.c, combi.c, debug.c, eval.c, filter.c, gc.c, hash.c, lib.c, * lib.h, match.c, parser.l, parser.y, rand.c, regex.c, signal.c, * signal.h, stream.c, syslog.c, txr.c, unwind.c, utf8.c: Remove trailing spaces.