From 93e3d055337822ceff671710ea0611e3bbf46ae2 Mon Sep 17 00:00:00 2001 From: Kaz Kylheku Date: Thu, 22 Sep 2016 21:36:22 -0700 Subject: Semantics change in match-regex-right. The way the end-position argument works in match-regex-right and match-regst-right is poorly considered. It basically enforces a constraint that there is a match which ends at that position and does not go beyond. This patch changes it work right: the functions test that the regex matches up to that position, as if the string ended there. * regex.c (match_regex_right_old): New static function, identical to the previous match_regex_right. Since we won't ever be using this inside TXR from any other module, we don't make it external. (match_regex_right): Rewritten to new semantics. (match_regst_right_old): New static function; provides the semantics of the old match_regst_right based on match_regex_right_old. (regex_init): Register match-regex-right and match-regst-right intrinsics to the match_regex_right_old and match_regst_right_old functions if compatibility <= 150 is requested. Otherwise they go to the rewritten new functions. * txr.1: Documentation updated, and compat notes added. --- txr.1 | 50 +++++++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 19 deletions(-) (limited to 'txr.1') diff --git a/txr.1 b/txr.1 index 60cbee60..91b1a4a0 100644 --- a/txr.1 +++ b/txr.1 @@ -31869,17 +31869,19 @@ matching substring of .syne .desc The -.code match-regex -function tests whether +.code match-regex-right +function tests whether some substring of .meta string -contains a match which ends -precisely on the character just before -.metn end-position . +which terminates at the character position just before +.meta end-position +matches +.metn regex . If .meta end-position is not specified, it defaults to the length of the string, and the function performs a right-anchored regex match. + The .meta end-position argument can be a negative integer, in which case it denotes @@ -31890,23 +31892,26 @@ of the string, then .code nil is returned. +If +.meta end-position +is a positive value beyond the length of +.metn string , +then, likewise, +.code nil +is returned. + If a match is found, then the length of the match is returned. -The match must terminate just before -.meta end-position -in the sense that -additional characters at +A more precise way of articulating the role of .meta end-position -and beyond can no longer satisfy the -regular expression. More formally, the function searches, starting from -position zero, for positions where there occurs a match for the regular -expression, taking the longest possible match. The length of first such a match -which terminates on the character just before +is that for the purposes of matching, +.code string +is considered to terminate just before +.metn end-position : +in other words, that .meta end-position -is returned. -If no such a match is found, then -.code nil -is returned. +is the length of the string. The match is then anchored to the +end of this effective string. The .code match-regst-right @@ -31914,7 +31919,7 @@ differs from .code match-regst-right in the representation of the return value in the matching case. Rather than returning the length of the match, it returns -matching substring of +the matching substring of .metn string . .TP* Examples: @@ -45715,6 +45720,13 @@ the behavior. The function was also affected by this issue; however, since it returned nonsense result not corresponding to the matching text, it was repaired without backward compatibility. +Also affected by version 150 compatibility are the +.code match-regex-right +and +.code match-regst-right +functions. These functions worked as documented; however, their +specification changes after version 150 to a semantics which is +more useful and less surprising to the programmer. .IP 148 Up until version 148, the .code :postinit -- cgit v1.2.3