Re: regular expression repetition operator

 new new list compose Reply to this message Top page
Attachments:
+ (text/plain)
+ (text/html)

Delete this message
Author: Kaz Kylheku
Date:  
To: vapnik spaknik
CC: txr-users
Subject: Re: regular expression repetition operator

On 2019-06-30 19:52, Kaz Kylheku wrote:

On 2019-06-30 16:58, vapnik spaknik wrote:

Hi,
    the regexp syntax for txr doesn't seem to include the postfix repetition operator {M,N} for matching between M & N repetitions of an object. 
This is severely limiting; I need to match between 5 & 10 spaces in files I am trying to extract data from. 
How can I achieve this with txr? Is it possible?

Hi vapnik,

The R{M,N} syntax basically denotes multiple expansions of R from M to N. For instance X{2,4} is a shorthand for  (XX|XXX|XXXX). A match for five to ten spaces can be coded similarly.

Addendum: of course, this can be algebraically factored.  For instance, in the above, we can factor out the common term XX to obtain XX(|X|XX). (Note the one empty term in the parentheses, making three).

So to match 5 to 10 spaces, first match 5 spaces, then from 0 to 5 additional: SSSSS(|S|SS|SSS|SSSS|SSSSS). For S, substitute whatever space is appropriate: the plain ASCII space or \s or whatever.