rgx - Regular expressions

Module Description

The rgx module implements regular expressions. It supports words for compiling and matching regular expressions. The module uses the [nfe] module for the actual expression building and matching.


    This module uses the following syntax:
     .   Match any char [incl. newline]     *   Match zero or more
     +   Match one or more                  ?   Match zero or one
     |   Match alternatives                 []  Class
     ()  Group or subexpression

    Backslash characters:
     \.  Character .                       \*   Character *
     \+  Character +                       \?   Character ?
     \|  Character |                       \\   Backslash
     \[  Character [

     \r  Carriage return                   \n   Line feed
     \t  Horizontal tab                    \e   Escape

     \d  Digits class: [0-9]               \D   No digits: [^0-9]
     \w  Word class: [0-9a-zA-Z_]          \W   No word: [^0-9a-zA-Z_]
     \s  Whitespace                        \S   No whitespace

     All other backslash characters simply return the trailing character,
     but this can change in future versions.

     Classes:
      [abc]  - match a or b or c
      [^abc] - match everything except a or b or c
      [a-z]  - match a or b or .. z
      [-abc] - match - or a or b or c
      []abc] - match ] or a or b or c
      [\d\n] - match digit or line feed

     Backslash characters in classes:
      \r  Carriage return                \n    Line feed
      \t  Horizontal tab                 \e    Escape
      \]  Character ]                    \-    Character -
      \d  Digits class: [0-9]            \w    Word class: [0-9a-zA-Z_]
      \s  Whitespace

     All other backslash characters simply return the trailing character,
     but this can change in future versions.

Module Words

Regular expression structure

rgx% ( -- n )
Get the required space for a rgx variable

Regular expression creation, initialisation and destruction

rgx-init ( rgx -- )
Initialise the regular expression
rgx-create ( "<spaces>name" -- ; -- rgx )
Create a named regular expression in the dictionary
rgx-new ( -- rgx )
Create a new regular expression on the heap
rgx-free ( rgx -- )
Free the regular expression from the heap

Regular expression words

rgx-compile ( c-addr u rgx -- true | n false )
Compile a pattern as regular expression, return success and optional the error offset n
rgx-cmatch? ( c-addr u rgx -- flag )
Match case-sensitive a string with the regular expression, return match result
rgx-imatch? ( c-addr u rgx -- flag )
Match case-insensitive a string with the regular expression, return match result
rgx-csearch ( c-addr u rgx -- n )
Search case-sensitive in a string for the first match of the regular expression, return offset in string, or -1 for not found
rgx-isearch ( c-addr u rgx -- n:index )
Search case-insensitive in a string for the first match of the regular expression, return offset in string, or -1 if not found
rgx-result ( n rgx -- n1 n2 )
Get the match result of the nth grouping, return match start n2 and end n1

Inspection

rgx-dump ( rgx -- )
Dump the regular expression

Examples

include ffl/rgx.fs

\ Create a regular expression variable rgx1 

rgx-create rgx1

\ Compile a regular expression and check the result

s" ((a*)b)*" rgx1 rgx-compile [IF] 
  .( Expression successful compiled) cr
[ELSE]
  .( Compilation failed on position:) . cr
[THEN]

\ Match case sensitive a test string
 
s" abb" rgx1 rgx-cmatch? [IF]
  .( Test string matched) cr
[ELSE]
  .( No match) cr  
[THEN]



\ Create a regular expression variable on the heap
 
rgx-new value rgx2

\ Compile a regular expression for matching a float number

s" [-+\s]?\d+(\.\d+)?" rgx2 rgx-compile [IF]
  .( Expression successful compiled) cr
[ELSE]
  .( Compilation failed on position:) . cr
[THEN]

\ Match a float number

s" -12.47" rgx2 rgx-cmatch? [IF]
  .( Float number matched) cr
[ELSE]
  .( No match) cr
[THEN]

\ Free the variable from the heap

rgx2 rgx-free


generated 03-Jun-2010 by ofcfrth-0.10.0