Using Regex in Golf


Regex (or "REGular EXpressions") is a powerful way to search strings, as well as to replace parts of strings. It makes it easier to reliably do so, since regular expressions are usually short; at the same time there's a learning curve in becoming proficient. However, given that regex is ubiquitous, it is worth learning at least some of it, as it can come handy.

Golf's implementation of regex is via match-regex statement, which allows for both searching and replacing in a single statement. It also has a caching capability (meaning caching of compiled regex process), which can increase performance by up to 500%.

This article will show a few basic ways to use match-regex statement in your Golf code.

Create directory for your application:
mkdir -p regex
cd regex

Create "reg" application:
gg -k reg

Copy the following code to file "reg.golf":
 %% /reg public
     // Use backreferences to swap two words, with "Reverse order word" as a result
     match-regex "(word)\\s+(order)" in "Reverse word order" replace-with "\\2 \\1" result res
     print-out res new-line

     // Recognize a pattern, in this case 3 found
     match-regex "[abc]{3}" in "Recognize 'aaa' or 'aa' or 'abc' or 'cab'" status st
     print-out st new-line

     // Recognize a pattern, in this case not found
     match-regex "[abc]{3}" in "Recognize 'aa' or 'aa' or 'bc' or 'ca'" status st
     print-out st new-line

     // Use case insensitive search to recognize a pattern, in this case 3 found
     match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" case-insensitive status st
     print-out st new-line

     // Use case insensitive search to recognize a pattern and replace, with "Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'" as a result
     match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" replace-with "XXX" result res case-insensitive status st
     print-out res new-line
     print-out st new-line
 %%

Build your application server as a native executable:
gg -q

Run it from the command line:
gg -r --req="/reg" --exec --silent-header

The result is:
Reverse order word
3
0
3
Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'
3

Let's go over this one by one.
Backreferences
The first statement uses back-references, which is a way to refer to something that's found in the string:
 // Use backreferences to swap two words
 match-regex "(word)\\s+(order)" in "Reverse word order" replace-with "\\2 \\1" result res
 print-out res new-line

Here, "word" and "order" are found. Since they are in parenthesis (meaning within "()"), they can be used as back-references. The first one would be \1, the second one \2 etc. In "replace-with" clause, we refer to them as "\\1" and "\\2" just because backslash in a special character used to escape others and needs to be escaped itself. "\\s+" means find any spaces ("\\s") which repeat at least one time ("+"). So we're looking for essentially a snippet like "word order" or "word   order", and we are then replacing that with "\\2 \\1". Keep in mind that "\\1" refers to "word" and "\\2" refers to "order". So the result will be "order word", i.e. the two words will be output in reverse order.
Finding pattern, and counting them
A common use of regex is to find out if a pattern is showing up in a string, and how many times. Consider this:
 // Recognize a pattern, found
 match-regex "[abc]{3}" in "Recognize 'aaa' or 'aa' or 'abc' or 'cab'" status st
 print-out st new-line

Here, you're looking for any of the characters "a", "b" or "c" that repeat 3 times (which is what "{3}" does). Obviously "aaa", "abc" and "cab" fit that bill, while "aa" does not, so the output is 3.

Conversely, in this case, there are no instances of 3 characters (with each being "a", "b" or "c"), since all of them of length 2 (such as "aa", "bc" etc.), so the result will be 0:
 // Recognize a pattern, in this case not found
 match-regex "[abc]{3}" in "Recognize 'aa' or 'aa' or 'bc' or 'ca'" status st
 print-out st new-line

Case insensitive search
By default, the search is case sensitive. You can make it case insensitive with "case-insensitive" clause:
 // Use case insensitive search to recognize a pattern
 match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" case-insensitive status st
 print-out st new-line

In this case, there are also 3 matches ("aAa", "aBc" and "Cab").
Search and replace
In the following example, we search for a pattern and replace it with something:
 // Use case insensitive search to recognize a pattern and replace
 match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" replace-with "XXX" result res case-insensitive status st
 print-out res new-line
 print-out st new-line

Just like in previous example, 3 patterns will be recognized and replaced with "XXX" and with 3 matches as the status, the result is:
Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'
3

Lookahead and lookbehind
Some times you'd like to search for a pattern, but only if there's another pattern before it ("lookbehind") or after it ("lookahead").
See also
Articles
article-capi  
article-cookies  
article-debug  
article-distributed  
article-encryption  
article-fetch-web-page  
article-fifo  
article-file-manager  
article-hello-server  
article-hello-world  
article-hello-world-service  
article-hello-world-service-web  
article-how-to-create-golf-application  
article-json  
article-language  
article-mariadb  
article-memory-safety  
article-memory-safety-web  
article-notes-postgres  
article-random  
article-regex  
article-remote-call  
article-request-function  
article-security  
article-sendmail  
article-server  
article-shopping  
article-sqlite  
article-statements  
article-status-check  
article-tree  
article-tree-web  
article-vim-coloring  
article-web-framework-for-c-programming-language  
article-what-is-golf  
article-what-is-web-service  
See all
documentation


Copyright (c) 2019-2025 Gliim LLC. All contents on this web site is "AS IS" without warranties or guarantees of any kind.