Becoming a Regex Wizard with Komodo IDE's Rx Toolkit

Quick! Write a regular expression that matches an unbroken sequence of alphanumeric characters and underscores, but does not start with a digit.

Whoa, whoa, hold on. Wait a minute. If you are like me, you might be thinking "what an earth is a 'regular expression'?" I can think of a whole bunch of irregular expressions. What makes an expression "regular"?

It turns out those computer scientists have invented a whole new language for describing sequences of printable characters...and they call that language "regular expressions". For example, they have come up with the idea that [0-9]+ describes any positive integer number. They've also decided that cat|dog matches the word "cat" or the word "dog", but not "catdog", "dogcat", or "gadoct". They have even come up with the bizarre idea that [A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,} describes almost any one person's e-mail address.

I don't know about you, but that is all Greek to me. Fortunately, Komodo IDE has just the right tool that speaks my language: Rx Toolkit. Even if you speak Greek, Rx Toolkit has exactly what you need to easily understand, write, and work with regular expressions, colloquially known as "regex".

Understanding Regexes

Let us take that last regex above, the one that supposedly matches e-mail addresses.

Regex Toolkit Komodo IDE

Well what do you know, it actually does work... I admit I did have to toggle the "case insensitive" flag over on the right of the window -- you will see it highlighted in blue. (Mousing over each options gives you the rundown on what it actually does.)

Whenever you have a regex that just does not make sense, throw it into Rx Toolkit along with some input, and the tool will help you decipher that Greek, like this doozy:

Regex Toolkit Komodo IDE

Writing Regexes

Okay, so Rx Toolkit can easily help verify that a regex works, but what about if gulp you actually have to write one?

The easiest way to get started is to go ahead and put the text you want to match against in the "Search Text" box. You can put multiple possibilities on multiple lines, as Rx Toolkit will match each line individually if you keep the "m" option turned off.

Let us consider the original question in this blog post, writing a regex that matches an "unbroken sequence of alphanumeric characters and underscores, but does not start with a digit".

Well I can think of a few character sequences that fit that criteria:

  • activestate
  • KomodoIDE_10_1
  • X

Let us put those in the "Search Text" box.

Now, where to begin with the regex? Well since we have been told that [0-9]+ matches the digits of a positive integer, we will try [A-Z]+ and see where that gets us.

Regex Toolkit Komodo IDE

Progress! Rx Toolkit shows us we got it at least partially right by highlighting each part of the matching input text and showing results in the bottom pane.

After a bit more tinkering, I think I managed to get something:

Regex Toolkit Komodo IDE

Awesome! All of the text highlights and the bottom pane shows success. Now we need to make sure we do not accidentally match character sequences that are not valid, like the number 11.

Regex Toolkit Komodo IDE

As you probably guessed, I got it wrong. After some more fiddling around I came up with this possible solution:

Regex Toolkit Komodo IDE

Thank you Rx Toolkit!

Working With Regexes

Even if you are already a regex wizard, juggling between the different regex implementations between programming languages can be tricky. Fortunately, Rx Toolkit supports Perl, Python, PHP, Ruby, JavaScript, and Tcl regex syntaxes. Write one regex, test for all languages right from Komodo! Just click on the little icon to the right of the "Regular Expression" input box to switch between implementations.

Conclusion

Komodo IDE's Rx Toolkit is your one-stop shop for understanding, writing, and working with regular expressions in real time. The ability to toggle between six major programming languages' regex implementations on-the-fly is incredibly powerful and useful. Whether you are a regex newbie or a seasoned veteran, this versatile tool is a must have in your developer arsenal.