Advance Your Programming - Part 1 - Regex
In my case I will talk about PHP, because it is so widely available (nearly 100% of reasonable hosting companies include PHP support in even their lowest plans). This article is also geared toward rather novice programmers because surely the aspects I mention can be seen on the tool belt of all master programmers.
- Regular Expressions
- Templates and Includes
- XML Parsing
I will only touch light on these subjects in the hope that you will continue research on them if you have a desire to improve your skills. Once learning each of these, your understanding of web programming will be greatly improved, your code may become cleaner and more concise, and most importantly you will have more fun programming! Today’s article will focus on Regular Expressions, commonly refered to as regex.
Regular Expressions - Regex
This will not be an article teaching regular expressions (I may refer to them as ‘regex’). I would rather this be a stimulus for you to learn regular expressions if you are currently in the dark, and afterwards I may provide you with a few sources to help you learn regular expressions. You don’t need to really understand the regex that I show you in this article, nor will I really explain them to you. You may even be able to decipher what the regex means on your own. Keep in mind that I am showing you uses of regular expressions, their powerful and simplicity, as a means to entice you to learn them yourself. You will not regret it.
Regular expressions are a language in their own. At first glance, a regex may appear complex, but they are in fact very straightforward and they have many uses for web programmers. Regex provide pattern matching for strings. A common use is searching a string for something, if that something exists then execute some code. Or the reverse, if the pattern is not found then execute some code. One such example is input validation.
Lets say you have a simple contact form, which asks for a user’s phone number and email address. Those are common form elements and should be checked for valid input so you can be sure users are not putting deliberately fake input or perhaps a simple typo in their phone number. Regex can ensure that the users input is not only valid, but that it is also in the correct format that you specify, making your work easier.
Lets make a simple PHP example, with hardcoded values, and the phone number format we want is ###-###-####.
<?php
$str = '555-555-555a';
$regex = '/^d{3}-d{3}-d{4}$/';
if (preg_match($regex, $str))
echo 'This is a valid phone number.';
else
echo 'Bad phone number.';
?>
The regex is located between the forward slashes. / here /. I am also using a PHP function preg_match( regex, string the regex works on ). For those interested, here is the documentation on preg_match from php.net. Copy, Paste, Save, and Run this example and you should see that the result is that 555-555-555a is indeed a bad phone number! The reason, you probably determined, was that ‘a’ is not a number! The preg_match() function applies the regex on the string and finds that the string is not in the correct format (described above). Powerful, simple, and fast! To see the script in action with a working phone number change $str to any valid phone number, here is an example:
$str = '123-456-7890';
With one simple string we force all input to be exactly how we want it! To show how flexible regex can be, we can extend the regular expression to allow for a phone number in either: ###-###-### format, ######### format, or (###) ###-#### format! All at the same time like so: /^(\d{3}-\d{3}-\d{4})|\d{10}|(\(\d{3}\) \d{3}-\d{4})$/
Now lets take a look at an email address. An email address to be valid should look something like: example@domain.com. It may even be as complex as: alpha.123@crazy.com, but lets just have a regex to allow for a simple email address. Here goes:
<?php
$email = 'example@domain.net';
if (preg_match('/^[a-z0-9_-]+@([_a-z0-9-]+.)+(com|net)$/i', $email))
echo 'Valid Email ending in .com or .net';
else
echo 'Bad email address!';
?>
The regular expression allows for only a simple email address. It only allows email address ending in .com or .net, but that can easily be extended. I think you can see how powerful this is! Only allow valid formatted email address will prevent problems down the line if you ever have to email this person using a script. At least the email will be correctly formatted, at least your script won’t crash prematurely due to a formatting error, but instead the worst that could happen was the email is sent but no one ever receives it!
The simplicity of regex in form validation is unparalleled. How would you ensure that the user put in a valid phone number?!? Would you individually check each character in the string? Make sure that the first character is a number, the second, the third, then a dash, number, number, … A pain to do manually, regular expressions are just a single string which can do all of the work.
Soon I will provide another use of regular expression, search and replace! There are countless uses but I will cover topics you will surely have to deal with as web programmers, and by learning regular expressions you will see a new world of possibilities. It would be best if you started to learn the basics of regular expressions and so I will provide you with some links.
Overview - A Brief but Very Accurate Overview of Regular Expressions.
Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns
This is the website I highly recommend that you start at.
Regular expression - Wikipedia, the free encyclopedia
Wikipedia is always a great resource for programming questions.
Regular Expressions Cheat Sheet - Cheat Sheets - ILoveJackDaniels.com
If ever you need a cheat sheet, be it PHP, Ruby on Rails, or CSS, get it here!
