Web Programming Step by Step, 2nd Edition

Lecture 12: Regular Expressions

Reading: 15.1 - 15.3

Except where otherwise noted, the contents of this document are Copyright 2012 Marty Stepp, Jessica Miller, and Victoria Kirst. All rights reserved. Any redistribution, reproduction, transmission, or storage of part or all of the contents in any form is prohibited without the author's expressed written permission.

Valid HTML5 Valid CSS

What is form validation?

A real form that uses validation

wamu

An example form to be validated

<form action="http://foo.com/foo.php" method="get">
	<div>
		City:  <input name="city" /> <br />
		State: <input name="state" size="2" maxlength="2" /> <br />
		ZIP:   <input name="zip" size="5" maxlength="5" /> <br />
		<input type="submit" />
	</div>
</form>

Recall: Basic server-side validation

$city  = $_POST["city"];
$state = $_POST["state"];
$zip   = $_POST["zip"];
if (!$city || strlen($state) != 2 || strlen($zip) != 5) {
	print "Error, invalid city/state/zip submitted.";
}

Regular expressions

/^[a-zA-Z_\-]+@(([a-zA-Z_\-])+\.)+[a-zA-Z]{2,4}$/

Regular expressions

This picture best describes regex.

Basic regular expressions

/abc/

Wildcards: .

Special characters: |, (), \

Quantifiers: *, +, ?

More quantifiers: {min,max}

Practice exercise

Anchors: ^ and $

Character sets: []

Character ranges: [start-end]

Practice Exercises

Escape sequences

Regular expressions in PHP (PDF)

function description
preg_match(regex, string) returns TRUE if string matches regex
preg_replace(regex, replacement, string) returns a new string with all substrings that match regex replaced by replacement
preg_split(regex, string) returns an array of strings from given string broken apart using given regex as delimiter (like explode but more powerful)

PHP form validation w/ regexes

$state = $_POST["state"];
if (!preg_match("/^[A-Z]{2}$/", $state)) {
	print "Error, invalid state submitted.";
}

Regular expression PHP example

# replace vowels with stars
$str = "the quick    brown        fox";

$str = preg_replace("/[aeiou]/", "*", $str);
                         # "th* q**ck    br*wn        f*x"

# break apart into words
$words = preg_split("/[ ]+/", $str);
                         # ("th*", "q**ck", "br*wn", "f*x")

# capitalize words that had 2+ consecutive vowels
for ($i = 0; $i < count($words); $i++) {
	if (preg_match("/\\*{2,}/", $words[$i])) {
		$words[$i] = strtoupper($words[$i]);
	}
}                        # ("th*", "Q**CK", "br*wn", "f*x")

Practice exercise

Use regular expressions to add validation to the turnin form shown in previous lectures.

Handling invalid data

function check_valid($regex, $param) {
	if (preg_match($regex, $_POST[$param])) {
		return $_POST[$param];
	} else {
		# code to run if the parameter is invalid
		die("Bad $param");
	}
}
...
$sid     = check_valid("/^[0-9]{7}$/", "studentid");
$section = check_valid("/^[AB][A-C]$/i", "section");

Regular expressions in HTML forms

How old are you?
<input type="text" name="age" size="2" pattern="[0-9]+" title="an integer" />
<input type="submit" />
  • HTML5 adds a new pattern attribute to input elements
  • the browser will refuse to submit the form unless the value matches the regex