Blog

Ryan Moscoe

Software Engineer | AI Prompt Engineer | Ninja

Email Regex Tutorial

Oops! Something went wrong. Please try again later.

February 19, 2023

This tutorial provides an example of a regular expression (regex) for an email address in JavaScript and explains the meaning of each part of the regex. A regex is a pattern that can be used to search within a string, replace characters in a string, or validate input. In order to understand the email regex below, it is helpful to understand the underlying pattern, which consists of the required components of an email address:

  1. A prefix, consisting of up to 64 uppercase letters, lowercase letters, numbers, or specific special characters. The email address cannot begin with a special character.
  2. The @ symbol.
  3. A domain name, consisting of up to 253 uppercase letters, lowercase letters, numbers, hyphens (-), or periods (.).
  4. A period (.).
  5. The top-level domain, which can include two to six letters.

Summary

For this tutorial, we will use the regex /^([a-z0-9]{1})([a-z0-9_\.!#$%&'*+-/=?^`{|}~]{0,63})@([\da-z\.-]{1,253})\.([a-z\.]{2,6})$/ to describe an email address. The remainder of this tutorial will cover the components of this regex. The components of this regex represent the majority of features a regex can include, which makes this tutorial broadly applicable to most JavaScript regular expressions.

Table of Contents

Regex Components

Anchors

In the email address regex above, the ^ at the beginning and the $ at the end represent anchors. Placing a carat (^) at the beginning of a regex indicates the pattern should begin with whatever follows. Similarly, placing a dollar sign ($) at the end of a regex indicates the pattern should end with whatever precedes the dollar sign. The regex above indicates an email address should begin with a letter or number and end with a series of two to six letters or periods.

  • Valid example: example@mail.com
  • Invalid example: !xample@mail.c%m

Quantifiers

Quantifiers indicate minim, maximum, or exact quantities of characters that must match a preceding character or range. The email address regex provided above uses numbers wrapped in braces ({}) to indicate exact quantities (e.g., {1}) and ranges (e.g., {0, 63}). In the latter example, a minimum of 0 and a maximum of 63 characters should match the preceding pattern. Other examples of quantifiers include the plus sign (+), which indicates at least one character should match, the asterisk (*), which indicates zero or more characters should match, and the question mark (?), which means zero or one character should match. The valid and invalid examples below are based on the email address regex above.

  • Valid example: example@mail.com
  • Invalid example: example@mail.computer

Grouping Constructs

Parenthesis (()) can be used to group patterns within a regex. The email address regex above uses parenthesis to group ranges of characters (discussed next) with quantifiers for that range. For example, ([\da-z\.-]{1,253}) indicates that the quantity of 1 to 253 characters applies to the range of \da-z.- at that position of the pattern (the domain). Grouping can also be used to indicate alternative options, such as (abc|xyz), which would mean either "abc" or "xyz."

Bracket Expressions

Bracket expressions are used to define a range of characters that will satisfy the requirements of the regex pattern. For example, [\da-z\.-] means that any digit (\d), lower-case letter (a-z), period (.), or hyphen (-) will suffice.

  • Valid example: example@mail.com
  • Invalid example: example@m@il.com

Character Classes

Character classes provide a shorthand method for identifying a range of characters. For example, the range [0-9] can be shortened to \d (digit). Character classes must be preceded by a backslash (\) in order to escape the character (see below), which tells the program to treat the character with special meaning, rather than as part of a string. The table below shows a few additonal examples of character classes.

Class Name Description
\w Word Upper- and lower-case letters, numbers, hyphens (-), and underscores (_)
\W Non-Word Anything other than a word character
\s Whitespace A space
\n Newline A hidden character indicating a new line of text

Character Escapes

If a character within a regex pattern has special meaning in Javascript, preceding that character with a backslash (\) forces the character to be treated as text with no special meaning. This is called escaping the character, and the pattern of a backslash and the character to be escaped is called an escape sequence. In the email address regex above, there are several instances of \., which cause the period to be treated as a simple text period, rather than a dot that would try to invoke a method or property.

February 19, 2023

Oops! Something went wrong. Please try again later.