Featured AI Products
Compute
Build, deploy, and scale cloud compute resources
Containers and Images
Safely store and manage containers and backups
Managed Databases
Fully managed resources running popular database engines
Management and Dev Tools
Control infrastructure and gather insights
Networking
Secure and control traffic to apps
Security
Help protect your account and resources with these security features
Storage
Store and access any amount of data reliably in the cloud
Browse all products
AI/ML
CMS
Data and IoT
Developer Tools
Gaming and Media
Hosting
Security and Networking
Startups and SMBs
Web and App Platforms
See all solutions
Community
Documentation
Developer Tools
Get Involved
Utilities and Help
Become a Partner
Marketplace
Pricing

- Community
- DigitalOcean
- Community
- DigitalOcean

JavaScript Regular Expressions for Regular People

Published on February 7, 2019

By Aaron Arney

JavaScript Regular Expressions for Regular People

Regular expressions, also known as regex or regexp, is a difficult subject to tackle. Don’t feel ashamed if you’re not 100% comfortable with writing your own regular expressions yet, as it does take some getting used to. My hope is that by the end of this article, you’ll be one step closer into rocking your own expressions in JavaScript without the need of relying so much on copypasta from Stack Overflow.

The first step to writing a regular expression is to understand how to invoke it. In JavaScript, regular expressions are a standard built-in object. Because of this, we can create a new RegExp object in few ways:

The literal way, /expression/.match('string to test against')
The new keyword with string argument, new RegExp('expression')
The new keyword with literal, new RegExp(/expression/)

I’ll use a combination of the methods just to show that they essentially perform the same job.

The Goals of our Regular Expression

In my example I’m going to be working with a string that contains my first name, last name, and a domain name. In the real world, the example would need much more thought. There are scores of subtleties when it comes to dealing with names, which I won’t address here.

Let’s say I’m building a dashboard and want to display the name of the logged-in user. I have no control over the data that’s returned to me so I have to make do with what I have.

I need to convert aaron.arney:alligator.io into Aaron Arney [Alligator].

Regular expressions fit a lot of logic into a single condensed object. This can and will cause confusion. A good practice is to break down your expression into a form of pseudo-code. This enables us to see what needs to happen and when.

Extract the first name
Extract the last name
Extract the domain name
Format the string into the desired templated format First Last [Domain]

Matching the First Name

To match a string with a regular expression, all you have to do is pass the literal string. The i at the end of the expression is a flag. The i flag in particular stands for case insensitive. That means that our expression with ignore casing on the string.

const unformattedName = 'aaron.arney:alligator.io';

const found = unformattedName.match(/aaron/i);

console.log(found);
// expected output: Array [ "aaron" ]

That works well, yet in our case it isn’t a good approach since the name of the user isn’t always going to be “Aaron.” This is where we explore programmatically matching strings.

Let’s focus on matching a first name for the time being. Break the word down into individual characters, what do you see?

The name “Aaron” consists of five alpha characters. Does every first name have only five characters? No, but it is reasonable to assume that first names can range between 1 and 15 characters. To denote a character in range of a-z, we use [a-z].

Now, if we update our expression to use this character class…

const unformattedName = 'aaron.arney:alligator.io';

const found = unformattedName.match(/[a-z]/i);

console.log(found);
// expected output: Array [ "a" ]

Instead of extracting “aaron” from the string, it only returns “a.” This is good, as regular expressions try their hardest to match as little as possible. To repeat the character match a number up to our limit of 15, we use curly brackets. This tells the expression that we watch to match the preceding token, our “a-z”, to match between 1 and 15 times.

const unformattedName = 'aaron.arney:alligator.io';
const unformattedNameTwo = 'montgomery.bickerdicke:alligator.io';
const unformattedNameThree = 'a.lila:alligator.io';

const exp = new RegExp(/[a-z]{1,15}/, 'i');

const found = unformattedName.match(exp);
const foundTwo = unformattedNameTwo.match(exp);
const foundThree = unformattedNameThree.match(exp);

console.log(found);
// expected output: Array [ "aaron" ]

console.log(foundTwo);
// expected output: Array [ "montgomery" ]

console.log(foundThree);
// expected output: Array [ "a" ]

Matching the Last Name

Extracting the last name should be as easy as copying and pasting our first expression. You’ll notice that the match still returns the same value instead of both the first and last names.

Break down the string character by character, there is a full stop separating the names. To account for this, we add the full stop to our expression.

We have to be careful here. The . can mean one of two things in an expression.

. - Match any character except newline
\. - Match a .

Using either version in this context will generate the same result, but that won’t always be the case. Tools like eslint will sometimes mark the escape sequence \ as unnecessary, but I say better safe than sorry!

const unformattedName = 'aaron.arney:alligator.io';

const exp = new RegExp(/[a-z]{1,15}\.[a-z]{1,15}/, 'i');

const found = unformattedName.match(exp);

console.log(found);
// expected output: Array [ "aaron.arney" ]

Since we prefer to split the string into two items as well as excluding the full stop from being returned by the expression, we can now use capturing groups. These are denoted by parenthesis () and wrap around parts of your expression in which you want to be returned. If we wrap them around the first and last name expressions, we’ll get new results.

The syntax for using capture groups is simple: (expression). Since I only want to return my first and last name and not the full stop, wrap our expressions in parenthesis.

const unformattedName = 'aaron.arney:alligator.io';

const exp = new RegExp(/([a-z]{1,15})\.([a-z]{1,15})/, 'i');

const found = unformattedName.match(exp);

console.log(found);
// expected output: Array [ "aaron.arney", "aaron", "arney" ]

Matching the Domain Name

To extract “alligator.io”, we will use the character classes we’ve already used thus far. With some slight modification, of course.

Validating domain names and TLD’s is a difficult business. We’re going to pretend the domains that we parse, are always > 3 && < 25 characters. The TLD’s are always > 1 && < 10. If we plug these in, we will get some new output:

const unformattedName = 'aaron.arney:alligator.io';

const exp = new RegExp(/([a-z]{1,15})\.([a-z]{1,15}):([a-z]{3,25}\.[a-z]{2,10})/, 'i');

const found = unformattedName.match(exp);

console.log(found);
// expected output: Array [ "aaron.arney:alligator.io", "aaron", "arney", "alligator.io" ]

A Shortcut

I showed you the “long way” of going about the expression. Now, I’ll show you how you can have a less verbose expression that captures the same text. By using the + quantifier, we can tell our expression to repeat the preceding token as many times as it can. It will continue until it hits a dead end, in our case the full stop. This expression also introduces the g flag, which stands for global. It tells the expression that we want to repeat our search as many times as possible, instead of the least times.

// With the global flag
'aaron.arney:alligator.io'.match(/[a-z]+/ig);
// expected output: Array(4) [ "aaron", "arney", "alligator", "io" ]

// Without the global flag
'aaron.arney:alligator.io'.match(/[a-z]+/i);
// expected output: Array(4) [ "aaron" ]

Formatting Output

To format the string, we’ll be using the replace method on the String object. The replace method takes two arguments:

RegExp | String - A regular expression object or literal
RegExp | function - A regular expression or function

const unformattedName = 'aaron.arney:alligator.io';

// The "long" way
const exp = new RegExp(/([a-z]{1,15})\.([a-z]{1,15}):([a-z]{3,25}\.[a-z]{2,10})/, 'i');

unformattedName.replace(exp, '$1 $2 [$3]');
// expected output:  "aaron arney [alligator.io]"

// A slightly shorter way
unformattedName.replace(/([a-z]+)\.([a-z]+):([a-z]+\.[a-z]{2,10})/ig, '$1 $2 [$3]');
// expected output: "aaron arney [alligator.io]"

In the above snippet, the $1, $2, $3 are special patterns that get interpreted by the replace method.

$1 - The first result from the match array =>A reference to the first parenthesized group
$2 - The second result from the match array=> A reference to the second parenthesized group
$n - So on and so on

To capitalize the words, we can use another regex. Instead of formatting the output like we did above, we will pass a function. The function capitalizes the argument supplied and returns it.

Here, I’m introducing a couple new parts, anchors, alternation, and a new character class [^].

[^abc] - Not a, b, or c
\b - Word boundary
ab|cd - Logical “OR”, matches ab or cd

// Capitalize the words
"aaron arney [alligator.io]".replace(/(^\b[a-z])|([^\.]\b[a-z])/g, (char) => char.toUpperCase());
// expected output: "Aaron Arney [Alligator.io]"

Breaking down this expression into two parts…

(^\b[a-z]) - Capture the first character of the string. ^ says to match the beginning of the string.
|([^\.]\b[a-z]) - OR, match a new word that does not start with a full stop ., as this is the TLD.

Continuing Your Exploration

This is but a small taste of the power of regular expressions. The example I worked through is improvable, but how?

Is the expression too verbose? Is it too simplified?
Does it cover edge cases?
Could you replace it with some clever string manipulation using native methods?

This is where you take the knowledge you learned and try to answer those questions. Explore the following resources to help you in your journey and experiment!

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author

Aaron Arney

Author

Category:

Tutorial

Tags:

JavaScript

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a question Search for more help

Was this helpful?

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

View all products

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Report this