Simple Hacks to Regular Expressions in Javascript (PART 1)

Simple Hacks to Regular Expressions in Javascript (PART 1)

A regular expression is a string of text that allows you to create patterns that help match, locate and manage text. Regular expressions ( Regex ) is a sequence of characters that define a search pattern. A regular expression in Javascript can be created in two ways :

  • Using a regular expression literal.
var regex = /test_string/ ;        //   var regex = /Dan/
  • Using the constructor function of the RegExp object.
var regex = new RegExp('test_string');      // var regex = new RegExp('Dan');

Creating a regular expression in any of these two ways is appropriate. A regex literal provides compilation of the regular expression when the script is loaded. If the regular expression remains constant, the performance is improved. However, using RegExp is advisable when the pattern of the regex is dynamic or indeterminate ( user specified ).

var str = 'Dan Walker';
var regex = /Dan/
console.log(regex.test(str));     // true
var str = 'Dan Walker';
var regex = new RegExp('Dan');
console.log(regex.test(str));     //true

The above snippet implements the regex literal and RegExp to test if a string passes the regex test and returns a boolean. The exec method gives a more introspective look into the regular expression and its test string.

console.log(regex.exec(str));    // ["Dan", index: 0, input: "Dan Walker", groups: undefined]

A regular expression is referred to as being state aware when it matches or manage its inputs. An illustration of this can be observed in its global flag.

var str = 'This is the oasis'
var regex = new RegExp('is');
console.log(regex.exec(str));     // ["is", index: 2, input: "This is the oasis", groups: undefined]
console.log(regex.exec(str));    //  ["is", index: 2, input: "This is the oasis", groups: undefined]

Even though is appears more than once in the test string, the regex only takes note of its first appearance. With the global flag, represented as g, the regex is made aware of the other occurences.

var str = 'This is the oasis' ; 
var regex = new RegExp('is' , 'g');   // i,e   var regex = /is/g
console.log(regex.exec(str));      //  ["is", index: 2, input: "This is the oasis", groups: undefined]
console.log(regex.exec(str));     //   ["is", index: 5, input: "This is the oasis", groups: undefined]

To mitigate verbosity, we would henceforth use the regex literal as we proceed. Case sensitivity is a crucial part of regular expressions. While testing for patterns in strings, it is important to take note of the case of the test string. The case in which the test string is can be ignored using the i flag.

var str = 'This is the oasis' ; 
var regex = /Is/g ;
console.log(regex.test(str))          // false
var str = 'This is the oasis' ; 
var regex = /Is/gi ;
console.log(regex.test(str));          // true

Adding the i flag after the global flag ignores the case of the test string.

Using the replace method in the string prototype

var str = 'This is the oasis' ; 
var regex = /is/
str.replace(regex, str=>'at');   // 'That is the oasis'

Return the index position of a test string using regex

var str = 'This is the oasis' ; 
var regex = /is/
console.log(str.search(regex));      // 2

FINDING PATTERNS IN PLAIN TEXT (AN ILLUSTRATIVE APPROACH)

Screen Shot 2020-05-17 at 2.51.35 PM.png

The . meta character matches any single character except a new line.

var regex = /./g ;           // Matches any single character except a new line.

Screen Shot 2020-05-17 at 3.35.50 PM.png

Screen Shot 2020-05-17 at 3.36.13 PM.png

var regex = /[h,s]is/g ;    //Matches any string that contains is preceeded by h or s

Screen Shot 2020-05-17 at 3.51.06 PM.png

There is quite a number of meta characters used in regular expressions (Regex meta characters). A meta character can be escaped (rendered ineffective or causing it to have a literal meaning). The backward slash ** is used to escape the meta characters in regular expressions.

Finding common sets of characters using regular expressions

  • Alphanumeric

Screen Shot 2020-05-17 at 4.03.36 PM.png

From the illustration, w matches the alphanumeric characters in a test string.

var regex = /\w/g    // i.e  /[a-zA-Z0-9]/g
  • Numeric (digits)

Screen Shot 2020-05-17 at 4.07.14 PM.png

From the illustration, d matches the numeric characters in a test string.

var regex = /\d/g    // i.e  /[0-9]/g
  • White spaces

Screen Shot 2020-05-17 at 4.10.10 PM.png

From the illustration, s matches the white spaces in the test string.

var regex = /\s/g

This article is meant to give an introspective idea on the fundamentals of regular expressions. In the later part of this article, the following would be illustrated:

  • Finding repeated patterns using quantifiers.

  • Basic regular expressions used in validation.