Template literal inside of the RegEx

JavascriptRegexTemplate Literals

Javascript Problem Overview


I tried to place a template literal inside of a RegEx, and it didn't work. I then made a variable regex which holds my RegEx, but it still not giving me the desired result.

However if I console.log(regex) individually, I do receive the desired RegEx, such as /.+?(?=location)/i, /.+?(?=date)/i and so on, but once I place regex inside the .replace it appears not to be working

function validate (data) {
  let testArr = Object.keys(data);
  errorMessages.forEach((elem, i) => {
    const regex = `/.+?(?=${elem.value})/i`;
    const a = testArr[i].replace(regex, '');
    })
  }

Javascript Solutions


Solution 1 - Javascript

Your regex variable is a String. To make it a RegExp, use a RegExp constructor:

const regex = new RegExp(String.raw`pattern_as_in_regex_literal_without_delimiters`)

For example, a regex literal like /<\d+>/g can be re-written as

const re = RegExp(String.raw`<\d+>`, 'g') // One \ is a literal backslash
const re = RegExp(`<\\d+>`, 'g')       // Two \ are required in a non-raw string literal

To insert a variable you may use

const digits = String.raw`\d+`;
const re = RegExp(`<${digits}>`, 'g')

To solve your issue, you may use

const regex = new RegExp(`.+?(?=${elemvalue.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')})`, "i"); 

Also, it is a good idea to escape the variable part in the regex so as all special regex metacharacters were treated as literals.

const s = "final (location)";
const elemvalue = "(location)";
const regex = new RegExp(`.+?(?=${elemvalue.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')})`, "i");
// console.log(regex); // /.+?(?=\(location\))/i
// console.log(typeof(regex)); // object
let a = s.replace(regex, '');
console.log(a);

Solution 2 - Javascript

> A more advanced form of template literals are Tagged templates. - MDN

const escape = s => `${s}`.replace(/[-\/\\^$*+?.()|[\]{}]/g, "\\$&");

const regex = ({ // (1)
  raw: [part, ...parts]
}, ...subs) => new RegExp(
  subs.reduce( // (2)
    (result, sub, i) => `${result}${escape(sub)}${parts[i]}`,
    part
  )
);


const t1 = `d`;
const r1 = regex `^ab(c${t1}e)`;  // (3) 
// /^ab(cde)/
console.log('r2.test(`abc${t1}e`); ➜', r1.test(`abc${t1}e`)); // true

// Check for proper escaped special chars
const t2 = `(:?bar)\d{2}`;
const r2 = regex `foo${t2}`; 
// /foo\(:\?bar\)d\{2\}/  ➜ t2 is escaped!

console.log('r2.test(`foo${t2}`); ➜', r2.test(`foo${t2}`)); // true
console.log('r2.test(`foo\(:\?bar\)d\{2\}`); ➜', r2.test(`foo\(:\?bar\)d\{2\}`)); // true
console.log('r2.test(`foobar11`); ➜', r2.test(`foobar11`)); // false
console.log(r2);

How it works:

  1. Define a "tag function" (name does not matter) e.g. regex() to build our regular expression (return new RegExp()) from a template literal string.

    > Hint: Tag functions don't even need to return a string!

    • First param parts is a string[] (segments between injected vars).

      a${..}bc${..}de['a', 'bc', 'de'] = parts > Hint: The special raw property, available on the first argument to the tag function, allows you to access the raw strings as they were entered, without processing escape sequences.

      This is what we need! We can destructure the first param, and split out the first part of parts. This is our first regExp result segment.

    • subs... are substitution values from injected vars:

      ..${'x'}..${'y'}..['x', 'y']

  2. Loop over ...subs and concatenate regex string:

    1. escape() special chars from current stubs[i] (which would break the final expression).
    2. Append escaped stubs[i] and add parts[i] in between.
  3. Prefix a template literal string, with regex `...` and a new RegExp is returned:

    const r1 = regex `^ab(c${t1}e)` /^ab(cde)/


Conditional escape

So far, regex() handles all template vars ..${..}..${..}.. as strings. (invoke .toString())

regex `X${new Date()}X`          // /X7\.12\.2020X/
regex `X${{}}X`                  // /X\[object Object\]X/
regex `X${[1, 'b', () => 'c']}X` // /X1,b,\(\) => 'c'X/

Even though this is the intended behavior, we can not use it in all places.

How to concat regular expressions in "tagged templates?

> Use case: You want a 'RegExp factory class' to build more complex expressions from smaller pieces. > E.g. a RegExp to parse/validate Content-Type header values for javascript-like MIME types. syntax: media-type = type "/" subtype. > This is what we want to find: > - */* > - application/*, application/javascript, application/ecmascript, > - text/*, text/javascript, text/ecmascript

const group = (name, regExp) => regex`(?<${name}>${regExp})`;

const rTypeGroup = group(`type`, `(text|application|\*)+?`);  
// /(?<type>\(text\|application\|\*\)\+\?)/ 
const rSubGroup = group(`sub`, `((java|ecma)+?script|\*)+?`); 
// /(?<sub>\(\(java\|ecma\)\+\?script\|\*\)\+\?)/
// .. and test
rTypeGroup.test(`application`);                   // false !!!
rTypeGroup.test(`\(text\|application\|\*\)\+\?`); // true !!!

Since regex() escapes all substitutions, the bodies of our capture groups matche literally. We can modify regex() and skip escape() for some types. Now we can pass RegEx instances and skip escape().

const escape = s => `${s}`.replace(/[-\/\\^$*+?.()|[\]{}]/g, "\\$&");
const regex = ({
  raw: [part, ...parts]
}, ...subs) => new RegExp( //            skip escape()
  subs.reduce( //                   ┏━━━━━━━━━┻━━━━━━━━━┓ 
    (result, sub, i) => `${result}${sub instanceof RegExp ? sub.source : escape(sub)}${parts[i]}`,
    part
  )
);

const group = (name, regExp) => regex `(?<${name}>${regExp})`;

//                                         RegEx
//                               ┏━━━━━━━━━━━┻━━━━━━━━━━━┓               
const rTypeGroup = group(`type`, /(text|application|\*)+?/); //  /(?<type>(text|application|\*)+?)/
const rSubGroup = group(`sub`, /((java|ecma)+?script|\*)+?/); // /(?<sub>((java|ecma)+?script|\*)+?)/

// Type
console.log('rTypeGroup.test(`*`); ➜', rTypeGroup.test(`*`)); // true
console.log('rTypeGroup.test(`text`); ➜', rTypeGroup.test(`text`)); // true

console.log('rTypeGroup.exec(`*`).groups.type; ➜', rTypeGroup.exec(`*`).groups.type); // '*'
console.log('rTypeGroup.exec(`text`).groups.type; ➜', rTypeGroup.exec(`text`).groups.type); // 'text'

// SubType
console.log('rSubGroup.test(`*`); ➜', rSubGroup.test(`*`)); // true
console.log('rSubGroup.test(`javascript`); ➜', rSubGroup.test(`javascript`)); // true
console.log('rSubGroup.test(`ecmascript`); ➜', rSubGroup.test(`ecmascript`)); // true

console.log('rSubGroup.exec(`*`).groups.sub; ➜', rSubGroup.exec(`*`).groups.sub); // '*'
console.log('rSubGroup.exec(`javascript`).groups.sub; ➜', rSubGroup.exec(`javascript`).groups.sub); // 'javascript'

We can now decide whether a var should be escaped, or not. Since we have ignored any quantifiers and flags, our groups also validate ABCtextDEF, 12texttext,... .

Now we can bundle multiple RegEx:

//                                                 '/' would be escaped! RegEx needed..            
//                                                               ┏━┻━┓
const rMediaTypeGroup = group(`mediaType`, regex `${rTypeGroup}${/\//}${rSubGroup}`);
// /(?<mediaType>(?<type>(text|application|\*)+?)\/(?<sub>((java|ecma)+?script|\*)+?))/
rMediaTypeGroup.test(`text/javascript`)); // true

const escape = s => `${s}`.replace(/[-\/\\^$*+?.()|[\]{}]/g, "\\$&");
const regex = ({
  raw: [part, ...parts]
}, ...subs) => new RegExp(
  subs.reduce(
    (result, sub, i) => `${result}${sub instanceof RegExp ? sub.source : escape(sub)}${parts[i]}`,
    part
  )
);

const group = (name, regExp) => regex `(?<${name}>${regExp})`;
const rTypeGroup = group(`type`, /(text|application|\*)+?/);
const rSubGroup = group(`sub`, /((java|ecma)+?script|\*)+?/);

//                                                 '/' would be escaped! RegEx needed..            
//                                                               ┏━┻━┓
const rMediaTypeGroup = group(`mediaType`, regex `${rTypeGroup}${/\//}${rSubGroup}`);
// /(?<mediaType>(?<type>(text|application|\*)+?)\/(?<sub>((java|ecma)+?script|\*)+?))/

console.log('rMediaTypeGroup.test(`*/*`); ➜', rMediaTypeGroup.test(`*/*`)); // true
console.log('rMediaTypeGroup.test(`**/**`); ➜', rMediaTypeGroup.test(`**/**`)); // true
console.log('rMediaTypeGroup.test(`text/javascript`); ➜', rMediaTypeGroup.test(`text/javascript`)); // true

console.log('rMediaTypeGroup.test(`1text/javascriptX`); ➜', rMediaTypeGroup.test(`1text/javascriptX`)); // true
console.log('rMediaTypeGroup.test(`*/java`); ➜', rMediaTypeGroup.test(`*/java`)); // true

console.log('rMediaTypeGroup.test(`text/X`); ➜', rMediaTypeGroup.test(`text/X`)); // false
console.log('rMediaTypeGroup.test(`/*`); ➜', rMediaTypeGroup.test(`/*`)); // false


Flags

All flags must be known before initialization. You can read them (/xx/gm.flags, /xx/gm.multiline, /xx/i.ignoreCase,..) but there are no setters.

> A tagged template function (e.g. regex()) returning new Regex() needs to know all flags.

This section demonstrates 3 alternatives of how to handle flags.

  • Option A: Pass flags as ${templateVariable} ➜ Not recommendet!
  • Option B: Extend class RegExp
  • Option C: Use Proxy() for flags
Option A: Pass flags as ${templateVariable}. ➜ Not recommendet!

Thread flags like other substitution vars. We need to check if last variable is flag-like (g, mi,..) and split it away from subs...

const escape = s => `${s}`.replace(/[-\/\\^$*+?.()|[\]{}]/g, "\\$&");

const regex = ({
  raw: [part, ...parts],
  // super verbose...
  splitFlags = ({ // destruct subs[] Array (arrays are objects!)
    length: l,    // destruct subs[].length to l
    iEnd = l - 1, // last array index to iEnd
    [iEnd]: sub,  // subs[subs.length - i] to sub
    ...subs       // subs[0...n-1] to subs
  }) => [         // returns RegEx() constr. params: [flags, concat regex string]                                                    
    //             ┏━━━━━━━━ all chars of sub flag-like? ━━━━━━━━━┓   ┏━ flags  ┏━ re-add last sub and set flags: undefined
    [...sub].every(f => ['s', 'g', 'i', 'm', 'y', 'u'].includes(f)) ? sub : !(subs[iEnd] = sub) || undefined,
    Object.values(subs).reduce( // concat regex string
      (result, sub, i) => `${result}${escape(sub)}${parts[i]}`,
      part
    )
  ]
}, ...subs) => new RegExp(...splitFlags(subs).reverse());

const r1 = regex `^foo(${`bar`})${'i'}`;
console.log('r1:', r1, 'flags:', r1.flags); // /^foo(bar)/i    ['i' flag]

const r2 = regex `^foo(${`bar`})${'mgi'}`;
console.log('r2:', r2, 'flags:', r2.flags); // /^foo(bar)/gim  ['gim' flag]

//            invalid flag 'x' ━━━━┓
const r3 = regex `^foo(${`bar`})${'x'}`;
console.log('r3:', r3, 'flags:', r3.flags); // /^foo(bar)x/    [no flags]

//              invalid flag 'z' ━━━━┓
const r4 = regex `^foo(${`bar`})${'gyzu'}`;
console.log('r4:', r4, 'flags:', r4.flags); // /^foo(bar)gyzu/ [no flags]

Code looks super verbose and mixing up flag-logic and substitution-logic is not obvious from outside. It will also break the final regex if last variable is falsely determine as flag-like.

> We look for phone types, like i-phone, a-phone,...

const rPhoneA = regex `${`a`}-phone`; // 
console.log('rPhoneA:', rPhoneA, 'flags:', rPhoneA.flags); // /a-phone/     [no flags]
const rPhoneI = regex `${`i`}-phone`; 
console.log('rPhoneI:', rPhoneI, 'flags:', rPhoneI.flags); // /(?:)/i       ['i' flag]

console.log('rPhoneA.test(`a-phone`); ➜', rPhoneA.test(`a-phone`)); // true
console.log('rPhoneA.test(`i-phone`); ➜', rPhoneA.test(`i-phone`)); // false

console.log('rPhoneI.test(`a-phone`); ➜', rPhoneI.test(`a-phone`)); // true
console.log('rPhoneI.test(`i-phone`); ➜', rPhoneI.test(`i-phone`)); // true

... all phones are i-phones! Because i is a flag-like substituion and is removed from subs... array which is now empty []. reduce() returns an empty string '' and new RegExp('', 'i') adds an empty non-capture group: (?:).

Option B: Extend class RegExp

We can extend from RegExp and add getter methods to set flags. Make them self-returning so we can chain them. We can even add a _ "clear flags" method.

This works nice, but also has the effect, that each flag added/removed, results in a new clone of TRegExp. If we build "static" (cachable) expressions it is probably OK for you.

  • A: Add getters in the constructor inside a loop. Or..
  • B: Add class getters

const escape = s => `${s}`.replace(/[-\/\\^$*+?.()|[\]{}]/g, "\\$&");

class TRegExp extends RegExp {

  constructor(...args) {
    super(...args);

    // Clear all flags
    Object.defineProperty(this, '_', {
      get() {
        return this.flags.length ? new TRegExp(this.source) : this;
      },
      enumerable: false
    });

    // A: define getters for all flags  
    ['g', 'i', 'm', 'u', 'y'].reduce((my, flag) => Object.defineProperty(my, flag, {
      get() {    // clone this on flags change ━━┓
        return my.flags.includes(flag) ? my : new TRegExp(my.source, `${my.flags}${flag}`);
      }, //                 return this ━━┛
      enumerable: false
    }), this);
  }

  // B: Define getters for each flag individually
  // get g() {
  //     return this.flags.includes('g') ? this : new TRegExp(this.source, `${this.flags}g`);
  // }
}

const regex = ({raw: [part, ...parts]}, ...subs) => new TRegExp(
  subs.reduce( //                                       ┣━━ TRegExp()
    (result, sub, i) => `${result}${subs instanceof TRegExp ? sub.source : escape(sub)}${parts[i]}`,
    part
  )
);

console.log('TRegExp +flags:', regex `foo(bar)`.g);       // /foo(bar)/g
console.log('TRegExp +flags:', regex `foo(bar)`.i.m);     // /foo(bar)/im
console.log('TRegExp +flags:', regex `foo(bar)`.g.i._.y); // /foo(bar)/y 
//                                              ┗━━━━━┻━━━━ ( + 'g', + 'i', - 'gi', + 'y')

const group = regex `(?<foo>(:?bar)\d{2})`.g.i;
const t = `a bar12 bar0 bar13 bar-99 xyz BaR14 bar15 abc`;
console.log([...t.matchAll(group)].map(m => m.groups.foo));
// ["bar12", "bar13", "BaR14", "bar15"]

Option C: Use Proxy() for flags

You can proxy a "tagged template function" e.g. regex() and intercept any get() on it. Proxies can solve many problems but can also lead to great confusion. You can re-write the entire function and completely change the initial behavior.

const escape = s => `${s}`.replace(/[-\/\\^$*+?.()|[\]{}]/g, "\\$&");

const _regex = (flags, {raw: [part, ...parts]}, ...subs) => new RegExp(
  subs.reduce(
    (result, sub, i) => `${result}${sub instanceof RegExp ? sub.source : escape(sub)}${parts[i]}`,
    part
  ),
  flags
); // ┗━━━ flags

const regex = new Proxy(_regex.bind(undefined, ''), {
  get: (target, property) => _regex.bind(undefined, property)
});


console.log('Proxy +flags:', regex.gi `foo(bar)`); // /foo(bar)/gi

const r = /(:?bar)\d{2}/; // matches: 'bar' + digit + digit ➜ 'bar12', 'abar123',..
const t = `(:?bar)\d{2}`; // template literal with regExp special chars

console.log('Proxy +flags:', regex.gi `(foo${r})`); // /(foo(:?bar)\d{2})/gi
console.log('Proxy +flags:', regex.gi `(foo${t})`); // /(foo\(:\?bar\)d\{2\})/gi
//                         flags ━┻━┛


More to read

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionvladView Question on Stackoverflow
Solution 1 - JavascriptWiktor StribiżewView Answer on Stackoverflow
Solution 2 - JavascriptExodus 4DView Answer on Stackoverflow