How to replace captured groups only?

JavascriptRegex

Javascript Problem Overview


I have HTML code before and after the string:

name="some_text_0_some_text"

I would like to replace the 0 with something like : !NEW_ID!

So I made a simple regex :

.*name="\w+(\d+)\w+".*

But I don't see how to replace exclusively the captured block.

Is there a way to replace a captured result like ($1) with some other string ?

The result would be :

name="some_text_!NEW_ID!_some_text"

Javascript Solutions


Solution 1 - Javascript

A solution is to add captures for the preceding and following text:

str.replace(/(.*name="\w+)(\d+)(\w+".*)/, "$1!NEW_ID!$3")

Solution 2 - Javascript

Now that Javascript has lookbehind (as of ES2018), on newer environments, you can avoid groups entirely in situations like these. Rather, lookbehind for what comes before the group you were capturing, and lookahead for what comes after, and replace with just !NEW_ID!:

const str = 'name="some_text_0_some_text"';
console.log(
  str.replace(/(?<=name="\w+)\d+(?=\w+")/, '!NEW_ID!')
);

With this method, the full match is only the part that needs to be replaced.

  • (?<=name="\w+) - Lookbehind for name=", followed by word characters (luckily, lookbehinds do not have to be fixed width in Javascript!)
  • \d+ - Match one or more digits - the only part of the pattern not in a lookaround, the only part of the string that will be in the resulting match
  • (?=\w+") - Lookahead for word characters followed by " `

Keep in mind that lookbehind is pretty new. It works in modern versions of V8 (including Chrome, Opera, and Node), but not in most other environments, at least not yet. So while you can reliably use lookbehind in Node and in your own browser (if it runs on a modern version of V8), it's not yet sufficiently supported by random clients (like on a public website).

Solution 3 - Javascript

A little improvement to Matthew's answer could be a lookahead instead of the last capturing group:

.replace(/(\w+)(\d+)(?=\w+)/, "$1!NEW_ID!");

Or you could split on the decimal and join with your new id like this:

.split(/\d+/).join("!NEW_ID!");

Example/Benchmark here: https://codepen.io/jogai/full/oyNXBX

Solution 4 - Javascript

With two capturing groups would have been also possible; I would have also included two dashes, as additional left and right boundaries, before and after the digits, and the modified expression would have looked like:

(.*name=".+_)\d+(_[^"]+".*)

const regex = /(.*name=".+_)\d+(_[^"]+".*)/g;
const str = `some_data_before name="some_text_0_some_text" and then some_data after`;
const subst = `$1!NEW_ID!$2`;
const result = str.replace(regex, subst);
console.log(result);


> If you wish to explore/simplify/modify the expression, it's been > explained on the top right panel of > regex101.com. If you'd like, you > can also watch in this > link, how it would match > against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Solution 5 - Javascript

A simplier option is to just capture the digits and replace them.

const name = 'preceding_text_0_following_text';
const matcher = /(\d+)/;

// Replace with whatever you would like
const newName = name.replace(matcher, 'NEW_STUFF');
console.log("Full replace", newName);

// Perform work on the match and replace using a function
// In this case increment it using an arrow function
const incrementedName = name.replace(matcher, (match) => ++match);
console.log("Increment", incrementedName);

Resources

Solution 6 - Javascript

"some_text_0_some_text".replace(/(?=\w+)\d+(?=\w+)/, '!NEW_ID!')

Result is

> some_text_!NEW_ID!_some_text

const regExp = /(?=\w+)\d+(?=\w+)/;
const newID = '!NEW_ID!';
const str = 'some_text_0_some_text';
const result = str.replace(regExp, newID);

console.log(result);

x(?=y) in JS RegExp

Matches "x" only if "x" is followed by "y". For example, /Jack(?=Sprat)/ matches "Jack" only if it is followed by "Sprat". /Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost". However, neither "Sprat" nor "Frost" is part of the match results.

details

Solution 7 - Javascript

If you're using python, you can use backslash substitution like in re.sub using the Match.expand() method. This means that you don't need to capture the entire string. An example is as follows:

import re

in_str = '<h1> this is valid html</h1>name="some_text_0_some_text"'
use_reg = 'name="(\w+)(\d+)(\w+)"'
replace_str = r"\1!NEW_ID!\3"

def find_with_replace_option(use_str, use_reg, to_str):
""" Find matches of the regex use_reg in the string use_str. Return
    to_str if there are any matches. to_str may contain backslash 
    substitution.
"""
    result_list = []
    for match in re.finditer(use_reg,use_str):
        result = match.expand(to_str)
        result_list.append(result)
    return result_list
print(find_with_replace_option(in_str,use_reg,replace_str))

Here the regex works as follows: the first part "some_text_" is captured in the first group, the zero is captured in the second group and the last part "_some_text" becomes the third group.

replace_str specifies that the output should consist of the first group, followed by "!NEW_ID!" followed by the third group.

The result is some_text_!NEW_ID!_some_text as expected

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNicolas GuillaumeView Question on Stackoverflow
Solution 1 - JavascriptMatthew FlaschenView Answer on Stackoverflow
Solution 2 - JavascriptCertainPerformanceView Answer on Stackoverflow
Solution 3 - JavascriptJogaiView Answer on Stackoverflow
Solution 4 - JavascriptEmmaView Answer on Stackoverflow
Solution 5 - JavascriptCTS_AEView Answer on Stackoverflow
Solution 6 - JavascriptВладислав ПаршенцевView Answer on Stackoverflow
Solution 7 - JavascriptJelmerView Answer on Stackoverflow