How to extract text from a string using sed?

RegexBashSed

Regex Problem Overview


My example string is as follows:

This is 02G05 a test string 20-Jul-2012

Now from the above string I want to extract 02G05. For that I tried the following regex with sed

$ echo "This is 02G05 a test string 20-Jul-2012" | sed -n '/\d+G\d+/p'

But the above command prints nothing and the reason I believe is it is not able to match anything against the pattern I supplied to sed.

So, my question is what am I doing wrong here and how to correct it.

When I try the above string and pattern with python I get my result

>>> re.findall(r'\d+G\d+',st)
['02G05']
>>>

Regex Solutions


Solution 1 - Regex

How about using grep -E?

echo "This is 02G05 a test string 20-Jul-2012" | grep -Eo '[0-9]+G[0-9]+'

Solution 2 - Regex

The pattern \d might not be supported by your sed. Try [0-9] or [[:digit:]] instead.

To only print the actual match (not the entire matching line), use a substitution.

sed -n 's/.*\([0-9][0-9]*G[0-9][0-9]*\).*/\1/p'

Solution 3 - Regex

sed doesn't recognize \d, use [[:digit:]] instead. You will also need to escape the + or use the -r switch (-E on OS X).

Note that [0-9] works as well for Arabic-Hindu numerals.

Solution 4 - Regex

Try this instead:

echo "This is 02G05 a test string 20-Jul-2012" | sed 's/.* \([0-9]\+G[0-9]\+\) .*/\1/'

But note, if there is two pattern on one line, it will prints the 2nd.

Solution 5 - Regex

Try using rextract. It will let you extract text using a regular expression and reformat it.

Example:

$ echo "This is 02G05 a test string 20-Jul-2012" | ./rextract '([\d]+G[\d]+)' '${1}'

2G05

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRanRagView Question on Stackoverflow
Solution 1 - RegexmVChrView Answer on Stackoverflow
Solution 2 - RegextripleeeView Answer on Stackoverflow
Solution 3 - RegexDennis WilliamsonView Answer on Stackoverflow
Solution 4 - RegexZsolt BotykaiView Answer on Stackoverflow
Solution 5 - RegexTim SavannahView Answer on Stackoverflow