sed whole word search and replace

ShellSed

Shell Problem Overview


How do I search and replace whole words using sed?

Doing

sed -i 's/[oldtext]/[newtext]/g' <file> 

will also replace partial matches of [oldtext] which I don't want it to do.

Shell Solutions


Solution 1 - Shell

\b in regular expressions match word boundaries (i.e. the location between the first word character and non-word character):

$ echo "bar embarassment" | sed "s/\bbar\b/no bar/g"
no bar embarassment

Solution 2 - Shell

On Mac OS X, neither of these regex syntaxes work inside sed for matching whole words

  • \bmyWord\b
  • \<myWord\>

Hear me now and believe me later, this ugly syntax is what you need to use:

  • /[[:<:]]myWord[[:>:]]/

So, for example, to replace mint with minty for whole words only:

  • sed "s/[[:<:]]mint[[:>:]]/minty/g"

Source: re_format man page

Solution 3 - Shell

Use \b for word boundaries:

sed -i 's/\boldtext\b/newtext/g' <file>

Solution 4 - Shell

In one of my machine, delimiting the word with "\b" (without the quotes) did not work. The solution was to use "\<" for starting delimiter and "\>" for ending delimiter.

To explain with Joakim Lundberg's example:

$ echo "bar embarassment" | sed "s/\<bar\>/no bar/g"
no bar embarassment

Solution 5 - Shell

For a posix compliant alternative, consider replacing word boundary matches (\b) by an expanded equivalent ([^a-zA-Z0-9]), also taking into account occurrences at start of line (^) and end of line ($).

However, this quickly becomes impractical if you want to support repeated occurrences of the word to replace (e.g. oldtext oldtext). sed --posix doesn't recognize expressions such as \(^\|[^a-zA-Z0-9]\), and you can't make use of lookarounds.

It seems we have to explictly match all possible cases. Here's a solution to replace mint with minty:

echo 'mint 0mint mint mint0 mint__mint mint__ mint_ -mint mint mint mint_ mint -mint- mint mint mintmint mint' \
  | sed --posix '   
s/^mint$/minty/g;
s/^mint\([^a-zA-Z0-9]\)/minty\1/g;
s/\([^a-zA-Z0-9]\)mint$/\1minty/g;
s/\([^a-zA-Z0-9]\)mint\([^a-zA-Z0-9]\)mint\([^a-zA-Z0-9]\)mint\([^a-zA-Z0-9]\)/\1minty\2minty\3minty\4/g;
s/\([^a-zA-Z0-9]\)mint\([^a-zA-Z0-9]\)mint\([^a-zA-Z0-9]\)/\1minty\2minty\3/g;
s/\([^a-zA-Z0-9]\)mint\([^a-zA-Z0-9]\)/\1minty\2/g;
'
# minty 0mint minty mint0 minty__minty minty__ minty_ -minty minty minty minty_ minty -minty- minty minty mintmint minty

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionksuraltaView Question on Stackoverflow
Solution 1 - ShellJoakim LundborgView Answer on Stackoverflow
Solution 2 - ShellLarry GerndtView Answer on Stackoverflow
Solution 3 - ShellMitch WheatView Answer on Stackoverflow
Solution 4 - ShellArunView Answer on Stackoverflow
Solution 5 - ShellfzbdView Answer on Stackoverflow