Find and replace in file and overwrite file doesn't work, it empties the file

ShellUnixSedIo Redirection

Shell Problem Overview


I would like to run a find and replace on an HTML file through the command line.

My command looks something like this:

sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html > index.html

When I run this and look at the file afterward, it is empty. It deleted the contents of my file.

When I run this after restoring the file again:

sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html

The stdout is the contents of the file, and the find and replace has been executed.

Why is this happening?

Shell Solutions


Solution 1 - Shell

When the shell sees > index.html in the command line it opens the file index.html for writing, wiping off all its previous contents.

To fix this you need to pass the -i option to sed to make the changes inline and create a backup of the original file before it does the changes in-place:

sed -i.bak s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html

Without the .bak the command will fail on some platforms, such as Mac OSX.

Solution 2 - Shell

An alternative, useful, pattern is:

sed -e 'script script' index.html > index.html.tmp && mv index.html.tmp index.html

That has much the same effect, without using the -i option, and additionally means that, if the sed script fails for some reason, the input file isn't clobbered. Further, if the edit is successful, there's no backup file left lying around. This sort of idiom can be useful in Makefiles.

Quite a lot of seds have the -i option, but not all of them; the posix sed is one which doesn't. If you're aiming for portability, therefore, it's best avoided.

Solution 3 - Shell

sed -i 's/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g' index.html

This does a global in-place substitution on the file index.html. Quoting the string prevents problems with whitespace in the query and replacement.

Solution 4 - Shell

use sed's -i option, e.g.

sed -i bak -e s/STRING_TO_REPLACE/REPLACE_WITH/g index.html

Solution 5 - Shell

To change multiple files (and saving a backup of each as *.bak):

perl -p -i -e "s/\|/x/g" *  

will take all files in directory and replace | with x this is called a “Perl pie” (easy as a pie)

Solution 6 - Shell

You should try using the option -i for in-place editing.

Solution 7 - Shell

Warning: this is a dangerous method! It abuses the i/o buffers in linux and with specific options of buffering it manages to work on small files. It is an interesting curiosity. But don't use it for a real situation!

Besides the -i option of sed you can use the tee utility.

From man:

> tee - read from standard input and write to standard output and files

So, the solution would be:

sed s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html | tee | tee index.html

-- here the tee is repeated to make sure that the pipeline is buffered. Then all commands in the pipeline are blocked until they get some input to work on. Each command in the pipeline starts when the upstream commands have written 1 buffer of bytes (the size is defined somewhere) to the input of the command. So the last command tee index.html, which opens the file for writing and therefore empties it, runs after the upstream pipeline has finished and the output is in the buffer within the pipeline.

Most likely the following won't work:

sed s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html | tee index.html

-- it will run both commands of the pipeline at the same time without any blocking. (Without blocking the pipeline should pass the bytes line by line instead of buffer by buffer. Same as when you run cat | sed s/bar/GGG/. Without blocking it's more interactive and usually pipelines of just 2 commands run without buffering and blocking. Longer pipelines are buffered.) The tee index.html will open the file for writing and it will be emptied. However, if you turn the buffering always on, the second version will work too.

Solution 8 - Shell

sed -i.bak "s#https.*\.com#$pub_url#g" MyHTMLFile.html

If you have a link to be added, try this. Search for the URL as above (starting with https and ending with.com here) and replace it with a URL string. I have used a variable $pub_url here. s here means search and g means global replacement.

It works !

Solution 9 - Shell

The problem with the command

sed 'code' file > file

is that file is truncated by the shell before sed actually gets to process it. As a result, you get an empty file.

The sed way to do this is to use -i to edit in place, as other answers suggested. However, this is not always what you want. -i will create a temporary file that will then be used to replace the original file. This is problematic if your original file was a link (the link will be replaced by a regular file). If you need to preserve links, you can use a temporary variable to store the output of sed before writing it back to the file, like this:

tmp=$(sed 'code' file); echo -n "$tmp" > file

Better yet, use printf instead of echo since echo is likely to process \\ as \ in some shells (e.g. dash):

tmp=$(sed 'code' file); printf "%s" "$tmp" > file

Solution 10 - Shell

And the ed answer:

printf "%s\n" '1,$s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g' w q | ed index.html

To reiterate what codaddict answered, the shell handles the redirection first, wiping out the "input.html" file, and then the shell invokes the "sed" command passing it a now empty file.

Solution 11 - Shell

I was searching for the option where I can define the line range and found the answer. For example I want to change host1 to host2 from line 36-57.

sed '36,57 s/host1/host2/g' myfile.txt > myfile1.txt

You can use gi option as well to ignore the character case.

sed '30,40 s/version/story/gi' myfile.txt > myfile1.txt

Solution 12 - Shell

With all due respect to the above correct answers, it's always a good idea to "dry run" scripts like that, so that you don't corrupt your file and have to start again from scratch.

Just get your script to spill the output to the command line instead of writing it to the file, for example, like that:

sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g index.html

OR

less index.html | sed -e s/STRING_TO_REPLACE/STRING_TO_REPLACE_IT/g 

This way you can see and check the output of the command without getting your file truncated.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBBalesView Question on Stackoverflow
Solution 1 - ShellcodaddictView Answer on Stackoverflow
Solution 2 - ShellNorman GrayView Answer on Stackoverflow
Solution 3 - ShellRich ApodacaView Answer on Stackoverflow
Solution 4 - ShellKevinView Answer on Stackoverflow
Solution 5 - ShellStenemoView Answer on Stackoverflow
Solution 6 - ShelluloBasEIView Answer on Stackoverflow
Solution 7 - ShellxealitsView Answer on Stackoverflow
Solution 8 - ShellKaeyView Answer on Stackoverflow
Solution 9 - ShellAndrzej PronobisView Answer on Stackoverflow
Solution 10 - Shellglenn jackmanView Answer on Stackoverflow
Solution 11 - Shelluser8117884View Answer on Stackoverflow
Solution 12 - ShellNestor MilyaevView Answer on Stackoverflow