How to get the difference (only additions) between two files in linux

LinuxBashDiff

Linux Problem Overview


I have two files A1 and A2 (unsorted). A1 is previous version of A2 and some lines have been added to A2. How can I get the new lines that are added to A2?

Note: I just want the new lines added and dont want the lines which were in A1 but deleted in A2. When i do diff A1 A2, I get the additions as well as deletions but I want only additions.

Please suggest a way to do this.

Linux Solutions


Solution 1 - Linux

Most of the below is copied directly from @TomOnTime's serverfault answer here. At the bottom is an attempt that works on unsorted files, but the command sorts the files before giving the diff so in many cases it will not be what is desired. For well-formatted diffs of unsorted files, you might find the other answers more useful (thanks to @Fritz for pointing this out):

Show lines that only exist in file a: (i.e. what was deleted from a)

comm -23 a b

Show lines that only exist in file b: (i.e. what was added to b)

comm -13 a b

Show lines that only exist in one file or the other: (but not both)

comm -3 a b | sed 's/^\t//'

(Warning: If file a has lines that start with TAB, it (the first TAB) will be removed from the output.)

NOTE: Both files need to be sorted for "comm" to work properly. If they aren't already sorted, you should sort them:

sort <a >a.sorted
sort <b >b.sorted
comm -12 a.sorted b.sorted

If the files are extremely long, this may be quite a burden as it requires an extra copy and therefore twice as much disk space.

Edit: note that the command can be written more concisely using process substitution (thanks to @phk for the comment):

comm -12 <(sort < a) <(sort < b)

Solution 2 - Linux

diff and then grep for the edit type you want.

diff -u A1 A2 | grep -E "^\+"

Solution 3 - Linux

You can try this

diff --changed-group-format='%>' --unchanged-group-format='' A1 A2

The options are documented in man diff:

       --GTYPE-group-format=GFMT
              format GTYPE input groups with GFMT

and:

       LTYPE is 'old', 'new', or 'unchanged'.
              GTYPE is LTYPE or 'changed'.

and:

              GFMT (only) may contain:

       %<     lines from FILE1

       %>     lines from FILE2

       [...]

Solution 4 - Linux

A similar approach to https://stackoverflow.com/a/15385080/337172 but hopefully more understandable and easy to tweak:

diff \
  --new-line-format="%L" \
  --old-line-format="" \
  --unchanged-line-format="" \
  A1 A2

Solution 5 - Linux

The simple method is to use :

sdiff A1 A2

Another method is to use comm, as you can see in https://stackoverflow.com/questions/11099894/comparing-2-unsorted-lists-in-linux-listing-the-unique-in-the-second-file

Solution 6 - Linux

You can type:

grep -v -f A1 A2

Solution 7 - Linux

git diff path/file.css | grep -E "^\+" | grep -v '+++ b/' | cut -c 2-
  • grep -E "^\+" is from previous accepted answer, it is incomplete because leaves non-source stuff
  • grep -v '+++ b' removes non-source line with file name of later version
  • cut -c 2- removes column of + signs, also may use sed 's/^\+//'

comm or sdiff were not an option because of git.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1004985View Question on Stackoverflow
Solution 1 - LinuxscottkostyView Answer on Stackoverflow
Solution 2 - LinuxtimrauView Answer on Stackoverflow
Solution 3 - LinuxPremjithView Answer on Stackoverflow
Solution 4 - LinuxFrancesc RosasView Answer on Stackoverflow
Solution 5 - LinuxMihai8View Answer on Stackoverflow
Solution 6 - LinuxZabadorView Answer on Stackoverflow
Solution 7 - Linuxuser1046885View Answer on Stackoverflow