Is it better to use git grep than plain grep if we want to search in versioned source code?

LinuxGitGrep

Linux Problem Overview


In a git repository, is there any difference/benefit using git grep over good old grep?
An example would be?

Linux Solutions


Solution 1 - Linux

The two are very similar. The main difference is that git grep defaults to searching in the files that are tracked by git.

Examples

If I want to find foo within my project I can use git grep or good ol' stand-alone grep:

git grep foo
grep -R foo .

The git grep version will only search in files tracked by git, whereas the grep version will search everything in the directory. So far so similar; either one could be better depending on what you want to achieve.

What if we want to limit the search to only .rb files?

git grep foo -- *.rb
grep -R --include=*.rb foo .

The plain old grep version is getting a bit more wordy, but if you're used to using grep that may not be a problem. They're still not going to search exactly the same files, but again it depends on what you want to achieve.

What about searching in the previous version of the project?

git grep foo HEAD^
git checkout HEAD^; grep -R foo .; git checkout -

This is where git grep makes a real difference: You can search in another revision of the project without checking it out first. This isn't a situation that comes up too often for me though; I usually want to search in the version of the project I have checked out.

Configuring git grep

There are some git config variables that modify the behaviour of git grep and avoid the need to pass a couple of command line arguments:

  • grep.lineNumber: Always show line numbers of matches (you can pass -n to both grep and git grep to get this behaviour)
  • grep.extendedRegexp: Always use extended regular expressions (you can pass -E to both grep and git grep to get this behaviour)

In practice

In practice I have gg aliased to git grep -En, and this almost always does what I want.

Solution 2 - Linux

The main advantage of git grep is that it can find the patterns in the git repository, i. e. also in others than the current version of the source. This cannot be done using the standard grep of course. Also there are a lot more features in the git grep like pattern arithmetic (things like git grep -e pattern1 --and --not \( -e pattern2 -e pattern3 \)), tree search using glob (things like git grep pattern -- '*.[ch]' to search only in .c and .h files) and some more.

Here's an example session for searching in an older revision:

$ mkdir git-test                 # create fresh repository
$ cd git-test/
$ git init .
Initialized empty Git repository in /home/alfe/git-test/.git/
$ echo eins zwei drei > bla      # create example file
$ git add bla                    # add and commit it
$ git commit bla
[master (root-commit) 7494515] .
 1 file changed, 1 insertion(+)
 create mode 100644 bla
$ echo vier fuenf sechs > bla    # perform a change on that file
$ git commit -m 'increase' bla   # commit it
[master 062488e] increase
 1 file changed, 1 insertion(+), 1 deletion(-)
$ git grep eins | cat            # grep for outdated pattern in current version
                                  # (finds nothing)
$ git grep eins master^ | cat    # grep for outdated pattern on former version
                                  # finds it:
master^:bla:eins zwei drei

Solution 3 - Linux

git grep only searches in the tracked files in the repo.

With grep you have to pass the list of files to search through and you would have filter out any untracked files yourself.

So if you are searching for something that you know is in the repo, git grep saves you time as all you have to do is provide the pattern. It also is useful for not having to search through anything that is untracked in the repo.

Solution 4 - Linux

If you're searching for patterns/strings within a git repository (i.e. in files that are already tracked), then yes, git grep should be much faster typically than regular grep as it is indexed. (You can try this out manually, the git-grep should be perceptibly faster)

Solution 5 - Linux

If you are searching in a Git repo, git grep is faster.

And with Git 2.20 (Q4 2018), it is also more compatible, option-wise, with the regular grep.

As discussed in this git grep "wishlist's":

> I often use "grep -r $pattern" to recursively grep a source tree.
If that takes too long, I hit ^C and tag "git" in front of the command line and re-run it.
Git then complains "error: unknown switch r'" because "git grep" is naturally recursive. > > Could we have "git grep -r" accept the argument for compatibility? Other important grep switches like "-i" are compatible, adding -r` would improve usability.

This is now (Git 2.20, Q4 2018) done:

See commit 0a09e5e (01 Oct 2018) by René Scharfe (rscharfe).
Suggested-by: Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit 9822b8f, 19 Oct 2018)

> ## grep: add -r/--[no-]recursive

> Recognize -r and --recursive as synonyms for --max-depth=-1 for compatibility with GNU grep; it's still the default for git grep. > > This also adds --no-recursive as synonym for --max-depth=0 for free, which is welcome for completeness and consistency. > > Fix the description for --max-depth, while we're at it -- negative values other than -1 actually disable recursion, i.e. they are equivalent to --max-depth=0.


Note that "git grep --untracked"(man) is meant to be lets ALSO find in these files on the filesystem" when looking for matches in the working tree files, and does not make any sense if the primary search is done against the index, or the tree objects.
The "--cached" and "--untracked" options have been marked as mutually incompatible with Git 2.31 (Q1 2021).

See commit 0c5d83b (08 Feb 2021) by Matheus Tavares (matheustavares).
(Merged by Junio C Hamano -- gitster -- in commit f712632, 17 Feb 2021)

> ## grep: error out if --untracked is used with --cached
> Signed-off-by: Matheus Tavares
> Reviewed-by: Elijah Newren

> The options --untracked and --cached are not compatible, but if they are used together, grep just silently ignores --cached and searches the working tree.
> Error out, instead, to avoid any potential confusion.

--untracked cannot be used with --cached

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJimView Question on Stackoverflow
Solution 1 - LinuxgeorgebrockView Answer on Stackoverflow
Solution 2 - LinuxAlfeView Answer on Stackoverflow
Solution 3 - LinuxSchleisView Answer on Stackoverflow
Solution 4 - LinuxDebajitView Answer on Stackoverflow
Solution 5 - LinuxVonCView Answer on Stackoverflow