git diff algorithm that does not rip functions apart? (language-aware diff)
GitLanguage AgnosticDiffSemantic DiffGit Problem Overview
Is it possible to configure git diff to respect indentation and syntax? I am not talking about ignoring indentation and spaces, but rather to use blank lines, indentation levels and possibly brackets, to help matching the old lines to new lines.
E.g. git diff often cuts through functions and their docblock, like this:
class C {
/**
+ * Goes to the bar.
+ */
+ function bar() {
+ return 'bar';
+ }
+
+ /**
* Gets your foo up to date.
*/
function foo() {
When I would prefer
class C {
+
+ /**
+ * Goes to the bar.
+ */
+ function bar() {
+ return 'bar';
+ }
/**
* Gets your foo up to date.
*/
function foo() {
In this example it is still quite harmless, but there are examples where functions and their docblock are really ripped apart due to the greedy and naive diff implementation.
Note: I already configured *.php diff=php
in ~/.gitattributes
.
EDIT: Another example: Here git diff mixes a property docblock with a method docblock:
/**
- * @var int
+ * @param string $str
*/
Git Solutions
Solution 1 - Git
I do not know how to do that in git alone, but there is at least one commercial tool (i.e. it costs money) which deals with that kind of problems, called SemanticMerge.
It can handle quite a lot of cool cases, and supports C#, Java, and partially C. You can configure git to use it as merge tool.
(I'm not affiliated.)
Solution 2 - Git
First of all use a more sophisticated diff algorithm like:
git config --global diff.algorithm histogram
Then there are also semantic diff tools like https://github.com/GumTreeDiff/gumtree whose algorithm has also been implemented in clang-diff: https://github.com/krobelus/clang-diff-playground