Git clone changes file modification time
GitCloneGit Problem Overview
When I clone a Git repository using the "git clone ...
" command, all cloned files in my local repository have the same modification time with date and time as when the git clone
command was issued.
Is there a way to clone a remote Git repository with the actual modification time for each file?
Git Solutions
Solution 1 - Git
Git does not record timestamp for the files, since it is a Distributed VCS (meaning the time on your computer can be different from mine: there is no "central" notion of time and date)
The official argument for not recording that metadata is explained in this answer.
But you can find scripts which will attempt to restore a meaningful date, like this one (or a simpler version of the same idea).
Solution 2 - Git
You can retrieve the last modification date of all files in a Git repository (last commit time). See How to retrieve the last modification date of all files in a Git repository.
Then use the touch command change the modification date:
git ls-tree -r --name-only HEAD | while read filename; do
unixtime=$(git log -1 --format="%at" -- "${filename}")
touchtime=$(date -d @$unixtime +'%Y%m%d%H%M.%S')
touch -t ${touchtime} "${filename}"
done
Also see my gist here.
Solution 3 - Git
Another option for resetting the mtime is git-restore-mtime.
sudo apt install git-restore-mtime # Debian/Ubuntu example
git clone <myurl>
cd <mydir>
git restore-mtime
Solution 4 - Git
This Linux one-liner will fix the problem with all the files (not folders - just files) - and it will also fix the problem with files with spaces in them too:
git ls-files -z | xargs -0 -n1 -I{} -- git log -1 --format="%ai {}" {} | perl -ne 'chomp;next if(/'"'"'/);($d,$f)=(/(^\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d(?: \+\d\d\d\d|)) (.*)/);print "d=$d f=$f\n"; `touch -d "$d" '"'"'$f'"'"'`;'
Solution 5 - Git
A shorter variant of user11882487's answer that I find easier to understand:
git ls-files | xargs -I{} git log -1 --date=format:%Y%m%d%H%M.%S --format='touch -t %ad "{}"' "{}" | $SHELL
Solution 6 - Git
Adding to the list of one-liners ...
for f in $(git ls-files) ; do touch -d $(git log -1 --format='%aI' "$f") "$f" ; done
Solution 7 - Git
Running log -1
once per file irks me so I wrote this to do them all in one pass:
( # don't alter any modified-file stamps:
git diff --name-status --no-find-copies --no-renames | awk '$1="D"' FS=$'\t' OFS=$'\t'
git log --pretty=%cI --first-parent --name-status -m --no-find-copies --no-renames
) | awk ' NF==1 { date=$1 }
NF<2 || seen[$2]++ { next }
$1!="D" { print "touch -d",date,$2 }' FS=$'\t'
which does the linux history in like ten seconds (piping all the touch commands through a shell takes a minute).
This is a good way to ruin e.g. bisecting, and I'm in the camp of don't even start down the road of trying to overload filesystem timestamps, the people who insist on doing this are apparently going to have to learn the hard way, but I can see that maybe there's workflows where this really won't hurt you.
Whatever. But, for sure, do not do this blindly.
Solution 8 - Git
This applies to solutions in multiple previous answers:
Use the %at
format, and then touch -d \@$epochdelta
, to avoid date-time conversion issues.
Solution 9 - Git
To do this in Python is simpler than some of these other options, as os.utime
accepts the Unix timestamp output by the git log
command. This example uses GitPython but it'd also work with subprocess.run
to call git log
.
import git
from os import utime
from pathlib import Path
repo_path = "my_repo"
repo = git.Repo(repo_path)
for n in repo.tree().list_traverse():
filepath = Path(repo.working_dir) / n.path
unixtime = repo.git.log(
"-1", "--format='%at'", "--", n.path
).strip("'")
if not unixtime.isnumeric():
raise ValueError(
f"git log gave non-numeric timestamp {unixtime} for {n.path}"
)
utime(filepath, times=(int(unixtime), int(unixtime)))
This matches the results of the git restore-mtime
command in this answer and the script in the highest rated answer.
If you're doing this immediately after cloning, then you can reuse the to_path
parameter passed to git.Repo.clone_from
instead of accessing the working_dir
attribute on the Repo
object.