How to complete a git clone for a big project on an unstable connection?

GitGit Clone

Git Problem Overview


I am trying to git clone the LibreOffice codebase, but at the moment I have an internet connection of about 300kbps and it's just anything but stable. I can get the connection back any moment, but then the git clone process already stopped working, and no way to get it running again. Is there some way to have a more failure-resistant git clone download?

One option I considered myself is to download someone else's .git directory, but that is overly dependent of others and doesn't seem like the best possible solution to me.

Git Solutions


Solution 1 - Git

Two solutions (or rather workarounds) that come to mind are:

  • Use shallow clone i.e. git clone --depth=1, then deepen this clone using git fetch --depth=N, with increasing N. You can use git fetch --unshallow (since 1.8.0.3) to download all remaining revisions.

  • Ask somebody to bundle up to some tagged release (see git-bundle(1) manpage). The bundle itself is an ordinary file, which you can download any way, via HTTP/FTP with resume support, via BitTorrent, via rsync, etc. The you can create clone from bundle, fix configuration, and do further fetches from official LibreOffice repository.

Solution 2 - Git

I don't think this is ready yet. There's an old GSoC page that which planned to implement your desired feature. My best bet is, like you suggested download it as a directory. I'm assuming you are able to resume downloads over other protocols.

> Restartable Clone > > When cloning a large repository (such > as KDE, Open Office, Linux kernel) > there is currently no way to restart > an interrupted clone. It may take > considerable time for a user on the > end of a small pipe to download the > data, and if the clone is interrupted > in the middle the user currently needs > to start over from the beginning and > try again. For some users this may > make it impossible to clone a large > repository. > > Goal: Allow git-clone to automatically > resume a previously failed download > over the native git:// protocol. > Language: C Mentor: Shawn Pearce > Suggested by: Shawn > Pearce on gmane


Update

Along with the shallow cloning (git clone --depth=1) suggestion in one of the other answers it may be helpful if someone can make a bare repository for you if you can communicate with the provider. You can easily convert the bare repository to a full repository. Also read the comments in that answer as a shallow clone may not always help.

Solution 3 - Git

This method uses 3rd party server.

First, do git clone --bare, then rsync -v -P -e ssh user@host:repo.git . You can use msys under Windows.

Solution 4 - Git

"Never underestimate the bandwidth of a carrier pigeon and a bundle of SD cards" would be the modern form of this answer. Tar it up, plain old cp -a it, whatever, and mail the damn thing. Find someone willing to take two minutes of their time to drop a thumb drive into an SASE. Find a contact, there, they might even do it for you.

Solution 5 - Git

You can "download someone else's .git directory", but with that someone else being the official repository itself. The LibreOffice repositories are available via http, for instance their build.git is at http://anongit.freedesktop.org/git/libreoffice/build.git/ (see http://cgit.freedesktop.org/libreoffice/ for the complete list, the http URL is at the bottom of each repository's page).

What you see at these http URLs is nothing more than a .git directory (actually a "bare" repository, which has only what you would find in the .git directory). It is the same directory the server for the git:// protocol (git daemon) would read. If you make a copy of these directories with a web downloader (for instance wget -m -np), you can clone from your copy and it will work as well as if you had cloned directly from the http repository.

So, what you can do is: for each repository, get a copy of it with your favorite web downloader (which will deal with all the issues with resuming broken downloads), and clone from that copy. When you want to update, use again your favorite web downloader to update your copy, and pull from that copy. Now your clones and updates are as resistant to bad connections as your favorite web downloader is.

Solution 6 - Git

I would like to put my 5 cents here. This is actually what helped me to solve this issue

  • Turn off compression
  • Increase http.postBuffer
  • Do a partial clone
  • Navigate to the cloned directory and fetch the rest of the clone
  • Pull the rest
git config --global core.compression 0
git config --global https.postBuffer 524288000
git clone  <your_git_http_url_here> --depth 1
git fetch --unshallow 
git pull --all

This helped me to clone ~3GB repo over the 8Mbps adsl connection, of course I had to perform fetch and pulls few times, but still ...

Solution 7 - Git

Increasing buffer size will help you in this problem. Just follow the steps.

  1. Open terminal or Git Bash and with cd go to the location where you wanted to clone repo.

  2. Set compression to 0

    git config --global core.compression 0
    
  3. Set postBuffer size

    git config --global http.postBuffer 1048576000
    
  4. Set maxRequestBuffer size

    git config --global http.maxRequestBuffer 100M
    
  5. Now start clone

    git clone <repo url>
    
  6. Wait till clone completes.

Solution 8 - Git

git clone --depth <Number> <repository> --branch <branch name> --single-branch

This command help me (Thanks to Nicola Paolucci)

for example

git clone --depth 1 https://github.com/gokhanmoral/siyahkernel3 --branch ics  --single-branch

Solution 9 - Git

Let's break git clone down into its component parts, and use git checkout to prevent re-downloading files.

When git clone runs, the first few things it does are equivalent to

git init
git remote add origin <repo_url>
git fetch origin <branch>

If you run the above steps manually, and assuming that they completed correctly, you can now run the following as many times as necessary:

git checkout --force <branch>

Note that it will checkout all files each time it's run, but you will not have to re-download files, which may save you a ton of time.

Solution 10 - Git

If you have access to a 3rd-party server, you could clone there and then copy.

Solution 11 - Git

Use a git proxy, such as ngitcached or git-proxy.

Solution 12 - Git

This problem bit me too. In my case there is a work-around. It may or may not apply in your case.

I'm using a mobile phone sometimes to initiate git operations on a remote system. If my wi-fi breaks of course the session ends and git drops the whole clone operation without recovering. But since the internet connection from my remote system to the git master is solid there's no need for the clone to stop. All I need is the commonsense to detach the clone from the terminal session. This can be done by using screen/tmux or nohup/daemon. So it's a liveware malfunction in my case.

Solution 13 - Git

Use CNTRL Z to stop the cloning. Don't close the terminal put the system/laptop in hibernation and then continue later by fg command. I was facing this same problem today while trying to clone a repo frm github. This came as a time saver for me.

Solution 14 - Git

Same problem here - I have a really flaky internet connection with often not more than 10-15 kb/sec :-P

For me the wget way worked very well.

Go to the repository site where the green button "clone or download" is, click it and copy the link of the ZIP download option.

Then insert the link to the wget command:

wget -c -m -np https://github.com/your/repository/archive/master.zip

Works like a charm...

Solution 15 - Git

if we assume server's have good band-wide (and you have a server) another answer is to:

  1. create your own server using Server-Side Git Wrapper's
  2. clone it in your server
  3. Zip it using Server-Side Zip Archiver's
  4. download it from and with Server-Side Resume support

but this only works with very basic Web-development experience ;) and also you need git.exe in your sever

Solution 16 - Git

The best workaround that worked for me:

I faced the same issue with a bad internet connection. So I came up with the following solution:

I created a small php file on my server to download the package as a zip file:

<?php
$url = "https://codeload.github.com/CocoaPods/Specs/zip/master";
file_put_contents("coco.zip", fopen($url, 'r'));
?>  

<a href="coco.zip">coco.zip</a>

Then download the zip file using any download manager that supports resume.

Solution 17 - Git

You can try to use mercurial with the hg-git extension.

If that doesn't work you can can use git fetch <commit-id> to fetch only parts of a remote git repository (you can fetch into an empty git repository, there is no need to create it with clone). But you might to correct the branch configuration (=create local and remote tracking branches) when you use this approach.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLaPingvinoView Question on Stackoverflow
Solution 1 - GitJakub NarębskiView Answer on Stackoverflow
Solution 2 - GitJungle HunterView Answer on Stackoverflow
Solution 3 - GitRafal RusinView Answer on Stackoverflow
Solution 4 - GitjthillView Answer on Stackoverflow
Solution 5 - GitCesarBView Answer on Stackoverflow
Solution 6 - Gitmatson kepsonView Answer on Stackoverflow
Solution 7 - GitSwapnil NaukudkarView Answer on Stackoverflow
Solution 8 - GitAhed EidView Answer on Stackoverflow
Solution 9 - GitcowlinatorView Answer on Stackoverflow
Solution 10 - GitAmberView Answer on Stackoverflow
Solution 11 - GitAmr MostafaView Answer on Stackoverflow
Solution 12 - GitTony SidawayView Answer on Stackoverflow
Solution 13 - GitJicksy JohnView Answer on Stackoverflow
Solution 14 - GitX-FileView Answer on Stackoverflow
Solution 15 - GitTop-MasterView Answer on Stackoverflow
Solution 16 - GitZoroxView Answer on Stackoverflow
Solution 17 - GitRudiView Answer on Stackoverflow