How does `scp` differ from `rsync`?

RsyncScp

Rsync Problem Overview


An article about setting up Ghost blogging says to use scp to copy from my local machine to a remote server:

scp -r ghost-0.3 root@*your-server-ip*:~/

However, Railscast 339: Chef Solo Basics uses scp to copy in the opposite direction (from the remote server to the local machine):

scp -r root@178.xxx.xxx.xxx:/var/chef .

In the same Railscast, when the author wants to copy files to the remote server (same direction as the first example), he uses rsync:

rsync -r . root@178.xxx.xxx.xxx:/var/chef

Why use the rsync command if scp will copy in both directions? How does scp differ from rsync?

Rsync Solutions


Solution 1 - Rsync

The major difference between these tools is how they copy files.

scp basically reads the source file and writes it to the destination. It performs a plain linear copy, locally, or over a network.

rsync also copies files locally or over a network. But it employs a special delta transfer algorithm and a few optimizations to make the operation a lot faster. Consider the call.

rsync A host:B
  • rsync will check files sizes and modification timestamps of both A and B, and skip any further processing if they match.

  • If the destination file B already exists, the delta transfer algorithm will make sure only differences between A and B are sent over the wire.

  • rsync will write data to a temporary file T, and then replace the destination file B with T to make the update look "atomic" to processes that might be using B.

Another difference between them concerns invocation. rsync has a plethora of command line options, allowing the user to fine tune its behavior. It supports complex filter rules, runs in batch mode, daemon mode, etc. scp has only a few switches.

In summary, use scp for your day to day tasks. Commands that you type once in a while on your interactive shell. It's simpler to use, and in those cases rsync optimizations won't help much.

For recurring tasks, like cron jobs, use rsync. As mentioned, on multiple invocations it will take advantage of data already transferred, performing very quickly and saving on resources. It is an excellent tool to keep two directories synchronized over a network.

Also, when dealing with large files, use rsync with the -P option. If the transfer is interrupted, you can resume it where it stopped by reissuing the command. See Sid Kshatriya's answer.

Finally, note that rsync:// the protocol is similar to plain HTTP: unencrypted and no integrity checks. Be sure to always use rsync via SSH (as in the examples from the question above), not via the rsync protocol, unless you really know what you're doing. scp will always use SSH as underlying transport mechanism which has both integrity and confidentiality guarantees, so that is another difference between the two utilities.

Solution 2 - Rsync

rysnc can be useful to run on slow and unreliable connections. So if your download aborts in the middle of a large file rysnc will be able to continue from where it left off when invoked again.

Use rsync -vP username@host:/path/to/file .

The -P option preserves partially downloaded files and also shows progress.

As usual check man rsync

Solution 3 - Rsync

Difference b/w scp and rsync on different parameter

1. Performance over latency

  • scp : scp is relatively less optimise and speed

  • rsync : rsync is comparatively more optimise and speed

https://www.disk91.com/2014/technology/networks/compare-performance-of-different-file-transfer-protocol-over-latency/

2. Interruption handling

  • scp : scp command line tool cannot resume aborted downloads from lost network connections

  • rsync : If the above rsync session itself gets interrupted, you can resume it as many time as you want by typing the same command. rsync will automatically restart the transfer where it left off.

http://ask.xmodulo.com/resume-large-scp-file-transfer-linux.html

3. Command Example

scp
$ scp source_file_path destination_file_path
rsync
$ cd /path/to/directory/of/partially_downloaded_file
$ rsync -P --rsh=ssh [email protected]:bigdata.tgz ./bigdata.tgz 

The -P option is the same as --partial --progress, allowing rsync to work with partially downloaded files. The --rsh=ssh option tells rsync to use ssh as a remote shell.

4. Security :

scp is more secure. You have to use rsync --rsh=ssh to make it as secure as scp.

man document to know more :

performance chart

Solution 4 - Rsync

One major feature of rsync over scp (beside the delta algorithm and encryption if used w/ ssh) is that it automatically verifies if the transferred file has been transferred correctly. Scp will not do that, which occasionally might result in corruption when transferring larger files. So in general rsync is a copy with guarantee.

Centos manpages mention this the end of the --checksum option description:

> Note that rsync always verifies that each transferred file was > correctly reconstructed on the receiving side by checking a whole-file > checksum that is generated as the file is transferred, but that > automatic after-the-transfer verification has nothing to do with this > option’s before-the-transfer “Does this file need to be updated?” > check.

Solution 5 - Rsync

There's a distinction to me that scp is always encrypted with ssh (secure shell), while rsync isn't necessarily encrypted. More specifically, rsync doesn't perform any encryption by itself; it's still capable of using other mechanisms (ssh for example) to perform encryption.

In addition to security, encryption also has a major impact on your transfer speed, as well as the CPU overhead. (My experience is that rsync can be significantly faster than scp.)

Check out this post for when rsync has encryption on.

Solution 6 - Rsync

scp is best for one file.
OR a combination of tar & compression for smaller data sets like source code trees with small resources (ie: images, sqlite etc).


Yet, when you begin dealing with larger volumes say:

  • media folders (40 GB)
  • database backups (28 GB)
  • mp3 libraries (100 GB)

It becomes impractical to build a zip/tar.gz file to transfer with scp at this point do to the physical limits of the hosted server.

As an exercise, you can do some gymnastics like piping tar into ssh and redirecting the results into a remote file. (saving the need to build a swap or temporary clone aka zip or tar.gz)

However,

rsync simplify's this process and allows you to transfer data without consuming any additional disc space.

Also,

Continuous (cron?) updates use minimal changes vs full cloned copies speed up large data migrations over time.

tl;dr
scp == small scale (with room to build compressed files on the same drive)
rsync == large scale (with the necessity to backup large data and no room left)

Solution 7 - Rsync

it's better to think in a practical context. In our team, we use rsync -aP to replace a bad cassandra host in our cluster. We can't do this with scp (slow and no progress preservation).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLeahcimView Question on Stackoverflow
Solution 1 - RsyncRafaView Answer on Stackoverflow
Solution 2 - RsyncSid KshatriyaView Answer on Stackoverflow
Solution 3 - RsyncAmitesh BhartiView Answer on Stackoverflow
Solution 4 - Rsynccount0View Answer on Stackoverflow
Solution 5 - RsyncSamuel LiView Answer on Stackoverflow
Solution 6 - RsynccopremesisView Answer on Stackoverflow
Solution 7 - Rsyncdel baoView Answer on Stackoverflow