Why when I transfer a file through SFTP, it takes longer than FTP?
PerformanceFileFtpSftpTransferPerformance Problem Overview
I manually copy a file to a server, and the same one to an SFTP server. The file is 140MB.
FTP: I have a rate arround 11MB/s
SFTP: I have a rate arround 4.5MB/s
I understand the file has to be encrypted before being sent. Is it the only impact on the file transfer? (and actually this is not exactly transfer time, but encryption time).
I am suprised of such results.
Performance Solutions
Solution 1 - Performance
I'm the author of HPN-SSH and I was asked by a commenter here to weigh in. I'd like to start with a couple of background items. First off, it's important to keep in mind that SSHv2 is a multiplexed protocol - multiple channels over a single TCP connection. As such, the SSH channels are essentially unaware of the underlying flow control algorithm used by TCP. This means that SSHv2 has to implement its own flow control algorithm. The most common implementation basically reimplements sliding windows. The means that you have the SSH sliding window riding on top of the TCP sliding window. The end results is that the effective size of the receive buffer is the minimum of the receive buffers of the two sliding windows. Stock OpenSSH has a maximum receive buffer size of 2MB but this really ends up being closer to ~1.2MB. Most modern OSes have a buffer that can grow (using auto-tuning receive buffers) up to an effective size of 4MB. Why does this matter? If the receive buffer size is less than the bandwidth delay product (BDP) then you will never be able to fully fill the pipe regardless of how fast your system is.
This is complicated by the fact that SFTP adds another layer of flow control onto of the TCP and SSH flow controls. SFTP uses a concept of outstanding messages. Each message may be a command, a result of a command, or bulk data flow. The outstanding messages may be up to a specific datagram size. So you end up with what you might as well think of as yet another receive buffer. The size of this receive buffer is datagram size * maximum outstanding messages (both of which may be set on the command line). The default is 32k * 64 (2MB). So when using SFTP you have to make sure that the TCP receive buffer, the SSH receive buffer, and the SFTP receive buffer are all of sufficient size (without being too large or you can have over buffering problems in interactive sessions).
HPN-SSH directly addresses the SSH buffer problem by having a maximum buffer size of around 16MB. More importantly, the buffer dynamically grows to the proper size by polling the proc entry for the TCP connection's buffer size (basically poking a hole between layers 3 and 4). This avoids overbuffering in almost all situations. In SFTP we raise the maximum number of outstanding requests to 256. At least we should be doing that - it looks like that change didn't propagate as expected to the 6.3 patch set (though it is in 6.2. I'll fix that soon). There isn't a 6.4 version because 6.3 patches cleanly against 6.4 (which is a 1 line security fix from 6.3). You can get the patch set from sourceforge.
I know this sounds odd but right sizing the buffers was the single most important change in terms of performance. In spite of what many people think the encryption is not the real source of poor performance in most cases. You can prove this to yourself by transferring data to sources that are increasingly far away (in terms of RTT). You'll notice that the longer the RTT the lower the throughput. That clearly indicates that this is an RTT dependent performance problem.
Anyway, with this change I started seeing improvements of up to 2 orders of magnitude. If you understand TCP you'll understand why this made such a difference. It's not about the size of the datagram or the number of packets or anything like that. It's entire because in order to make efficient use of the network path you must have a receive buffer equal to the amount of data that can be in transit between the two hosts. This also means that you may not see any improvement whatsoever if the path isn't sufficiently fast and long enough. If the BDP is less than 1.2MB HPN-SSH may be of no value to you.
The parallelized AES-CTR cipher is a performance boost on systems with multiple cores if you need to have full encryption end to end. Usually I suggest people (or have control over both the server and client) to use the NONE cipher switch (encrypted authentication, bulk data passed in clear) as most data isn't all that sensitive. However, this only works in non-interactive sessions like SCP. It doesn't work in SFTP.
There are some other performance improvements as well but nothing as important as the right sizing of the buffers and the encryption work. When I get some free time I'll probably pipeline the HMAC process (currently the biggest drag on performance) and do some more minor optimization work.
So if HPN-SSH is so awesome why hasn't OpenSSH adopted it? That's a long story and people who know the OpenBSD team probably already know the answer. I understand many of their reasons - it's a big patch which would require additional work on their end (and they are a small team), they don't care as much about performance as security (though there is no security implications to HPN-SSH), etc etc etc. However, even though OpenSSH doesn't use HPN-SSH Facebook does. So do Google, Yahoo, Apple, most ever large research data center, NASA, NOAA, the government, the military, and most financial institutions. It's pretty well vetted at this point.
If anyone has any questions feel free to ask but I may not be keeping up to date on this forum. You can always send me mail via the HPN-SSH email address (google it).
Solution 2 - Performance
UPDATE: As a commenter pointed out, the problem I outline below was fixed some time before this post. However, I knew of the HP-SSH project and I asked the author to weigh in. As they explain in the (rightfully) most upvoted answer, encryption is not the source of the problem. Yay for email and people smarter than myself!
Wow, a year-old question with nothing but incorrect answers. However, I must admit that I assumed the slowdown was due to encryption when I asked myself the same question. But ask yourself the next logical question: how quickly can your computer encrypt and decrypt data? If you think that rate is anywhere near the 4.5Mb/second reported by the OP (.5625MBs or roughly half the capacity of a 5.5" floppy disk!) smack yourself a few times, drink some coffee, and ask yourself the same question again.
It apparently has to do with what amounts to be an oversight in the packet size selection, or at least that's what the author of LIBSSH2 says,
>The nature of SFTP and its ACK for every small data chunk it sends, makes an initial naive SFTP implementation suffer badly when sending data over high latency networks. If you have to wait a few hundred milliseconds for each 32KB of data then there will never be fast SFTP transfers. This sort of naive implementation is what libssh2 has offered up until and including libssh2 1.2.7.
So the speed hit is due to tiny packet sizes x mandatory ack responses for each packet, which is clearly insane.
The High Performance SSH/SCP (HP-SSH) project provides an OpenSSH patch set which apparently improves the internal buffers as well as parallelizing encryption. Note, however, that even the non-parallelized versions ran at speeds above the 40Mb/s unencrypted speeds obtained by some commenters. The fix involves changing the way in which OpenSSH was calling the encryption libraries, NOT the cipher and there is zero difference in speed between AES128 and AES256. Encryption takes some time, but it is marginal. It might have mattered back in the 90's but (like the speed of Java vs C) it just doesn't matter anymore.
Solution 3 - Performance
Several factors affect speed of SFTP transfer:
- Encryption. Though symmetric encryption is fast, it's not that fast to be unnoticed. If you comparing speeds on fast network (100mbit or larger), encryption becomes a break for your process.
- Hash calculation and checking.
- Buffer copying. SFTP running on top of SSH causes each data block to be copied at least 6 times (3 times on each side) more comparing to plain FTP where data in best cases can be passed to network interface without being copied at all. And block copy takes a bit of time as well.
Solution 4 - Performance
Encryption has not only cpu, but also some network overhead.
Solution 5 - Performance
SFTP is not FTP over SSH, it's a different protocol and being similar to SCP, it's offers more capabilities.
Solution 6 - Performance
There are all sorts of things which can do this. One possiblity is "Traffic Shaping". This is commonly done in office environments to reserve bandwidth for business critical activities. It may also be done by the web hosting company, or by your ISP, for very similar reasons.
You can also set it up at home very simply.
For example there may be a rule reserving minimum bandwidth for FTP, while SFTP might be falling under an "everything else" rule. Or there might be a rule capping bandwidth for SFTP, but someone else is also using SFTP at the same time as you.
So: Where are you tranferring the file from and to?
Solution 7 - Performance
For comparison, I tried transfering a 299GB ntfs disk image from an i5 laptop running Raring Ringtail Ubuntu alpha 2 live cd to an i7 desktop running Ubuntu 12.04.1. Reported speeds:
over wifi + powerline: scp: 5MB/sec (40 Mbit/sec)
over gigabit ethernet + netgear G5608 v3:
scp: 44MB/sec
sftp: 47MB/sec
sftp -C: 13MB/sec
So, over a good gigabit link, sftp is slightly faster than scp, 2010-era fast CPUs seem fast enough to encrypt, but compression isn't a win in all cases.
Over a bad gigabit ethernet link, though, I've had sftp far outperform scp. Something about scp being very chatty, see "scp UNBELIEVABLY slow" on comp.security.ssh from 2008: https://groups.google.com/forum/?fromgroups=#!topic/comp.security.ssh/ldPV3msFFQw http://fixunix.com/ssh/368694-scp-unbelievably-slow.html
Solution 8 - Performance
Your results make sense. Since FTP operates over a non-encrypted channel it is faster than SFTP (which is subsystem on top of the SSH version 2 protocol). Also remember that SFTP is a packet based protocol unlike FTP which is command based.
Each packet in SFTP is encrypted before being written to the outgoing socket from the client and subsequently decrypted when received by the server. This of-course leads to slow transfer rates but very secure transfer. Using compression such as zlib with SFTP improves the transfer time but still it won't be anywhere near plain text FTP. Perhaps a better comparison is to compare SFTP with FTPS which both use encryption?
Speed for SFTP depends on the cipher used for encryption/decryption, the compression used e.g. zlib, packet sizes and buffer sizes used for the socket connection.
Solution 9 - Performance
SFTP will almost always be significantly slower than FTP or FTPS (usually by several orders of magnitude). The reason for the difference is that there is a lot of additional packet, encryption and handshaking overhead inherent in the SSH2 protocol that FTP doesn't have to worry about. FTP is a very lean and comparatively simple protocol with almost no data transfer overhead, and the protocol was specifically designed for transferring files quickly. Encryption will slow FTP down, but not nearly to the level of SFTP.
SFTP runs over SSH2 and is much more susceptible to network latency and client and server machine resource constraints. This increased susceptibility is due to the extra data handshaking involved with every packet sent between the client and server, and with the additional complexity inherent in decoding an SSH2 packet. SSH2 was designed as a replacement for Telnet and other insecure remote shells, not for very high speed communications. The flexibility that SSH2 provides for securely packaging and transferring almost any type of data also contributes a great deal of additional complexity and overhead to the protocol.
However, it is still possible to get a data transfer rate of several MB/s between client and server using SFTP if the right network conditions are present. The following are items to check when trying to maximize SFTP transfer speeds:
Is there a firewall or network device inspecting or throttling SSH2 traffic on your network? That might be slowing things down. Check you firewall settings. We've had users report solving extremely slow SFTP file transfers by modifying their firewall rules. The SFTP client you use can make a big difference. Try several different SFTP clients and see if you get different results. Network latency will drastically affect SFTP. If the link you are on has a high degree of latency then that is going to be a problem for fast transfers. How powerful is the server machine? Encryption with SFTP is very intensive. Make sure you have a sufficiently powerful machine that isn't being overtaxed during SFTP file transfers (high CPU utilization).
Solution 10 - Performance
Yes, encryption add some load to your cpu, but if your cpu is not ancient that should not affect as much as you say.
If you enable compression for SSH, SCP is actually faster than FTP despite the SSH encryption (if I remember, twice as fast as FTP for the files I tried). I haven't actually used SFTP, but I believe it uses SCP for the actual file transfer. So please try this and let us know :-)