TCP Socket no connection timeout

NetworkingTcpNetwork ProgrammingTimeout

Networking Problem Overview


I open a TCP socket and connect it to another socket somewhere else on the network. I can then successfully send and receive data. I have a timer that sends something to the socket every second.

I then rudely interrupt the connection by forcibly losing the connection (pulling out the Ethernet cable in this case). My socket is still reporting that it is successfully writing data out every second. This continues for approximately 1hour and 30 minutes, where a write error is eventually given.

What specifies this time-out where a socket finally accepts the other end has disappeared? Is it the OS (Ubuntu 11.04), is it from the TCP/IP specification, or is it a socket configuration option?

Networking Solutions


Solution 1 - Networking

Pulling the network cable will not break a TCP connection(1) though it will disrupt communications. You can plug the cable back in and once IP connectivity is established, all back-data will move. This is what makes TCP reliable, even on cellular networks.

When TCP sends data, it expects an ACK in reply. If none comes within some amount of time, it re-transmits the data and waits again. The time it waits between transmissions generally increases exponentially.

After some number of retransmissions or some amount of total time with no ACK, TCP will consider the connection "broken". How many times or how long depends on your OS and its configuration but it typically times-out on the order of many minutes.

From Linux's tcp.7 man page:

   tcp_retries2 (integer; default: 15; since Linux 2.2)
          The maximum number of times a TCP packet is retransmitted in
          established state before giving up.  The default value is 15, which
          corresponds to a duration of approximately between 13 to 30 minutes,
          depending on the retransmission timeout.  The RFC 1122 specified
          minimum limit of 100 seconds is typically deemed too short.

This is likely the value you'll want to adjust to change how long it takes to detect if your connection has vanished.

(1) There are exceptions to this. The operating system, upon noticing a cable being removed, could notify upper layers that all connections should be considered "broken".

Solution 2 - Networking

If want a quick socket error propagation to your application code, you may wanna try this socket option:

> TCP_USER_TIMEOUT (since Linux 2.6.37) This option takes an unsigned int as an argument. When the value is greater than 0, it specifies the maximum amount of time in milliseconds that transmitted data may remain unacknowledged before TCP will forcibly close the corresponding connection and return ETIMEDOUT to the application. If the option value is specified as 0, TCP will use the system default.

See full description on linux/man/tcp(7). This option is more flexible (you can set it on the fly, just right after a socket creation) than tcp_retries2 editing and exactly applies to a situation when you client's socket doesn't aware about server's one state and may get into so called half-closed state.

Solution 3 - Networking

Two excellent answers are here and here.
TCP user timeout may work for your case: The TCP user timeout controls how long transmitted data may remain unacknowledged before a connection is forcefully closed.
there are 3 OS dependent TCP timeout parameters. On Linux the defaults are: tcp_keepalive_time default 7200 seconds
tcp_keepalive_probes default 9
tcp_keepalive_intvl default 75 sec
Total timeout time is tcp_keepalive_time + (tcp_keepalive_probes * tcp_keepalive_intvl), with these defaults 7200 + (9 * 75) = 7875 secs
To set these parameters on Linux:
sysctl -w net.ipv4.tcp_keepalive_time=1800 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=20

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionoggmonsterView Question on Stackoverflow
Solution 1 - NetworkingBrian WhiteView Answer on Stackoverflow
Solution 2 - NetworkingAlex-BogdanovView Answer on Stackoverflow
Solution 3 - NetworkingedWView Answer on Stackoverflow