Saturday, April 10, 2010

Reusing TCP Connections: "address already in use"

The socket option SO_REUSEADDR is useful for binding a socket to a port that is already in use by another socket. But even if you are successful in binding the socket, it does not mean you will not get the "address already in use" error later with that socket.

Consider a case where you want to connect from a fixed source port (srcport) and IP (srcip) to a fixed destination port (dstport) and IP (dstip) using TCP. A TCP connection is basically a 5-tuple (srcip, srcport, dstip, dstport, protocol).
The first attempt with connect() succeeds and the connection is established.
But, depending on your OS configuration, you may not be able to establish the same connection again for a while, even if you close the first connection. Even though you may succeed with SO_REUSEADDR binding the (srcip, srcport, TCP) 3-tuple, your connect() call will fail to establish the 5-tuple.

This is a very unlikely use-case because usually, clients do not need to use the same srcport in a subsequent connection. But I encountered this problem when I was developing an ICE-TCP library for a client who wanted to specify a fixed port to be used in the SDP generated by the library. In this case, there is a chance that the same fixed port will be specified for a call after the previous call ends.

This problem arises from the TIME_WAIT state of TCP.

To get around this problem, you need to "hard close" the first connection (as opposed to "graceful shutdown"). For this, set the option SO_LINGER to the socket of the first connection, with a linger timeout of zero. Then when you want to close the connection, DO NOT call shutdown(). Call only closesocket(). If you call shutdown(), a graceful shutdown is initiated with the [FIN, ACK, FIN, ACK] sequence, which ultimately leads to the stuck TIME_WAIT state.

So, in summary, if you desparately need to reuse a TCP connection 5-tuple and want to avoid the "address already in use" problem, use SO_REUSEADDR, SO_LINGER and closesocket(), and avoid shutdown().

To know more about the TIME_WAIT state and other details, check out the following links:
http://www.developerweb.net/forum/showthread.php?t=2941
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

No comments: