Least-Connection Scheduling

Jump to: navigation, search

The least-connection scheduling algorithm directs network connections to the server with the least number of established connections. This is one of the dynamic scheduling algorithms; because it needs to count the number of connections for each server dynamically to estimate its load. The load balancer records the connection number of each server, increases the connection number of a server when a new connection is dispatched to it, and decrease the connection number of a server when a connection finishs or timeouts.

In the IPVS implementation, an extra condition was introduced in the least-connection scheduling to check if a server is available or not, because a server may be taken out of service for either masking server failure or system maintenance. The server weight zero indicates that this server is not available. The formal procedure of round-robin scheduling is as follows:

Supposing that there is a server set S = {S0, S1, ..., Sn-1},
W(Si) is the weight of server Si,
C(Si) is the current connection number of server Si,

for (m = 0; m < n; m++) {
    if (W(Sm) > 0) {
        for (i = m+1; i < n; i++) {
            if (W(Si) <= 0)
            if (C(Si) < C(Sm))
                m = i;
        return Sm;
return NULL;

For a IPVS cluster that has a collection of servers with similar performance, least-connection scheduling is good to smooth distribution when the load of requests vary a lot. The load balancer will direct requests to the real server with the fewest connections.

At a first glance it might seem that least-connection scheduling can also perform well even when there are servers of various processing capacities, because the faster server will get more network connections. In fact, it cannot perform very well because of the TCP's TIME_WAIT state. The TCP's TIME_WAIT is usually 2 minutes, during this 2 minutes a busy web site often receives thousands of connections, for example, the server A is twice as powerful as the server B, the server A is processing thousands of requests and keeping them in the TCP's TIME_WAIT state, but server B is crawling to get its thousands of connections finished. So, the least-connection scheduling cannot get load well balanced among servers with various processing capacities.