In order to make troubleshooting easier, we need to understand the IP load balancing technology that we are using now in the cluster, and know the whole packet flow, such as how packets are received and sent out at the load balancer, and how packets are handled at real servers and sent out back to the clients.
To know more information about IP load balacing technologies implemented in the IPVS, check
- Virtual Server via Network Address Transaltion
- Virtual Server via IP Tunneling
- Virtual Server via Direct Routing
Then, we can use some packet capture tools, such as ethereal and tcpdump, to do troubleshooting at our cluster system. First, we can capture the load balanced traffic at both load balancer and real servers, in order to make sure that basic load balancing system works; Second, we can also capture packets of service monitoring among load balancers and real servers, to verify that cluster monitoring and high availability works as expected.
While LVS provides basic means for load-balancing, lvs-kiss provides some more sophisticated possibilities.
Basically lvs-kiss is just a peace of pearl that sits on top of LVS.
lvs-kiss features configurable means of measuring node "load". This can be anything - from using the average load to an snmp-get (any thing you can imagine doing at the command-line of an lvs-server which gives a numerical response).
Trouble-shooting load-balancing can be tricky, when you encounter web-caches etc. I tend to use the good old "telnet VIP PORT" to check the result of load-balancing. For tests, you may even use dummy results with lvs-kiss (just do "echo NUMBER" as method for getting the load...).
If you compile linux without the CONFIG_IP_VS_PROTO_TCP configure option, ipvsadm does not give any error messages, but does not grab incoming connections.