Difference between revisions of "Building Scalable DNS Cluster using LVS"

From LVSKB
Jump to: navigation, search
(Configuration Example: now here is actually something written ;))
(Conclusion)
Line 94: Line 94:
 
== Conclusion ==
 
== Conclusion ==
  
 +
It just works.
  
 
[[Category:LVS Examples|DNS]]
 
[[Category:LVS Examples|DNS]]

Revision as of 09:36, 20 January 2006

Introduction

Architecture

Configuration Example

keepalived.conf:

! Balancer-Set for udp/53
virtual_server 194.97.173.124 53 {
   delay_loop 10
   lb_algo wrr
   lb_kind DR
   protocol UDP
   ! persistence_timeout 1
   ! persistence_granularity 255.255.255.255
   ! eth1.105 -> kai eth1.105
   real_server 10.1.53.2 53 {
       weight 1
       MISC_CHECK {
           misc_path "/usr/bin/dig -b 10.1.53.1 a resolve.test.roka.net @10.1.53.2 +time=1 +tries=5 +fail > /dev/null"
           misc_timeout 6
       }
   }
   ! eth1.109 -> kai eth1.109
   real_server 10.3.53.2 53 {
       weight 1
       MISC_CHECK {
           misc_path "/usr/bin/dig -b 10.3.53.1 a resolve.test.roka.net @10.3.53.2 +time=1 +tries=5 +fail > /dev/null"
           misc_timeout 6
       }
   }
}

As you can dig (;-) we are using an A record with a low TTL to test the service for this setup is a recursive DNS cluster. So far dig works fine with 44 real_servers configured on an idle Dual PIII 800.


on real_server kai we use the following netfilter setup to be able to direct the traffic to different BIND processes on the same machine/mac:

#DNAT 194.97.173.124->10.1.53.2 eth1.105
$ipt -t nat -A PREROUTING -i eth1.105 -s $net -d 194.97.173.124 -p tcp --dport 53 -j DNAT --to-destination 10.1.53.2:53
$ipt -t nat -A PREROUTING -i eth1.105 -s $net -d 194.97.173.124 -p udp --dport 53 -j DNAT --to-destination 10.1.53.2:53
#DNAT 194.97.173.124->10.3.53.2 eth1.109
$ipt -t nat -A PREROUTING -i eth1.109 -s $net -d 194.97.173.124 -p tcp --dport 53 -j DNAT --to-destination 10.3.53.2:53
$ipt -t nat -A PREROUTING -i eth1.109 -s $net -d 194.97.173.124 -p udp --dport 53 -j DNAT --to-destination 10.3.53.2:53


If you have more than one Loadbalancer at different locations and you can convince your local Networker to let you speak BGP4+ to his routers you can use quagga with something like the following configuration to failover the service IP to the second LB if the first one goes down:

!
router bgp 5430
 no synchronization
 bgp router-id a.b.c.d
 redistribute connected route-map benice
 neighbor c.d.e.f remote-as 5430
 neighbor c.d.e.f description ffm4-j2
 neighbor c.d.e.f send-community both
 neighbor c.d.e.f soft-reconfiguration inbound
 neighbor c.d.e.f route-map nixda in
 neighbor c.d.e.f route-map benice out
 neighbor d.c.f.e remote-as 5430
 neighbor d.c.f.e description ffm4-j
 neighbor d.c.f.e send-community both
 neighbor d.c.f.e soft-reconfiguration inbound
 neighbor d.c.f.e route-map nixda in
 neighbor d.c.f.e route-map benice out
 no auto-summary
!
access-list line permit 127.0.0.1/32 exact-match
access-list line deny any
!
ip prefix-list cns-dus2 description dus2 high-metric eq low-perference
ip prefix-list cns-dus2 seq 5 permit 194.97.173.125/32
ip prefix-list cns-dus2 seq 10 deny any
ip prefix-list cns-ffm4 description ffm4 low-metric eq high-preference
ip prefix-list cns-ffm4 seq 5 permit 194.97.173.124/32
ip prefix-list cns-ffm4 seq 10 deny any
!
route-map benice permit 10
 match ip address prefix-list cns-ffm4
 set local-preference 100
 set metric 0
!
route-map benice permit 20
 match ip address prefix-list cns-dus2
 set local-preference 100
 set metric 1
!
route-map nixda deny 10
!

This is the LB at FFM4. Note that the metric at the DUS2 LB is just the other way around. Here we fancy talking to two core-routers from each LB for extra redundancy. You can also have an internal anycast ServiceIP if you use the same metric at both LBs and make sure they are attached to the same level of router network-topology-wise. This way traffic gets shared between the two loadbalancers according to your network-topology most interesting of course for large dialin ISPs.

Conclusion

It just works.