Difference between revisions of "Building Scalable Web Cluster using LVS"

From LVSKB
Jump to: navigation, search
(how to install ipvsadm-1.24 in red hat enterprise linux 4)
 
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
if you want to install ipvsadm-1.24 for kernel 2.6 in red hat enterprise linux 4 ,you need to do ln -s /usr/src/kernels/2.6.9-5.EL-hugemem-i686/ /usr/src/linux
+
== Introduction ==
 +
 
 +
Web cluster is a cluster system of more than one computer to provide HTTP and HTTPS service (also called web server farm). Clustering is the best approach to achieve the scalability, availability and reliability of web services.
 +
 
 +
Web cluster with many inexpensive commodity servers can easily handle large volumes of web requests, without creating unwanted delays. The number of servers can be added as workload is increasing.
 +
 
 +
== Architecture ==
 +
 
 +
The general architecture of LVS-based web cluster is illustrated in the following figure.
 +
 
 +
[[Image:Web-cluster.jpg|center]]
 +
 
 +
The architecture has three ties:
 +
* [[Load balancer]], which usually use [[IPVS#IP Load Balancing Technologies|IP load balancing technologies]] for higher system throughput
 +
* Web server pool, which actually perform HTTP and/or HTTPS services
 +
* Shared storage, which can be database, network file system, distributed file system, or the hybrid ones.
 +
 
 +
For dynamic web pages (such as PHP, JSP and ASP pages), data that is accessed by dynamic pages usually is stored in database system. The database service is running at stand-alone server, and is shared by all the web servers. No matter if multiple dynamic pages from a web server or different web servers access the same data, database engines provide atomicity and locking facility to serialize data access, so that it is easier to guarantee data consistency.
 +
 
 +
The static files, such as HTML, graphics, and dynamic page scripts file, can be stored in network file system (NFS or CIFS) or distributed file system. Whether to choose network file system and distributed file system, it depends on system scale and the loading of file access. Through shared network file system and distributed file system, webmaster can see a single image of file storage space, so that it is easier to maintain those files, and any updates is effective for all the web servers.
 +
 
 +
In this architecture of shared storage, system administrators can easily add new web servers to handle increasing load of web access, and do not need to copy the contents to local disks of new web servers.
 +
 
 +
Most web sites may use HTTP cookie, which is to store cookie in client browser and send cookie to web server to track sessions from different HTTP request from the same browser. Once HTTP cookie is used, all the requests from a client must be sent to the same web server if web servers don't know their generated cookie sessions one another.
 +
 
 +
Some web sites may use HTTPS protocol, which is to transfer HTTP over SSL (Secure Socket Layer) connection. When a SSL connection is made to port 443 for secure web service, a key for the connection must be chosen and exchanged. Since it is time consuming to negociate and generate the SSL key, the successive connections from the same client can also be granted by the server in the life span of the SSL key. Therefore, all the HTTPS requests from a client must be sent to the same server in the life span of the SSL key.
 +
 
 +
Upon these connection affinity requirements, [[IPVS]] [[load balancer]] provides the persistent service feature, which is to send all the successive requests from the same client IP address to the same server in the specified time. The persistent service feature can help solve the connection affinity problem between client and server.
 +
 
 +
== Configuration Example ==
 +
 
 +
In the configuration example, we will have a [[LVS/NAT]] cluster of three web servers. The virtual IP address of load balancer is 10.23.8.80, and gateway ip address for internal web servers is 172.18.1.254. The ip addresses of three web servers are 172.18.1.11, 172.18.1.12 and 172.18.1.13 respectively. The web servers can run apache or other web server programs.
 +
 
 +
We use the following [[ipvsadm]] commands to setup IPVS rules:
 +
ipvsadm -A -t 10.23.8.80:80 -s wlc
 +
ipvsadm -a -t 10.23.8.80:80 -r 172.18.1.11 -m -w 100
 +
ipvsadm -a -t 10.23.8.80:80 -r 172.18.1.12 -m -w 100
 +
ipvsadm -a -t 10.23.8.80:80 -r 172.18.1.13 -m -w 100
 +
 
 +
Then, use any computer outside the internal network (172.18.1.0/24) to access http://10.23.8.80/, and see whether it works or not. "ipvsadm -Ln" can be used to list the IPVS table, and "ipvsadm -Lcn" can be used to list the IPVS connections.
 +
 
 +
Once the basic cluster configuration works, we can consider to use [[LVS Cluster Management|cluster management]] software to add reliability and availability into the web cluster system.
 +
 
 +
== Conclusion ==
 +
 
 +
It's easier to build web cluster using LVS and add scalability and availability into the web system.
 +
 
 +
== References ==
 +
 
 +
* [http://en.wikipedia.org/wiki/Comparison_of_web_servers Comparison of web servers]
 +
 
 +
{{lvs-example-stub}}
 +
 
 +
[[Category:LVS Examples|Web]]

Latest revision as of 14:31, 16 December 2006

Introduction

Web cluster is a cluster system of more than one computer to provide HTTP and HTTPS service (also called web server farm). Clustering is the best approach to achieve the scalability, availability and reliability of web services.

Web cluster with many inexpensive commodity servers can easily handle large volumes of web requests, without creating unwanted delays. The number of servers can be added as workload is increasing.

Architecture

The general architecture of LVS-based web cluster is illustrated in the following figure.

Web-cluster.jpg

The architecture has three ties:

  • Load balancer, which usually use IP load balancing technologies for higher system throughput
  • Web server pool, which actually perform HTTP and/or HTTPS services
  • Shared storage, which can be database, network file system, distributed file system, or the hybrid ones.

For dynamic web pages (such as PHP, JSP and ASP pages), data that is accessed by dynamic pages usually is stored in database system. The database service is running at stand-alone server, and is shared by all the web servers. No matter if multiple dynamic pages from a web server or different web servers access the same data, database engines provide atomicity and locking facility to serialize data access, so that it is easier to guarantee data consistency.

The static files, such as HTML, graphics, and dynamic page scripts file, can be stored in network file system (NFS or CIFS) or distributed file system. Whether to choose network file system and distributed file system, it depends on system scale and the loading of file access. Through shared network file system and distributed file system, webmaster can see a single image of file storage space, so that it is easier to maintain those files, and any updates is effective for all the web servers.

In this architecture of shared storage, system administrators can easily add new web servers to handle increasing load of web access, and do not need to copy the contents to local disks of new web servers.

Most web sites may use HTTP cookie, which is to store cookie in client browser and send cookie to web server to track sessions from different HTTP request from the same browser. Once HTTP cookie is used, all the requests from a client must be sent to the same web server if web servers don't know their generated cookie sessions one another.

Some web sites may use HTTPS protocol, which is to transfer HTTP over SSL (Secure Socket Layer) connection. When a SSL connection is made to port 443 for secure web service, a key for the connection must be chosen and exchanged. Since it is time consuming to negociate and generate the SSL key, the successive connections from the same client can also be granted by the server in the life span of the SSL key. Therefore, all the HTTPS requests from a client must be sent to the same server in the life span of the SSL key.

Upon these connection affinity requirements, IPVS load balancer provides the persistent service feature, which is to send all the successive requests from the same client IP address to the same server in the specified time. The persistent service feature can help solve the connection affinity problem between client and server.

Configuration Example

In the configuration example, we will have a LVS/NAT cluster of three web servers. The virtual IP address of load balancer is 10.23.8.80, and gateway ip address for internal web servers is 172.18.1.254. The ip addresses of three web servers are 172.18.1.11, 172.18.1.12 and 172.18.1.13 respectively. The web servers can run apache or other web server programs.

We use the following ipvsadm commands to setup IPVS rules:

ipvsadm -A -t 10.23.8.80:80 -s wlc
ipvsadm -a -t 10.23.8.80:80 -r 172.18.1.11 -m -w 100
ipvsadm -a -t 10.23.8.80:80 -r 172.18.1.12 -m -w 100
ipvsadm -a -t 10.23.8.80:80 -r 172.18.1.13 -m -w 100

Then, use any computer outside the internal network (172.18.1.0/24) to access http://10.23.8.80/, and see whether it works or not. "ipvsadm -Ln" can be used to list the IPVS table, and "ipvsadm -Lcn" can be used to list the IPVS connections.

Once the basic cluster configuration works, we can consider to use cluster management software to add reliability and availability into the web cluster system.

Conclusion

It's easier to build web cluster using LVS and add scalability and availability into the web system.

References

LVS.png "Building Scalable Web Cluster using LVS" is an LVS Example related stub. You can help LVSKB by expanding it