To put it simply, load balancers are a way of distributing work across multiple computers. In web, they are often used to distribute HTTP requests across multiple servers, so websites can handle large amounts of traffic.
Server-side load balancers are the most common method in web applications; it usually consists of a software application that listens on an external port (e.g port 80) and forwards requests from this port to one of its backend servers where the web application is actually hosted. This setup means the client only knows about the load balancer and prevents them from being able to contact the backend servers directly.
Load balancers increase redundancy in web applications by removing servers from the backend groups if it starts misbehaving. Load balancers are often set up in pairs to avoid becoming a single point of failure.
The majority of web applications have data they persist between multiple requests from a user, for example, a server might need to know if a user is logged in. This causes a problem in load balancers applications as data is normally stored on the server itself, but this would mean other backend servers would not be able to access this data. The first way to solve this, is to make sure the user always connects to the same backend server. While this would work perfectly fine while the server is up, if that server become unavailable, it would mean the data would be lost when the user switched to another backend server.
Another option is to store the session data in the database, this allows any server to retrieve the session data, but increases load on the database server, so can affect performance. Sessions can also be stored in browser cookies, where the content of the session is encrypted. This allows the load balancer to simply forward the session along to the application and does not matter which backend server the user is connected to.
Along with the obvious benefit of spreading load across multiple servers, load balancers often have a range of features to help with the deployment of web applications.
Auto scaling during high loads
Web applications generally have times of the day at which they are used a lot more than others. This is sometimes a predictable pattern (e.g. more usage during work hours), or can be the result of an unexpected spike in traffic. Both of these can be fixed by deploying a huge number of servers to cope with much larger amounts of traffic than needed, however, this is very wasteful as resources that are not needed at all times are being paid for. This can be combatted by only increasing the number of servers when the traffic is high. When a load balancer is being used, this is very simple. Servers are simply added to the backend server pool and your application will now be able to cope with the increase in traffic. The user doesn’t need to know about the new servers, and they can instantly come into use to deal with the high load.
HTTP caching allows the load balancer to cache static and dynamic content, so that if it receives a request for an object that it has cached, it can simply return it and not have to contact a backend server. This reduces load on the backend servers and improve performance as there is no network transfer between the backend and the load balancer. This comes at a cost of the storage of the cache. The cache can often be stored in either memory or on disk. Memory is much faster than a disk-based cache, but is often more limited.
SSL is computationally expensive for web servers to handle. You can reduce the cost of SSL on the web servers by performing the SSL operations on the load balancers. This allows the load balancer to send the request as a HTTP request to the backend server and also allows the load balancer access to the request allowing it to perform more advanced features.
Being DDoS’ed can be difficult to handle, some load balancers include features that mitigate attacks allowing you to better handle the attack and reduce the effect on your users.
Having your backend servers behind a load balancer means that their traffic can be heavily restricted, as the traffic they receive should only come from your servers. This means they are more secure as it is more difficult to detect and attack them.
Deploying updates in a live application can be a daunting task. Updates could break your application and bring the whole service crashing down. Using a load balancer allows you to pull a server out of use while it is being updated without any affecting your users. Once the update is completed, you can then slowly increase its load until it’s certain the build is okay. This process can be done automatically and allows continuous deployment without too much risk to the service.