What Is Load Balancing?
Load balancing is the process of distributing incoming network traffic across multiple servers so that no single server bears too much demand. A load balancer sits between clients and your server pool, intelligently routing each request to the server best able to handle it. The result is higher availability, better performance, and the ability to scale horizontally by adding more servers rather than upgrading a single machine.
Without load balancing, all traffic hits one server. If that server fails or becomes overloaded, your site goes down. With load balancing, traffic is spread across a group of servers — if one server fails, the others absorb its share automatically, keeping your application online.
Load Balancing Methods
Load balancers use algorithms to decide which server should receive each request. The most common methods include:
- Round robin — Requests are distributed to servers in sequential order. Server A gets the first request, server B the second, server C the third, then back to A. Simple and effective when all servers have equal capacity.
- Least connections — Each new request goes to the server currently handling the fewest active connections. This is more adaptive than round robin because it accounts for requests that take longer to process — slow requests don't cause one server to pile up while others sit idle.
- IP hash — The client's IP address is hashed to determine which server handles the request. This ensures the same client always reaches the same server, which is useful for applications that store session data locally rather than in a shared data store.
- Geographic routing — Requests are routed to the server closest to the client's physical location. This minimizes latency and is commonly used in global deployments. DNS smart routing enables geographic load balancing at the DNS level.
Hardware vs. Software vs. DNS Load Balancing
Load balancing can be implemented at different layers of your infrastructure:
- Hardware load balancers — Dedicated physical appliances (like F5 BIG-IP) that sit in your data center and route traffic at high speed. They offer excellent performance but are expensive and inflexible — scaling requires purchasing additional hardware.
- Software load balancers — Applications like HAProxy, Nginx, or Envoy that run on commodity servers and perform the same routing functions. Software load balancers are far more flexible and cost-effective, and they can be deployed in the cloud, in containers, or on bare metal.
- DNS load balancing — Traffic is distributed at the DNS layer by returning different IP addresses for the same domain name. When a visitor resolves your domain, the DNS server returns the address of the most appropriate server based on health checks, geography, or a simple rotation. DNS-based load balancing operates globally with no additional infrastructure, but it is limited by DNS caching and TTL values.
Load Balancing and High Availability
Load balancing is a cornerstone of high availability architecture. By distributing traffic across multiple servers, you eliminate single points of failure at the application layer. Modern load balancers continuously monitor server health through periodic checks — if a server stops responding or returns errors, the load balancer automatically removes it from the rotation and redistributes its traffic to healthy servers.
For true high availability, the load balancer itself must also be redundant. This is typically achieved through active-passive or active-active load balancer pairs, or by using a CDN that inherently provides distributed load balancing across its global edge network. The NOC.org CDN distributes traffic across multiple points of presence, combining caching, load balancing, and failover into a single managed layer.