What Is Latency?
Latency is the time it takes for a data packet to travel from its source to its destination. In networking, it is typically measured in milliseconds (ms) and represents the delay between a request being sent and the first byte of the response arriving. Low latency means a fast, responsive connection; high latency means noticeable delays.
For website visitors, latency directly affects the perceived speed of a site. Even if your server can generate a response in 5 ms, a visitor located 10,000 miles away may experience 150 ms or more of latency just from the physical distance the data must travel — and that is before any server processing or DNS resolution time is added.
What Causes Latency?
Several factors contribute to network latency:
- Physical distance: Data travels through fiber optic cables at roughly two-thirds the speed of light. A round trip between New York and Tokyo covers approximately 21,000 km of cable, adding around 100-140 ms of latency that no technology can eliminate — it is limited by physics.
- Network hops: Packets rarely travel in a straight line. They pass through multiple routers, switches, and internet exchange points. Each hop adds a small amount of processing delay, and the cumulative effect can be significant.
- Processing delay: Every device that handles a packet — routers, firewalls, load balancers, and the server itself — takes time to inspect and forward it. Complex firewall rules or deep packet inspection can add measurable latency.
- Congestion: When network links approach capacity, packets are queued in buffers. This queuing delay can spike dramatically during peak traffic or DDoS attacks.
- Protocol overhead: Establishing a TCP connection requires a three-way handshake (one round trip). Adding TLS encryption requires an additional handshake. For an HTTPS connection, you may incur two or more round trips before any application data flows.
Latency vs. Bandwidth
Latency and bandwidth are often confused but measure entirely different things:
- Latency is how long it takes a single packet to arrive — the delay.
- Bandwidth is how much data can flow through the connection per second — the capacity.
A useful analogy: latency is how long it takes a truck to drive from warehouse A to warehouse B. Bandwidth is how big the truck is. A bigger truck (more bandwidth) does not make the drive any faster. And a faster route (lower latency) does not increase how much cargo fits in the truck.
For web applications, latency often matters more than bandwidth. A typical web page requires dozens of sequential requests (HTML, CSS, JavaScript, images), and each request is delayed by latency. Doubling your bandwidth rarely makes pages load twice as fast, but halving your latency often does.
Measuring Latency
There are several tools and methods for measuring network latency:
- Ping: The simplest tool. It sends ICMP echo requests and reports the round-trip time (RTT). A ping time of 20 ms means the packet took 20 ms to go to the destination and back.
- Traceroute: Shows every hop between you and the destination, along with the latency at each hop. This is invaluable for identifying where delays are being introduced.
- HTTP timing: Browser developer tools break down request timing into DNS lookup, TCP handshake, TLS handshake, time to first byte (TTFB), and content download. This gives a complete picture of real-world latency for web requests.
How CDNs Reduce Latency
A content delivery network (CDN) is one of the most effective tools for reducing latency. By caching content at edge locations distributed around the world, a CDN ensures that visitors connect to a server that is geographically close to them rather than to a distant origin server.
Instead of a visitor in London making a 140 ms round trip to your server in California, they connect to a CDN edge node in London with 5 ms of latency. The CDN handles the TCP and TLS handshakes locally, serves cached content instantly, and only contacts the origin server when necessary — dramatically reducing the total time to deliver the page.