Latency is the time delay between the initiation of an event and its perception by some observer. In networking and telecommunications, latency is the time between a sender causing a change in a system's state and its reception by an observer. Network latency is often informally used interchangeably with lag.
Excess latency is one of the most common causes of poor Voice or Video communications quality over a network. As networked communications emulate face to face or verbal communications, long delays between transmission and reception of a signal are easily noticeable and off-putting. High latency, especially without prepared participants, causes communications breakdowns and stops conversation flow.
Latency's effects depend on observers, but most will perceive obvious latency around 100 - 120 milliseconds. Communications will start to break down around 250 - 300ms. If all parties are aware of large latency in a session (e.g. if all parties know a call is over a satellite connection), this delay might still be acceptable.
The International Telecommunications Union has codified recommendations on maximum acceptable one-way latency in ITU-T G.114. Their recommendation for network planning is to keep one-way latency below 400ms at the absolute low end while noting that a target of 150ms one-way latency is suitable for most purposes. Some applications, especially those with interactive sessions, will be best experienced with latencies under 100ms.
For many applications of IP communications, the largest contributor to latency is the routing between the sender and receiver. This is caused by the physical routes the packets are forwarded over, as well as the individualized caches, settings, and congestion on intermediate forwarding nodes.
A partial list of other contributors to latency (but by no means an exhaustive one) follows:
Latency measurement was standardized in IETF's RFC 2544, based on definitions of latency from RFC 1242. RFC 1242 draws a distinction between store and forward devices and bit forwarding devices, but latency measures from when the last of the data leaves an output to when the first of the data reaches the input. Generally, in IP applications, latency measures the time between when the last piece of data leaves a sender to when the recipient receives the first piece of data.
ping is a utility which sends ICMP ECHO
packets to measure network performance. In its most commonly used implementation, ping will send ECHO
packets and receive ECHO REPLY
packets. Ping then rolls up results to calculate statistics related to round-trip times.
Although closely related to latency (especially when routing or network latency dominates), ping is slightly different. On top of taking two trips through the network, there is also some compute time involved for the recipient to receive and process the ECHO
packet and to send an ECHO REPLY
. However, ping often gives an excellent estimate of latency and can be used to diagnose or identify many latency issues.
In many cases, network effects cause the largest increase in latency. Although physical distance matters, the actual routes taken by packets in a session may be much longer. Take, for instance, the following example:
Two participants are both on the equator, 1,000 miles apart in a straight line.
Their communications path is through a satellite in Geosynchronous (more accurately Geostationary) Orbit exactly bisecting the two, 22,236 miles above earth. Rounding off, the path from A or B to the Satellite is 22,273 miles.
It takes light about 119.6 ms to travel that distance, so if the satellite takes 2 ms to process packets, the minimum one-way latency added by the route is 241.2 milliseconds!