Researchers discovered that congestion control algorithms designed to ensure that multiple users sending data over a network do so fairly are incapable of avoiding situations in which some users hog all of the bandwidth. Congestion can occur when users want to send data over the internet faster than the network can handle, similar to how traffic congestion snarls the morning commute into a big city.
Computers and devices that transmit data over the internet divide the data into smaller packets and use a special algorithm to determine how quickly those packets are sent. These congestion control algorithms seek to fully discover and utilize available network capacity while fairly sharing it with other users who may be sharing the same network. These algorithms try to minimize delays caused by data waiting in queues in the network.
Several algorithms have been developed over the last decade by researchers in industry and academia to achieve high rates while controlling delays. Some of these, such as Google’s BBR algorithm, are now widely used by many websites and applications.
However, a group of MIT researchers discovered that these algorithms can be extremely unfair. According to a new study, there will always be a network scenario in which at least one sender receives almost no bandwidth compared to other senders; that is, starvation cannot be avoided.
“What is really surprising about this paper and the results is that when you take into account the real-world complexity of network paths and all the things they can do to data packets, it is basically impossible for delay-controlling congestion control algorithms to avoid starvation using current methods,” says Mohammad Alizadeh, associate professor of electrical engineering and computer science (EECS).
What is really surprising about this paper and the results is that when you take into account the real-world complexity of network paths and all the things they can do to data packets, it is basically impossible for delay-controlling congestion control algorithms to avoid starvation using current methods.Mohammad Alizadeh
While Alizadeh and his co-authors were unable to find a traditional congestion control algorithm that could avoid starvation, they believe that there may be algorithms in a different class that could. Their research also suggests that modifying these algorithms to allow for larger variations in delay could help prevent starvation in some network situations.
Alizadeh collaborated on the paper with first author and EECS graduate student Venkat Arun, as well as senior author and Fujitsu Professor of Computer Science and Artificial Intelligence Hari Balakrishnan. The findings will be presented at the ACM SIGCOMM (Special Interest Group on Data Communications) conference.
Congestion control is a fundamental problem in networking that researchers have been trying to tackle since the 1980s.
A user’s computer does not know how fast to send data packets over the network because it lacks information, such as the quality of the network connection or how many other senders are using the network. Sending packets too slowly makes poor use of the available bandwidth. But sending them too quickly can overwhelm the network, and in doing so, packets will start to get dropped. These packets must be resent, which leads to longer delays. Delays can also be caused by packets waiting in queues for a long time.
Congestion control algorithms use packet losses and delays as signals to determine how fast data should be sent. However, the internet is complex, and packets can be delayed or lost for reasons other than network congestion. Data could, for example, be held up in a queue along the way before being released with a burst of other packets, or the receiver’s acknowledgment could be delayed. Delays that are not caused by congestion are referred to as “jitter” by the authors.
Even if a congestion control algorithm measures delay perfectly, it can’t tell the difference between delay caused by congestion and delay caused by jitter. Delay caused by jitter is unpredictable and confuses the sender. Because of this ambiguity, users start estimating delay differently, which causes them to send packets at unequal rates. Eventually, this leads to a situation where starvation occurs and someone gets shut out completely, Arun explains.
“We started the project because we lacked a theoretical understanding of congestion control behavior in the presence of jitter. To place it on a firmer theoretical footing, we built a mathematical model that was simple enough to think about, yet able to capture some of the complexities of the internet. It has been very rewarding to have math tell us things we didn’t know and that have practical relevance,” he says.
The researchers fed their mathematical model to a computer, gave it a series of commonly used congestion control algorithms, and asked the computer to find an algorithm that could avoid starvation, using their model.
“We couldn’t pull it off. We tried every algorithm we were aware of, as well as some new ones we devised. Nothing was working. The computer always found a way for some people to get all of the bandwidth while at least one person gets almost nothing “Arun explains.
This result surprised the researchers, especially since these algorithms are widely thought to be reasonably fair. They began to suspect that avoiding starvation, an extreme form of unfairness, might not be possible. This prompted them to develop a class of algorithms known as “delay-convergent algorithms,” which they demonstrated will always suffer from starvation in their network model. All existing congestion control algorithms (that the researchers are aware of) that control delay are delay-convergent.
The fact that such simple failure modes of these widely used algorithms remained unknown for so long illustrates how difficult it is to understand algorithms through empirical testing alone, Arun adds. It underscores the importance of a solid theoretical foundation.
However, all is not lost. While all of the algorithms they tried failed, there may be other non-delay-convergent algorithms that can avoid starvation. This suggests that one solution could be to design congestion control algorithms that vary the delay range more widely so that the range is larger than any delay caused by network jitter.
“Algorithms have attempted to limit the variations in delay about the desired equilibrium in order to control delays, but there is nothing wrong with potentially increasing delay variation in order to obtain better measurements of congestive delays. You would simply need to adopt a new design philosophy” Balakrishnan elaborates.
Now, the researchers want to see if they can find or create an algorithm that will eliminate hunger. They also want to apply this mathematical modeling and computational proof approach to other difficult, unsolved problems in networked systems.
“We are increasingly reliant on computer systems for critical tasks, and we need to establish a more solid conceptual foundation for their dependability. We’ve shown how surprising things can be discovered when you take the time to develop these formal specifications of what the problem is” Alizadeh says