A diagram of the traffic volume across Facebook’s network, and the split between M2M and user traffic. (Source: Facebook)
With the Express Backbone, Facebook engineers sought to improve upon some of the technical constraints of the “classic” backbone network. Its approach, which is explained in detail in the technical blog post, provides new levels of control over the handling of different types of traffic. That includes a hybrid model for traffic engineering (TE), using both distributed control agents and a central controller.
“This model allows us to control some aspects of traffic flow centrally, e.g., running intelligent path computations,” the Facebook team writes. “At the same time, we still handle network failures in a distributed fashion, relying on in-band signaling between Open/R agents deployed on the network nodes.
“Such a hybrid approach allows the system to be nimble when it encounters congestion or failure. In EBB, the local agents can immediately begin redirecting traffic when they spot an issue (local response). The central system then has time to evaluate the new topology and come up with an optimum path allocation, without the urgent need to react rapidly.”
The result is a “path allocation algorithm” that can be changed to address the different requirements for each class of traffic. For example, it can:
- Minimize latency for latency-sensitive traffic.
- Minimize path congestion for latency-insensitive traffic.
- Schedule latency-sensitive traffic ahead of latency-insensitive traffic.
“We’re very excited to have EBB running in production, but we are continuously looking to make further improvements to the system,” the Facebook engineers reported.
“For example, we’re extending the controller to provide a per-service bandwidth reservation system. This feature would make bandwidth allocation an explicit contract between the network and services, and would allow services to throttle their own traffic under congestion. In addition, we’re also working on a scheduler for large bulk transfers so that congestion can be avoided proactively, as opposed to managed reactively.”