SAN JOSE, Calif. – The massive data traffic coursing through Facebook’s cloud campuses has outstripped the capabilities of commercial networking hardware. So the company has built its own distributed networking system to support its growing data needs – and to lay the groundwork for even larger cloud campuses to come.
At Tuesday’s Open Compute Summit 2018, Facebook unveiled the Fabric Aggregator, a new system to manage data traffic between its data centers. The company is donating the device design to the Open Compute Project (OCP), the open source hardware project which Facebook co-founded in 2011.
The development of Fabric Aggregator highlights how the super-sizing of data centers and cloud campuses has implications for network equipment. As hyperscale companies like Facebook continue to grow, the huge volumes of user data prompt them to add data centers. This leads to larger cloud campuses, with truly massive volumes of data moving between them.
The volume of this “East-West” traffic between data centers far surpasses the volume of data traveling to other campuses and the Internet – known as “North-South” traffic.
Even Bigger Campuses Ahead?
As we noted yesterday, Facebook has been increasing the scale of its data center campuses. Since the beginning of 2017, Facebook has announced five new cloud campuses, and now has 12 around the globe, including nine in the U.S.
Early campuses featured two data centers, but the company has recently been building bigger.
“We’ve been moving to six buildings, creating a large increase in East-West traffic,” said Sree Sankar, Technical Product Manager at Facebook. “That required a big change.”
The Fabric Aggregator is a distributed network system made up of a simple building block – Facebook’s Wedge 100 switch.
“Unfortunately using a large, general purpose network chassis no longer met our needs in terms of scale, power efficiency, and flexibility,” the Facebook Engineering Team said in a blog post. “Taking a disaggregated approach allows us to accommodate larger regions and varied traffic patterns, while providing the flexibility to adapt to future growth.”
What does “adapt to future growth” mean? One obvious possibility is even larger data center campuses, with more buildings. But the most important benefit of the new architecture is that Fabric Aggregator provides flexibility, allowing it to manage campuses with different numbers of facilities in a logical fashion.
Flexible Enough for Many Scenarios
The aggregator was designed so it can work within a single rack, or in a multi-rack configurations. Different flavors of the rack were designed to support various networking technologies that tie together the many layers of data center traffic.
“The ability to tailor different Fabric Aggregator node sizes in different regions allows us to use resources more efficiently, while having no internal dependencies keeps failures isolated, improving the overall reliability of the system,” Facebook said.
The Fabric Aggregator uses Wedge100S switches with Facebook Open Switching System (FBOSS) as the base building blocks, running Border Gateway Protocol (BGP) between all subswitches
“A building block approach gives us the ability to operate the solution at either the subswitch or node level,” the Facebook team writes. “For example, if we detect a misbehaving subswitch inside a particular node, we can take that specific subswitch out of service for debugging. If there is a need take all downstream and upstream subswitches out of service in a node, our operational tools abstract all the underlying complexities inherent to multiple interactions across many individual subswitches. We also implement redundancy at the node level so that we can take many nodes out of service simultaneously in a single region. The Fabric Aggregator layer can suffer many simultaneous failures without compromising the overall performance of the network.”
For additional technical details, see the Facebook Engineering Post on Fabric Aggregator.