• About Us
  • Partnership Opportunities
  • Privacy Policy

Data Center Frontier

Charting the future of data centers and cloud computing.

  • Cloud
    • Hyperscale
  • Colo
    • Site Selection
    • Interconnection
  • Energy
    • Sustainability
  • Cooling
  • Technology
    • Internet of Things
    • AI & Machine Learning
    • Edge Computing
    • Virtual Reality
    • Autonomous Cars
    • 5G Wireless
    • Satellites
  • Design
    • Servers
    • Storage
    • Network
  • Voices
  • Podcast
  • White Papers
  • Resources
    • COVID-19
    • Events
    • Newsletter
    • Companies
    • Data Center 101
  • Jobs
You are here: Home / Cloud / Facebook Takes Open Compute Hardware to the Next Level

Facebook Takes Open Compute Hardware to the Next Level

By Rich Miller - March 14, 2016 1 Comment

Facebook Takes Open Compute Hardware to the Next Level

Rows of servers inside a a Facebook data center. (Photo: Rich Miller)

LinkedinTwitterFacebookSubscribe
Mail

SAN JOSE, Calif. – Facebook is creating the next generation of open hardware, building new technologies into its data center platform. The social network is leveraging an alphabet soup of  powerful technologies – including SSDs, GPUs, NVM and JBOFs – to build new servers and storage gear to accelerate its infrastructure.

These upgrades are part of Facebook’s vision to create a network of powerful data centers that will push the boundaries of delivering services over the Internet.

“Over the next decade, we’re going to build experiences that rely more on technology like artificial intelligence and virtual reality,” said Facebook CEO Mark Zuckerberg. “These will require a lot more computing power, and through efforts like the Open Compute Project, we’re developing a global infrastructure to enable everyone to enjoy them.”

Facebook discussed its progress Wednesday at the Open Compute Summit, which brought together the growing community of open source hardware hackers who are building on designs that started life in Facebook’s data centers. It showed of a number of updates to its infrastructure. These include:

  • A retooled server form factor to pack more performance into the same power footprint.
  • New servers for high-performance data crunching, powered by graphic processing units (GPUs) rather than CPUs.
  • An evolved storage sled, in which the original JBOD (“just a bunch of disks”) has become a much faster JBOF (“Just a Bunch of Flash”).
  • An experiment with advances in non-volatile memory (NVM) to provide more options for storage tiering.
Jason Taylor, chairman of the Open Compute project.

Jason Taylor, chairman of the Open Compute project.

The summit marked the fifth anniversary for the Open Compute Project, which prompted reflection on how far OCP has come since 2011, when it was founded to innovate upon designs released by Facebook.

“It’s remarkable to see where we are today,” said Jason Taylor, chairman of the Open Compute Project, and also a VP of Infrastructure at Facebook. “OCP is where engineers can get together to build amazing things.

“I feel a tremendous sense of momentum, as we’ve moved beyond hyperscale and into finance and telecom,” he said.

Servers: Next-Generation Design

Facebook has totally retooled its server design and infrastructure, shifting from its traditional two-processor server to a system-on-chip (SoC) based on a single Intel Xeon-D processor that uses less power and solves several architectural challenges.

TheMono Lake server boards are housed in a new enclosure called Yosemite, which houses four SoCs in each sled chassis. Facebook engineers Vijay Rao and Edwin Smith described the new design on the Facebook Engineering Blog.

“We worked closely with (Intel) on the design of a new processor, and in parallel redesigned our server infrastructure to create a system that would meet our needs and be widely adoptable by the rest of the industry,” they wrote. “The result was a one-processor server with lower-power CPUs, which worked better than the two-processor server for our web workload and is better suited overall to data center workloads … At the same time, we redesigned our server infrastructure to accommodate double the number of CPUs per rack within the same power infrastructure.”

Free Resource from Data Center Frontier White Paper Library

cloud service providers
Pro Tips and Best Practices: Physical Layer Strategies for Cloud/Managed Service Providers
Successful Cloud Service Providers and Managed Service Providers need to be out in front of everything in their managed data center spaces – ensuring uptime, bandwidth, and operational/cost efficiency today, with the flexibility and scalability to adapt and expand on the fly. Physical layer and  infrastructure is the foundation on which those services are built. Get the new data center ebook from Siemon that explores pro tips and best practices for physical layer strategies for cloud and managed service providers, from zone cabling in the colocation data center to high speed interconnects in the data center.
We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

Get this PDF emailed to you.

The Facebook Yosemite sled, which houses four of the new Mono Lake single-processor servers. (Image: Facebook)

The Facebook Yosemite sled, which houses four of the new Mono Lake single-processor servers. (Image: Facebook)

The new design streamlines communication between processors, and between the processors and memory.

“We minimized the CPU to exactly what we required,” the Facebook engineers reported. “We took out the QPI (Quick Path Interconnect,an Intel point-to-point processor interconnect) links, which reduced costs for Intel and removed the NUMA (Non-Uniform Memory Access) problem for us, given that all servers would be one-socket-based. We designed for it to be a system-on-a-chip (SOC), which integrates the chipset, thus creating a simpler design. This single-socket CPU also has a lower thermal design power (TDP). At the same time, we redesigned our server infrastructure to accommodate double the number of CPUs per rack within the same power infrastructure.”

This allowed Facebook to create a server infrastructure that could pack far more performance into each rack, while remaining under the designed rack power density of 11 kW per cabinet.

Beefier Servers for AI Data-Crunching

Facebook shared an update on its use of GPUs, which in recent years have played a major role in high performance computing. GPUs were initially used to accelerate the performance of desktop PCs to handle graphics, but are now helping accelerate workloads for some of the world’s most powerful supercomputers.

Facebook is using GPUs to bring more horsepower to bear on data-crunching for its artificial intelligence (AI) and machine learning platform. Facebook’s AI Lab trains neural networks (computers that emulate the learning process of the human brain) to solve new challenges. This requires lots of computing horsepower.

“We’ve been investing a lot in our artificial intelligence technology,” said Jay Parikh, Global Head of Engineering and Infrastructure for Facebook. “AI is now powering things like your Newsfeed. It is helping us serve better ads. It is also helping make the site safer for people that use Facebook on a daily basis.”

Facebook's Big Sur is Open Rack-compatible hardware designed for AI computing at a large scale.

Facebook’s Big Sur is Open Rack-compatible hardware designed for AI computing at a large scale. (Image: Facebook)

The Big Sur system leverages NVIDIA’s Tesla Accelerated Computing Platform, with eight high-performance GPUs of up to 300 watts each, with the flexibility to configure between multiple PCI-e connections. Facebook has optimized these new servers for thermal and power efficiency, allowing them to operate them in the company’s data centers alongside standard CPU-powered servers.

The gains in performance and latency provided by Big Sur help Facebook process more data, dramaticallly shortening the time needed to train its neural networks.

“It is a significant improvement in performance,” said Parikh. “We’ve deployed thousands of these machines in a matter of months. It gives us the ability to drive this technology into more product use cases within the company.”

Storage: Just a Bunch of Flash

Facebook has used Flash for many years to accelerate server boot drives and caching. As its infrastructure has continued to scale, it has created a new “building block” to integrate more Flash into its operations. Facebook has adapted its initial Open Compute storage sled, known as Knox, and substituted solid state drives (SSDs) for the hard disk drives (HDDs) – transforming the “Just a Bunch of Disks” storage unit to “Just a Bunch of Flash’ (JBOF).

Facebook has worked with Intel to develop the new JBOF unit, called Lightning, reflecting the speed gained through the use of NVM Express (NVMe), a high-speed PCI Express interface that’s been optimized for SSDs. Here’s a look at the specs in a slide from Parikh’s presentation at the Open Compute Summit. 

ocp-lightning-nvm

Specs for the new Lightning storage sled, which replaces hard disk drives with solid state drives. (Image: Facebook)

As a disaggregated storage appliance, Lightning can support a variety of different applications. “It brings a new building block in the form of high-performance storage for the applications we’re building,” said Parikh.

Parikh said there will be more storage innovation ahead, particularly in using non-volatile memory (NVM) in new ways.

Jay Parikh, Global Head of Engingeering and Infrastructure at Facebook. (Photo: Rich Miller)

Jay Parikh, Global Head of Engineering and Infrastructure at Facebook. (Photo: Rich Miller)

“In the storage industry, disk drives are getting bigger, but they’re not getting more reliable, latency isn’t getting any better, and IOPS (input/ouput operations per second) isn’t improving.” said Parikh. “Flash is also getting slightly better, but endurance is not improving that dramatically. We’re really stuck with this paradigm where things are scaling out and getting bigger, but from a performance perspective, we’re not getting what we actually need.”

Facebook sees a potential answer in new NVM implementations, especially the 3D XPoint technology developed by Intel and Micron. Parikh called on the Open Compute community to focus on this technology as a worthwhile solution to current storage challenges.

“We can start to think about our storage problems, and spread that (storage) across many more tiers that give us more price and performance levers to scale out things for performance, or capacity, or optimizing on price,” said Parikh, who said NVM offered an attractive option between DRAM and NAND (Flash).

Facebook is test-driving its NVM configurations with an open source project called MyRocks, which is built atop MySQL and RocksDB database technologies.

The Road Ahead: Scaling for the Data Deluge to Come

Facebook’s relentless push to build a faster and more powerful infrastructure is driven by the growth of its audience, which now includes 1.6 billion users on Facebook, 1 billion on WhatsApp, 800 million on Facebook Messenger, and 400 million using Instagram. The company’s ambitions are also powered by Zuckerberg’s embrace of virtual reality, reflected in the $2 billion acquisition of VR pioneer Oculus.

Virtual reality can deliver immersive 3D experiences, and many analysts believe  the technology is nearly ready for prime time. Zuckerberg believes Facebook can deliver its social network as a virtual reality experience.

“Pretty soon we’re going to live in a world where everyone has the power to share and experience whole scenes as if you’re just there, right there in person,” Zuckerberg said at the recent Mobile World Congress. “Imagine being able to sit in front of a campfire and hang out with friends anytime you want. Or being able to watch a movie in a private theater with your friends anytime you want. Imagine holding a group meeting or event anywhere in the world that you want. All these things are going to be possible. And that’s why Facebook is investing so much early on in virtual reality, do we can hope to deliver these types of social experiences.”[clickToTweet tweet=”Mark Zuckerberg: We’re going to build experiences that will require a lot more computing power.” quote=”Mark Zuckerberg: We’re going to build experiences that will require a lot more computing power.”]

That will require a LOT of infrastructure. Full VR video files can be up to 20 times larger than the size of today’s HD video files.

“The file sizes are so large they can be an impediment to delivering 360 video or VR in a quality manner at scale,” write Facebook’s Evgeny Kuzakov and David Pio, who recently outlined Facebook’s progress on encoding and compression technologies for virtual reality files. Facebook is moving from equirectangular layouts to a cube format in 360 video, reducing file sizes by 25 percent.

But Facebook realizes that real-time delivery of virtual reality will require faster networks, and they can’t do it alone. Following the Open Compute model, Facebook has created the Telecom Infra Project, teaming with Equinix, Intel, Nokia, SK Telecom and T-Mobile/Deutsche Telekom to develop new 5G technologies to accelerate global networks.

“Scaling traditional telecom infrastructure to meet the global data challenge (of video and virtual reality) is not moving as fast as people need it to,” said Parikh. “Driving a faster pace of innovation in telecom infrastructure is necessary to meet these new technology challenges and to unlock new opportunities.”

LinkedinTwitterFacebookSubscribe
Mail

Tagged With: Artificial Intelligence, Facebook, Open Compute, Servers, SSDs, Storage

Newsletters

Stay informed: Get our weekly updates!

Are you a new reader? Follow Data Center Frontier on Twitter or Facebook.

About Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

Comments

  1. eemoss57@yahoo.com'ED MOSS says

    August 2, 2017 at 6:17 am

    RICH FORGIVE MY FIRST NAME PROTOCALBUT AFTER SEVERAL COMMENTS I FEEL FAMILAR WITH YOUR GREAT WORK IN THESE AREAS THAT INTEREST ME TO NO END . YOU SEEM TO HAVE A KNACT FOR GIVING THE INFORMATION I WOULD ASK FOR BEFORE I ASK FOR IT ! THANKS FOR YOUR INSIGHTFUL WORK ………..

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Facebook
  • Instagram
  • LinkedIn
  • Pinterest
  • Twitter

Voices of the Industry

Fit for the Edge: Modular Data Centers

Fit for the Edge: Modular Data Centers Jackie Pasierbowicz, Director of Sales and Marketing at TAS explores the dramatic rise of multiple markets for edge computing and the benefits of a modular data center strategy.

DCF Spotlight

The COVID-19 Crisis and the Data Center Industry

The COVID-19 pandemic presents strategic challenges for the data center and cloud computing sectors. Data Center Frontier provides a one-stop resource for the latest news and analysis for decision-makers navigating this complex new landscape.

An aerial view of major facilities in Data Center Alley in Ashburn, Virginia. (Image: Loudoun County)

Northern Virginia Data Center Market: The Focal Point for Cloud Growth

The Northern Virginia data center market is seeing a surge in supply and an even bigger surge in demand. Data Center Frontier explores trends, stats and future expectations for the No. 1 data center market in the country.

See More Spotlight Features

White Papers

Build-to-Suit Data Center

Case Study: Financial Services Build-to-Suit Data Center — Strength through Flexibility

In the competitive and highly regulated banking industry, demands on financial institutions come not only from customers, but from regulators and shareholders. In fact, the financial services sector is one of the most heavily regulated industries in the nation. Get the new report that shows how a collaborative partnership with Stream Data Centers gave one of the nation’s largest commercial banks the security and control of a standalone facility and the ability to specify the design, build and ongoing operation of a dedicated data center building that incorporated the most critical elements of the existing customer-built facility.

Get this PDF emailed to you.

We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

Newsletters

Get the Latest News from Data Center Frontier

Job Listings

RSS Job Openings | Peter Kazella and Associates, Inc

  • Navy Electrician / Navy Mechanic - Redmond, WA
  • Electrical Commissioning Engineer - Ashburn, VA
  • MEP Superintendent - Data Center - Dallas, TX
  • Construction Project Manager - Data Center - Dallas, TX
  • Data Center QA / QC Manager - Huntsville, AL

See More Jobs

Data Center 101

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center Frontier, in partnership with Open Spectrum, brings our readers a series that provides an introductory guidebook to the ins and outs of the data center and colocation industry. Think power systems, cooling, solutions, data center contracts and more. The Data Center 101 Special Report series is directed to those new to the industry, or those of our readers who need to brush up on the basics.

  • Data Center Power
  • Data Center Cooling
  • Strategies for Data Center Location
  • Data Center Pricing Negotiating
  • Cloud Computing

See More Data center 101 Topics

About Us

Charting the future of data centers and cloud computing. We write about what’s next for the Internet, and the innovations that will take us there. We tell the story of the digital economy through the data center facilities that power cloud computing and the people who build them. Read more ...
  • Facebook
  • LinkedIn
  • Pinterest
  • Twitter

About Our Founder

Data Center Frontier is edited by Rich Miller, the data center industry’s most experienced journalist. For more than 20 years, Rich has profiled the key role played by data centers in the Internet revolution. Meet the DCF team.

TOPICS

  • 5G Wireless
  • Cloud
  • Colo
  • Connected Cars
  • Cooling
  • Cornerstone
  • Coronavirus
  • Design
  • Edge Computing
  • Energy
  • Executive Roundtable
  • Featured
  • Finance
  • Hyperscale
  • Interconnection
  • Internet of Things
  • Machine Learning
  • Network
  • Podcast
  • Servers
  • Site Selection
  • Social Business
  • Special Reports
  • Storage
  • Sustainability
  • Videos
  • Virtual Reality
  • Voices of the Industry
  • White Paper

Copyright Data Center Frontier LLC © 2021