• About Us
  • Partnership Opportunities
  • Privacy Policy

Data Center Frontier

Charting the future of data centers and cloud computing.

  • Cloud
    • Hyperscale
  • Colo
    • Site Selection
    • Interconnection
  • Energy
    • Sustainability
  • Cooling
  • Technology
    • Internet of Things
    • AI & Machine Learning
    • Edge Computing
    • Virtual Reality
    • Autonomous Cars
    • 5G Wireless
    • Satellites
  • Design
    • Servers
    • Storage
    • Network
  • Voices
  • Podcast
  • White Papers
  • Resources
    • COVID-19
    • Events
    • Newsletter
    • Companies
    • Data Center 101
  • Jobs
You are here: Home / Machine Learning / With Big Basin, Facebook Beefs Up its AI Hardware

With Big Basin, Facebook Beefs Up its AI Hardware

By Rich Miller - March 8, 2017

With Big Basin, Facebook Beefs Up its AI Hardware

Big Basin, the new machine learning AI server Facebook introduced today at the Open Compute Summit in Santa Clara, Calif. (Photo: Facebook)

LinkedinTwitterFacebookSubscribe
Mail

SANTA CLARA, Calif. – Facebook is beefing up its high performance computing horsepower, enhancing its use of artificial intelligence to personalize your news feed.

Facebook introduced brawny new hardware to power its AI workloads today at the 2017 Open Compute Summit at the Santa Clara Convention Center. Known as Big Basin, the unit brings more memory to its GPU-powered data crunching. It’s a beefier successor to Big Sur, the first-generation Facebook AI server unveiled last July.

“With Big Basin, we can train machine learning models that are 30 percent larger because of the availability of greater arithmetic throughput and a memory increase from 12 GB to 16 GB,” said Kevin Lee, a Technical Program Manager at Facebook. “This enables our researchers and engineers to move more quickly in developing increasingly complex AI models that aim to help Facebook further understand text, photos, and videos on our platforms and make better predictions based on this content.”

Making Your Newsfeed Smarter

Big Sur and Big Basin play important roles in Facebook’s bid to create a smarter newsfeed for its 1.9 billion users around the globe. With this hardware, Facebook can train its machine learning systems to recognize speech, understand the content of video and images, and translate content from one language to another.

As leading tech companies push the boundaries of machine learning, they are often following a do-it-yourself approach to their HPC hardware. Google, Apple and Amazon have also created research labs to pursue faster and better AI capabilities. They have used different approaches to hardware, with Google opting for custom ASICs (application specific integrated circuits) for its machine learning operations.

Facebook has chosen to use NVIDIA graphics processing units (GPUs) for its machine learning hardware. Facebook has been designing its own hardware for many years, and In preparing to upgrade Big Sur, the Facebook engineering team gathered feedback from colleagues in Applied Machine Learning (AML), Facebook AI Research (FAIR), and infrastructure teams.

The Power of Disaggregation

For Big Basin, Facebook collaborated with QCT (Quanta Cloud Technology), one of the orginal design manufacturers (ODMs) that works closely with the Open Compute community. Big Basin features eight NVIDIA Tesla P100 GPU accelerators, connected using NVIDIA NVLink to form an eight-GPU hybrid cube mesh — similar to the architecture used by NVIDIA’s DGX-1 “supercomputer in a box.”

Big Basin features eight NVIDIA Tesla P100 GPU accelerators. It's the successor to Big Sur, which used an earlier verson of NVIDIA's GPU technology. (Photo: Facebook)

Big Basin features eight NVIDIA Tesla P100 GPU accelerators. It’s the successor to Big Sur, which used an earlier verson of NVIDIA’s GPU technology. (Photo: Facebook)

Big Basin offers an example of one of the key principles guiding Facebook’s hardware design – disaggregation. Key components are built using a modular design, separating the CPU compute from the GPUs, making it easier to integrate components as new technology emerges.

“For the Big Basin deployment, we are connecting our Facebook-designed, third-generation compute server as a separate building block from the Big Basin unit, enabling us to scale each component independently,” Lee writes in a blog post announcing Big Basin. “The GPU tray in the Big Basin system can be swapped out for future upgrades and changes to the accelerators and interconnects.”

Flexibility and Faster Upgrades

Big Basin is split into three main sections: the accelerator tray, the inner chassis, and the outer chassis. The disaggregated design allows the GPUs to be positioned directly in front of the cool air being drawn into the system, removing preheat from other components and improving the overall thermal efficiency of Big Basin.

Free Resource from Data Center Frontier White Paper Library

Gaming
From Console to Cloud
This white paper from Iron Mountain explores the current challenges, drivers, and opportunities for gaming digital infrastructure.
We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

Get this PDF emailed to you.

There are multiple advantages to this disaggregated design, according to Eran Tal, and engineering manager at Facebook.

“The concept is breaking down and separating components, and creating the ability to select what solution you want at different levels of hardware,” said Tal. “It gives you a lot of flexibility in addressing design with a fast-changing workload. You can never know what you will need tomorrow.

“You’re maximizing efficiency and flexibility,” he added.

Two New Server Models

Facebook also introduced two new server designs, each representing the next generation of existing OCP designs.

  • Tioga Pass is the successor to Leopard, which is used for a variety of compute services at Facebook. Tioga Pass has a dual-socket motherboard, which uses the same 6.5” by 20” form factor and supports both single-sided and double-sided designs. The double-sided design, with DIMMs on both PCB sides, allows Facebook to maximize the memory capacity. The flexible design allows Tioga Pass to serve as the head node for both the Big Basin JBOG (Just a Bunch of GPUs) and Lightning JBOF (Just a Bunch of Flash).This doubles the available PCIe bandwidth when accessing either GPUs or flash.
  • Yosemite v2 is a refresh of Facebook’s Yosemite multi-node compute platform. The new server includes four server cards. Unlike Yosemite, the new power design supports hot service — servers can continue to operate and don’t need to be powered down when the sled is pulled out of the chassis for components to be serviced. With the previous design, repairing a single server prevents access to the other three servers since all four servers lose power.
LinkedinTwitterFacebookSubscribe
Mail

Tagged With: Artificial Intelligence, Facebook, NVIDIA, Open Compute

Newsletters

Stay informed: Get our weekly updates!

Are you a new reader? Follow Data Center Frontier on Twitter or Facebook.

About Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

  • Facebook
  • Instagram
  • LinkedIn
  • Pinterest
  • Twitter

Voices of the Industry

Understanding the Differences Between 5 Common Types of Data Centers

Understanding the Differences Between 5 Common Types of Data Centers No two are data centers are alike when it comes to design or the applications and data they support with their networking, compute and storage infrastructure. Shad Secrist of Belden outlines the differences between 5 of the most common types of data centers including edge, colocation and hyperscale.

White Papers

wet stacking

Data Center Generator Maintenance

A new white paper from Kohler Power Systems explains the feasibility and benefits of no-load exercising for diesel generator operators.

Get this PDF emailed to you.

We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

DCF Spotlight

Data center modules on display at the recent Edge Congress conference in Austin, Texas. (Photo: Rich Miller)

Edge Computing is Poised to Remake the Data Center Landscape

Data center leaders are investing in edge computing and edge solutions and actively looking at new ways to deploy edge capacity to support evolving business and user requirements.

An aerial view of major facilities in Data Center Alley in Ashburn, Virginia. (Image: Loudoun County)

Northern Virginia Data Center Market: The Focal Point for Cloud Growth

The Northern Virginia data center market is seeing a surge in supply and an even bigger surge in demand. Data Center Frontier explores trends, stats and future expectations for the No. 1 data center market in the country.

See More Spotlight Features

Newsletters

Get the Latest News from Data Center Frontier

Job Listings

RSS Job Openings | Pkaza Critical Facilities Recruiting

  • Critical Power Energy Manager - Data Center Development - Ashburn, VA
  • Site Development Manager - Data Center - Ashburn, VA
  • Data Center Facility Operations Director - Chicago, IL
  • Electrical Engineer - Senior - Dallas, TX
  • Mechanical Commissioning Engineer - Calgary, Alberta

See More Jobs

Data Center 101

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center Frontier, in partnership with Open Spectrum, brings our readers a series that provides an introductory guidebook to the ins and outs of the data center and colocation industry. Think power systems, cooling, solutions, data center contracts and more. The Data Center 101 Special Report series is directed to those new to the industry, or those of our readers who need to brush up on the basics.

  • Data Center Power
  • Data Center Cooling
  • Strategies for Data Center Location
  • Data Center Pricing Negotiating
  • Cloud Computing

See More Data center 101 Topics

About Us

Charting the future of data centers and cloud computing. We write about what’s next for the Internet, and the innovations that will take us there. We tell the story of the digital economy through the data center facilities that power cloud computing and the people who build them. Read more ...
  • Facebook
  • LinkedIn
  • Pinterest
  • Twitter

About Our Founder

Data Center Frontier is edited by Rich Miller, the data center industry’s most experienced journalist. For more than 20 years, Rich has profiled the key role played by data centers in the Internet revolution. Meet the DCF team.

TOPICS

  • 5G Wireless
  • Cloud
  • Colo
  • Connected Cars
  • Cooling
  • Cornerstone
  • Coronavirus
  • Design
  • Edge Computing
  • Energy
  • Executive Roundtable
  • Featured
  • Finance
  • Hyperscale
  • Interconnection
  • Internet of Things
  • Machine Learning
  • Network
  • Podcast
  • Servers
  • Site Selection
  • Social Business
  • Special Reports
  • Storage
  • Sustainability
  • Videos
  • Virtual Reality
  • Voices of the Industry
  • Webinar
  • White Paper

Copyright Data Center Frontier LLC © 2022