Amazon Building Custom Chips to Accelerate Cloud Networking

Nov. 30, 2016
Amazon Web Services is developing its own customer semiconductor chips to accelerate its cloud computing network. Amazon’s James Hamilton provided an overview at the AWS ReInvent conference.

Amazon Web Services is developing customer semiconductors to accelerate its cloud computing network, expanding its push into custom hardware, the company said Tuesday. AWS says its new Annapurna ASIC will enable it move data faster across its huge data center network.

“We’re in the semiconductor business!” said James Hamilton, VP and Distinguished Engineer at Amazon Web Services, during a keynote last night at the AWS Re:Invent conference in Las Vegas. “We think this is a really big deal.”

Amazon unveiled the new chip in a presentation showcasing the growth of its fiber network and data infrastructure powering the growth of its massive AWS cloud operation, which now generates more than $10 billion in annual revenue.

Hamilton also provided an overview of Amazon’s data center deployment strategy, as well as a glimpse of recent-generation AWS custom cloud servers and storage units.

The Rise of Custom Chips for Hyperscale Workloads

The new networking chip will be a custom semiconductor known as an ASIC (Application Specific Integrated Circuit) that can be tailored for tasks like network management. This reflects a trend for hyperscale data centers to move beyond CPUs and turn to specialized chips like ASICs, FPGAs (Field Programmable Gating Arrays) and GPUs.

Microsoft is using FPGAs to accelerate its cloud servers, while Google has developed a custom ASIC for artificial intelligence data crunching, while Facebook has opted for a GPU-driven machine learning server.

Amazon has historically been stingy in disclosing details about its infrastructure. But the cloud computing arena is becoming more competitive, with Google, Microsoft and Oracle aggressively adding data centers in a bid to gain ground on market leader AWS. With more customers doing comparison shopping for cloud platforms, Amazon has a vested interest in asserting the competitive advantages of AWS.

The company’s favorite venue for this is re:Invent, the customer conference that drew 32,000 attendees this year, packing event space across three hotels on the Las Vegas Strip.

Private Network Powers AWS Growth

Hamilton did not disappoint. The veteran technologist is something of a rock star among infrastructure geeks, with a long history of innovation at Microsoft and AWS. His 2010 critique of network hardware (“Datacenter Networks Are In My Way”) stoked momentum for disruption in data center networking.

On Tuesday evening Hamilton took the stage at Re:Invent to share the details of the network he once dreamed about. He confirmed that Amazon has built a global private network to manage the flow of data between its data centers and availability zones, providing customers exceptional reliability as well as failover options for applications.

Building a private network is “really, really expensive,” said Hamilton. “But it’s the right thing to do. If you’ve got a packet, the more people that touch it, the less likely it is to be delivered. We always have (network) assets to survive a link failure.”

Amazon’s network spans everything from dense fiber connectivity between its data centers (which are grouped in regional clusters and availability zones) all the way up to long-haul fiber and even undersea cables.

AWS Distinguished Engineer James Hamilton shares a map of Amazon’s global private network Tuesday night at the AWS Re:Invent conference in Las Vegas. (Screen shot via Amazon video)

Amazon Web Services organizes its cloud infrastructure into regions, each containing a cluster of data centers. Each region contains multiple Availability Zones, providing customers with the option to mirror or back up key IT assets to avoid downtime.

To illustrate the breadth and density of Amazon’s network, Hamilton provided an overview of the fiber that ties together connect an Amazon region with five availability zones. He didn’t name the region, but it’s clearly the US-East region in Northern Virginia, which is the only AWS region that has five AZs. A huge chunk of AWS infrastructure is concentrated in Loudoun and Prince William counties, where the company has at least 25 data centers and is expanding rapidly.[clickToTweet tweet=”James Hamilton of AWS: Building a private network is really, really expensive. But it’s the right thing to do.” quote=”James Hamilton of AWS: Building a private network is really, really expensive. But it’s the right thing to do.”]

Each region is supported by two transit centers – data centers that connect to Amazon’s global fiber and provide interconnections with other networks, and provide 100 Gbps connections to the other facilities in the region.

“We’re running a lot of redundant fiber between these buildings,” said Hamilton.

How much fiber? A total of 3,456 fibers run through two-inch conduit spanning the region, with a total of 242,374 fiber strands run through US-East. AWS pays close attention to cable management within those conduits, enabling it to pack more capacity into each conduit and quickly ID and repair cabling problems. “It’s saved us a ton of money, because we have so much fiber,” he said.

AWS Distinguished Engineer James Hamilton speaks Tuesday night at the AWS re:Invent conference in Las Vegas. (Screen shot via Amazon video)

Amazon’s private fiber network connects its 14 regions around the globe, which includes US regions in Ohio, Oregon and Northern California as well as Virginia. AWS plans to add four more regions next year as it continues its global growth. It hasn’t announced the locations for the new regions, but Reuters reported Wednesday that AWS is in talks with an Italian utility about converting several former power plants into data centers.

One clear area of investment focus is Asia. That’s why Amazon is a partner in the Hawaiki trans-Pacific submarine cable project, which will provide added connectivity between Australia and New Zealand and the United States, running through Hawaii to a landing station on the Oregon coast. The cable is 14,000 kilometers in length, and runs at a depth of 6,000 meters, or about three miles beneath the sea.

Custom Gear Boosts Control, Efficiency

Amazon has been building its own network hardware for years. “We run our own custom-built routers,” said Hamilton. “It’s built to our specs. As big as the cost gain is – and it’s pretty big – the biggest gain is in reliability.

“Our networking gear has one requirement: ours,” Hamilton continued. “As fun as it would be to add a lot of features, it would be less reliable. So we just don’t do it.”

Amazon’s network gear currently uses a Tomahawk Ethernet ASIC from network vendor Broadcom, which supports 128 ports of 25Gbps Ethernet. But there’s more innovation to come.

A slide from James Hamilton’s presentation offers an overview of specs for Amazon’s custom networking silicon. (Source: Amazon)

Amazon’s custom Annapurna ASIC will provide “second generation Enhanced Networking,” enabling AWS to boost performance and efficiency by controlling its networking silicon, hardware and software.

Hamilton said Amazon has also benefited from moving to a 25Gbps networking scheme, which emerged as an alternative to the IEEE standard 10G and 40G gear.

“We jumped on 25G early,” said Hamilton. “We love where we are right now. We’re confident that 25G is the right wave. We buy enough hardware that it doesn’t matter. Vendors are always willing to work with us.”

Sleek Servers Yield Power Savings

Hamilton also offered the re:Invent crowd a look at some recent (albeit not current) Amazon hardware. This included a 42U storage rack packed with 1,110 disks, which adds up to 8.8 petabytes of storage. The downside: the rack weights 2,778 pounds. Be careful rolling that one around the data hall!

The presentation also included a look at a recently retired Amazon 1U server design, which was unusually roomy inside the chassis, providing extra room for airflow and cooling:

Hamilton shows off a recent vintage AWS cloud server, which is notable for its simple design and efficient use of space for airflow and cooling. (Image via Amazon video)

“This is a winning design,” he said. “But it’s a very different design than what’s out there with most vendors.”

Hamilton said AWS likes a simple approach that trades component density for power efficiency and the ability to operate in warmer environments, which allows data center providers to save money on cooling. At the server level, tiny gains in power efficiency add up as they ripple across the huge AWS footprint to create significant savings. The company runs between 50,000 and 80,000 servers in each data center, he said, with several Availability Zones spanning more than 300,000 servers.

On the data center design front, Hamilton said AWS is building slightly larger data centers. The company has traditionally built new facilities with 25 to 30 megawatts of power capacity,  but now targets new builds for 32 megawatts.

“We could easily build 250 megawatt data centers,” said Hamilton. “As you get bigger, the gains (from economies of scale) are relatively small.

“This is about the right size facility,” he added, noting the importance of limiting the size of its failure domains. “It costs us a little more, but we think it’s the right thing for our customers.”

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

Sponsored Recommendations

Guide to Environmental Sustainability Metrics for Data Centers

Unlock the power of Environmental, Social, and Governance (ESG) reporting in the data center industry with our comprehensive guide, proposing 28 key metrics across five categories...

The AI Disruption: Challenges and Guidance for Data Center Design

From large training clusters to small edge inference servers, AI is becoming a larger percentage of data center workloads. Learn more.

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...

How Modernizing Aging Data Center Infrastructure Improves Sustainability

Explore the path to improved sustainability in data centers by modernizing aging infrastructure, uncovering challenges, three effective approaches, and specific examples outlined...

SeventyFour / Shutterstock.com

Improve Data Center Efficiency with Advanced Monitoring and Calculated Points

Max Hamner, Research and Development Engineer at Modius, explains how using calculated points adds up to a superior experience for the DCIM user.

White Papers

Get the full report

Decarbonized Resilience

Nov. 14, 2021
A new white paper from Enchanted Rock explores four alternatives to diesel backups to see which offer both resiliency and an economical way to meet climate goals.