JD Cloud first entered the Chinese cloud computing market in 2016. By 2021, they had the fifth largest share of the IaaS market in China. As more users transition complex applications to the cloud, the company wanted to reduce hardware failures and any subsequent downtime. They noted that 37% of the JD Cloud data center hardware failures were caused my memory failures. JD Cloud and Intel worked together to develop a failure recovery system that allows the company to predict and rapidly recover from memory errors.
A new white paper from Intel explores how Intel MCA Recovery + MFP has helped JD Cloud provide efficient and stable cloud computing services to their more than 2,500 partners. The author discusses the type of memory errors the JD Cloud host was facing, as well as the consequences of those errors. The paper then explains what MFP and MCA Recovery is, and how they were deployed in the JD cloud scenario.
Intel includes charts and schematics in the white paper that illustrate the MCA memory error recovery and the architecture of MCA recovery in JD Cloud’s Failure Recovery System.