Over the past 18 months, the diligent Argonne Leadership Computing Facility (ALCF) team has embarked on building the Aurora supercomputer. This vast technological masterpiece is a testament to human ingenuity and innovation, boasting an extraordinary compilation of 10,624 crafted server architectures (blades). Each of these blades has been carefully equipped with state-of-the-art components, resulting in unparalleled computational power.
Aurora supercomputer is a collaborative effort between one of the multinational semiconductor and technology companies. It represents a remarkable technological achievement. Aurora is poised to revolutionise scientific endeavours across various domains by boasting a staggering computing power of over two exaflops, equivalent to performing more than two billion calculations per second.
This cutting-edge supercomputer is set to propel scientific capabilities to new heights, making substantial contributions in crucial areas such as combating climate change, advancing cancer research, delving into the mysteries of space exploration, optimising clean energy solutions, and much more.
Furthermore, its computational prowess will empower researchers and scientists to tackle complex challenges, unlocking new insights and driving transformative breakthroughs in these crucial fields of study.
The ALCF project director Susan Coghlan expresses her deep involvement in the Aurora supercomputer project. In her interview, she emphasised the team’s unwavering commitment, “From the moment the first components arrived in November 2021, we have been fully immersed in the installation process of Aurora,” she explained.
Before this unprecedented new technology installation launched, researchers did enormous extensive renovations. These included the expansion of data centre infrastructure to accommodate the massive scale of the supercomputer, as well as the construction of specialised mechanical rooms and equipment capable of handling the heightened power and cooling requirements.
Additionally, the blades within Aurora play a pivotal role in its overall functionality. These rectangular units serve as fundamental building blocks, encompassing crucial elements such as processors, memory modules, networking components, and advanced cooling technologies.
Aurora achieves optimal performance and efficiency by integrating these essential components within the blades, ensuring seamless operation and the ability to tackle complex computational tasks with remarkable precision.
The computing power of the Aurora supercomputer derives from the utilisation of advanced central processing units (CPUs) and graphics processing units (GPUs). The physical structure of the blades is also important. Each blade weighs approximately 70 pounds. These units require specialised machinery for their vertical installation into the racks of the Aurora supercomputer.
A total of 166 racks are employed, with each rack accommodating 64 blades. These racks are thoughtfully arranged across eight rows, covering an area equivalent to two basketball courts. Such a design ensures efficient space utilisation, enabling the Aurora supercomputer to operate at its maximum potential within a compact yet robust framework.
The team is currently in the process of transitioning their work to the Aurora supercomputer. This pivotal step involves scaling their applications on the complete system, marking an important milestone in their research endeavours.
In preparation for this migration, researchers had been designing the Sunspot testbed. The Sunspot testbed serves as a valuable testing and development environment, mirroring the architecture of Aurora but on a smaller scale, utilising only two racks. This testing process is crucial in identifying and addressing any potential bugs or issues that may arise before Aurora becomes fully operational.
Ultimately, this aims to ensure the system functions flawlessly and delivers the intended performance before making it available to the broader scientific community. This approach underscores the team’s commitment to providing a reliable and optimised computing platform for researchers worldwide.
“Before we turn the system over to the broader scientific community, we are making sure to precisely put the Aurora through its paces, so everything will work as we intended,” Coghlan concluded.