Website header

Production deployment of bluefire supercomputer

During FY2008 NCAR took delivery of the first IBM Power 575 Hydro-Cluster supercomputer ever installed, continuing a 30-year tradition of providing a sustainable high performance computing environment to the atmospheric sciences community. The bluefire system is based on the world's fastest microprocessor, the IBM POWER6 dual-core chip with a 4.7 GHz clock speed. In June 2008 bluefire was ranked as the 30th most powerful computer in the world by the Top500 project. Chilled-water heat sinks cover the 16 POWER6 chips in one node of the bluefire cluster. Within 11 cabinets, each weighing 3,600 pounds, bluefire contains 128 such nodes. The unique liquid cooling system, where chilled water is carried by copper tubes directly to the chips, is 33% more energy efficient than air cooling. This installation photo of the chilled-water reservoir shows two 1,500-gallon storage tanks connected in series by copper pipes (before the pipes were insulated to prevent condensation in the computer room). Chilled water from the tanks is used to prevent bluefire from overheating in the event of a power outage that could momentarily disrupt the supply of chilled water. On April 24, 2008, NCAR took delivery of an IBM Power 575 Hydro-Cluster supercomputer, the first in a highly energy-efficient class of machines to be shipped anywhere in the world. The system, named bluefire, has 4,064 POWER6 processors, a peak computation rate of over 76 teraflops, and consists of 11 cabinets each weighing 3,200 pounds. Bluefire is over three times more powerful, and based on floating-point-operations per Watt, is three times more energy efficient than the supercomputers it replaced, bluevista and blueice. For historical reference, it is over a million times more powerful than the first recognized supercomputer, the Cray 1-A, that NCAR used from 1977-1986.

Impact on science

Scientists at NCAR and across the country will use bluefire to accelerate research into climate change, including future patterns of precipitation and drought around the world, changes to agriculture and growing seasons, and the complex influence of global warming on hurricanes. Researchers also will use it to improve weather forecasting models so society can better anticipate where and when dangerous storms may develop or hurricanes may strike. The system will also allow scientists to study in unprecedented detail the relationship between solar processes and weather on Earth, gain a deeper understanding of turbulence, and develop and refine models that simulate many of the processes responsible for elements of the Earth climate system.

During the first quarter of FY2009, several projects have the opportunity to study challenging problems by participating in the Accelerated Scientific Discovery program on bluefire, where large amounts of capability computing resources will be dedicated to a few users. Specifically, projects include nested regional climate modeling, the effect of anthropogenically forced radiative forcing on convective storms, the role of eddies in ocean circulation variability, the effect of the corona on the Earth's magnetosphere and ionosphere, and a study of the winter precipitation, snowpack, and runoff from Colorado's headwater basins using a high-resolution model.

Researchers will rely on bluefire to generate the climate simulations necessary for the next report on global warming by the Intergovernmental Panel on Climate Change (IPCC), which conducts detailed assessments under the auspices of the United Nations. The IPCC was a recipient of the 2007 Nobel Peace Prize.

The most powerful microprocessor available

Bluefire houses the new POWER6 processor, the world's fastest microprocessor with a clock speed of 4.7 GHz. The system consists of 4,064 processors, 12 terabytes of memory, and 150 terabytes of FAStT DS4800 disk storage.

Within the landscape of high-performance computing technology, bluefire is on the leading edge. Bluefire is the second phase of a system called the Integrated Computing Environment for Scientific Simulation (ICESS) at NCAR. After undergoing acceptance testing during June 2008, it began full-scale operations in August 2008, and will provide supercomputing support for researchers at NCAR and other organizations through 2011.

Return to liquid-cooled hardware

Bluefire relies on a unique, water-based cooling system that is 33 percent more energy efficient than traditional air-cooled systems. Heat is removed from the electronics by water-chilled copper plates mounted in direct contact with each POWER6 microprocessor chip. As a result of this water-cooled system and POWER6 efficiencies, bluefire is three times more energy efficient per rack than its predecessor.

It was a significant challenge to the NCAR facilities engineering staff to prepare the Mesa Laboratory computer room for bluefire. Because bluefire was the first IBM Hydro-Cluster installed in the field, nearly everything about its installation is innovative. The chilled water required to provide cooling for bluefire had to be obtained (tapped) from the existing, dual 450-ton water chillers. In addition, to mitigate the impact of a power outage where NCAR's water chillers would fail over to the twin backup system, two 1,500-gallon chilled water storage tanks were installed in the NCAR machine room. The first technical specifications from IBM were obtained in August 2007, and the work was completed by April 2008. Numerous technical challenges were overcome on this project including dynamic scope changes from IBM, discovery of insufficient facility capabilities, and new fabrication techniques. All challenges were overcome and the project was completed on time and within budget.

Supporting NCAR's strategic plan

The ICESS project in general and bluefire in particular advance NCAR's strategic goal to "Provide robust, accessible, and innovative information services and tools" and NCAR's strategic priority of "Enhancing capability and capacity of NCAR supercomputing." Notably, bluefire expands and extends the CISL-developed on-demand capability computing model that enables the entire cluster or portions of it to support dedicated or shared special computing campaigns, such as the Accelerated Scientific Discovery campaign or this year's High Resolution Hurricane Simulation Special Computing Campaign.

This project is made possible through NSF Core funds, including CSL funding.