This image shows the geographic locations of current TeraGrid Resource Providers and the 10Gb/s network links that interconnect them. The TeraGrid is a virtual facility for scientific research that integrates computational, storage, information, and data analysis resources at the San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of Chicago/Argonne National Laboratory, the National Center for Supercomputing Applications, Purdue University, Indiana University, Oak Ridge National Laboratory, the Pittsburgh Supercomputing Center, and the National Center for Atmospheric Research. As a TeraGrid Resource Provider, NCAR is committed to offering a highly distributed network of computational, data, and knowledge resources to multidisciplinary groups of researchers, students, educators, impact and assessment communities, and policy makers around the world.

Priority 2: Developing and Providing Advanced Services and Tools

TeraGrid Integration

Background

The National Center for Atmospheric Research (NCAR) is dedicated to integration and collaboration across scientific projects, scientific organizations, and technology.

Progress

A key accomplishment toward this goal came in FY2006 when NCAR established a partnership with the NSF-supported TeraGrid facility and contributed to its success as a Resource Provider and catalyst for enhanced services to the Earth System science community. As a TeraGrid partner site, NCAR will offer increased access to specialized high-performance computing and data storage resources, climate data, and tools for data analysis and visualization. Access to these facilities will help Earth System scientists better understand complex phenomena such as global climate change, hurricanes and other severe storms, wildfires, air pollution, solar storms, and space weather. NCAR's Computational and Information Systems Laboratory (CISL) worked closely with TeraGrid and the NSF to establish the partnership. The opportunities afforded by this partnership enhance NCAR's ability to forge new collaborations for distributed modeling, access to geoscientific data, digital preservation and archiving, and the development and use of new Grid technologies.

Focused geoscience collaboration initiatives in converging modeling frameworks and metadata, knowledge and semantic systems, and analysis and visualization environments pave the way for the creation of a new generation of knowledge tools. CISL is playing a leadership role in the development of global, interoperable data systems including a strong contribution to the World Meteorological Organization (WMO) efforts. The confluence of these capabilities provides a foundation for delivering new, very large datasets to our community, including re-analyses, regional climate model results, and global weather ensembles. This effort supports NCAR's strategic priorities of "Engaging a broader and more diverse community" and "Developing and providing advanced services and tools."

In FY2006, a 10Gb/s network connection to the TeraGrid was established along with a new, dedicated datagrid server for an Earth System Grid (ESG) Science Gateway. At the end of FY2006, NCAR will begin serving as a conduit to Terascale climate model data through the use of Grid computing technologies. During FY2007, NCAR will begin offering an experimental storage cluster and provide access to select supercomputing resources.

NCAR's participation in the TeraGrid project is supported entirely through NSF Core funding.

Detailed Project Description

NCAR currently delivers a substantial computing resource to the Earth System sciences community with a significant portion of this resource devoted to the university research community. NCAR produces, collects, and manages one of the world's premiere collections of environmental and geoscience data. NCAR is funding a Cyberinfrastructure Strategic Initiative, one element of which is called the Community Data Portal (CDP). The CDP effort is aimed at providing a unified gateway to datasets and GIS databases from a wide range of scientific resources that are broadly useful to the research community. These datasets span observed and simulated data for climate, ocean, atmospheric reanalysis, biogeochemistry, carbon cycle, weather, space physics, chemistry, and many others. The CDP is aimed at developing hardware and software cyberinfrastructure and sustainable strategies for managing the breadth of scientific data being gathered and generated at NCAR and UCAR, such that these resources can start to become knowledge resources for the broad community.

NCAR's initial resource contribution to the TeraGrid is providing access to the DOE-funded Earth System Grid (ESG) project configured as a TeraGrid Science Gateway. The primary goal of ESG is to address the formidable challenges associated with enabling analysis of and knowledge development from global Earth System models. Through a combination of Grid technologies and emerging community technology, distributed federations of supercomputers and large-scale data and analysis servers provide a seamless and powerful environment that enables the next generation of climate research. ESG provides access to a 150-TB collection of climate and ocean model data including datasets for the Intergovernmental Panel on Climate Change. It also serves as a central point for community distribution of the Community Climate System Model (CCSM) model itself, as well as other related datasets, analysis, and visualization tools.

The primary point of entry into this data and modeling collection is the ESG portal. This resource serves as a nexus for Data Grid services, a storage cache for other TeraGrid scientific projects, a conduit to NCAR's archival storage system, and as a vehicle for providing a wealth of scientific environmental data for the TeraGrid community. Additional Data Grid services and geosciences-specific services and applications will be layered on the data holdings. ESG and other CDP datasets will be "published" on the TeraGrid, and access modalities will support models, applications, and web portalsóadding up to a premier environmental data service for the TeraGrid and geosciences communities.

NCAR offers an abundant, exciting, and unique portfolio of science and technology projects that, we believe, constitute a primary resource contribution to the TeraGrid.

TeraGrid Background

TeraGrid is an NSF-funded national facility that integrates computational and data resources and security, accounting, documentation, and educational outreach services from resource provider partners (RPs) to serve the nation's science and engineering community. Common services and integration processes and components are provided by, or in some cases coordinated by, the Grid Infrastructure Group (GIG). The GIG is responsible for architecture, planning, managing, and enhancing the TeraGrid facility, providing a core set of services, and coordinating RP staff through distributed service teams ranging from user support to security to education, outreach, and training.

The objective of the RPs and GIG is to enable scientific discovery by providing integrated access to the highest performance resources available, integrated as a coordinated system that supports various use cases ranging from exploiting a single TeraGrid resource to combining resources in specialized workflow or cooperative computing modes. Resource integration and enhancement efforts are ranked through user input and evaluations of TeraGrid services as measured by operational, system, or service use metrics.

The set of long-term TeraGrid objectives toward providing cyberinfrastructure to national science and engineering researchers can be expressed in three interdependent sets of activities.

TeraGrid DEEP encompasses a set of initiatives aimed at fully exploiting the integrated capabilities of the TeraGrid facility to support scientific discovery that would not otherwise be possible. The GIG coordinates user support staff to provide both traditional user consulting support and a program called Advanced Support for TeraGrid Applications (ASTA). ASTA assigns user support staff to dedicate 25% of their time for 6-12 months assisting a science group to enable them to fully harness TeraGrid services and resources as an integrated facility.

TeraGrid WIDE recognizes that, traditionally, NSF's high-performance computing infrastructure has focused primarily on only a small fraction of the national science and engineering community. Thus, in addition to supporting a current and growing user community, the aim is to provide TeraGrid services to many more scientists and engineers over the coming years. Such scaling requires a new model for interacting with the community and for provisioning cyberinfrastructure: the creation of science gateways.

TeraGrid's broad-impact goals also extend to students and educators. TeraGrid's Education, Outreach, and Training (EOT) program is a coordinated effort to raise the awareness of the benefits of TeraGrid within research and education communities across all disciplines and all learning levels. The EOT team works closely with the science gateways to engage significantly larger numbers of scientists, educators, and students, with an emphasis on reaching out to under-represented groups.

TeraGrid OPEN involves the provision of a persistent, reliable national cyberinfrastructure. The TeraGrid facility is architected as a set of integrated services based on open standards wherever possible and embracing the heterogeneity represented by nearly 20 unique major resources operated by TeraGrid RPs. OPEN also describes the approach to presenting TeraGrid to NSF and the community as a truly extensible and adaptable facility.

Timeframe

NCAR's deployment of TeraGrid cyberinfrastructure will be a strategic and ongoing activity. The initial deployment will consist of the necessary 10Gb/s network fabric, a data server for accessing Earth System Grid (ESG) data, a 50-TB, RAID10 high performance storage system running the Lustre file system, and a 5.7-TFLOPS-peak, 2,048-processor IBM Blue Gene/L and its associated I/O subsystem. Together, these components will create, by the middle of FY2007, a high performance TeraGrid node capable of providing both high performance computing and data services to TeraGrid users.

Subsequent out-year upgrades of the TeraGrid infrastructure will be accomplished with CISL's research equipment budget. While modest, this investment should enable CISL and NCAR to continue deploying resources of a scale sufficient to develop Grid expertise and learn vital lessons about providing domain-specific Grid services to NCAR's scientific community. In many ways it is an open-ended project and its precise direction is difficult to predict. As potential collaborations and services emerge, CISL will adapt the NCAR TeraGrid node and the objectives of the project appropriately.

Plans and Impact

In FY2007, NCAR intends to complete the deployment of the cyberinfrastructure described in the previous section. In particular, the high performance Luster data storage system and the 10Gb/s E1200 switch will be deployed in October through November 2006. An allocation policy for this storage resource will have to be developed. To that end, NCAR is sending two staff to attend a data workshop at SDSC, scheduled for November 28 through December 1, 2006.

The deployment of the IBM Blue Gene/L supercomputer system will involve several steps. First, the system must be moved outside the UCAR security perimeter. Next, the CTSS software stack must be installed and tested on the system. These two steps, which will render the system ready for preproduction testing, will be complete by December 31, 2006. Finally, the Blue Gene/L system must be allocated as a resource. This will occur in the first quarter of 2007. By April 2007, the Blue Gene/L system should be in full production on the TeraGrid.

Several partnerships are expected to develop in FY2007. The ESG team expects to develop data federation capabilities with the Climate Center at Purdue University. Installation of the Storage Resource Broker (SRB) on NCAR's TeraGrid data server will allow the cross archiving of data between SDSC and NCAR to begin as described under the MOU agreement. NCAR will establish partnerships with ORNL, PSC, and the University of Indiana for conducting Lustre-WAN testing with these TeraGrid resource providers. Finally, the possibility of integrating the SDSC Blue Gene/L resource with the NCAR Blue Gene/L will be explored. Cross mounting file systems using GPFS-WAN may facilitate this integration as well as the migration of users back and forth between the two systems.

Throughout the TeraGrid project, CISL has kept careful track of the staff resources required and consumed. Over FY2006, CISL measured charges to the TeraGrid activity that equate to an annual burn rate of $374,000, i.e. approximately two fully loaded Software Engineer-3s being consumed across the organization. Of course, this is spread across several individuals. In FY2006, CISL hired a half-time TeraGrid security officer and added one full time junior software engineer to administer the systems needed for this project.

It is expected that, as more equipment is deployed in FY2007, the load on NCAR staff will increase. To that end, CISL has made strategic adjustments to free salary and is prepared to hire another half-time software engineer/administrator to support the deployed system.

The impacts of this project are open-ended and difficult to predict. NCAR's access to and integration with TeraGrid resources will help ensure continuity of the NSF's cyberinfrastructure plans, particularly between the Office of Cyberinfrastructure (OCI) and the Geoscience Directorate. The connection itself is expected to increase the ability of NCAR scientists and geoscientists to collaborate using TeraGrid resources. The resulting collaborations will likely center around data exchanges at first, but will inevitably expand into other aspects of scientific workflows, such as the sharing or coscheduling of HPC resources.

For Further Information & Details

CISL Annual Report