Deploy a development High Performance Storage System (HPSS) system for evaluation
The National Science Foundation TeraGrid uses high-performance networks to integrate supercomputers, data archives, and data analysis facilities around the country. Its coordinated work environment enables researchers throughout the United States to collaborate on especially challenging scientific questions, and to process vast amounts of data that would not be manageable on smaller or isolated computing systems.
In FY2007 CISL deployed a portion of its IBM Blue Gene/L supercomputer, frost, on the TeraGrid. Frost has been an operational TeraGrid resource since August 1, 2007, and can provide 4.5 million CPU hours annually to the TeraGrid research community.
In FY2008 CISL began deployment of an IBM High Performance Storage System (HPSS) to complement frost as a TeraGrid resource to the research community and to evaluate HPSS as a future CISL data archive solution. HPSS has been deployed by a number of the TeraGrid Resource Providers. The deployment of HPSS at NCAR will enable a homogeneous storage solution for the TeraGrid, enable potential data archive connectivity directly with Wide Area Network (WAN) filesystems on the TeraGrid, provide a data management system administration learning opportunity in a security environment external to the UCAR security perimeter, and provide CISL staff hands-on experience with HPSS which will be used to evaluate HPSS as a future CISL data archive solution.
This effort supports NCAR's strategic priorities of "Developing and providing advanced services and tools" and "Engaging a broader and more diverse community."

The figure shows the initial HPSS deployment configuration sized to support 1 petabyte of total storage. Working with the IBM HPSS support team, the initial HPSS deployment configuration was developed to support the HPSS evaluation system. This entry-level configuration will enable the CISL deployment of HPSS as a TeraGrid resource and HPSS evaluation as a future CISL data archive solution.
A modestly sized HPSS configuration was developed in concert with the IBM HPSS support team. The configuration included a minimal number of HPSS servers, minimal amount of disk cache space, and minimal tape resources. The HPSS servers and HPSS service support agreement were purchased in FY2008. The HPSS servers were installed within the confines of the UCAR security perimeter and will be moved outside the perimeter in FY2009. The HPSS software was loaded and operationally validated. The initial disk cache space will be acquired in early FY2009. The tape resources composed of an automated tape library, five tape drives, and 1 petaByte of tape media are included in the Augmentation of the Mass Storage Tape Archive Resources (AMSTAR) Request For Proposals. It is anticipated that the AMSTAR equipment will also be installed in early FY2009.
An initial 30 terabytes of disk cache will be acquired and deployed in early FY2009. The disk cache will buffer active data on low latency storage, thus reducing end-user access time to that set of data. AMSTAR deployment in early FY2009 will complete the initial HPSS system configuration with tape library, drive, and media technology. A soon-to-be-released enhanced version of HPSS will be installed after the disk cache and tape resources are in place providing optional Hierarchical Storage Management (HSM) integration with the IBM General Parallel File System (GPFS). GPFS is currently deployed as a WAN filesystem resource on the TeraGrid. Final deployment of the HPSS system outside the UCAR security perimeter will then commence, resulting in a fully operational HPSS system ready for TeraGrid use and CISL evaluation. End user interfaces will be one of the first evaluation efforts. Standard HPSS user interfaces, custom interface options, and HSM capabilities will be investigated over the four-year life of the project.
The NCAR HPSS is managed by CISL under the UCAR/NSF Cooperative Agreement and is supported by NSF Core funds including CSL funding.
