Website header

The Cyberinfrastructure Strategic Initiative: Realizing an Earth System Knowledge Environment, Building a science gateway framework, and the Community Data Portal

The NCAR Cyberinfrastructure Strategic Initiative (CSI) was originally proposed as a collection of strategic activities that spanned data and knowledge management, collaboration environments, and advancing our web presence. Having accomplished our goals and realizing production capabilities in the latter two, our primary focus is now on advancing data and knowledge environments and aggressively developing our opportunity space in this and related areas. The CSI effort currently funds a collection of strategic and opportunity-development activities, along with core foundational thrusts including the development of the ESKE Science Gateway Framework and the Community Data Portal (CDP). Our overarching goals are to build the cyberinfrastructure, integrate and extend the Information Technology, develop the critical relationships and projects with scientific and educational projects, and foster the development of human resources and culture such that we can effectively develop our Earth System Knowledge Environment (ESKE).

This image is a snapshot of a prototype interface that bridges across models and data. Developed on the emerging ESKE Science Gateway Framework (SGF), it reflects collaboration across multiple projects, including the Cyberinfrastructure Strategic Initiative, the Earth System Curator (ESC), and the Earth System Grid. The SGF provides a powerful new capability: semantic organization of digital objects. Once digital collections are classified as part of an ontology, Semantic Web technologies may be employed to provide a "faceted search" capability such as the one depicted above. While this is useful in its own right, it paves the way to deliver systems with semantically mediated workflows, where data integration happens behind the scenes, allowing scientists to acquire the resources they need without necessarily having to know about all of the complexity and heterogeneity that lies underneath.

The Community Data Portal (CDP) is aimed at developing and delivering innovative cyberinfrastructure that provides a shared technology base and facility for data and knowledge management for a broad set of digital holdings across UCAR, NCAR, and UOP. The basic idea is to develop and deliver the foundations for building "science gateways" and knowledge environments, including a broad spectrum of functionality spanning data search and discovery, semantic organization, catalogs and metadata browsing, support for virtual organizations, data download and upload, publishing, digital preservation, and analysis and visualization services. The CSI thus plays an important role in supporting NCAR's strategic plan including "engaging a broader and more diverse community in the atmospheric and geosciences," "developing and providing advanced tools and services," and "creating an Earth System Knowledge Environment."

Developing the ESKE Science Gateway Framework (SGF): We made significant progress developing the next-generation ESKE Science Gateway Framework (SGF) in FY2008. The new SGF will include support for virtual organization branding/skinning; federated identity, authorization, and security; search and browse functions; publishing workflows; data management, analysis, and visualization; access to MSS/HPSS holdings; data preservation; metadata, ontology, and semantic support; support for community annotation and tagging; metrics; workflows; federation with ESG, THREDDS, WMO, GCMD, CADIS, Google Earth, etc.; GIS capabilities; and support for a wide variety of digital object types. The result will be an Open Source software product, and our first production release will occur in early FY2009. We also interfaced the SGF with the LAS (Live Access Server) and TDS (Thredds Data Server) to enable data subsetting, post-processing, and visualization. We leveraged emerging OpenID technology to enable federated authentication and attributes exchange and also developed and implemented a model for federated group membership management.

Supported numerous data, modeling, and technology projects: In FY2008, we continued to support the spectrum of existing projects and also worked with a number of new ones. This included IPCC, the THORPEX Interactive Grand Global Ensemble (TIGGE), the NCAR GIS Strategic Initiative, HAO's TGCM project, the Whole Atmosphere Community Climate Model (WACCM), ACME-07, the Cooperative Arctic Data and Information Service (CADIS), the Earth System Modeling Framework (ESMF), the NSF Earth System Curator (ESC) project, the IHOPE/ARCHEOMEDES project, and the Google Earth Opportunity Fund effort.

The World Meteorological Organization Information System (WMO-WIS): In FY2008 the CSI also supported collaborative efforts to develop the World Meteorological Organization (WMO) Information System (WIS), with CISL staff serving on several WMO committees and expert teams, including the WIS metadata group (IPET-MI), the WIS data and codes group (ET-CTS), the WIS global federation group (ET-WISC), and the WMO Intercommission Coordination Group (ICG-WIS). The CSI also supported contributions relative to the organization and development of the Global Earth Observing System of Systems (GEOSS) effort.

THORPEX Interactive Grand Global Ensemble (TIGGE): CSI-supported staff continued to engage in the design, software engineering, and deployment of core TIGGE systems that are based in large part on underlying CDP technology. We completed and released a new version of the TIGGE portal that fulfilled the data access requirements of the project Phase I including file-based access, data subsetting, and geographic-based data selection and regridding.

2008 NCAR DyCore workshop: As depicted in the image above and described in the highlight Cyberinfrastructure for next-generation atmospheric models, the CSI supported a collaboration that spanned this initiative, the Earth System Curator project, and the Earth System Grid to support the 2008 DyCore workshop on model inter-comparison, including semantic-based search, data distribution, comparison of model configuration, and track-back functionality.

Opportunity development: The CSI successfully developed a funding stream for the Chronopolis digital preservation effort (funded by the U.S. Library of Congress), continued to work with EOL and other partners to develop support for the Virtual Operations Center (VOC), and contributed to NSF DataNet and TeraGrid proposals.

Overall, the CSI has impact ranging from local to global, with a solid track record of building important new collaborations.

In FY2009 we will continue to pursue opportunity development in the areas of science gateway and portal development, semantic and knowledge systems, integrated data management, analysis, and visualization environments, and digital preservation initiatives. The initiative now has a large operational responsibility, and we will continue to work with a large number of projects and customers to continue good service, learn from it, and evolve our capabilities accordingly. From a technology standpoint, we will migrate all of our existing science portals to the new Science Gateway Framework during FY2009. We will continue our contributions to the IPCC, WMO-WIS, and GEOSS efforts, working with international partners to realize our vision of globally federated data and knowledge environments. We also intend to develop a thrust on environmental/climate Science Gateways for the TeraGrid in FY2009. Overall, an overarching theme in the upcoming year is one of cross-project integration in the ESKE context -- with an emphasis on establishing ESKE support foundations.

This project is supported through NCAR Strategic Initiative funding and NSF Core funding, augmented by specific project support as described throughout this report.