Data support for climate and weather research

by Brian Bevirt

Weather and climate research rely on accurate data describing the Earth System. NCAR’s Research Data Archive (RDA) is a key resource for the global community studying climate, weather, and ocean-atmosphere interactions. Maintained by NCAR’s Computational and Information Systems Laboratory (CISL), the 600+ terabyte RDA offers open access to more than 600 collections. Ease of access and data content are equally important to researchers, so RDA personnel focus improvement efforts on both of these areas. Maintaining this balance assures that the RDA will sustain its high value as a user-centered data resource.

Trained in atmospheric and oceanographic sciences, the RDA support staff is uniquely prepared to offer knowledgeable consulting on broad-ranging research data questions. The team is also expert in handling data. They use efficient methods to both import new data to NCAR and export data to users. These data professionals evaluate incoming data as part of their data curation job. Since they know how the data should “look,” their evaluation process frequently identifies data content errors that can be eliminated before releasing the product to the public. RDA management by a team of data scientists is a recognized asset among data managers worldwide.

In addition to providing, curating, and quality-controlling data, the RDA staff – in collaboration with the user community – identifies useful new data sets to be added to the archive. RDA staff support the scientists and organizations that offer data to the archive by ensuring data integrity during transfer, long-term preservation, and complete data user metrics. RDA staff relieve data providers from most of the necessary data support tasks, and they also act as the first point of contact for users, offering answers to scientific data questions that arise, providing insights on effective data manipulation software tools, and creating documentation and metadata that are both meaningful to users and accurate from the provider’s perspective. This proactive work to enhance the archive keeps the RDA relevant to a large user community and accelerates the pace of its enrichment.

Reanalysis NameTime Period
NCEP/NCAR Global Atmospheric Reanalysis1948-2010*
NCEP/DOE Global Atmospheric Reanalysis1979-2008*
ECMWF Re-Analysis 40-year (ERA-40)1957-2002
NCEP North America Regional Reanalysis**1979-2010*
Japanese Reanalysis (JR-25/JCDAS)1979-2009*
ECMWF Interim Reanalysis (ERA-I/ECDAS)1989-2010*
NOAA-CIRES 20th Century Reanalysis1891-2008
NCEP Climate Forecast System Reanalysis, Ocean1979-2009*

NCAR’s RDA is known for having nearly all the low- and high-resolution products for each reanalysis, and it is the largest collection organized in one location. All reanalyses cover the entire globe except one, and they are listed here in the order they were produced.
*    These reanalyses are continually being extended by the data providers.
**    Geographic coverage is centered on North America.

RDA data access has made substantial gains in the past five years. This growth and change will continue in the future, just as user expectations and data service challenges continue to evolve. Plans for the RDA in 2011 include expanding the existing online data archive to 250 terabytes, six times its current size. The RDA’s online data service is supplemented by an easy-to-use request interface that stages data to disk for Internet download from tape archive storage. This system gives users access to the complete RDA. By coordinating with user-oriented workflow improvements in NCAR’s data storage architecture, the RDA now provides data users with more efficient and significantly faster online access to the major portions of the archive. These improvements will be scaled up again to complement supercomputing at NWSC.

This visualization of wind speeds for the extremely destructive Galveston hurricane of 1900 was generated using the global NOAA-CIRES 20th Century Reanalysis data. Reanalysis data products are useful in weather and climate studies because they provide a consistent, dynamically constrained mapping of point observations onto 3D atmosphere and ocean grids.