Example of an application of Method for Object-based Diagnostic Evaluation (MODE) forecast verification technique to WRF (ARW and NMM) 24-h reflectivity forecasts from 0000 UTC 12 May 2005. ARW is the Advanced Research WRF and NMM is the Nonhydrostatic Mesoscale Model version of WRF. The ARW2 was run with approximate 2-km horizontal resolution; the other models were run with 4-km resolution. All results are based on 4-km output. MODE was applied using an 11-gridbox convolution radius and a threshold of 25 dBZ. The table shows the results of this application, with variations in various object attributes. General conclusions for this single case are that NMM produced many more objects than were observed but the total area identified was closer to the observed amount than what was predicted by the other versions of the model; and all of the model versions under-estimated the average median and 0.90th quantiles of intensity. It is important to not generalize these results since they only represent one case.

(a) An example of the impacts of object scale on the strength of matches between forecast and observed objects. Scale is defined here by the combination of the convolution radius and threshold used to define objects in the Method for Object-based Diagnostic Evaluation (MODE) approach for spatial forecast evaluation. Total interest values shown here are computed as a weighted sum of interest values assigned to various attributes comparing forecast and observed objects (e.g., area differences, intensity difference). In this case the forecast is based on a 1-h nowcast by the National Convective Weather Forecast (NCWF2) on a 4-km grid and a 4-km grid of observed convective coverage. The results suggest objects defined at lower resolutions (e.g., using a larger radius and a moderate threshold for definining objects) generally have larger total interest values. The original NCWF-2 1-h forecast probability of convection for this case (20:01 UTC, 1 July 2005) is shown in panel (b).

Priority 3: Conducting Research in Computer Science, Applied Mathematics, Statistics, and Numerical Methods

Verification Research

Background

Forecast verification and evaluation activities are typically based on relatively simple metrics such as the Probability of Detection, Root Mean Squared Error, and Equitable Threat Score. These metrics provide information that is useful for monitoring changes in performance of single aspects of forecast performance with time. However, they generally do not provide information that can be used to improve the forecasts, or that can be used by end users (including forecasters) for decision making. Moreover, it is possible for forecasts that are quite useful including high resolution forecasts to have very poor scores when evaluated by using these standard metrics. In response to these limitations, the NCAR/RAL Verification Group develops improved verification approaches and tools that provide more meaningful and relevant information about forecast performance. The focus of this effort is to develop diagnostic, statistically valid approaches, including object-based evaluation of precipitation and convective forecasts and other approaches (e.g., distribution-based) that can provide more useful information for forecast developers as well as forecast users about forecast performance.

Forecast verification by nature is a mathematical activity, and development of improved verification methods requires application of advanced mathematical, statistical, and computational approaches. To draw inferences regarding the comparative performance of different forecasting models requires application of innovative statistical approaches, due to the nature of atmospheric observations and forecasts (e.g., non-systematic observations, spatial and temporal dependencies, extreme values).

Project Description

Development and dissemination of new verification approaches requires research and application in several areas, including statistical methods, exploratory data analysis, statistical inference, pattern recognition, and evaluation of user needs. The concept of “user-focused” verification implies that the information provided is better able to meet the needs of particular types of users from model developers to end users. The Method for Object-based Diagnostic Evaluation (MODE), developed by RAL and MMM scientists, provides an approach for diagnostic evaluation of spatial forecasts that directly measures the performance of the forecasts in terms of specific attributes spatial displacement, intensity, storm size, and so on, and attributes may be designed to represent the use of the forecast for specific applications. This method represents one approach toward forecast evaluations and intercomparisons that are more user-focused. This method represents one approach toward forecast evaluations and intercomparisons that are more user-focused (Figure, above).

The long-term goals of the verification research project are to:

  1. develop a stable version of MODE that can be applied in evaluations of a variety of weather (and climate) forecast variables, including precipitation, convection, and other variables that can be represented spatially;
  2. enhance the MODE approach to take into account the time dimension;
  3. develop user-relevant verification approaches in the context of the needs of specific end users;
  4. develop new user-relevant verification approaches for evaluation of probabilistic and ensemble forecasts;
  5. develop and disseminate new methods for making statistical inferences about verification measures (i.e., methods to take into account the uncertainty in verification measures); and
  6. continue to facilitate activities of the international verification community, and further advance the use of improved verification measures in operational settings.

Progress

During the past year application of the MODE approach for a variety of different types of forecasts, including actual applications to two types of convective forecasts, hurricane forecasts, and concepts for evaluation of ensemble forecasts, was investigated. Significant advocacy efforts were made through participation in, and leadership of, the WMOís Joint Working Group on Verification (JWGV), numerous conferences and workshops, statistical support for forecast evaluation studies undertaken by the RAL Developmental Testbed Center (DTC), and application of the MODE method in an evaluation of a forecasting system. Two papers were published in Monthly Weather Review, and presentations were provided to several groups, including the NCAR Leadership Team, the AMS 2nd Community Workshop, a WAS*IS group, a Workshop on the North American THORPEX Societal and Economic Research and Applications, the International QPF Workshop, and a COMET class on hydrometeorological forecasting. A tutorial was prepared on verification methods and was tested with a group of college students. This tutorial will be utilized at the 3rd International Workshop on Verification Methods being planned by the JWGV for January 2007 in Reading, U.K., which will include a 2 Ω day tutorial session. The MODE method was applied in two evaluations of a convective nowcast forecast, with results provided to the FAA and the NWS.

Plans

To meet the needs of model and forecasting system developers, the MODE approach will be developed further and an operational version will be tested and implemented. Attributes of the approach will be more completely investigated, including extensive examination of the impacts of variations in spatial scale, as represented by the parameters used to define objects [FIGURE B]. Additional methodologies for identifying, merging, and matching objects will be tested and compared, and a basic approach for incorporating the time dimension will be included in the system. Other diagnostic spatial verification methods developed by researchers at universities and other centers will be investigated, and a community-based intercomparison of methodologies will be organized and coordinated by NCAR staff. MODE will also be applied to additional datasets and types of forecasts. An initial effort will be made to examine ensemble forecasts of precipitation from an object perspective. In this context, an ensemble of objects and their attributes will be evaluated, with the results expressed as distributions of errors in object attributes. MODE will also be applied to convective and precipitation forecasts as part of NCAR's program on Short Term Explicit Prediction.

New methods for quantifying the uncertainty in verification measures (e.g., through statistical confidence intervals) will be developed and implemented in the verification system under development by the Joint Numerical Testbed Program. A workshop will also be organized, focusing on developing a state-of-the-art verification system for the JNT, which will follow the 3rd Workshop on Verification Methods being organized by the JWGV, which is led by B. Brown. The concept of user-focused verification will be further developed and presented to the forecasting and verification communities. Initial approaches leading to greater user focus will be identified and investigated, in conjunction with applications of MODE, the development of the JNT verification system, and the quality assessment activities of the FAA program. Impact of the Program:

This project will lead to improved statistical diagnostic methodologies for evaluation of forecasts. Application of these methods will aid in improving many types of weather, climate, and hydrometeorological forecasts by providing meaningful feedback to forecast developers, and will provide information to decision makers that can be used for better decision making.

For Further Information

RAL Annual Report