Scientific Working Group on Evaluation and Diagnostics

The SWG on Evaluation and Diagnostics exists to foster activities that help describe, document and understand the performance of integrated assessment models so as to improve model performance and reliability and to make recommendations in the areas of priorities for community activities and standards of best practice.

Integrated assessment modelers have and will continue to be asked by potential users and critics why they should have confidence in the results flowing IAMs, or what level of confidence they should have in the results. Like all models, IAMs describe a critical set of relationships between assumed values for key input assumptions, which lie outside the model, and variables of interest that the model predicts, albeit a contingent prediction. For IAMs exogenous assumptions typically include demographic characteristics, economic growth, technology characteristics and availability, and the policy environment. IAMs produce contingent predictions of such variables as energy supply, demand, transformation, trade and prices, agricultural supply, demand, land use, land cover, and prices, carbon and other greenhouse gas prices, and the economic cost of policy interventions.

Determining the usefulness of a given model for a given purpose is both art and science. The purpose of the model is critical to determining usefulness. It is also critical to determining the right set of test to apply in the evaluation and diagnosis of a given model.

A variety of techniques have been applied to evaluate and diagnose models. While the SWG will not undertake model evaluation, diagnostics or validation, the SWG can undertake activities that help shape and facilitate such activities.

A number of community activities have begun to undertake activities in the model diagnostics and evaluation domain. Workshops on model evaluation and diagnostics have been conducted by the US DOE-funded PIAMDDI and the EU FP7 funded AMPERE projects that helped to enhance our understanding about the various ways contributing to model evaluation and the analogies to the approaches in other communities such as the operations research and climate modelling communities. AMPERE developed a set of diagnostic indicators to better understand underlying reasons for similarities and differences in model behavior. PIAMDDI continues to develop methods for model intercomparison and diagnostics. The EU FP7 funded ADVANCE project aims to develop standards for model documentation and to establish standard tests for diagnosing IAM behavior.

  • Information on updates to the SWG from the Eighth Annual Meeting of the IAMC in 2014 November can be found here (PDF download)
  • Information about PIAMDDI activities can be  found here (PDF download)
  • Information about AMPERE activities can be found here
  • Information about ADVANCE activities can be found here


Planned activities

The SWG has yet to set an agenda for its work and is considering a wide range of potential activities. Not all of these could possibly be undertaken simultaneously, so that it is important to begin getting feedback from SWG members and the broader IAMC community on priorities for future work. A list of some potential activities is below:

  1. Framing: Facilitating a discussion about adequate conceptual frameworks for IAM evaluation, including experts from outside the IAM community. Part of this could be the articulation of important model outputs that IAMs produce and the questions this community is trying to answer.
  2. Information Exchange: Identify IAM evaluation and diagnostic community activities that are underway and of relevance for IAMC members. Do we want to create a reading list on model diagnostics and evaluation and post it on the SWG web site?
  3. Model Validation Against Historical Data: Model validation experiments using historical data (sometimes referred to as “hindcast” exercises) have been conducted by a number of teams to test IAM behavior against historical developments. There are a host of questions and challenges to resolve, such as what prescribed historical period to choose, what common input assumptions and what historical developments should be compared to model outputs? The SWG could facilitate the discussion of these challenging questions among interested teams and projects. Do we want to write a joint paper for the IAMC on this topic?
  4. Model Evaluation Against Historical Patterns: The SWG could facilitate the systematic exploration of robust historical socio-economic patterns of energy and land use and economic activity that may continue to hold in the future, at least under baseline conditions. (This approach is sometimes referred to as “stylized facts”.) Such patterns may be of particular relevance for model evaluation, and the SWG could support activities that try to employ historical patterns for behavior testing of models (complementary to Historical Data Validation, above).
  5. Model Diagnostics: The SWG and the IAMC more generally, could support the use of a repository of model diagnostic runs that have been developed in model comparison exercises and elsewhere and that can be used as a source of information for researchers attempting to understand model performance and develop relevant indicators?
  6. Developing standards: Do we want to make recommendations for evaluation and diagnostic activities? Traditionally we have undertaken model intercomparison projects that have been designed to reveal the variety of model results that characterize high-level responses to partially-standardized scenarios. Do we want to recommend community standards for experiments that compare results to empirical observation as well as relative to other models (diagnostics)? Do we want to identify and recommend further developments in the area of evaluation and diagnostics, e.g. experiments that reveal the structure of the models or their behavior in extreme conditions?
  7. Model documentation: Design standards of good practice, for example, model documentation, model data documentation, and model code availability. Do we want to write a set of standards for model and data documentation and model availability and certify models that meet these minimum standards?
  8. Data needs: Identify data requirements (needs) including historical and prognostic data (including standards of accounting and measurement—coordinate with SWG on Data Protocols and Management). Also, data availability. Work with an existing data center (e.g. HYDE)?




Pacific Northwest National Laboratory (PNNL), Joint Global Change Research Institute at the University of Maryland [USA]


Executive Members

  • Elmar KRIEGLER, Potsdam Institute for Climate Impact Research (PIK) [Germany]
  • John WEYANT, Stanford Energy Modeling Forum (EMF) [USA]


IAMC members that are interested in participating in the work of the SWG should contact SWG Co-Chairs.