Best Practice for Using the CMIP REF Output

The CMIP Rapid Evaluation Framework (REF) represents a significant advancement in systematic climate model evaluation, offering standardized diagnostics, performance scores, and visualization tools that enable the rapid comparison across models. As CMIP outputs have expanded beyond the original research context into decision-relevant applications, spanning the IPCC reports, national risk assessments, adaptation and resilience planning, and climate services, the need for structured guidance on how to interpret and apply tools and outputs from the community such as the REF has become a priority. The uptake of these tools and data has outpaced the development of norms, cautions, and interpretive guidance needed to prevent unintentional misuse.

This guidance document emerged from the recognized gap at the interface of model evaluation tools and responsible use: REF diagnostics are powerful, but the summary scores can be easily mistaken for definitive quality rankings of models, rather than relative, context-dependent indicators for model use. The document was developed through the joint RIfS-CMIP Responsible Data Use Task Team, the Model Benchmarking Task Team and those involved in the development of the REF. It draws on recommended best practices in model evaluation and broader literature, with input from the CMIP Steering Committee and user communities such as the IPCC authors.

The resulting document provides high-level principles for rigorous and appropriate use of the REF outputs alongside a set of reflective considerations and basic checklists to help users think carefully before drawing conclusions from the REF outputs. It addresses the frameworks’s intended purpose as a screening and comparison tool, the proper relationships between model skill scores and the underlying diagnostics they summarize, the role of reference datasets uncertainty, and the inherent trade-offs in model behavior and skill for different phenomena and processes. The goal is to support informed judgment, i.e., helping users recognize what questions to ask and what limitations to account for, rather than to prescribe specific workflows or provide detailed application-level guidance.

The document represents a first step. Future work will expand on these foundations to include more concrete examples and guidance on use cases, including using REF outputs as the first step for assessing model fitness-for-purpose.

Please cite the document as:

Morrison, M. A., Daba, D., Abrha, H., Chung, C., Samanta, D., Hoffman, F. M., Swaminathan, R., Brands, S., and Bonou, F. (2026), Best Practice for Using the CMIP Rapid Evaluation Framework Output. https://doi.org/10.5281/zenodo.18775782.