Checking your assumptions
A workflow-based method to check the suitability of assumptions in complex statistical models gives researchers an efficient tool for selecting the right model.
King Abdullah University of Science & Technology (KAUST)
image: A new KAUST-developed workflow helps researchers efficiently test and validate the assumptions underlying complex statistical models to ensure more reliable results. © 2025 KAUST
Credit: © 2025 KAUST
A three-step model checking workflow has the potential to revolutionize how researchers evaluate the suitability of their statistical models for specific datasets. Developed by KAUST, the workflow is a computationally efficient alternative to existing model diagnostics that will give researchers more confidence in model selection[1].
“When we build statistical models to analyze complex datasets, such as climate or biomedical data, we inevitably make assumptions about the underlying statistical distributions, dependencies, and how things change over space and time,” says Rafael Cabral from the research team. “But two crucial questions are often overlooked: ‘Are my model assumptions reasonable?’ and ‘If I change these assumptions, how much would that affect the final results such as predictions or decisions?’ We have developed a principles-based way to answer those questions.”
Modern statistical models often contain multiple layers and include hidden components that cannot be directly observed. Existing model-checking tools focus only on the top observable layer, and whether the model has a reasonable fit to the observed data. However, fit alone does not reveal whether the underlying structural assumptions – such as a ‘bell-curve’ Gaussian versus a skewed non-Gaussian distribution – are correct for the given dataset.
“When a model is flawed, several problems can arise,” says Cabral. “For instance, if we incorrectly assume a Gaussian distribution for spatial models when in fact the data is non-Gaussian, we might smooth away important local patterns, which could significantly affect our conclusions and produce poor predictions.”
Existing tools work well for checking simple statistical models, but cannot check assumptions embedded in hidden layers in complex hierarchical models. While complex models can be compared with other models to gauge their suitability, fitting multiple complex alternative models is computationally expensive, often practically infeasible, and provides little insight into which assumptions need to be reconsidered.
Cabral, with the chair of the Statistics Program Håvard Rue, and KAUST colleague David Bolin, instead applied a systematic approach to this problem by setting up a three-step process aimed specifically to test the validity of model assumptions.
“First, we fit the candidate model to the data as usual. Second, we identify a plausible alternative model that relaxes some assumptions, such as allowing for non-Gaussian, rather than Gaussian distributions. Third, we compute how sensitive the results are to small perturbations toward this alternative model. Unlike other methods, we can perform this sensitivity analysis using the initial model alone, without the computational expense of fitting alternative models,” Cabral says.
Building on a classic 1980s theory of statistical robustness, the researchers extended the idea to modern hierarchical models, and connected it with a model checking approach. In application to a dataset of air pressure data from the American Pacific Northwest, their method showed that failing to account for non-Gaussian features led to poor predictions near areas with sharp variations in local air pressure.
“Our approach allows researchers to identify which assumptions are most problematic, and provides visual diagnostic tools showing which results are most affected,” says Cabral. “Our work makes rigorous model criticism more accessible and computationally feasible for complex hierarchical models, and can be integrated into common statistical modelling programs to help researchers more routinely check whether their modeling assumptions are reasonable and robust, or in fact fragile.”
Reference
- Cabral, R., Bolin, D. & Rue, H. Robustness, model checking, and hierarchical models. Journal of the Royal Statistical Society Series B: Statistical Methodology 87, 632–652 (2025).| article
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.