Data verification is one of the corner stones in geoscience. Without knowing whether a prediction has been correct, it is not possible to claim that we can predict anything at all. Most of the verification bases nowadays on the assumption that observations are perfect, often without the acknowledgement of any uncertainties. Standard tools like contingency tables and correlations (the latter often used in some form in long-term predictions) makes it hard to take them into account (even when possible e. g. by sampling strategies).
Another problem is that having uncertainties for observations to work with is often not an easy task. An example are reanalysis data, which have long been only provided in form of one realisation. This led to the problem that while predictions were often available as ensembles, the observations to compare to were not. There are techniques available to use aggregated data and validate statistics of them, but the verification of most classical variables is still often done with certain observations. Currently the field is changing. Reanalysis start to become available in form of ensembles, so in the future we need new tools making use of these developments.
But also on the philosophical side there is more need to look into verification with uncertain observations. We know that the real world is not deterministic, we know that our instruments are imperfect and we are sure that these uncertainties matter. Why do we train our students in creating and measuring uncertainties, when we later on do not use them in our analysis? And yes, there is the issue that all observations are in their core models. We acknowledge that models are imperfect, otherwise we wouldn’t need ensembles for creating predictions. But why do we then not take care of the uncertainties due to the applications in those models when we create observations. Those models are certainly not much better (they are just applied on a different temporal and spatial scale. So we have to confront this issue in every step we take, we do that in data assimilation, so we have to do it in data verification as well.
Therefore, new developments in this field are essential. We need new tools to look into uncertain observations and make use of them. This paper is a small step into opening opportunities for future developments in this direction. It is certainly not a final solution and certainly not the first step. It is just another proposal of a tool to approach this challenge. We require in the future well understood and tested tools, which are applicable by the broader scientific community. How those might look like is currently open, also whether the tools presented here are of any wider use. In the paper I described two metrics, the EMD and the IQD, and developed a strategy to make verification tools with them. In the next post I will take a deeper look into the two metrics and shine a light on the opportunity they offer.