One main aspect of the new paper is the question why sub-sampling works. In many review rounds for the original paper (Dobrynin et al 2018) we got questions about a proper statistical model of the method and many claims why it should not work while it does (aka cheating). This is the point this manuscript comes into play. Instead of selecting a (probably) random number of ensemble members close to one or more predictors everything is transferred to distribution functions (pdf). Of course those are not easily available without making large amounts of assumptions, so I have gone the hard way. Bootstrapping of EOF fields is certainly no easy task in terms of computational costs, but it does work. It allows to have for every ensemble member and every predictor as well as for the observations of the North Atlantic Oscillation (NAO) a pdf.
Basing on those pdfs it is now possible to look for the reason of better prediction skill of the sub-sampling method compared of no-sub-sampling-case. First step is to show that the distribution view and the sub-sampling are at least similar. In the end, making use of pdfs is not a pure selection but more a weighting. It weights those ensemble members higher, which are close to a predictor compared to those far away. Of course there are differences between the two approaches, but the results are remarkably similar. It gave us more confidence that in the many tests we did in the past on the sub-sampling methodology the way how we select does not have such a huge influence (but that will be explained in detail in an upcoming paper). Consequently, we can accept that when we can show how the pdf-approach works we will get insights into the sub-sampling approach itself.
The new paper shows, that key to the understanding of the mechanism is the understanding of the spread. While seasonal prediction has an acceptable correlation skill for its mean of ensemble members, each prediction of a single ensemble member is rubbish. In consequence, the overall ensemble has a huge spread of quite uniformed members. We have learned in the past to work with such problems, requiring us to take huge care in how to evaluate predictions on the long-term timescale. By filtering this broad spread and with it highly variant distribution function with informed and sharper predictor functions leads to the effect of sharpening the combined prediction, while at the same time having a better prediction overall. With other (simplified) words: we weight down the influence of those ensemble members that drifted away from the correct path and concentrate onto those, which are consistent with the overall state of the climate system.
As a consequence, the nature of the resulting prediction is in its properties quite similar to a statistical prediction, but has still many advantages of a dynamical prediction. It is probably not the best of both worlds, but an acceptable compromise. But to establish that we need tools to evaluate the made predictions and that proved to be harder than expected. But that is the story of the next post on why we need verification tools for uncertain observations.