In my last post I showed that observations are models as well. But when this is the case, why do we distinguish between these two kinds of data the way we do? Why is everyone so keen on observations, when they are just another model output?
The reason can be found usually in their different structure. The amount of modelling, which is applied to an observation to still be called observation should usually be very basic. Coming from the atmospheric sciences myself, the border between the two worlds can often be drawn in the type of the data. Generally the observations in that field are point data, often in situ data, which are irregular in time and space. In contrast to this, model data is usually very regular and sometimes high-dimensional.
But of cause there also a lot of mixed types: Atmospheric scientists for example use the global reanalysis fields as quasi-observations. This is of cause critical, since it is in the end observations, where the gaps (and more) are filled with modelling results to get a more or less homogeneous field for the physical variables. But often enough the answer on the criticality of seeing this combination as observations is something like “it’s the best we have”. Of cause there are a lot of good reasons to take them as observations, but in the end everyone using it have to keep in mind: it is just a calibrated model.
In contrast to this remote sensing observations do not necessary fulfil the irregularity argument. They are also by definition not in situ, since they use for example backscattering effects of the investigated element to determine its physical properties. Nevertheless, they are still accepted widely as pure observations.
So what is the difference between them? Its their usage and the theoretical background. Models (I prefer to name it simulated data) are usually constructed from scratch, basing on physical, chemical or biological “laws” and afterwards calibrated. In the case of observations it is the other way round. The reality is the basic plan and the aim is not the overall picture, but a local result with the minimal effort on the used models. In the daily work for a lot of scientists, observations are simply the one, which are irregular and which are used to calibrate what ever is done at this time point on paper or computer.
More discussions on the classification for the meteorological sciences and the problems of different storing and publishing procedures of the different kinds of data can be found in a paper, which was written by the project in which I wrote my PhD-thesis:
Quadt, F.; Düsterhus, A.; Höck, H.; Lautenschlager, M.; Hense, A. V.; Hense, A. N.; Dames, M. (2012): Atarrabi – A Workflow System for the Publication of Environmental Data, Data Science Journal, 11, 89-109, http://dx.doi.org/10.2481/dsj.012-027
This post should have shown that the simple wordings of “observations” and “models” is not this simple at all, even when they are heavily used in the scientific literature. In the end it is often a convention, which has been established over time, to decide, which elements are really classified as observations or models. Of cause, there are a lot of clear cases, but especially the mixed types gain more and more importance in modern sciences.