Data peer review paper background: Why quality is a dicisive information for data?

Using information on data quality is nothing new. A typical way to do it is by the uncertainty of the data, which gave different data points in many different data analysis methods something like a weighting. It is essential to know, which data points or parts can be believed and which are probably questionable. To help data reusers with this, a lot of datasets contain flags. They indicate when problems occurred during the measurement or when a quality control method raised doubts. Every scientist who analyses data has to look after this information, and is desperate to know whether they explain for him/her the reason, why for example some points do not fit into the analysis of the rest of the dataset.

By the institutionalised publication of data, the estimation of data quality gets to a new level. The reason behind this is that published data is not only used by the scientists, who are well aware of the specific field, but also by others. This interdisciplinary environment is a chance, but also a thread. The chances can be seen by new synergies, bringing new views to a field and even more the huge opportunities of new dataset combination. In opposite to this the risks are the possible misunderstandings and misinterpretations of datasets and the belief that published datasets are ideal. The risks can at best countered by a proper documentation of the datasets. Therefore is the aim of a data peer review to guarantee the technical quality (like readability) of the dataset a good documentation. This is even more important since the datasets itself should not be changed at all.

That this interdisciplinarity problem is solvable have been shown for a long time in paper publications. Nobody assumes that the journals Nature or Science are a thread to science (at least not due to their interdisciplinary programme). Scientists have learned to trust the quality statements, expressed by the tag peer-reviewed, by the people from the field where the original publication entity originated from. Without these quality statements designed for the interdisciplinary community the people from other fields are quite lost. What is a good paper in a field you are new to? There are not many who have a large expertise in a lot of fields.

Nevertheless, the interdisciplinarity gets more and more important. Fields like climate science have so many components that they require the acceptance of results, which are out of the comfort zone of a scientist. This is achieved by the peer review system, which enables one to filter the results. Additionally, when there is a need for using the results of others one can be sure that at least one independent scientist has, after intensive control, not found a severe flaw within the article. When we talk about data peer review, the same is valid. WHen a scientist of a different field takes the data into his hand, s/he has to be sure, that the documentation tells them everything they need to know to reuse the datasets.

Quality of data is therefore a very important part of the future in science. Measures to increase it are needed. The aim for these measures is to make the data for the reuser more usable by a better documentation. This does not mean that quality procedures performed in the data generation process are not very important as well, but the focus during the publication should be set to the reusability.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.