For a few years now, Data Science is a hot topic. Under the theme ‘Big Data’, it got popular and when you believe some media it will solve nearly all problems in the world. But what does it mean to be a data scientist? Is it a jack of all trades or just someone, who know no field really well? As I would myself describe as a data scientist, I would like to write a little bit about how I see this field.
Traditionally within the different disciplines of earth science the scientists are divided into two groups: modelers and observationalists. In this view the modellers are those who do theory, possibly with pen and paper alone, and the observationalist go into the field and get dirty hands. That this view is a little bit outdated, won’t be anything new. In my opinion, it really started with the establishment of remote sensing that this division reunited (Yes, reunite, because in the old days, there were a lot of scientists who did everything). As I am a learned meteorologist, from my view it is quite common that this division is not really existent anymore. Both types of scientists sit in front of their computer, both are programming and both have to write papers with a lot of mathematical equations. In other fields, the division might be still more obvious (e.g. Geology), but for many its only the type of data someone is working with, which classify someone as observationalist or modeller. Continue reading
As a scientist in earth science, who is working more on the theoretical side, the daily work consists in large parts of programming. Nevertheless, even with the importance programming has nowadays in this field, I hear again and again from people that they had not got a systematical education on this during their studies. Of cause, I agree, learning by doing plays a very important part to become a good programmer, but without further insights into the background of programming it can be quite hard to generate the benefits of a well planed structured programm. Continue reading
Doing statistics between the two worlds of observations and model results lead often to the assumption that both are completely different things. There are the observations, where real people moved into the field, drilled, dug and measured and delivered the pure truth of the world we want to describe. In contrast to this, the clean laboratory of a computer, which takes all our knowledge and creates a virtual world. This world need not necessary have something to do with its real counterpart, but at least it delivers us nice information and visualisation. But this contrast between the dirty observations and the clean models is usually only something, which exists in our heads, in reality they are much more connected to each other.