Massive ensemble paper background: Massive ensembles: How to make use of simple models?

The new paper on the LIG sea-level investigation with massive ensembles analyses simple models. In this post I want to talk a bit about their importance and how they can be used in scientific research.

Simple models are models with reduced complexity. In contrast to complex models their physics is simplified, they are more specified for a specific problem and their results are not necessarily directly comparable to the real world. They can have a smaller, easier to maintain code base, but also a simple model can grow in lines of codes fast. A simple model is defined depends on the processes it includes, not the mass of coding lines.

Due to the minimised representation of physics, statistical methodologies are much more important to gain any insights from the simple models. The huge advantage of simple models is of course that fewer resources are required for a model run. Especially the run time of a simple model is decisive, while the other advantage, the lower amount of storage necessary, is usually a secondary priority. An example in our case is that we were able to run several thousands of model runs with our simple model suite, while a complex model run or a very small ensemble probably would have already been a challenge (depends on how complex the model would have been designed).

I have to mention here of course a third category of models, which is often used. Intermediate-complexity models usually tries to use advantages of both categories, by minimising the disadvantages. As the model choice should always be made on the research question as hand, they can be for certain problems a good idea, but weren’t for us.

Also the aim of research is different between the two main categories of models. While complex models try to imitate physical processes as good as possible to generate a realistic picture of the consequences, simple models are usually either designed to learn more about the basic physical processes or to make more diverse uncertainty estimates. The latter was our aim and so a simple model was the primary choice (it had of course also something to do with the availability, but in our case it fitted very well).

Having the opportunity to calculate so many runs of a model helps of course to understand the system. Each model run is unique, and sometimes small changes to the input parameters can have an enormous effect. Nevertheless, generating something like big data is always a challenge to analyse. Statistical post-processing and systematic approaches are key to solve this issue. Just making a lot of runs is therefore not really a good idea. We tried making use of them by deciding to design a data assimilation approach, but that will be the topic of the next post.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s