Top-down or bottom-up

When you program in science, your projects usually progress over time. Often, you got an idea, you create a quick and dirty solution and test it on data you know. This works for a while, but after several amendments, future-proving and incorporating new ideas, the code gets unbearable. This is the point when bottom-up-approaches break down and when you think about reprogramming everything. In these cases the new programs are not anymore bottom-up, you have an idea in mind what to achieve and often reuse some code snippets from before. We have reached the world of top-down.

Myself is often involved in projects, which need the setup of codes with several thousand lines of code and involve often the development of completely new statistical approaches. So I usually work quite a while in testing and trying different approaches in a very bottom-up-style approach. When something works, great, combine it with other stuff. When not, discontinue it (which is of course a very hard decision to make). But at one point this has an end. You plan to make a manuscript and want to be sure that nothing unforeseen interferes with your result, or you simply are annoyed by the chaos your combined code snippets create in your editor.

This point is crucial to choose right, because top-down approaches requires you to make assumptions. What has to be variable in the future? What should your code be able to do one day? Which parameters can be fixed? When this is decided, usually after a few weeks in of a project, the structuring of the reprogramming starts. The idea is that randomness does not happen anymore, that everything is reproducible with the start of only script, and that everything is ordered in the way that you can be sure that no bug is included anymore. This often works quite alright, when you have used a modularised approach and so it allows to write-up your results.

There is just one problem: When the assumptions on the top-down approach fail, your code should get a different scope and you risk by changing too much that the  reproducibility is endangered, then you start from the beginning. New assumptions, new structure, take from the old code what you can and plug it together. Hopefully it is now much cleaner than before and you can happily do your new manuscript. When you have done everything right, you can even use code for several manuscripts, as often structures are quite similar. When this works, you know you have done things right and that a lot of work has paid of. When not, well it is science, things are expected to fail at one point.  Live with it and start the next project.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.