vEGU2021: A “zoom” into science

Another EGU in online only mode and things have changed since last year. It started with technical chaos but after this got under control it was a really nice and enjoyable conference experience.

A week before

EGU is traditionally one week long, but this time it got extended for another week. The week focused on medal lectures and some networking events, but as those where hard to see, it was for most obviously just about the household names giving an interesting talk. Those talks were on Zoom, which allowed a simple following for most. For myself and many others it was still the week to prepare the contributions. The displays of last year got back, but were extended by a single slide for a 2 minute presentation. Communicating about this worked not really ideal, so there were still some questions around what acutally had to be prepared.

The “real” first day

We had the honor to convene our session on seasonal to decadal predictions in the morning of the “real” first day.

EGU had implemented a new environment compared to last year, which theoretically took care of some issues bugging last years edition. It started alright, but soon people had trouble to get onto the server (nothing unusual at EGU, that’s what we know from every edition). But as some speakers had audio troubles and then the promised breakout-chats after the initial comments did not show up it was quite a improvised discussion session. We implemented more or less a system we had used last year in the chat-only conference, but beside the chaos it was sad, that not everyone got the input on their contribution they deserved. But still, it worked rather well under the circumstances.

The sessions later that day were even more catastrophic from the technical side of things and so the technical system was changed to Zoom only. From then on it worked quite well, as most scientists are used to Zoom after a year of pandemic.

Relaxing in the middle

As a scientist with quite interdisciplinary interest, I always find something, which I like to watch. But this year I had the impression it was even more condensed. Anyway, a digital conferences has its advantages. You can stream the talks onto your TV, enjoy your couch and when the show is over you get to your desk to have some chats in the breakout rooms. In the evenings some networking events happened via Gather.town. It offered some options for meetings, which is the part which sufferes the most in digital conferences.

Final day

Friday was the day of my own little presentation and a lot of other talks I liked to see (unfortunately all at the same time).

My talk went quite alright, having a short talk does not require much creativity. There is not a lengthy introduction, everything is brought into short soundbites and it is effectively just an advert to lure people into the breakout session. Had a few discussions afterwards and enjoyed otherwise the rest of the meeting. In the evening was a closing party over Gather.town, which was a bit better than the Zoom one a year before, but still not optimal.

Do digital conferences now work?

Many said they preferred this year over the last edition. beside from the technical chaos on Monday it was quite enjoyable. It is still far away from the personal experience in Vienna, which I look really forward to in the upcoming years. It was possible to find and talk to people and former colleagues, setting up new projects and see what others do. Some liked the 2 minute presentations, I think they are not ideal.

Because the main struggle of this zoom-like conference is taht again Early Career Scientists py the prive. It is tough for them to get into a 2 minute ad-talk, stripped of all introductions and motivations. When you have seen the field beforehand in the last years, you know usually what they are talking about, but for those getting new into science it can be a quite steep demotivating learning curve. Also they do not get much exposure in the breakout-rooms cause established scientists use them for larger chats among themselves and drawing with this the attention away. The separation into talks and posters at the usual personal conferences has its advantages, which are often forgotten. But as also former editions of the EGU had developed away from ECR-friendliness, it is nothing surprising. The biggest challenge of a future EGU, may it be hybrid or online, will be to advance the experience of young scientists.

Final words

All in all it was a great conference. I like being at the EGU, it is still the go-to-conference every year, because of its interdisciplinary opportunities. After day one it was alright, and it went rather smoothly. Let’s hope that next year will have again some form of real Vienna in it, as I miss to meet friends and other scientists in person. See you next year, wherever it will be.

Digital conferences – how might they work?

In the past week we had an interesting experiment: putting the largest European Geo-scientific conference onto a virtual stage. In a short time frame, EGU managed to switch from a huge gathering of people in Vienna to an exchange of scientific ideas on digital channels. And taken together, they did a fabulous job. The channels ran smoothly, the feared chaos induced by trolls didn’t happen and ideas got exchanged quite friction-less. So is all well? Not quite. Under those circumstances it was close to the best what was possible, especially due to the limited time to put it up and running. Nevertheless, as we are part of climate science and the calls to limit travel gets louder, the question arise, how a digital conference might look like in cases in which there is enough time to prepare (so a year or more). So what happens and where are the dangers, when conferences get generally put online in the future.

Continue reading

Post-processing paper background: Do we need new approaches in verification?

In the final post on this background-series I want to write about the necessity for new ideas in verification. Verification is essential in geo- and climate science, as it gives validity to our work of predicting the future, whether it is on the short or long timescale. Especially in long-term prediction we have the huge challenge to verify our predictions on a low number of cases. We are happy when we got our 30+ events to identify our skill, but we have to find ways to make quality statements on potentially much lower number of cases. When we e.g. investigate El Niño events over the satellite period, we might have a time series bellow 10 time steps at hand and come to a dead end with classical verification techniques. Contingency tables require much more cases, because otherwise potential uncertainties become so huge that they cannot be controlled. Correlation measures are also highly dependent on many cases. Everything below 30 is not really acceptable, which is shown by quite high thresholds to reach significance. Still, most of long term prediction evaluation rely on such methods.

An alternative idea has been proposed by DelSole and Tippett, which I had first seen at the S2S2D-Conference in 2018. In this case we do not investigate a whole time series at once, as we would do for correlations, but single events. This allows to evaluate the effect of every single time step on the verification and give therefore new information beside the information on the whole time series.

I have shown in the new paper, that this approach allows also a paradigm shift in evaluating forecasts. While we looked beforehand in many approaches at a situation, where the evaluation of a year depends on the evaluation on other years, by counting the successes of each single year makes a prediction evaluation much more valuable. We do often not ask how good a forecast is, but whether it is better than another forecast. And we want to know at the time of forecasting, how likely it is that a forecast is better than another. But this information is not given by many standard verification techniques, as they take into account the value of difference between two forecasts at each time step. This is certainly important information, but limits our view in essential questions of our evaluation. Theoretically, it is often possible, that one single year can decide whether one forecast is better than another. Or more extreme: When in correlation one forecast is really bad in one year, but is better in all other years, it can still be dominated by the other forecast. These consequences have to be taken into account when we verify our models with these techniques.

As such, it is important to collect new ideas about how we want to verify and quantify the quality with its uncertainties of the new challenges, which are posed to us. This new paper applies new approaches in many of these departments, but there is certainly quite some room for new ideas in this important field for the future.

Post-processing paper background: EMD and IQD? What is it about?

When you have two probability distributions and want to know the difference between them, then you need a way to measure it. Over the years many metrics and distance measures have been developed and used, the most famous one is the Kullback-Leibler-Distance. In a paper in 2012 I had shown that a metric called Earth Mover’s Distance (EMD) shows considerable improvements in detecting differences between distributions. So it was a natural idea for me to try to make use of this measure, when we want to compare two distributions.

So given is a distribution by the model prediction, defined by the ensemble members, and an observation with a non-parametric distribution of its uncertainties. A nowadays standard tool for evaluation of ensemble prediction is CRPS. In this case it is evaluated at which percentile of the probability distribution the deterministic observation can be found. The paper now tries to make use of this tool and extends it by looking at uncertain observations. So effectively, what is done is to measure the distance between two distributions and by normalising it against a reference (e. g. the climate state) a metric distinguishing between a good and a bad prediction can be created.

So how does the EMD work? Well, it effectively measures how much work would be needed to transfer one distribution into another. So when you imagine a distribution as a sand pile, then it measures the minimal amount of fuel a machine would need to push the sand around until it creates the target distribution. This picture is also the one from which the EMD got its name. As a metric it measures the distance precisely and therefore allows to say, when you have two predictions, which one is closer to the observations.

But it is important here to mention, that there are problems with this view. Similar to CRPS, there exist literature, which describe that even with its properties, measures like EMD are potentially to kind to false sharp predictions compared to uniformed ones. In the CRPS case, the distance is squared, so that a longer transport of probability is necessary for a wrong prediction. In my paper I also show the results with this approach as IQD. A squared distance is much less intuitive than a linear one, it is harder to understand for scientists, why they should use this above the others, which leads to hesitant use of these kind of measures. Therefore, it will be necessary in the future to much better describe why the issues occur and develop new pictures to explain everyone, why squaring is the way to go. We also need new ways in general for verification in the future, but on this I will write more on the final post in this series.