MLAir (v1.0) – a tool to enable fast and flexible machine learning on air data time series

Lukas Hubert Leufen,Martin G Schultz,Felix Kleinert

doi:10.5194/gmd-14-1553-2021

Lukas Hubert Leufen, Martin G Schultz + Show 1 more

Open Access

https://doi.org/10.5194/gmd-14-1553-2021

Copy DOI

Abstract

Abstract. With MLAir (Machine Learning on Air data) we created a software environment that simplifies and accelerates the exploration of new machine learning (ML) models, specifically shallow and deep neural networks, for the analysis and forecasting of meteorological and air quality time series. Thereby MLAir is not developed as an abstract workflow, but hand in hand with actual scientific questions. It thus addresses scientists with either a meteorological or an ML background. Due to their relative ease of use and spectacular results in other application areas, neural networks and other ML methods are also gaining enormous momentum in the weather and air quality research communities. Even though there are already many books and tutorials describing how to conduct an ML experiment, there are many stumbling blocks for a newcomer. In contrast, people familiar with ML concepts and technology often have difficulties understanding the nature of atmospheric data. With MLAir we have addressed a number of these pitfalls so that it becomes easier for scientists of both domains to rapidly start off their ML application. MLAir has been developed in such a way that it is easy to use and is designed from the very beginning as a stand-alone, fully functional experiment. Due to its flexible, modular code base, code modifications are easy and personal experiment schedules can be quickly derived. The package also includes a set of validation tools to facilitate the evaluation of ML results using standard meteorological statistics. MLAir can easily be ported onto different computing environments from desktop workstations to high-end supercomputers with or without graphics processing units (GPUs).

Highlights

In times of rising awareness of air quality and climate issues, the investigation of air quality and weather phenomena is moving into focus
We present a new framework to enable fast and flexible Machine Learning on Air data time series (MLAir)
We show how the results of an experiment conducted by MLAir are structured and which statistical analysis is applied

Summary

Introduction

In times of rising awareness of air quality and climate issues, the investigation of air quality and weather phenomena is moving into focus. Driven in particular by computer vision and speech recognition, technologies like convolutional neural networks (CNNs; Lecun et al, 1998), or recurrent networks variations such as long short-term memory (LSTM; Hochreiter and Schmidhuber, 1997) or gated recurrent units (GRUs; Cho et al, 2014) and more advanced concepts like variational autoencoders (VAEs; Kingma and Welling, 2014; Rezende et al, 2014), or generative adversarial networks (GANs; Goodfellow et al, 2014), are powerful and widely and successfully used The application of such methods to weather and air quality data is rapidly gaining momentum. Data scientists use data to build their models on and evaluate them either with additional independent data or physical constraints This elementary difference can lead to misinterpretation of study results so that, for example, the ability of the network to generalize is misjudged. We would like to mention that MLAir is an open-source project and contributions from all communities are welcome

MLAir workflow and design

Coding language

Design of the MLAir workflow

Run modules

Model class

Data handler

Conducting an experiment with MLAir

Running first experiments with MLAir

Results of an experiment

Statistical analysis of results

Host system and processing units

Preprocessing

Custom data handler

Training

Validation

Evaluation

Custom run modules and workflow adaptions

How to continue an experiment?

Limitations

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Geoscientific Model Development	Publication Date: Mar 17, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

MLAir (v1.0) – a tool to enable fast and flexible machine learning on air data time series

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Geoscientific Model Development

Lead the way for us

Similar Papers

Machine learning in pain research.
Jörn Lötsch ... Alfred Ultsch
Pain | VOL. 159
Jörn Lötsch, et. al.Jörn Lötsch ... Alfred Ultsch
24 Nov 2017
Pain | VOL. 159

Artificial intelligence: machine learning for chemical sciences.
Akshaya Karthikeyan ... U Deva Priyakumar
Journal of chemical sciences (Bangalore, India) | VOL. 134
Akshaya Karthikeyan, et. al.Akshaya Karthikeyan ... U Deva Priyakumar
21 Dec 2021
Journal of chemical sciences (Bangalore, India) | VOL. 134

Artificial intelligence in interdisciplinary life science and drug discovery research.
Jürgen Bajorath
Future science OA | VOL. 8
Jürgen BajorathJürgen Bajorath
08 Mar 2022
Future science OA | VOL. 8

Chapter 2 - General considerations on artificial intelligence
Abhay Dharamsi ... Sunil S Jambhekar
A Handbook of Artificial Intelligence in Drug Delivery | VOL. -
Abhay Dharamsi, et. al.Abhay Dharamsi ... Sunil S Jambhekar
01 Jan 2023
A Handbook of Artificial Intelligence in Drug Delivery | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MLAir (v1.0) – a tool to enable fast and flexible machine learning on air data time series

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Geoscientific Model Development