Abstract
AbstractThe construction of diverse and synthetic datasets of atmospheric situations, used as first guesses or training bases for remote‐sensing algorithms, is still a challenge. Numerical constraints require the use of datasets with a limited number of representative situations, but keeping, as much as possible, the full diversity observed in nature. This study presents an innovative sampling method that allows extraction of a new, more limited, dataset from a large database of atmospheric situations. One major issue of such sampling concerns the heterogeneity of the input space variables: different units and ranges of temperatures and specific humidities, for instance, or locations from the lower troposphere to the higher stratosphere, can hardly be compared. We illustrate the fact that sampling using only one variable type is not optimal, since erroneous features can be observed in the other variables not used for the sampling. The use of Shannon's entropy can help to develop a sampling technique able to deal with very heterogeneous variables. A dataset of 10 000 situations is built from EUMETSAT satellite atmospheric retrievals: it includes temperature and water‐vapour profiles, four integrated ozone layers and surface temperature. The sampling increases the entropy of the original dataset from 22 to 28 (about 20% increase).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Quarterly Journal of the Royal Meteorological Society
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.