Abstract

With the development of fourth-generation high-brightness synchrotrons on the horizon, the already large volume of data that will be collected on imaging and mapping beamlines is set to increase by orders of magnitude. As such, an easy and accessible way of dealing with such large datasets as quickly as possible is required in order to be able to address the core scientific problems during the experimental data collection. Savu is an accessible and flexible big data processing framework that is able to deal with both the variety and the volume of data of multimodal and multidimensional scientific datasets output such as those from chemical tomography experiments on the I18 microfocus scanning beamline at Diamond Light Source.

Highlights

  • Background subtractionThe step in the processing chain is to remove the background from all the data again in a many-to-many mapping because we take in both X-ray diffraction (XRD) patterns and X-ray fluorescence (XRF) spectra and return the patterns/ spectra minus the background

  • The step in the processing chain is to remove the background from all the data again in a many-to-many mapping because we take in both XRD patterns and XRF spectra and return the patterns/ spectra minus the background

  • One hundred iterations were used for each pattern/ the mXRF-computed tomography (CT) data are both input into the plugin along with spectrum to remove the peaks

Read more

Summary

Big data processing: a solution for modern scientific data processing

Modern scientific investigations are producing data at an increasing rate and volume, such that the number of processing hours required soon exceeds that which is feasible for a single user to spend processing them. A very topic-specific framework focused on tomography This limits its possible applications, as it often requires a sizeable effort to extend it to include other data reduction processes such as X-ray fluorescence tomography (Hong et al, 2015; Gursoy et al, 2015) or even other tomography toolboxes (Pelt et al, 2016). As it is highly optimized for use on particular cluster architecture it is hard to roll out a performant framework to different facilities

Savu: an open-source Python-based scientific data processing pipeline
Application: multimodal chemical tomography
Catalyst investigation
Data reduction
Summary
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call