Abstract

Since the advent of modern computing, geochemists have increasingly relied on computers to garner efficiencies in calculations, data analysis, and data presentation. Entirely new fields, such as Monte Carlo-based simulation and geochemical modeling, have developed under this paradigm. With continued growth in computing power, machine learning has become an increasingly popular tool in aqueous geochemistry. However, continued reliance on algorithms to perform mathematical calculations can lead to paths of not understanding how to properly prepare information for models or not the reasons behind apparent patterns in the output. Machine learning algorithms can be heavily impacted by what variables are chosen for the model and how data are pre-processed, including handling of missing and censored values (e.g., above or below a detection limit). We propose an approach of parsimonious variable selection, based partially on the signal-to-noise ratio, and suggest and discuss strategies for handling missing and censored data. An example of unsupervised machine learning, using emergent self-organizing map analysis, is applied to water from oil and gas wells in the northern U.S. Gulf Coast Basin, whose composition is controlled by different processes and is derived from various origins. Findings from this investigation suggest five groups of water samples are present, two of which were not identified using conventional data analysis methods. One notable result is that brines derived from seawater evaporation, presumably waters from which the Jurassic Louann salt precipitated, have migrated upward into shallower reservoirs across the study area. This work demonstrates that focus on understanding data quality and exercises to better interpret the output from numerical models continue to be critical skills to further take advantage of applying machine learning to geochemistry.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call