Expectations from unconventional reservoirs are often overly optimistic when estimation criteria are transferred from one location to another but some of the many important elements of petroleum systems remain unaccounted for. Feature engineering used in model construction to improve the quality of the machine learning process can improve the performance of unconventional reservoir estimates.The compositions of petroleum fluids are represented by spectra obtained from various analyses, which vary greatly in shape and dimensions, contain large amounts of data, and cannot be processed as single numbers. Selection of relevant features in the spectra can simplify interpretation, reduce dimensionality, and improve data compatibility.The features have been selected in GC-MS, NMR and MALDI-TOF spectra of bitumen from the Bayan-Erkhet tar sand deposit in Mongolia for comparison with bitumens from other regions of the world. The GC-MS spectra have a similar baseline shape, either a triangle or two humps with peaks eluted to varying degrees. The comparison made it possible to isolate poorly eluted peaks of carbon number groups in the studied bitumen. Since NMR spectra of petroleum fluids are most often published without NMR parameters, simple indices are developed to compare the shapes of the spectra. The values of these indices are consistent with the characteristics of the samples. The sequences of hydrocarbon compounds determined in the MALDI-TOF spectrum are very similar to those of the Banik black shale, Iraq, which allowed to clarify the sequences. Some peaks are also found in the spectra of two other crude oil samples. The feature selection in spectral analyses enables revealing the hidden information.
Read full abstract