AbstractThis study investigates natural hydrogen (H2) occurrences in the Paris Basin, using Optical Character Recognition (OCR) technology to analyze an extensive, yet underexploited, database that contains historic drilling records. The potential of natural hydrogen has been largely unexplored in conventional oil and gas wells. Utilizing the in‐house CVAGeoDB database based on public well data, which includes well logs, mudlogs, and End Drilling Reports (EDRs) in PDF image format, we applied the Tesseract‐OCR Engine to convert these documents into searchable formats for efficient data analysis. Our analysis revealed several H2‐bearing wells across French sedimentary basins. The hydrogen occurrences in the Aquitaine Basin may be explained by the geological context and in particular the presence of a mantle body at shallow depth. On the contrary, the detection of H2 in the Paris Basin cannot be explained in a straightforward manner as the presence of ultramafic or U‐rich rocks is poorly documented so far. In the Paris Basin, H2 has been detected in four main formations: the Lusitanian, the Dogger, and Triassic aquifers as well as in the basement. The highest hydrogen concentration (52 vol%) was measured in the Dogger aquifer. These wells are primarily located along the Bray Fault, indicating at least a structural influence on H2 distribution. Finaly, the presence of serpentinzed dunite from the Lizard complex associated with the bedrock may have played the role as a source for H2. This research demonstrates the effectiveness of OCR in reassessing historical drilling data for natural hydrogen exploration, highlighting the need for comprehensive exploration methodologies in this emerging field.
Read full abstract