The upstream industry’s pervasive struggle to account for and make use of its data has become almost cliché. But it is a reality—even though the industry is now multiple years into its digitalization phase. Among the many entrepreneurs and researchers coming up with solutions, two entities have leveraged data donations from big operators in an effort to make data access as quick and easy as a Google search. A big operator, which may have interests in basins all over the world, adds terabytes upon terabytes of data from its wells each day to the generations of data it has already accumulated over decades. These large, disparate sets of information come both structured and unstructured from a variety of sources. Interpretations of those records can vary depending on the terminology used by the faceless person or program that put them together. Externally, a large portion of operator data is still stowed in a disorganized manner on servers owned by multiple electronic drilling recorder providers, observed Pradeep Ashok, senior research scientist in the drilling and rig automation group at the University of Texas’s Hildebrand Department of Petroleum and Geosystems Engineering, one of the entities tackling this problem. Operators can download the data, view the data through a web interface, and perform analysis locally. But the more ideal option would be to have all the data readily accessible in a data store within the company. Internally, in many cases, data sit in different silos within a company, spread across different physical locations. And the people who work with that data are not static entities: When the downturn hit a few years ago and layoffs occurred, gobs of data were left stranded. Opera-tors are still trying to account for that lost information. Unless companies implement “a system that can deal with the volume of data created every day, they will continue to be challenged,” said Frank Perez, chief executive officer of Sfile, a software firm that has set out to help operators make the most of their data. Perez noted an instance when an operator asked Sfile to train its system to identify and collect all of the company’s hydraulic fracturing pump curves and transform them into a normalized data feature set. The operator initially estimated that it had 10,000 pump curves, but, after Sfile got a hold of its data, it determined that the company actually had 65,000. “We were able to facilitate a massive database of pressure information that was used to basically build a new reservoir model,” he said. “It was just something that they didn’t even expect they had.” It is a common trend observed by Perez: Companies have way more data than they realize, and the possibilities that come with leveraging that data are endless. But those companies first need to be able to harness the data they know they have.