Colin T. Barnett and Peter M. Williams discuss how new data mining concepts such as visualization and probabilistic modelling can provide the key to improved exploration success in the mining industry. The article is a slightly abridged version of a contribution to the latest special publication from the Society of Economic Geologists, which this year celebrates its 100th anniversary. Following an analysis of recent performance of the gold industry, Schodde (2004) concludes that gold exploration is currently only a break-even proposition. In the last 20 years, the average cost of a new discovery has increased nearly fourfold, and the average size of a deposit has shrunk by 30%. The average rate of return for the industry has been 5–7 %, which is the same order as the cost of capital. Why should this be, and what can be done about it? Paterson (2003) observes that, historically, discoveries have taken place in waves, after the introduction of new methods or advances in the understanding of ore genesis. For instance, discovery rates jumped sharply between 1950 and 1975, following the development of new methods and instruments in exploration geophysics and geochemistry. In the last quarter century, there has been a comparable surge in digital electronics and computing that has resulted in a great increase in the quality and quantity of exploration data. Yet these developments, on their own, evidently were not sufficient to reverse a downward trend in the discovery rate during this period. So where should we look for new methods to drive the next wave of discoveries? It seems we are now collecting data faster than we can absorb it. But this is also true in bioinformatics with genome sequencing, or in making sense of other huge corpora of data available on the Internet. It is the thesis of this paper that the new methods are to be found in ways currently being developed for extracting meaningful information from data. Specifically, we should look to recent developments in visualization and data mining. Many data mining techniques are inspired by analogy with human intelligence and suggest a new idea of computing. Conventional computing is restricted to tasks for which a human can find an algorithm. Living creatures, however, are programmed by experience, not by spelling out every step of a process. Data mining is therefore about discovering how machines, like humans, might learn from data. Machine learning broadly distinguishes between supervised and unsupervised learning. Supervised learning, or learning from examples, requires sufficiently many labelled cases to be available. These form a set of known inputoutput pairs, usually called a training set, and the task is to learn the true input-output mapping from these examples. In the exploration case, the training set typically consists of a collection of known deposits and known barren regions. For unsupervised learning, we know only the inputs and not the corresponding outputs. The aim, then, is to search for ‘interesting’ features of the data, such as clusters or outliers, or for some latent structure which would account for how they were generated. In this paper only the case of supervised learning is considered, but see Williams (2002) for some further discussion of both approaches. The paper begins with a review of recent advances in visualization and supervised learning techniques, such as neural network models. The use of these ideas is then demonstrated in a study of gold exploration in the Walker Lane in the western United States. Finally it is shown how the results can be applied to a quantitative analysis of exploration risk, and how improved targeting accuracy can reduce exploration costs and increase the probability of success.
Read full abstract