This year marks the 15th anniversary of the invention of the gene expression microarray. As mRNA transcripts serve as the blueprint within cells for making proteins, measuring mRNA levels was seen as an accurate and manageable way to investigate cell and tissue processes. Those earliest microarrays in 1995 could measure 48 transcripts in parallel in plants,1 but within 1 year were scaled up to measure more than 1000 transcripts including those in human tissues. Today, these microarrays are essentially commodity items, commonly used to study human health and disease in hospitals and academic institutions, as well as in the biotechnology and pharmaceutical industry. While tens of thousands of publications have already been published referencing microarrays, this is just the start. Similar arrays are already used to probe genetic differences in DNA, but even these will soon be supplanted by whole genome sequencing, where we can expect all three billion human base pairs to be sequenced for a few thousand dollars. The exponential decrease in costs for whole genome sequencing has been described as going beyond the decline we are used to from Moore's law.2 These enormous amounts of molecular data make it clear that there is a pressing need for computational methods to analyze and interpret them. Molecular data have never been foreign to the pages of JAMIA or AMIA Symposia. The second volume of JAMIA back in 1995 contained an article describing how the internet and newly introduced world wide web could be used to facilitate genome sequencing efforts across two academic genome centers and introduced concepts like yeast artificial chromosomes, sequence tagged sites and contigs.3 The 2002 AMIA Fall Symposium suggested bioinformatics and medical informatics … Correspondence to Dr Atul J Butte, Stanford University School of Medicine, 251 Campus Drive MS-5415, Room X-163, Stanford, CA 94305-5415, USA; abutte{at}stanford.edu