Everyone would agree that metagenomics has been a great boon to the field of environmental microbiology. Fuelled by major advances in sequencing technology, the number of metagenome projects has exploded in recent years, with hundreds of environmental samples having been interrogated by shotgun sequencing (Markowitz et al., 2008; Meyer et al., 2008; Liolios et al., 2009). As a result, while just a few years ago it was possible for an individual investigator to be familiar with the major shotgun metagenomic data sets, today there are far too many to easily recite. Therefore we argue that the time is ripe for developing and implementing a metagenome classification system. Why classify metagenomes? The ability to extract, study and understand information from genomic data depends heavily on comparative analysis, and metagenomic data are no exception. Yet the appropriate comparisons to make are much less clear for metagenomes than for genomes, where the choice of comparison can be guided by phylogenetic classification. Moreover, even if the type of environmental studies one would want to compare is known, it still remains difficult to know how many and which are available given the lack of systematic nomenclature describing these projects (i.e. standardized naming) or categorization. For example, if you were looking for metagenomes from organisms in the digestive tracts of various animals, they might be named ‘gut’ but could also be ‘rumen’, ‘forestomach’, ‘caecum’ or ‘faecal’ communities. Currently metagenomic projects are not systematically classified. NCBI’s metagenomic project catalogue has implemented a simple and general project type distinction between ‘environmental’ and ‘host-associated’ projects (named correspondingly as Ecological and Organismal). This shallow classification is a starting point but does not address the many other environmental features potentially of interest for comparison. In order to circumvent the present difficulty in identifying appropriate metagenomic projects for comparative analysis, we present here a fivetiered metagenome naming and classification scheme. The top level includes the broad NCBI categories, but we also add a third ‘engineered’ category that separates out manipulated communities such as bioreactors or treatment plants from natural environmental communities (Fig. 1). Each of these is then subcategorized according to a variety of criteria, taking into account knowledge of key variables that influence community composition [e.g. salinity (Lozupone and Knight, 2007) or soil pH (Lauber et al., 2009)]. Where possible, we have taken advantage of existing classification systems such as the Environment Ontology (EnvO; http://www.environmentontology.org/). Environmental communities are separated by the ecosystem category (aquatic, terrestrial, air) and ecosystem type (e.g. freshwater, marine) with more detailed categorizations based on specific features (e.g. salinity, pH). Host-associated communities are defined by host phylogeny, then sampling site; and finally engineered communities are classified by their function (e.g. bioremediation or food production) with further levels based on specific substrates or features. In some cases an individual ‘project’ may span multiple categories because it includes samples from different habitat types. A sampling of the higher-level categories is shown in Table 1, and the complete proposed schema is available from GOLD (Genomes OnLine Database, http://www.genomesonline.org/cgi-bin/ GOLD/bin/metagenomic_classification.cgi) and IMG/M (http://img.jgi.doe.gov/m/). Although we developed this schema to address an immediate need within these databases, we hope that it will provide the basis for a broadly *For correspondence. E-mail nckyrpides@lbl.gov; Tel. (+1) 925 296 5718; Fax (+1) 925 296 5720. Environmental Microbiology (2010) 12(7), 1803–1805 doi:10.1111/j.1462-2920.2010.02270.x