BackgroundThe ability to establish root nodule symbioses is restricted to four different plant orders. Soil actinobacteria of the genus Frankia can establish a symbiotic relationship with a diverse group of plants within eight different families from three different orders, the Cucurbitales, Fagales and Rosales. Phylogenetically, Frankia strains can be divided into four clusters, three of which (I, II, III) contain symbiotic strains. Members of Cluster II nodulate the broadest range of host plants with species from four families from two different orders, growing on six continents. Two Cluster II genomes were sequenced thus far, both from Asia.ResultsIn this paper we present the first Frankia cluster II genome from North America (California), Dg2, which represents a metagenome of two major and one minor strains. A phylogenetic analysis of the core genomes of 16 Frankia strains shows that Cluster II the ancestral group in the genus, also ancestral to the non-symbiotic Cluster IV. Dg2 contains the canonical nod genes nodABC for the production of lipochitooligosaccharide Nod factors, but also two copies of the sulfotransferase gene nodH. In rhizobial systems, sulfation of Nod factors affects their host specificity and their stability.ConclusionsA comparison with the nod gene region of the previously sequenced Dg1 genome from a Cluster II strain from Pakistan shows that the common ancestor of both strains should have contained nodABC and nodH. Phylogenetically, Dg2 NodH proteins are sister to rhizobial NodH proteins. A glnA-based phylogenetic analysis of all Cluster II strains sampled thus far supports the hypothesis that Cluster II Frankia strains came to North America with Datisca glomerata following the Madrean-Tethyan pattern.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3140-1) contains supplementary material, which is available to authorized users.