Genomewide data sets of single nucleotide polymorphisms (SNPs) offer great potential to improve ex situ conservation. Two factors impede their use for producing core collections. First, due to the large number of SNPs, the assembly of collections that maximize diversity may be intractable using existing, serial software algorithms. Second, the effect of the natural partitioning of the genome into linked regions, or haplotype blocks, on the optimization of collections, and the capture of diversity, is unknown. To address the first problem, we report the development of a parallel computer program, M+, for identifying optimized core collections from arbitrarily large genotypic data sets on high performance computing systems. With respect to the second problem, we use three exemplar data sets to show that, as haplotype block length increases, the number of accessions necessary to capture a predetermined proportion of genomewide haplotypic variation also increases. This relationship is asymptotic such that the minimum haplotype block length suitable for assembling core collections can be empirically determined, and the number of accessions necessary to capture a given percentage of the haplotypic diversity present in the entire collection can be estimated, even when true haplotype structure is unknown. Additionally, we test whether simple geographic or environmental information can be used to produce core collections with elevated genomewide haplotypic diversity. We find this opportunity to be limited, and dependent on natural history and improvement status.