Most methods for inferring phylogenies assume that the data consist of a series of variables measured across a series of taxa. This general scheme holds whether these variables are measurements of quantitative characters, discretely coded morphological states, gene frequencies, or amino acids or nucleoti'des in a sequence. Sometimes, however, the data are in the form of a table of all pairwise distances among the taxa. For some kinds of data, such as immunological distances or measurements of DNA hybridization, the data are originally collected as pairwise measures of difference between the taxa. In other cases, the investigator has computed distances from the original data table. These cases are distinguished by the possibility, in the latter case, of discarding the distances and returning to an analysis of the original data. The first papers using distance methods to infer phylogenies were of the latter type. Fitch and Margoliash ( 1967) started with amino acid sequences of cytochrome c, then reduced the data to a table of the percentages of sites differing between each pair of species. Cavalli-Sforza and Edwards (1 967) developed a distance measure based on gene frequency data, presenting a least-squares method based on the distances. The readiness with which distance approaches were adopted by these authors owed in part to the popularity of Sokal and Sneath's (1963) clustering methods for the mathematically related, but logically distinct, task of erecting classifications. However, phylogenetic distance methods have their own rationale, independent of any analogy to clustering: they are not intrinsically phenetic as opposed to phylogenetic methods. Since these original papers, a number of other distance methods have been introduced (Hartigan, 1967; Farris, 1972; Moore et al., 1973; Beyer et al., 1974; Tateno et al., 1982; Chakraborty, 1977; Sattath and Tversky, 1977; Waterman et al., 1977; Fitch, 1981). It is not my intention to review all of these methods here: I have described them briefly elsewhere (Felsenstein, 1982). What all have in common is that they try to find a rooted or unrooted tree, usually interpreted as a phylogeny, which most closely fits the observed distances. They differ in the measure of fit of the distances to the tree and the constraints they impose on the tree. All of the methods consider that there is a perfect fit of a tree to the data if all of the observed distances are sums of the lengths of the intervening branches of the tree. Such a perfect fit is shown in Figure 1, which shows a table of distances which can be perfectly fit by a particular tree, and a rooted version of that tree. Each branch of the tree has a length indicated next to it, and the distance between each pair of tips is in this example simply the sum of the lengths of the intervening branches. Thus the distance between tips A and D is 5 + 10 + 2 + 8 = 25. For any tree proposed, for each pair of tips we can compute the sum of the lengths of the branches between them. This quantity, which we call dij', should be close to the observed distance dij if the tree is a good fit. Different methods use different formulas for assessing the goodness of fit. For example, the methods of Fitch and Margoliash (1967) and Cavalli-