This article introduces constNJ (constrained neighbor-joining), an algorithm for phylogenetic reconstruction of sets of trees with constrained pairwise rooted subtree-prune-regraft (rSPR) distance. We are motivated by the problem of constructing sets of trees that must fit into a recombination, hybridization, or similar network. Rather than first finding a set of trees that are optimal according to a phylogenetic criterion (e.g., likelihood or parsimony) and then attempting to fit them into a network, constNJ estimates the trees while enforcing specified rSPR distance constraints. The primary input for constNJ is a collection of distance matrices derived from sequence blocks which are assumed to have evolved in a tree-like manner, such as blocks of an alignment which do not contain any recombination breakpoints. The other input is a set of rSPR constraint inequalities for any set of pairs of trees. constNJ is consistent and a strict generalization of the neighbor-joining algorithm; it uses the new notion of maximum agreement partitions (MAPs) to assure that the resulting trees satisfy the given rSPR distance constraints.

}, keywords = {2010, Algorithms, Center-Authored Paper, Computational Biology, Computer Simulation, Databases, Genetic, HIV, Phylogeny, Public Health Sciences Division, Sequence Alignment}, issn = {1557-8666}, author = {Matsen, Frederick A} }