We have developed Procrustes, a free, open-source, cross-platform, and user-friendly Python library implementing a wide-range of algorithmic solutions to Procrustes problems. The goal of Procrustes analysis is to find an optimal transformation that makes two matrices as close as possible to each other, where the matrices are often (but need not always be) a list of multidimensional points specifying the systems of interest. We demonstrate the functionality of the package through various examples, mostly from cheminformatics. However, Procrustes analysis has broad applicability including image recognition, signal processing, data science, machine learning, computational biology, chemistry, and physics. Our library includes methods for one-sided Procrustes problems using orthogonal, rotational, symmetric, and permutation transformation matrices, as well as two-sided Procrustes problems using orthogonal and permutation transformation matrices. For the two-sided permutation Procrustes problem, we include heuristic algorithms along with a rigorous (but slow) method based on softassign. In addition, we include a general formulation of the Procrustes problem. The Procrustes source code and documentation is hosted on GitHub (https://github.com/theochem/procrustes). Program summaryProgram Title:ProcrustesCPC Library link to program files:https://doi.org/10.17632/57dkchhjbp.1Developer's repository link:https://github.com/theochem/procrustesLicensing provisions: GNU General Public License v3.0Programming language: PythonSupplementary material: Summary of Implemented Procrustes AlgorithmsNature of problem: The generic Procrustes problem aims to find the transformation (e.g., rotation, permutation, scaling, etc.) of a matrix (often constructed as a list of data points) which minimizes its distance to another matrix. This quantifies the “true” similarity between the two entities represented by the matrices. While this mathematical problem occurs in many contexts, it is most prevalent in the context of point-set registration, where two sets of points are aligned by rotating (and/or permuting) one set of points. Other applications include the traditional (linear) assignment problem and the quadratic assignment problem, where the transformation matrix is a permutation. The Kabsch algorithm for molecular structure alignment is also equivalent to rotational Procrustes.Solution method: The Procrustes library implements explicit solutions for the one-sided orthogonal, rotational, and symmetric Procrustes problems and uses the Hungarian algorithm for the one-sided permutation Procrustes problem. In addition to the explicit solution for the two-sided orthogonal Procrustes with two transformations, approximate algorithms for the two-sided orthogonal Procrustes with one transformation and two-sided permutation Procrustes with one transformation are provided. For the two-sided permutation Procrustes with one transformation, several new heuristics are implemented and an accurate (but computationally slow) method based on softassign algorithm is provided. In addition, a brute-force combinatorial approach can be used, albeit only for small matrices. Translation, rotation and scaling of matrices can be automatically treated with Procrustes functionality.Additional comments including restrictions and unusual features: A very broad range of Procrustes problems is treated with a common framework, including some unconventional problems like symmetric Procrustes and two-sided Procrustes problems. Several new heuristics methods for the two-sided permutation Procrustes problem with one transformation are provided, along with a robust softassign approach. The software is intended to work for matrices of moderate size.
Read full abstract