Abstract
Abstract We propose a novel approximate method for the probabilistic catalog matching problem that provides better solutions than previously used heuristics and scales well for large real-world applications. We also improve probabilistic catalog matching by including a simple but powerful prior and optimizing the posterior instead of just the likelihood as in previous formulations. Our new approach uses constrained clustering, specifically COP-KMeans, to provide near-optimal solutions in a fraction of the time of previous methods. We empirically demonstrate our constrained clustering’s efficacy through simulations and data from the Hubble Source Catalog.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have