Probabilistic multi-catalogue positional cross-match

F.-X Pineau,S R Rosen,L Michel,A Nebot Gómez-Morán,S Derriere,F J Carrera,B Mingo,A Ruiz Camuñas,F Genova,C Motch,A Mints

doi:10.1051/0004-6361/201629219

Abstract

Context. Catalogue cross-correlation is essential to building large sets of multi-wavelength data, whether it be to study the properties of populations of astrophysical objects or to build reference catalogues (or timeseries) from survey observations. Nevertheless, resorting to automated processes with limited sets of information available on large numbers of sources detected at different epochs with various filters and instruments inevitably leads to spurious associations. We need both statistical criteria to select detections to be merged as unique sources, and statistical indicators helping in achieving compromises between completeness and reliability of selected associations. Aims. We lay the foundations of a statistical framework for multi-catalogue cross-correlation and cross-identification based on explicit simplified catalogue models. A proper identification process should rely on both astrometric and photometric data. Under some conditions, the astrometric part and the photometric part can be processed separately and merged a posteriori to provide a single global probability of identification. The present paper addresses almost exclusively the astrometrical part and specifies the proper probabilities to be merged with photometric likelihoods. Methods. To select matching candidates in n catalogues, we used the Chi (or, indifferently, the Chi-square) test with 2(n−1) degrees of freedom. We thus call this cross-match a χ-match. In order to use Bayes’ formula, we considered exhaustive sets of hypotheses based on combinatorial analysis. The volume of the χ-test domain of acceptance – a 2(n−1)-dimensional acceptance ellipsoid – is used to estimate the expected numbers of spurious associations. We derived priors for those numbers using a frequentist approach relying on simple geometrical considerations. Likelihoods are based on standard Rayleigh, χ and Poisson distributions that we normalized over the χ-test acceptance domain. We validated our theoretical results by generating and cross-matching synthetic catalogues. Results. The results we obtain do not depend on the order used to cross-correlate the catalogues. We applied the formalism described in the present paper to build the multi-wavelength catalogues used for the science cases of the Astronomical Resource Cross-matching for High Energy Studies (ARCHES) project. Our cross-matching engine is publicly available through a multi-purpose web interface. In a longer term, we plan to integrate this tool into the CDS XMatch Service.

Highlights

The development of new detectors with high throughput over large areas has revolutionized observational astronomy during recent decades
We applied the formalism described in the present paper to build the multi-wavelength catalogues used for the science cases of the Astronomical Resource Cross-matching for High Energy Studies (ARCHES) project
ARCHES was originally focusing on the cross-matching of XMM-Newton sources, the algorithms developed in this context are clearly applicable to any combination of catalogues and energy bands

Summary

Introduction

The development of new detectors with high throughput over large areas has revolutionized observational astronomy during recent decades These technological advances, aided by a considerable increase of computing power, have opened the way to outstanding ground-based and space-borne all-sky or very large area imaging projects (e.g. the 2MASS Skrutskie et al 2006; Cutri et al 2003; SDSS Ahn et al 2012, 2013; and WISE Wright et al 2010; Cutri et al 2014, surveys). At the 2020 horizon, European space missions such as Gaia and Euclid together with the Large Synoptic Survey Telescope (LSST) will provide a several-fold increase in the number of catalogued optical objects while providing measurements of exquisite astrometric and photometric quality This exponentially increasing flow of high quality multiwavelength data has radically altered the way astronomers design observing strategies and tackle scientific issues. ARCHES was originally focusing on the cross-matching of XMM-Newton sources, the algorithms developed in this context are clearly applicable to any combination of catalogues and energy bands (see for example Mingo et al 2016)

Going beyond the two-catalogue case

Simplifying assumptions

Notations

Classical positional errors in catalogues

Candidates selection: the χ-match

Estimation of the real position given n observations

Candidates selection criterion

Iterative form: catalogue by catalogue

Iterative form: by groups of catalogues

Summary and Interpretation

Hypotheses from combinatorial considerations

Generalities

Possible combinations and the Bell number

Two-catalogues case: two hypotheses

Three-catalogues case: five hypotheses

Frequentist estimation of spurious associations rates and priors

Case of two catalogues

Case of three catalogues

Case of n catalogues

Probability of being χ-matched under hypothesis hi

General formula

Case of four catalogues

Advantage and limits

10.1. Warning about the non independence of positional uncertainties

10.2. Probability using the Mahalanobis distance

10.3. Putting aside the Mahalanobis distance

12. Tests on synthetic catalogues

13. Summarized recipe

14. Conclusions

A BC obs A BC theo

Sum of two χ functions

Estimating proper motions

Simple case: no covariance

Findings

Testing the unique source hypothesis

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Astronomy & Astrophysics	Publication Date: Jan 1, 2017
Citations: 56	License type: cc-by

R Discovery Prime

R Discovery Prime

Probabilistic multi-catalogue positional cross-match

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Astronomy & Astrophysics

Lead the way for us

Similar Papers

Astrometric and photometric data fusion for inactive space object mass and area estimation
Richard Linares ... Tom Kelecy
Acta Astronautica | VOL. 99
Richard Linares, et. al.Richard Linares ... Tom Kelecy
06 Nov 2013
Acta Astronautica | VOL. 99

Some Thoughts About Data Type, Distribution, and Statistical Significance
D Scot Malay
The Journal of Foot and Ankle Surgery | VOL. 45
D Scot MalayD Scot Malay
01 Nov 2006
The Journal of Foot and Ankle Surgery | VOL. 45

Orbital analysis of Algol AB, C from combined astrometric, photometric, and radial velocity data
P J Bachmann ... J L Hershey
The Astronomical Journal | VOL. 80
P J Bachmann, et. al.P J Bachmann ... J L Hershey
01 Oct 1975
The Astronomical Journal | VOL. 80

Machine Learning and Big Data for Optimization of Administrative Law (Computing Experience)
Egor Viktorovich Trofimov ... Oleg Gennad'Evich Metsker
Административное и муниципальное право | VOL. -
Egor Viktorovich Trofimov, et. al.Egor Viktorovich Trofimov ... Oleg Gennad'Evich Metsker
01 Apr 2022
Административное и муниципальное право | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Probabilistic multi-catalogue positional cross-match

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Astronomy & Astrophysics