Abstract

The general problem of data integration is expressed as that of combining probability distributions conditioned to each individual datum or data event into a posterior probability for the unknown conditioned jointly to all data. Any such combination of information requires taking into account data interaction for the specific event being assessed. The nu expression provides an exact analytical representation of such a combination. This representation allows a clear and useful separation of the two components of any data integration algorithm: individual data information content and data interaction, the latter being different from data dependence. Any estimation workflow that fails to address data interaction is not only suboptimal, but may result in severe bias. The nu expression reduces the possibly very complex joint data interaction to a single multiplicative correction parameter ν0, difficult to evaluate but whose exact analytical expression is given; availability of such an expression provides avenues for its determination or approximation. The case ν0=1 is more comprehensive than data conditional independence; it delivers a preliminary robust approximation in presence of actual data interaction. An experiment where the exact results are known allows the results of the ν0=1 approximation to be checked against the traditional estimators based on assumption of data independence.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.