Abstract
Gene prioritization aims at identifying the most promising candidate genes among a large pool of candidates-so as to maximize the yield and biological relevance of further downstream validation experiments and functional studies. During the past few years, several gene prioritization tools have been defined, and some of them have been implemented and made available through freely available web tools. In this study, we aim at comparing the predictive performance of eight publicly available prioritization tools on novel data. We have performed an analysis in which 42 recently reported disease-gene associations from literature are used to benchmark these tools before the underlying databases are updated. Cross-validation on retrospective data provides performance estimate likely to be overoptimistic because some of the data sources are contaminated with knowledge from disease-gene association. Our approach mimics a novel discovery more closely and thus provides more realistic performance estimates. There are, however, marked differences, and tools that rely on more advanced data integration schemes appear more powerful. yves.moreau@esat.kuleuven.be Supplementary data are available at Bioinformatics online.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.