Abstract
The calling of genomic structural variants (SV) in high-throughput sequencing data necessitates prior discovery of abnormally aligned discordant read pair clusters that indicate candidate SVs. Some methods for SV discovery collect these candidate variants by heuristically searching for maximal cliques in an undirected graph, with nodes representing discordant read pairs and edges between vertices indicating that the read pairs overlap. This approach works well for identifying clusters that overlap with noisy mapping artefacts, but could miss distinct variant clusters that are created due to complex structural variants or overlapping breakpoints of distinct SVs. In this paper, we consider the minimum weight clique partition problem and its application to the problem of discordant read pair clustering. Our results demonstrate that methods which approximate or heuristically solve this problem can enhance the predictive abilities of structural variant calling algorithms.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have