Abstract

Structural variations (SVs) play an essential role in the evolution of human genomes and are associated with cancer genetics and rare disease. High-throughput chromosome capture (Hi-C) technology probed all genome-wide crosslinked chromatin to study the spatial architecture of chromosomes. Hi-C read pairs can span megabases, making the technology useful for detecting large-scale SVs. So far, the identification of SVs from Hi-C data is still in the early stages with only a few methods available. Especially, no algorithm has been developed that can detect SVs without control samples. Therefore, we developed HiSV (Hi-C for Structural Variation), a control-free method for identifying large-scale SVs from a Hi-C sample. Inspired by the single image saliency detection model, HiSV constructed a saliency map of interaction frequencies and extracted saliency segments as large-scale SVs. By evaluating both simulated and real data, HiSV not only detected all variant types, but also achieved a higher level of accuracy and sensitivity than existing methods. Moreover, our results on cancer cell lines showed that HiSV effectively detected eight complex SV events and identified two novel SVs of key factors associated with cancer development. Finally, we found that integrating the result of HiSV helped the WGS method to identify a total number of 94 novel SVs in two cancer cell lines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call