Abstract

DNase I Hypersensitive sites (DHS) are the regions that are sensitive to cleavage by the DNase I enzyme. Knowledge regarding these sites is helpful for decryption of the functions of non-coding genomic regions. Various biological processes need its intervention. With the exclusive growth of newly discovered peptide sequences, machine learning (ML) based predictors are highly encouraged for accurate and timely identification of DHS. Various ML-based techniques such as support vector machine, random forest and K-nearest neighbor showed complementary role in developing sequence-based predictors for studying DHS for drug discovery. Therefore, this review conducts a comprehensive and comparative analysis of the ML-based state-of-the-art predictors from various aspects such as sequence representation methods, ML models, cross-validation and assessment parameters. Moreover, the inherent weaknesses and adopted schematic views of the existing predictors are thoroughly discussed. Finally, we provided future perspectives and guidelines for improving prediction performance and development of robust DHS models. With this review, we anticipate that it would be helpful to research community to develop robust DHS predictors for rapid screening and discrimination of DHS and non- DHS for drug design and clinical use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call