The Human Epididymis Protein 4 (HE4) biomarker has been extensively investigated for its potential in diagnosing ovarian cancer (OC). For the application of diagnostic techniques and drug delivery, it is crucial to understand the protein tertiary structure. However, the Protein Data Bank (PDB) does not currently contain the three-dimensional (3D) structure of HE4. Therefore, an in silico analysis was conducted to model the HE4 protein using AlphaFold, I-TASSER, and Robetta servers, with the sequence retrieved from UniProt (ID: Q14508). These three servers employed deep learning algorithms, threading templates, and de novo methods, respectively. Subsequently, Molecular Dynamics (MD) simulation using the GROMACS software package improved each 3D structure model, resulting in optimized and refined structures: RF1, RF2, and RF3. PROCHECK and ERRAT programs were employed to assess the structure quality. The Ramachandran plots from PROCHECK indicated that 100% of residues were within the allowed regions for all servers except for I-TASSER. For the refined structures, RF1 and RF3, all residues were concentrated within the allowed regions. According to the ERRAT program, the RF1 model exhibited the highest overall quality factor of 97.701, followed by RF3 and AlphaFold models with scores of 94.643 and 93.750, respectively. After these validations, RF1 emerged as the most accurately predicted 3D structure of HE4 and has one tunnel identified by CAVER 3.0 tool that facilitates the transportation of small particles to the active site, supported by FTsite and PrankWeb binding site predictions. This model holds potential for various computational studies, including the development of OC diagnostic kits. It will enhance our comprehension of the interactions between the protein and other biomolecules.
Read full abstract