Abstract
BackgroundProteins are integral part of all living beings, which are building blocks of many amino acids. To be functionally active, amino acids chain folds up in a complex way to give each protein a unique 3D shape, where a minor error may cause misfolded structure. Genetic disorder diseases i.e. Alzheimer, Parkinson, etc. arise due to misfolding in protein sequences. Thus, identifying patterns of amino acids is important for inferring protein associated genetic diseases. Recent studies in predicting amino acids patterns focused on only simple protein misfolded disease i.e. Chromaffin Tumor, by association rule mining. However, more complex diseases are yet to be attempted. Moreover, association rules obtained by these studies were not verified by usefulness measuring tools.ResultsIn this work, we analyzed protein sequences associated with complex protein misfolded diseases (i.e. Sickle Cell Anemia, Breast Cancer, Cystic Fibrosis, Nephrogenic Diabetes Insipidus, and Retinitis Pigmentosa 4) by association rule mining technique and objective interestingness measuring tools. Experimental results show the effectiveness of our method.ConclusionAdopting quantitative experimental methods, this work can form more reliable, useful and strong association rules i. e. dominating patterns of amino acid of complex protein misfolded diseases. Thus, in addition to usual applications, the identified patterns can be more useful in discovering medicines for protein misfolded diseases and thereby may open up new opportunities in medical science to handle genetic disorder diseases.
Highlights
Frequent Patterns (FP) are small patterns that repeatedly occur in a database, specially high in bio-sequences
Protein misfolding may occur due to an unwanted mutation in their amino acids or because of an error in the folding process. The relationship between these amino acids is very vital in case of protein misfolded diseases
The aim of this paper was to analyze protein sequences associated with complex protein misfolded diseases (i.e. Sickle Cell Anemia, Breast Cancer, Cystic Fibrosis, Nephrogenic Diabetes Insipidus and Retinitis Pigmentosa-4) and identify frequent patterns among their amino acids
Summary
Frequent Patterns (FP) are small patterns that repeatedly occur in a database, specially high in bio-sequences. Amino acids chain folds up in complex way to give each protein a unique 3D shape. Protein misfolding may occur due to an unwanted mutation in their amino acids or because of an error in the folding process. The relationship between these amino acids is very vital in case of protein misfolded diseases. Proteins are integral part of all living beings, which are building blocks of many amino acids. Amino acids chain folds up in a complex way to give each protein a unique 3D shape, where a minor error may cause misfolded structure. Identifying patterns of amino acids is important for inferring protein associated genetic diseases. Recent studies in predicting amino acids patterns focused on only simple protein misfolded disease i.e. Chromaffin Tumor, by association rule mining. Association rules obtained by these studies were not verified by usefulness measuring tools
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have