Abstract

Association analysis of amino acids in molecular sequences can reveal crucial information and knowledge for understanding structure, function and interaction of proteins. The traditional methods of association rule mining like apriori, F-P Growth etc. fail to generate appropriate patterns due to inherent uncertainty present in data. The uncertainty in sequence data caused by variation in the length of sequences and lack of parameterization lead to under prediction and over prediction of the results. In this paper an attempt has been made to develop a soft set based approach for mining fuzzy association patterns in peptide sequences of dengue virus. The fuzzy set approach is employed to incorporate the degree of relationships among amino acids due to variation in length of the sequences. The soft set approach is employed to incorporate the relationship of parameters with amino acid association patterns. The 12,581 sequences of dengue virus are downloaded from NCBI and screened for redundancy to obtain non redundant 6995 sequences. The amino acid associations are explored and analyzed using soft fuzzy approach. Also the results obtained by soft fuzzy approach are compared with the results obtained individually by ordinary, fuzzy and soft set approaches. The soft fuzzy approach is able to overcome the issue of under prediction and over prediction of the results obtained by other approaches. Also the interesting association rules have been generated to predict the structure and physico chemical properties of the peptide sequences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call