Integration of residue attributes for sequence diversity characterization of terpenoid enzymes.

Nelson Kibinge,Naoaki Ono,Shigehiko Kanaya,Md Altaf-Ul-Amin,Shun Ikeda

doi:10.1155/2014/753428

Nelson Kibinge, Naoaki Ono + Show 3 more

Open Access

https://doi.org/10.1155/2014/753428

Copy DOI

Abstract

Progress in the “omics” fields such as genomics, transcriptomics, proteomics, and metabolomics has engendered a need for innovative analytical techniques to derive meaningful information from the ever increasing molecular data. KNApSAcK motorcycle DB is a popular database for enzymes related to secondary metabolic pathways in plants. One of the challenges in analyses of protein sequence data in such repositories is the standard notation of sequences as strings of alphabetical characters. This has created lack of a natural underlying metric that eases amenability to computation. In view of this requirement, we applied novel integration of selected biochemical and physical attributes of amino acids derived from the amino acid index and quantified in numerical scale, to examine diversity of peptide sequences of terpenoid synthases accumulated in KNApSAcK motorcycle DB. We initially generated a reduced amino acid index table. This is a set of biochemical and physical properties obtained by random forest feature selection of important indices from the amino acid index. Principal component analysis was then applied for characterization of enzymes involved in synthesis of terpenoids. The variance explained was increased by incorporation of residue attributes for analyses.

Highlights

Biology and other modern sciences have become data intensive and data-driven biology is a full-fledged domain of specialization among the life sciences
KNApSAcK database describes species-metabolite relationships, and within the KNApSAcK family we have developed an enzyme-reaction database called KNApSAcK motorcycle DB containing reactions and enzyme peptide sequences based on experimental evidence focusing on secondary metabolic reactions in plants
We examined the performance of principal component analysis (PCA) classification when rAAindex biochemical and physicochemical properties (BPPs) encoding is implemented, relative to the commonly used 8bit binary encoding

Summary

Introduction

Biology and other modern sciences have become data intensive and data-driven biology is a full-fledged domain of specialization among the life sciences. Computational analyses of the often heterogenous datasets require theoretical representations in forms suitable for various data processing models. This formal representation has been defined as sequence feature coding [3]. We introduce a BPP subset for encoding amino acid residue properties into protein sequences during analyses. We found that this increases the flexibility of computational analyses focusing on facets of biochemical, physical, and evolutionary attributes of sequence data. Integration of BPP is employed in examination of diversity in enzymes related to secondary metabolite pathways, those involved in terpenoid synthesis

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BioMed research international	Publication Date: Jan 1, 2014
Citations: 10	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Integration of residue attributes for sequence diversity characterization of terpenoid enzymes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioMed research international

Lead the way for us

Similar Papers

Novel Index of Polar Amino Acids Characterizing End Region of Transmembrane Helices
...
Genome Informatics | VOL. 11
, et. al. ...
01 Jan 1999
Genome Informatics | VOL. 11

Fuzzy clustering of physicochemical and biochemical properties of amino Acids
Indrajit Saha ... Ujjwal Maulik
Amino acids | VOL. 43
Indrajit Saha, et. al.Indrajit Saha ... Ujjwal Maulik
13 Oct 2011
Amino acids | VOL. 43

Novel protein weight matrix generated from amino acid indices.
Charalambos Chrysostomou ... Huseyin Seker
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference | VOL. 2015
Charalambos Chrysostomou, et. al.Charalambos Chrysostomou ... Huseyin Seker
01 Aug 2015
01 Aug 2015

Comparison of Feature Reduction Methods and Machine Learning Models for Breast Cancer Diagnosis
Todor K Avramov ... Dong Si
-
Todor K Avramov, et. al.Todor K Avramov ... Dong Si
19 May 2017
19 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Integration of residue attributes for sequence diversity characterization of terpenoid enzymes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioMed research international