Hybrid MM/SVM structural sensors for stochastic sequential data

Brian Roux,Stephen Winters-Hilt

doi:10.1186/1471-2105-9-s9-s12

Brian Roux, Stephen Winters-Hilt

Open Access

https://doi.org/10.1186/1471-2105-9-s9-s12

Copy DOI

Abstract

In this paper we present preliminary results stemming from a novel application of Markov Models and Support Vector Machines to splice site classification of Intron-Exon and Exon-Intron (5' and 3') splice sites. We present the use of Markov based statistical methods, in a log likelihood discriminator framework, to create a non-summed, fixed-length, feature vector for SVM-based classification. We also explore the use of Shannon-entropy based analysis for automated identification of minimal-size models (where smaller models have known information loss according to the specified Shannon entropy representation). We evaluate a variety of kernels and kernel parameters in the classification effort. We present results of the algorithms for splice-site datasets consisting of sequences from a variety of species for comparison.

Highlights

Introduction and backgroundWe are exploring hybrid methods where Markov-based statistical profiles, in a log likelihood discriminator framework, are used to create a fixed-length feature vector for Support Vector Machine (SVM) based classification
This analysis was critical to identifying information-rich sequence regions around the splice site locations, and are used in defining the positional range of positionally defined Markov Models (pMM's) needed in the SVM classification that follows
It is found that the positions identified in the low Entropy (lEnt) regions carry information about the splice site which a trained SVM can classify with high accuracy

Summary

Introduction

Introduction and backgroundWe are exploring hybrid methods where Markov-based statistical profiles, in a log likelihood discriminator framework, are used to create a fixed-length feature vector for Support Vector Machine (SVM) based classification. The individual-observation log odds ratios are themselves constructed from positionally defined Markov Models (pMM's), so what results is a pMM/SVM sensor method. This method may have utility in a number of areas of stochastic sequential analysis that are being actively researched, including splice-site recognition and other types of gene-structure identification, file recovery in computer forensics ('file carving'), and speech recognition. We test our pMM/SVM method on an interesting discrimination problem in gene-structure identification: splicesite recognition In this situation the pMM/SVM approach leads to evaluation of the log odds ratio of an observed stochastic sequence, for splice-site and not, by Chow expansion decomposition, with vectorization rather than sum of the log odds ratios of the conditional probabilities on individual observations

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Aug 12, 2008
Citations: 14	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Hybrid MM/SVM structural sensors for stochastic sequential data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Analysis of digitized waveforms using Shannon entropy
Michael S Hughes
The Journal of the Acoustical Society of America | VOL. 93
Michael S HughesMichael S Hughes
01 Feb 1993
The Journal of the Acoustical Society of America | VOL. 93

Organization of the Human β-Adducin Gene (ADD2)
Diana M Gilligan ... Adam Silberfein
Genomics | VOL. 43
Diana M Gilligan, et. al.Diana M Gilligan ... Adam Silberfein
01 Jul 1997
Genomics | VOL. 43

Flow Characteristics and Shannon Entropy Analysis of Dense‐Phase Pneumatic Conveying of Pulverized Coal with Variable Moisture Content at High Pressure
C Liang ... C.‐S Zhao
Chemical Engineering & Technology | VOL. 30
C Liang, et. al.C Liang ... C.‐S Zhao
01 Jul 2007
Chemical Engineering & Technology | VOL. 30

Characterization of dynamic behavior of a spout-fluid bed with Shannon entropy analysis
Wenqi Zhong ... Mingyao Zhang
Powder Technology | VOL. 159
Wenqi Zhong, et. al.Wenqi Zhong ... Mingyao Zhang
29 Aug 2005
Powder Technology | VOL. 159

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hybrid MM/SVM structural sensors for stochastic sequential data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics