Abstract

Epigenetics is a chemical modification to DNA without changes in the base sequence. While it is known that epigenetic modifications have far reaching implications on how genes are expressed, it is difficult to identify what the modification is or where it can be found. A next-generation method of sequencing called nanopore sequencing may be the solution. Nanopore sequencing runs a voltage bias across the DNA sequence and outputs a unique electric response to each genetic unit. Epigenetic modifications may then be identified by their distinct electric response. In this paper we provide preliminary results of applying dynamic time warp Barycenter Averaging (DBA) to multiple noisy nanopore streams to generate a consensus signal that can be used to identify genetic sequences and their modifications. DBA convergence rates, time complexity, together with qualitative and quantitative metrics to compare the consensus signal with gold standards are evaluated.

Highlights

  • Bioinformatics combines elements of biology and computer science to collect and analyze complex biological data; commonly macromolecular structures, genetic sequencing, and genomic experiments, or gene expression data [1]

  • We explore dynamic time warping (DTW) Barycenter Averaging (DBA) convergence rates and time complexity, and examine several qualitative and quantitative metrics for comparing the consensus signal derived from several noisy nanostream with the original gold standards

  • We propose the following measure for the DBA convergence of the P individual streams to a consensus signal

Read more

Summary

INTRODUCTION

Bioinformatics combines elements of biology and computer science to collect and analyze complex biological data; commonly macromolecular structures, genetic sequencing, and genomic experiments, or gene expression data [1]. As DTW is not entirely dependent on synchronous timing, the time series may be warped non-linearly, their time axis stretched or shrunk, during the alignment process [3] This sequencing method may be used in conjunction with powerful data mining algorithms in an attempt to identify and characterize the epigenetic modifications. We suggest that the usefulness of this property is as important to geneticists requiring a consensus signal as it is to traders wanting to create valid alpha curves We believe that this is the first investigation of using DBA to generate a consensus signal that preserves the key features of multiple nanopore streams while avoiding a smoothed result that is characteristic of most averaging methods. A conclusion follows to summarize the key features of this paper

RELATIONSHIP BETWEEN DTW AND DBA
Gold Standard Simulation
Individual nanostream simulation
DBA convergence rate and time complexity
Qualitative DBA consensus metrics
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call