Abstract

BackgroundGenome browsers are widely used for locating interesting genomic regions, but their interactive use is obviously limited to inspecting short genomic portions. An ideal interaction is to provide patterns of regions on the browser, and then extract other genomic regions over the whole genome where such patterns occur, ranked by similarity.ResultsWe developed SimSearch, an optimized pattern-search method and an open source plugin for the Integrated Genome Browser (IGB), to find genomic region sets that are similar to a given region pattern. It provides efficient visual genome-wide analytics computation in large datasets; the plugin supports intuitive user interactions for selecting an interesting pattern on IGB tracks and visualizing the computed occurrences of similar patterns along the entire genome. SimSearch also includes functions for the annotation and enrichment of results, and is enhanced with a Quickload repository including numerous epigenomic feature datasets from ENCODE and Roadmap Epigenomics. The paper also includes some use cases to show multiple genome-wide analyses of biological interest, which can be easily performed by taking advantage of the presented approach.ConclusionsThe novel SimSearch method provides innovative support for effective genome-wide pattern search and visualization; its relevance and practical usefulness is demonstrated through a number of significant use cases of biological interest. The SimSearch IGB plugin, documentation, and code are freely available at https://deib-geco.github.io/simsearch-app/ and https://github.com/DEIB-GECO/simsearch-app/.

Highlights

  • Genome browsers are widely used for locating interesting genomic regions, but their interactive use is obviously limited to inspecting short genomic portions

  • Search, visualization, comparison, and biological interpretation The search method implemented in our SimSearch IGB App allows efficiently looking for patterns ofgenomic regions within tracks loaded in the Integrated Genome Browser, and visualizing, analyzing, and biologically interpreting the obtained results

  • After loadinggenomic region datasets in IGB tracks, either from files, a Distributed Annotation System (DAS) server, or a Quickload server (Fig. 2a), the user can define a query pattern “model” based on loaded tracks; it can be a selection of tracks or of specific regions on the tracks (Fig. 2b), for example, peak regions related to histone marks or transcription factors visualized in IGB

Read more

Summary

Results

Search, visualization, comparison, and biological interpretation The search method implemented in our SimSearch IGB App allows efficiently looking for patterns of (epi)genomic regions within tracks loaded in the Integrated Genome Browser, and visualizing, analyzing, and biologically interpreting the obtained results. Some studies identified several patterns of histone mark combinations as associated with specific chromatin states in some cell lines [7, 13] Starting from such patterns, with our SimSearch IGB plugin it is possible to search efficiently for them, or for similar patterns, genomewide in different datasets loaded as IGB tracks; this permits, for instance, to infer the regulation state of genomic regions in different cell types or cell lines under different conditions. Running SimSearch (with default parameters), we obtained 9,117 results, with a matching score over 0.9 To assess their relevance, we annotated them with the chromatin states calculated by ChromHMM on the same dataset, loaded in IGB as an additional track (see Table 2 and Supplementary Material section S6 for details about annotating pattern search results with an IGB track); 7,800 matchings (85% of the total ones) covered 9,892 regions annotated as Enhancer (Enh) by ChromHMM (Fig. 2h). We implemented a search on DNA loops that can help identifying additional matchings on track elements brought together by DNA-DNA interaction (see Supplementary Material section S7 and Table 1)

Conclusions
Background
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call