Abstract

BackgroundThe biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many questions in biology, they are not an easy solution to apply. Building a GPM is not a simple task for an end user. Moreover, applying GPMs is also impeded by the insidious fact that the “complex web of interacting factors” inherent to a problem might be easy to define and also intractable to compute upon.DiscussionWe propose that the visualization sciences can contribute to many domains of the bio-sciences, by developing tools to address archetypal representation and user interaction issues in GPMs, and in particular a variety of GPM called a Conditional Random Field(CRF). CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data.ConclusionsIn this manuscript we examine the shared features of several biological problems that are amenable to modeling with CRFs, highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects.Software and tutorials are available at http://www.stickwrld.org/

Highlights

  • The biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM)

  • In this manuscript we examine the shared features of several biological problems that are amenable to modeling with Conditional random field (CRF), highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects

  • Results of biological application We have applied these ideas in the development of a prototype visualization system, StickWRLD, and used this system in collaboration with several biological labs to create novel and powerful statistical models that are being used for productive work today

Read more

Summary

Discussion

CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data

Conclusions
Background
Results and discussion
Conclusion
Methods
25. Ray WC
31. Ozer HG
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call