Abstract

BioTechniquesVol. 40, No. 3 Technology NewsOpen AccessComputational BiologyLynne LedermanLynne LedermanSearch for more papers by this authorPublished Online:21 May 2018https://doi.org/10.2144/06403TN01AboutSectionsPDF/EPUB ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinkedInRedditEmail Back to the FutureBroadly defined, computational biology is the use of computers to solve biologic problems. Richard W. Pastor, Laboratory of Biophysics, U.S. Food and Drug Administration (FDA) Center for Biologics Evaluation and Research, Rockville, MD, sees computational biology as encompassing several fields, including bioinformatics, systems biology, and his own area of research, molecular dynamics simulation. “How do you analyze all the data in the huge databases coming out?” he asks. “The tools needed include high-powered statistics, e.g., Bayesian statistics and high-powered computer searches.” But computational biology is not just about analysis for its own sake. “What do people want to know about? Diseases people have. Going back to systematics and reanalyzing evolutionary trees, seeing where branches split. We cannot only go back with new tools and data, but go forward to medical applications. There are questions from 100 years ago we can now solve.”Molecular MachinesAndrew Neuwald, Associate Professor, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, performs computational analyses of proteins that are well conserved across major groups of living organisms. Certain residues have been conserved for over a billion years, suggesting they have critical functions. He has developed procedures for classifying and estimating selective constraints on specific residues and analyzing corresponding molecular interactions. Although proteins in large classes of well-conserved families (e.g., the AAA+ class of ATPases, signaling GTPases, and protein kinases) are most amenable to this analysis, he also uses it for smaller classes (e.g., DNA sliding clamps).“We've had an industrial revolution in biology,” notes Neuwald, “with experimental factories creating massive amounts of experimental data which is not hypothesis-driven, including the genome project, microarrays, and two-hybrid experiments. A fundamental question is how to maintain scientific rigor and apply the scientific method rather than ad hoc approaches.”Neuwald asks, “How can you apply a scientific method approach to non-hypothesis-driven science, like looking at mountains of sequence data? I think to ensure scientific rigor, you need to employ Bayesian statistics.” Rather than assume a hypothesis is true and trying to disprove it, this approach considers every hypothesis at once and doesn't assume what the true model is. In fields such as systems biology and signal transduction, the single hypothesis approach breaks down as an unfeasible number of possible models could be set up and results consistent with the hypothesis could be false.For Neuwald, proteins are the simplest biologic system one can study. “All the residues impart some function, but the system is something more than the sum of its parts. The scientific reductionist approach is to isolate components and see how they work, but this doesn't work for proteins. Bayesian statistics can infer and take into account properties of the system.”“Bayesian statistics gives a blurry probabilistic view of a system. Physicists working with quantum mechanics are used to thinking this way,” Neuwald observes. “Biologists have to start thinking this way. The role of statisticians is underappreciated.”Modeling MembranesPastor's work is based on the observation that Newton's equations of motion, used to describe the orbits of planets around the sun, can also describe all motions in fluids other than chemical reactions or motions in superfluids. To create molecular dynamics simulations in biologic systems, all atoms in a membrane are treated with Newton's equations of motion solved for a certain length of time. Atypical simulation involving 20,000 to 50,000 atoms requires solving about 100,000 differential equations. “These simulations can take 3 months on pretty big clusters of computers. We are looking at 10 to 15 nanoseconds of the process, and the membrane probably takes a microsecond to fuse, and maybe a second to fold, so we are looking at early stages,” Pastor says.Image 1. Packing of lipids around the hemagglutinin fusion peptide from a molecular dynamic simulation.(Purple = choline groups, green = phosphate groups for lipids; hydrophilic residues in red and hydrophobic residues in yellow for the peptides.)Courtesy of Patrick Lagüe, Departments of Biochemistry and Microbiology, Université Laval, Québec, Canada and Richard W. Pastor, Laboratory of Biophysics, Center for Biologics Evaluation and Research, FDA, Rockville MD, USA.Some of his group's work involves looking at the interaction of influenza virus hemagglutinin fusion protein interaction with the cell membrane. Because of its polar residues, the influenza virus hemagglutinin fusion peptide can only partly bury itself into the top leaflet of the lipid bi-layer. To avoid an unfavorable vacuum region, the lipids’ adjacent lipids curl under the peptide, while those on the opposing leaflet extend into the upper leaflet. They speculate that these events trigger positive membrane curvature and eventual cell fusion. A long-range plan is to simulate bigger membranes and look at how the virus fuses, eventually testing drugs to prevent fusion. Other applications are simulating protein folding.Image 2. Schematic of how hemagglutinin fusion peptide (red) might induce a positive curvature in the lipid bilayer prior to cell membrane fusion.Courtesy of Patrick Lagüe, Departments of Biochemistry and Microbiology, Université Laval, Québec, Canada and Richard W. Pastor, Laboratory of Biophysics, Center for Biologics Evaluation and Research, FDA, Rockville, MD, USA.“What's really limiting, although computers are so much faster now, is that we still need more. It would help if more people had access to IBM's Blue Gene supercomputers with a reasonable budget. We need Blue Gene capabilities for $100,000, a little faster, and a lot cheaper,” Pastor says. He recalls that in 1993, he published a study simulating a few hundred picoseconds of movement on an IBM 3090. “Now we can model 300 nanoseconds using six systems in clusters in 2 to 3 months. Computer times get faster, but mostly they get cheaper and we run in clusters in parallel.”Family TreesSusan Katherine Pell, Laboratory Manager, Brooklyn Botanic Garden, Brooklyn, NY, is a molecular plant systematist. She reconstructs evolutionary relationships of plants by sequencing their DNA, focusing on the cashew family (Anacardiaceae), which includes mango, pistachio, poison ivy, poison oak, and poison sumac. Many members of this family cause contact dermatitis. Poison ivy, oak, and sumac are all closely related to each other, but molecular systematic investigation shows that other members of the family have evolved their toxic chemicals several times.“We do a lot of computation and data gathering when we analyze sequence data,” she says. Her laboratory uses several different programs to capture sequence data, align gene fragments within a species, then align multiple samples from multiple species to create phylogenetic trees, which may number in the thousands. The final step is to create a consensus of these trees by making multiple comparisons of the branching patterns. She is looking at mitochondrial genes, a first for the cashew family, as well as a chloroplast gene, among others. One byproduct of her work is finding new genera or species, or ones that have been known but misclassified. In addition, she sometimes looks deliberately for new genera or species, such as when a colleague asks her to examine material from a herbarium collection.“There definitely are people who get upset about reclassification, but the majority of systematists want to use all the tools that are available,” Pell observes. “It's possible to extract DNA from herbarium specimens. It's harder than with fresh specimens, which are optimal. My success rate is about 70%,” she notes. “Other people have done extractions with specimens up to 100 years old.”Image 3. Generation of a chloroplast gene phylogeny of cashew family members via maximum likelihood analysis of the trnL intron and 3′ exon and the trnL-trnF intergenic spacer.Courtesy of Susan Pell, Brooklyn Botanic Garden.Pell loves what she does, which combines work in the molecular biology laboratory with field work. “The best thing about being a botanist is traveling around the world, especially when I know a plant only from a herbarium specimen and can see it in its environment. When you look at herbarium specimens of closely related plants and then can see them growing, you can appreciate that they are obviously closely related, and that really brings it home.” She particularly enjoys seeing how local people use and manage plants. Much of her field work takes place in Madagascar. One cashew family member, marula (Sclerocarya birrea), which is native to sub-Saharan Africa, is an increasingly important crop plant used for many products. “It's exciting,” she says. “I get to see the tree in its native environment and climb it. The Brooklyn Botanic Garden is wonderfully supportive of both my field-based and lab-based research. It's amazing to work here.”Future DirectionsClearly, computational biology requires multidisciplinary skills. One program that is addressing the need to train a new generation of scientists in the field is the Computational Biology Doctoral Program at New York University. Supported by the National Science and funded by its IGERT (Integrative Graduate Education and Research Traineeship) Program, it involves the interaction of seven different divisions and departments, including biology, chemistry, mathematics, computer science, neuroscience, and medicine. Says its director, Tamar Schlick, who holds appointments in the Departments of Chemistry, Mathematics, and Computer Science, “I think it's a really great field with a lot of growth and application potential. The new generation of computational biologists will be better trained. Prior generations of biologists had to learn computer science on the side, and vice versa. The new interdisciplinary NYU program will allow scientists to tackle problems with multiple scientific tools and new perspectives.”Pastor thinks that in 10 years, one could use computer simulation to do more ingenious things, for example, to devise methods for vaccine development. “But vaccine development has been done for the last 200 years without computers,” he concedes. “In the medical field, the available technology is still better for diagnostics than therapeutics. The most exciting thing is to be able to look at certain chemotherapies with proteomics and see changes in the metabolism of cells in 12 hours instead of taking up to a month to see changes in cell morphology.”FiguresReferencesRelatedDetails Vol. 40, No. 3 Follow us on social media for the latest updates Metrics History Published online 21 May 2018 Published in print March 2006 Information© 2006 Author(s)PDF download

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call