Abstract

Biology is a challenging and complicated mess. Understanding this challenging complexity is the realm of the biological sciences: Trying to make sense of the massive, messy data in terms of discovering patterns and revealing its underlying general rules. Among the most powerful mathematical tools for organizing and helping to structure complex, heterogeneous and noisy data are the tools provided by multivariate statistical analysis (MSA) approaches. These eigenvector/eigenvalue data-compression approaches were first introduced to electron microscopy (EM) in 1980 to help sort out different views of macromolecules in a micrograph. After 35 years of continuous use and developments, new MSA applications are still being proposed regularly. The speed of computing has increased dramatically in the decades since their first use in electron microscopy. However, we have also seen a possibly even more rapid increase in the size and complexity of the EM data sets to be studied. MSA computations had thus become a very serious bottleneck limiting its general use. The parallelization of our programs—speeding up the process by orders of magnitude—has opened whole new avenues of research. The speed of the automatic classification in the compressed eigenvector space had also become a bottleneck which needed to be removed. In this paper we explain the basic principles of multivariate statistical eigenvector-eigenvalue data compression; we provide practical tips and application examples for those working in structural biology, and we provide the more experienced researcher in this and other fields with the formulas associated with these powerful MSA approaches.

Highlights

  • The electron microscope (EM) instrument, initially developed by Ernst Ruska in the early nineteen thirties [1], became a routine scientific instrument during the nineteen fifties and sixties

  • In the nineteen sixties and seventies, the early days of single-particle electron microscopy, the main specimen preparation approach used for investigating the structure of biological macromolecules was the negative stain technique in which the samples were contrasted with heavy metal salts like uranyl acetate [3] [4]

  • Over the past three decades, multivariate statistical analysis approaches have been very successful in helping us sort out complex EM data sets in many different ways, and examples have been discussed such as a first “manifold” separation into two functional states [54]

Read more

Summary

Introduction

The electron microscope (EM) instrument, initially developed by Ernst Ruska in the early nineteen thirties [1], became a routine scientific instrument during the nineteen fifties and sixties. In the nineteen sixties and seventies, the early days of single-particle electron microscopy, the main specimen preparation approach used for investigating the structure of biological macromolecules was the negative stain technique in which the samples were contrasted with heavy metal salts like uranyl acetate [3] [4]. In those days, the standard way of interpreting the structures was to come up with an intuitively acceptable three-dimensional arrangement of subunits that would fit with the observed (noisy) molecular images. Electron tomography of single particles had been proposed by Hoppe and his co-workers [8], due to the radiation-sensitivity of biological macromolecules to electrons, it is not feasible to expose a biological molecule to the dose required to reveal the 3D structure from one hundred different projection images of the same molecule

Objectives
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call