This issues features papers about mathematical genetics, data compression, and data assimilation in oceanography. 1. Computer algebra can help to predict the genetic makeup of a population. This is what we can conclude from the first paper, “Buchberger's Algorithm and the Two-Locus, Two-Allele Model.” With reproduction rates and probability of chromosome recombination as parameters, and one variable for each chromosome type, one can set up a system of polynomial equations. The solution value for a variable indicates how often this chromosome type occurs in the population. The question is now, over all parameter choices, how many different equilibrium solutions can there be? In other words, in the long term, can a chromosome type occur with any possible frequency? Arthur Copeland's answer is no: the number of solutions is almost always finite (and conjectured to be 15) in the special case when a population contains at most four different chromosome types. He arrives at his answer by means of affine varieties, ideals, and Gröbner bases. He also shows that the equilibrium solutions vary smoothly with the parameters. 2. The second paper, “A Direct Formulation for Sparse PCA Using Semidefinite Programming,” by Alexandre d'Aspremont, Laurent El Ghaoui, Michael I. Jordan, and Gert R. G. Lanckriet, is concerned with principal component analysis (PCA), a technique for compressing data so that as little information as possible is lost. PCA accomplishes this by finding directions of maximum variance in the data. In data analysis of gene expressions, for instance, different elements of a direction vector correspond to different genes. However, interpreting the meaning of a direction is much easier if only a few genes contribute to the direction. Such a direction vector has many zero elements and is called sparse. The authors propose a semidefinite programming method that maximizes variance as well as sparsity. The method is computationally efficient and robust. 3. The topic of the third paper, “A Reduced-Order Kalman Filter for Data Assimilation in Physical Oceanography,” by D. Rozier, F. Birol, E. Cosme, P. Brasseur, J. M. Brankart, and J. Verron, is a technique used by oceanographers to increase the accuracy of numerical models for predicting ocean circulation. Data assimilation combines the output of a simulation step with observed data and error statistics and then returns this “corrected” output to the simulation as the basis for the next step. Data assimilation based on Kalman filters assumes that measurement and model errors are unbiased and Gaussian. This is expensive, however, because an error covariance matrix must be constructed at every step. A reduced-order Kalman filter (called SEEK) is a cheaper version that reduces (or compresses) the size of the covariance matrix—the same key idea as in the paper described above. The authors give a very readable account of the issues associated with so-called ocean global circulation models: choice of vertical coordinate system (depending on the ocean region: shallow coastal shelf, steeply sloping ocean floors, stratified regions); limitations of models for ocean circulation simulations; sequential data assimilation techniques; adaptation of Kalman filters to geophysical applications; and modes of data acquisition (via satellites, ships, floats, mooring networks). The effectiveness of the SEEK filter is illustrated with numerical simulations of the Gulf Stream and the Bay of Biscay.
Read full abstract