The enhanced graphic matrix procedure analyzes nucleic acid and amino acid sequences for features of possible biological interest and reveals the spatial patterns of such features. When a sequence is compared to itself the technique shows regions of self-complementarity, direct repeats, and palindromic subsequences. Comparison of two different sequences, exemplified by immunoglobulin kappa light chain genes, by using colored graphic matrices showed domains of similarity, regions of divergence, and features explainable by transpositions. Analysis of mouse constant domain immunoglobulin sequences revealed self-complementary regions that can be used to fold the molecule into a structure consistent with electron microscopic observations. Computer translation of nucleic acid sequences into all possible amino acid sequences followed by graphic matrix analysis provides a way to detect the most likely protein encoding regions and can predict the correct reading frames in sequences in which splicing patterns are not defined. Application of this technique to regions of simian virus 40 and polyoma virus demonstrates the frames of translation and shows the agreement of sequences determined in separate laboratories with different virus isolates. The graphic matrix technique can also be used to assemble fragmentary sequences during determination, to display local variations in base composition, to detect distant evolutionary relationships, and to display intragenic variation in rates of evolution.
Read full abstract