Abstract

A notion and a measure of linguistic complexity introduced earlier (Trifonov, 1990) were originally used for analysis of nucleotide sequences. This measure was shown to reflect multiplicity of codes (messages) of different natures superimposed in the sequences. Unlike human language texts, genetic texts are ‘read’ by cellular mechanisms in several different ways, each time using a different selection of the characters of the same text while skipping others (Trifonov, 1989). Human texts are read in one way only, sequentially and involving all characters (one code). The conceptual significance and essence of the idea on the multiplicity of overlapping codes in genetic sequences, as opposed to human languages, is discussed. The linguistic complexity technique allows a calculation to be made of the structural complexity of any linear sequence of characters irrespective of whether the text is cognized or presently undeciphered. The texts (sequences) are compared exclusively from the point of view of their structural complexity with no reference to the meaning of the texts which is beyond the scope of this article. Results of such a comparison of protein sequences with various texts, written in English, Italian and Welsh are presented. The human texts are found to be structurally simpler than genetic (protein) texts, reflecting, apparently, a difference in the reading modes: single code versus many codes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.