On the basis of the atomic graph-theoretical index – aEAID (atomic Extended Adjacency matrix IDentification) and molecular adjacent topological index – ATID (Adjacent Topological IDentification) suggested by one of the authors (Zhang Q), a highly selective atomic topological index - aATID (atomic Adjacent Topological IDentification) index was suggested to identify the equivalent atoms in this study. The aATID index of an atom was derived from the number of the attached hydrogen atoms of the atom but omitting bond types. In this case, the suggested index can be used to identify equivalent atoms in chemistry but perhaps not equivalent in the molecular graph. To test the uniqueness of aATID indices, the virtual atomic data sets were derived from alkanes containing 15–20 carbon atoms and the isomers of Octogen, as well as a real data set was derived from the NCI database. Only four pairs of atoms from alkanes containing 20 carbons can't be discriminated by aATID, that is, four pairs of degenerates were found for this data set. To solve this problem, the aATID index was modified by introducing distance factors between atoms, and the 2-aATID index was suggested. Its uniqueness was examined by 5,939,902 atoms derived from alkanes containing 20 carbons and further 16,166,984 atoms from alkanes of 21 carbons, and no degenerates were found. In addition, another large real data set of 16,650,688 atoms derived from the PubChem database was also used to test the uniqueness of both aATID and 2-aATID. As a result, each atom was successfully discriminated by any of the two indices. Finally, the suggested aATID index was applied to the identification of duplicate atoms as data pretreatment for QSPR (Quantitative Structure–Property Relationships) studies.
Read full abstract