The non-linear nature of deep neural networks makes it difficult to interpret the reason behind their output, thus reducing verifiability of the system where these models are applied. Understanding the patterns between activation vectors and predictions could give insight as to erroneous classifications and how to identify them. This paper explains a systematic approach to identifying the clusters with the most misclassifications or false label annotations. For this research, we extracted the activation vectors from a deep learning model, DNABERT, and visualized them using t-SNE to decode the reason behind the results that are produced. We applied K-means in a hierarchical fashion on the activation vectors for a set of training instances. We analyzed cluster mean activation vectors to find any patterns in the errors across K-means clusters. The cluster analysis revealed that the predictions were uniform, or nearly 100 percent the same, in clusters of similar activation vectors. It was found that two clusters containing most of their objects belonging to the same true class tend to be closer together than clusters of opposite classes. The means of objects of the same true label are closer if two clusters have the same predicted labels rather than opposite predicted labels, showing that the activation vectors reflect both predicted and true classes. We did a similar analysis for all 26 organisms in the dataset, showing the Euclidean distance can be used for identifying clusters with many errors. We propose a heuristic to find the clusters with a high number of misclassifications or incorrect label annotations using the vector analysis between clusters. This can aid in identifying misclassifications of DNA sequences or problems with sequence tagging.
Read full abstract