Abstract

Terahertz (THz) waves characterized by low energy, instantaneity and proficiency in spectral analysis, have promising future in material identification. The main dimensionality reduction methods of Terahertz-based material identification, including Principal Component Analysis (PCA), Local Preserving Projection (LPP) and Local Linear Embedding (LLE), are sensitive to the number of nearest neighbor samples and neglect difference among classes, thus making it difficult to design the subsequent clustering model or leading to incorrect clustering. The t-distributed Stochastic Neighbor Embedding (t-SNE), which regards the sample distribution in the high-dimension as a Gaussian distribution and the coordinates in the low-dimension as t-distribution, makes the distance of clusters with long distance longer and then relieve their congestion. Besides, in the traditional Fuzzy C-means (FCM) clustering methods, the initial clustering center is randomly determined, so the problem of local optimum is easy to appear, which is prone to cause wrong recognition. To solve the above problems, an improved FCM algorithm is proposed in this paper for Terahertz spectral recognition. Firstly, t-SNE method is used for dimensional reduction and for the selection of initial clustering center for a more accurate clustering effect. On this basis, classical FCM clustering is used to recognize different substances through Terahertz spectrum. The algorithm can not only relieve the congestion among classes in the process of clustering, but also reflect the distance there for an appropriate cluster center in samples. In order to verify the reliability of the proposed method, the Terahertz time-domain spectroscopy is used to detect three genetically modified cotton seeds of lumianyan28, lumianyan29 and lumianyan36 respectively, obtaining their time-domain spectral data. It is the proposed method, which is used to analyze the spectral data, that successfully distinguishes three different types of transgenic cotton seeds, with a total accuracy of 0.9668. The result shows that the clustering method proposed in the paper has a bright prospect in identifying the Terahertz spectrum of materials.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call