Abstract
The rapid development of high-throughput sequencing technology has promoted the research of metagenomic sequence. At present, although a large number of sequence classification tools have good classification performance at the genus level and above, there is still room for improvement at the species level. To solve this problem, a metagenomic sequence classification method based on one-dimensional convolutional neural network is proposed in this paper. First, a metagenomic sequence corpus is constructed and used to train word2vec for k-mer embedding. Then, the optimal k value was selected to vectorize the entire gene sequence and serve as the input layer to establish a one-dimensional convolutional neural network classification model to realize species or genus level recognition. Finally, two datasets are used to optimize the model and improve its generalization ability. Experimental results show that the classification performance of this model is almost the same as the genus level, but it improves at the species level and obtains better classification efficiency.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have