Abstract

In order to help Washington State interpret the data about Vespa mandarinia provided by the public report, and enable government agencies to adopt corresponding strategies to prioritize correct reports when resources are limited, for further investigation, this article establishes two targeted models: The first unsupervised probability prediction model. First, extract the text information of misjudgment classification in the data set, and carry out preprocessing. The data set is divided into training set and test set according to the ratio of 8:2, and the Latent Dirichlet Allocation model is trained using the misjudgment classification information in the training set. After the model training is completed, this paper makes a probability prediction on the data on the test set, and evaluates the robustness of the model through the accuracy rate on the test set. The second text similarity matching model is based on feature dimensionality reduction and extracting feature keywords as vectors. The TF-IDF algorithm is used to calculate the weight of each feature keyword in the vector to form a standard bag-of-words vector for the correct witnessing of the Vespa mandarinia report. Judge by the similarity of text similarity matching model.

Highlights

  • IntroductionWashington State has set up a helpline and website to collect the sightings of Vespa mandarinia

  • Vespa mandarinia are native to temperate and tropical East Asia.December 2019 The species first appeared in Washington.Washington State Department of Agriculture said that Vespa mandarinia, as an invasive species, will have a negative impact on the environment, economy and public health in Washington State

  • When the probability of making a mistake does not exceed 0.14, this article believes that Topic[1] is the most suitable text topic that other bees mistakenly believe that Asian Vespa mandarinia, that is, Topic[1] can represent the Negtive ID category; Table 3

Read more

Summary

Introduction

Washington State has set up a helpline and website to collect the sightings of Vespa mandarinia. Most reported sightings mistake other wasps, such as European bumblebee and cicada killer, because they are similar in size, shape and color. This makes it more difficult to interpret the data reported by the public. With limited resources of government agencies, how to give priority to these public reports for further investigation is an urgent problem to be solved. Considering the background information and restricted conditions identified, we need to solve the following problems: 1.Create, analyze, and discuss models that predict the possibility of misclassification using data set files such as spreadsheets of sighting reports published by the Washington Department of Agriculture

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call