Abstract

Using the advantages of web crawlers in data collection and distributed storage technologies, we accessed to a wealth of forestry-related data. Combined with the mature big data technology at its present stage, Hadoop’s distributed system was selected to solve the storage problem of massive forestry big data and the memory-based Spark computing framework to realize real-time and fast processing of data. The forestry data contains a wealth of information, and mining this information is of great significance for guiding the development of forestry. We conducts co-word and cluster analyses on the keywords of forestry data, extracts the rules hidden in the data, analyzes the research hotspots more accurately, grasps the evolution trend of subject topics, and plays an important role in promoting the research and development of subject areas. The co-word analysis and clustering algorithm have important practical significance for the topic structure, research hotspot or development trend in the field of forestry research. Distributed storage framework and parallel computing have greatly improved the performance of data mining algorithms. Therefore, the forestry big data mining system by big data technology has important practical significance for promoting the development of intelligent forestry.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call