Abstract

With the coming of the era of big data, traditional entity recognition technologies have been unable to effectively finish data preprocessing due to large scale of power grid data and complex volume type features. The rising of Hadoop technologies in these years can deal with big data processings better. Therefore, this paper proposes a power big data entity recognition algorithm based on Hadoop. It applies the discretization algorithm to select higher information accuracy discrete points and put forward a discretization evaluation indicator. In the end, we finish entity recognition of the monitoring data of wind turbines on Hadoop platform.Experimental results show that the proposed algorithm performs well in terms of correctness and breakpoint number experiments and it has a good speed-up ratio. The proposed algorithm can apply to power large data entity recognition processing.

Highlights

  • Along with the advance of information and communication technology, digitization and informatization have been deeply penetrated into every aspect of our lives

  • We presents a large data entity recognition algorithm based on information accuracy (ERBIA) under the background of electric big data

  • In view of the large power data attributes, in this paper we propose a big data entity recognition algorithm based on information accuracy (ERBIA) on the basis of information theory

Read more

Summary

Introduction

Along with the advance of information and communication technology, digitization and informatization have been deeply penetrated into every aspect of our lives. Power big data entity recognition accurately identify different entities belonging to the same entity name or attributes and clustering in a given data set. It makes each entity in the decision-making of power grid can be more valuable to identify. In literature [1] the author proposes big data entity recognition algorithm based on parallel machines. There are seldom studies in entity recognition efficiency of the large data oriented technology Most of these methods are aiming at the tuple and string. We presents a large data entity recognition algorithm based on information accuracy (ERBIA) under the background of electric big data. We obtain better processing scheme effectiveness and efficiency for power big data

Entity recognition discretization scheme for power big data describing
Definition of information accuracy
Improved discretization of the evaluation index
Experimental analysis
Correctness
Breakpoint number analysis
Speedup ratio
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call