A large amount of data being generated from different sources and the analyzing and extracting of useful information from these data becomes a very complex task. The difficulty of dealing with big data arises from many factors such as the high number of features, existence of lost data, and variety of data. One of the most effective solutions that used to overcome the huge amount of big data is the feature reduction process. In this paper, a set of hybrid and efficient algorithms are proposed to classify the datasets that have large feature size by merging the genetic algorithms with the artificial neural networks. The genetic algorithms are used as a prestep to significantly reduce the feature size of the analyzed data before handling that data using machine learning techniques. Reducing the number of features simplifies the task of classifying the analyzed data and enhances the performance of the machine learning algorithms that are used to extract valuable information from big data. The proposed algorithms use a new gene-weight mechanism that can significantly enhance the performance and decrease the required search time. The proposed algorithms are applied on different datasets to pick the most relative and important features before applying the artificial neural networks algorithm, and the results show that our proposed algorithms can effectively enhance the classifying performance over the tested datasets.
Read full abstract