Abstract

For sentiment analysis in particular, the problem of processing and analyzing high-dimensional data becomes more prominent in recent past. This is where the IEL-HDDSA model, which aims to increase accuracy and performance in complex, high-dimensional data streams sentiment analysis comes into play. Iterative approach in ensemble learning; a contribution to the field. It integrates preprocessing techniques such as tokenization, stop word removal, lemmatization and the collection of sentiment-related features. Then the training corpus is divided by label, and features with high mutual information are selected. Highly replicated points of data for model training can also be identified at this point. First a Naive Bayes model is trained, then later it's placed in an ensemble as part of bagging. Its major advantage over earlier methods is that IEL-HDDSA can iteratively train on selected subsets of data until the performance in sentiment analysis for high-dimensional objects reaches an optimum level. A 10-fold cross validation method was used to rigorously evaluate the performance of this model, which showed consistently high levels of operation with almost no variation across different measures. IEL-HDDSA's precision ranged from 0.9359 to 0.9492, and its specificity was between 0. Its accuracy differed from 0.93 to around 0.95, and its F1-measure fluctuated between the values of about 0.94 and above; so here too balance was well maintained in a manner that satisfied both precision and recall requirements equally. The false alarming rate fell from 0.056 to 0.1, a fairly low ratio of incorrect positive classifications; Moreover, MCC quantities ranged from 0.8668 to 0. These results testify to the IEL-HDDSA model's stable effectiveness and high reproducibility in sentiment analysis applications, especially for massive data flows.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.