Abstract
AbstractThe complex syntactic structure of Turkish text makes sentiment analysis in natural language processing (NLP) a challenging task. Conventional sentiment analysis methods often fail to effectively identify attitudes in Turkish texts, creating an urgent need for more efficient approaches. To fill this need, our study investigates the effectiveness of embedding techniques including pre-trained Turkish models such as Word2Vec, GloVe, and FastText in addition to two character-level embedding methods, namely, character-integer embedding (CIE) and character one-hot encoding embedding (COE), in conjunction with deep learning models specifically long short-term memory (LSTM), convolution neural networks (CNNs), bidirectional LSTM (Bi-LSTM), and hybrid models, for Turkish short-texts sentiment analysis. DL-based models were investigated on two datasets (e.g., an original Twitter (X) dataset and an accessible hotel reviews dataset). In addition to providing an intensive performance analysis of different embedding strategies and assessing their efficacy in dealing with the linguistic intricacies of Turkish, this study proposed a previously unexplored method in Turkish text representation that relies on a character-level one-hot encoding technique. The obtained findings indicate positive progress using a novel approach utilizing a dual-pathway architecture for both character level and word level that constitutes a substantial contribution to the area of natural language processing (NLP), specifically in the context of complex morphological languages. By employing a hybrid strategy that combines character and word levels on Twitter (X) data, the LSTM model obtained anF1 score of$$0.835 \pm 0.005$$0.835±0.005concerning cross-validation while CNN-BiLSTM attained the highestF1 Score (0.8392) using holdout validation. This strategy consistently produced modest improvements across the second public dataset (hotel reviews dataset) by emerging as the runner-up embedding technique in effectiveness, surpassed only by FastText. Findings provide practical recommendations for practitioners on how to effectively use sentiment analysis to make informed decisions by introducing an extensive performance analysis of the use of embedding techniques and deep learning models for sentiment analysis in Turkish texts, which is crucial in the current age of data analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.