Exploring the Cross-Lingual Similarity of Valmiki Ramayana Using Semantic and Sentiment Analysis
The Sanskrit language holds significant importance in Indian culture because it has been extensively used in religious literature, primarily in Hinduism. Numerous ancient Hindu texts originally composed in Sanskrit have since been translated into various Indian and non-Indian languages by Indian and foreign authors. These translations offer a renewed cultural perspective and broaden the reach of Indian literature to a global audience. However, the manual translations of these religious texts often lack thorough validation. Recent advancements in semantic and sentiment analysis, powered by deep learning, have provided enhanced tools for understanding language and text. In this paper, we present a framework that uses semantic and sentiment analysis to validate the English translation of the Ramayana against its original Sanskrit version. The “Ramayana” which narrates the journey of the Rama, the king of Ayodhya, is an ancient Hindu epic written by the sage Valmiki. It is known for its contribution to human values for centuries and has universal relevance. Given the importance of Sanskrit in Indian culture and its influence on literature, understanding the translations of key texts like the Ramayana is essential. Multilingual Bidirectional Encoder Representations from Transformers (mBERT) model is utilized to analyze the selected chapters of the English and the Sanskrit versions of Ramayana. Our analysis reveals that sentiment and semantic alignment between the original Sanskrit and English translations remain consistent despite stylistic and vocabulary differences. The study also compares the findings of Bidirectional Encoder Representations from Transformers (BERT) with its other variants to examine which BERT variant is more suitable for validating Sanskrit text. The paper demonstrates the potential of deep learning techniques for cross-lingual validation of ancient texts.
- Research Article
- 10.70102/afts.2025.1833.176
- Oct 30, 2025
- Archives for Technical Sciences
Sentiment analysis has come out as an important activity in natural language processing (NLP) applications whose data analysis is in high demand at present in the modern world. The BERT (Bidirectional Encoder Representations from Transformers) algorithm has proved to be extremely efficient when it comes to sentiment analysis tasks, and its potential is far exceeding that of conventional algorithms, unlocking their potential however would require fine tuning of their hyperparameters. It is quite a feat to optimise the BERT’s various hyperparameters due to the complicated interaction between them (e.g. the learning rate, batch size, dropout rate, attention heads). In this paper, the Salp Swarm Algorithm (SSA) is used as a bio-inspired metaheuristic optimization technique to optimize the fine-tuning process. Through SSA’s exceptionally efficient search capabilities in modelling multidimensional search space, BERT hyperparameters are optimized systematically to the sentiment classification tasks. A benchmark dataset for sentiment analysis (Sentiment140) is used to evaluate the proposed model. The novelty of the presented model is the fact that it dynamically adjusts its search behaviour in response to performance signals, thus it identifies better-performing parameter sets than conventional methods, leading to successful exploitation of the BERT algorithm that has produced high performing configurations. Extensive evaluations against 3 state-of-the-art search algorithms, namely manual tuning, grid search, and random search are conducted on the Sentiment140 benchmark dataset, demonstrating the superiority of the proposed SSA BERT optimization technique over state-of-the-art methods. The SSA-BERT model achieved a maximum accuracy of 96.4 percent, which is far better than manual tuning, grid search, and random search (65.0 percent, 69.5 percent and 72.0 percent respectively). It also performed better than other existing BERT models used in related literature, which showed accuracy levels between 46.4 and 75.7 percent in accordance with different benchmarks Sentiment analysis has come out as an important activity in natural language processing (NLP) applications whose data analysis is in high demand at present in the modern world. The BERT (Bidirectional Encoder Representations from Transformers) algorithm has proved to be extremely efficient when it comes to sentiment analysis tasks, and its potential is far exceeding that of conventional algorithms, unlocking their potential however would require fine tuning of their hyperparameters. It is quite a feat to optimise the BERT’s various hyperparameters due to the complicated interaction between them (e.g. the learning rate, batch size, dropout rate, attention heads). In this paper, the Salp Swarm Algorithm (SSA) is used as a bio-inspired metaheuristic optimization technique to optimize the fine-tuning process. Through SSA’s exceptionally efficient search capabilities in modelling multidimensional search space, BERT hyperparameters are optimized systematically to the sentiment classification tasks. A benchmark dataset for sentiment analysis (Sentiment140) is used to evaluate the proposed model. The novelty of the presented model is the fact that it dynamically adjusts its search behaviour in response to performance signals, thus it identifies better-performing parameter sets than conventional methods, leading to successful exploitation of the BERT algorithm that has produced high performing configurations. Extensive evaluations against 3 state-of-the-art search algorithms, namely manual tuning, grid search, and random search are conducted on the Sentiment140 benchmark dataset, demonstrating the superiority of the proposed SSA BERT optimization technique over state-of-the-art methods. The SSA-BERT model achieved a maximum accuracy of 96.4 percent, which is far better than manual tuning, grid search, and random search (65.0 percent, 69.5 percent and 72.0 percent respectively). It also performed better than other existing BERT models used in related literature, which showed accuracy levels between 46.4 and 75.7 percent in accordance with different benchmarks.
- Research Article
16
- 10.47813/2782-5280-2024-3-1-0311-0320
- Mar 2, 2024
- Информатика. Экономика. Управление - Informatics. Economics. Management
First developed in 2018 by Google researchers, Bidirectional Encoder Representations from Transformers (BERT) represents a breakthrough in natural language processing (NLP). BERT achieved state-of-the-art results across a range of NLP tasks while using a single transformer-based neural network architecture. This work reviews BERT's technical approach, performance when published, and significant research impact since release. We provide background on BERT's foundations like transformer encoders and transfer learning from universal language models. Core technical innovations include deeply bidirectional conditioning and a masked language modeling objective during BERT's unsupervised pretraining phase. For evaluation, BERT was fine-tuned and tested on eleven NLP tasks ranging from question answering to sentiment analysis via the GLUE benchmark, achieving new state-of-the-art results. Additionally, this work analyzes BERT's immense research influence as an accessible technique surpassing specialized models. BERT catalyzed adoption of pretraining and transfer learning for NLP. Quantitatively, over 10,000 papers have extended BERT and it is integrated widely across industry applications. Future directions based on BERT scale towards billions of parameters and multilingual representations. In summary, this work reviews the method, performance, impact and future outlook for BERT as a foundational NLP technique. We provide background on BERT's foundations like transformer encoders and transfer learning from universal language models. Core technical innovations include deeply bidirectional conditioning and a masked language modeling objective during BERT's unsupervised pretraining phase. For evaluation, BERT was fine-tuned and tested on eleven NLP tasks ranging from question answering to sentiment analysis via the GLUE benchmark, achieving new state-of-the-art results. Additionally, this work analyzes BERT's immense research influence as an accessible technique surpassing specialized models. BERT catalyzed adoption of pretraining and transfer learning for NLP. Quantitatively, over 10,000 papers have extended BERT and it is integrated widely across industry applications. Future directions based on BERT scale towards billions of parameters and multilingual representations. In summary, this work reviews the method, performance, impact and future outlook for BERT as a foundational NLP technique.
- Research Article
- 10.15294/7h63ma50
- Sep 30, 2024
- Recursive Journal of Informatics
Abstract. The Covid-19 vaccine is an important tool to stop the Covid-19 pandemic, however, there are pros and cons from the public regarding this Covid-19 vaccine. Purpose: These responses were conveyed by the public in many ways, one of which is through social media such as Twitter. Responses given by the public regarding the Covid-19 vaccination can be analyzed and categorized into responses with positive, neutral or negative sentiments. Methods: In this study, sentiment analysis was carried out regarding Covid-19 vaccination originating from Twitter using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. The data used in this study is public tweet data regarding the Covid-19 vaccination with a total of 29,447 tweet data in English. Result: Sentiment analysis begins with data preprocessing on the dataset used for data normalization and data cleaning before classification. Then word vectorization was performed with TF-IDF and data classification was performed using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. From the classification results, an accuracy value of 73% was obtained for the Naïve Bayes Classifier (NBC) algorithm and 83% for the Bidirectional Encoder Representations from Transformers (BERT) algorithm. Novelty: A direct comparison between classical models such as NBC and modern deep learning models such as BERT offers new insights into the advantages and disadvantages of both approaches in processing Twitter data. Additionally, this study proposes temporal sentiment analysis, which allows evaluating changes in public sentiment regarding vaccination over time. Another innovation is the implementation of a hybrid approach to data cleansing that combines traditional methods with the natural language processing capabilities of BERT, which more effectively addresses typical Twitter data issues such as slang and spelling errors. Finally, this research also expands sentiment classification to be multi-label, identifying more specific sentiment categories such as trust, fear, or doubt, which provides a deeper understanding of public opinion.
- Research Article
1
- 10.15294/rji.v2i2.67502
- Sep 30, 2024
- Recursive Journal of Informatics
Abstract. The Covid-19 vaccine is an important tool to stop the Covid-19 pandemic, however, there are pros and cons from the public regarding this Covid-19 vaccine. Purpose: These responses were conveyed by the public in many ways, one of which is through social media such as Twitter. Responses given by the public regarding the Covid-19 vaccination can be analyzed and categorized into responses with positive, neutral or negative sentiments. Methods: In this study, sentiment analysis was carried out regarding Covid-19 vaccination originating from Twitter using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. The data used in this study is public tweet data regarding the Covid-19 vaccination with a total of 29,447 tweet data in English. Result: Sentiment analysis begins with data preprocessing on the dataset used for data normalization and data cleaning before classification. Then word vectorization was performed with TF-IDF and data classification was performed using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. From the classification results, an accuracy value of 73% was obtained for the Naïve Bayes Classifier (NBC) algorithm and 83% for the Bidirectional Encoder Representations from Transformers (BERT) algorithm. Novelty: A direct comparison between classical models such as NBC and modern deep learning models such as BERT offers new insights into the advantages and disadvantages of both approaches in processing Twitter data. Additionally, this study proposes temporal sentiment analysis, which allows evaluating changes in public sentiment regarding vaccination over time. Another innovation is the implementation of a hybrid approach to data cleansing that combines traditional methods with the natural language processing capabilities of BERT, which more effectively addresses typical Twitter data issues such as slang and spelling errors. Finally, this research also expands sentiment classification to be multi-label, identifying more specific sentiment categories such as trust, fear, or doubt, which provides a deeper understanding of public opinion.
- Research Article
- 10.37394/232032.2024.2.15
- Jun 24, 2024
- Financial Engineering
In the ever-changing world of financial markets, understanding investor behavior and making informed decisions relies heavily on sentiment analysis. This study delves into the integration of traditional techniques, such as the Loughran- McDonald dictionary, with advanced natural language processing (NLP) methods utilizing BERT (Bidirectional Encoder Representations from Transformers). The goal is to enhance the accuracy and depth of sentiment analysis in financial reports.To begin, we employ the specialized Loughran-McDonald dictionary designed for financial sentiment analysis. This lexicon includes domainspecific word lists for positive and negative sentiments, forming a solid foundation for sentiment scoring. Expanding on this foundation, we incorporate BERT, an advanced transformerbased NLP model. BERT’s contextual understanding of language and ability to capture intricate semantic relationships within financial texts aim to overcome the limitations of rule-based sentiment analysis. The methodology involves preprocessing financial reports, integrating Loughran-McDonald sentiment scores, and fine-tuning BERT for financial sentiment classification. This hybrid approach leverages both the domain expertise encoded in the dictionary and BERT’s contextual comprehension of financial jargon and nuances. We validate and evaluate our implementation using a diverse dataset comprising quarterly earnings releases, annual reports, and other relevant disclosures. Performance metrics such as precision, recall, and F1 score are analyzed to assess the effectiveness of our hybrid approach compared to individual methods. The findings have significant implications for financial analysts, investors, and policymakers by providing a more nuanced understanding of sentiment in financial reports. Our hybrid approach aims to offer improved accuracy in capturing sentiment polarity while facilitating more informed decision-making in today’s complex and dynamic realm of financial markets.
- Research Article
- 10.54254/2755-2721/2025.kl22289
- Apr 24, 2025
- Applied and Computational Engineering
With the steady growth of social media and online platforms, sentiment analysis has become a critical task to understand public opinion, customer feedback, and social trends. This study investigates multi-modal sentiment analysis by exploring state-of-the-art models such as Bidirectional encoder representations from transformers (BERT), Bootstrapping Language-Image Pretraining (BLIP), Generative Image-to-text Transformer (GIT), and Contrastive Language-Image Pretraining (CLIP) for sentiment classification tasks using both image and text data. The effectiveness of these models on sentiment analysis tasks is evaluated under different configurations such as text-only BERT, image-to-text augmented BERT, and CLIP-based classification. The results show that while BERT achieves 76% accuracy on text-only sentiment analysis, combining text with image-generated descriptions does not significantly improve performance, with accuracy remaining around 74%. On the other hand, CLIP achieves a moderate 62% accuracy using image-text embeddings. While CLIP performs well on multi-modal mapping, it demonstrates challenges in deep semantic understanding compared to BERTs 0.76 F1 score on the text-only task. These findings highlight the challenges of effectively merging different modalities and point out future directions for improving sentiment analysis in multi-modal settings, enhancing the ability of models to fully understand the semantic content in both images and text.
- Research Article
13
- 10.11591/ijeecs.v29.i3.pp1817-1826
- Mar 1, 2023
- Indonesian Journal of Electrical Engineering and Computer Science
Sentiment analysis on views and opinions expressed in Indian regional languages has become the current focus of research. But, compared to a globally accepted language like English, research on sentiment analysis in Indian regional languages like Malayalam are very low. One of the major hindrances is the lack of publicly available Malayalam datasets. This work focuses on building a Malayalam dataset for facilitating sentiment analysis on Malayalam texts and studying the efficiency of a pre-trained deep learning model in analyzing the sentiments latent in Malayalam texts. In this work, a Malayalam dataset has been created by extracting 2,000 tweets from Twitter. The bidirectional encoder representations from transformers (BERT) is a pretrained model that has been used for various natural language processing tasks. This work employs a transformer-based BERT model for Malayalam sentiment analysis. The efficacy of BERT in analyzing the sentiments latent in Malayalam texts has been studied by comparing the performance of BERT with various machine learning models as well as deep learning models. By analyzing the results, it is found that a substantial increase in accuracy of 5% for BERT when compared with that of Bi-GRU, which is the next bestperforming model.
- Research Article
2
- 10.7759/cureus.88902
- Jul 28, 2025
- Cureus
Natural language processing (NLP) has become an essential tool in healthcare, enabling sentiment analysis to extract insights from patient reviews, clinician notes, and medical research. This study evaluates the effectiveness of three NLP models - Bidirectional Encoder Representations from Transformers (BERT), Valence Aware Dictionary and sEntiment Reasoner (VADER), and Flair - in analyzing patient sentiment from physician reviews. A total of 1,486 reviews of 30 pain management specialists in Atlanta, GA, were collected from Healthgrades, with sentiment scores derived from each model and compared to patient-provided numerical ratings.Statistical analyses, including pairwise t-tests, Pearson correlation, and logistic regression, were conducted to assess each model’s performance. Results showed significant differences among models (P < 0.05), with Flair demonstrating the highest correlation with patient ratings (r = 0.80), followed by BERT (r = 0.74) and VADER (r = 0.59). Logistic regression analysis further supported Flair's superior predictive accuracy.These findings highlight the potential of sentiment analysis in healthcare, offering an objective lens to interpret subjective patient experiences. Future research should focus on refining NLP models for medical contexts, integrating multimodal sentiment analysis, and addressing ethical considerations in patient data handling. By leveraging sentiment analysis, healthcare systems may improve patient satisfaction assessment, identify early signs of mental health concerns, and reduce documentation bias.While the results are promising, this study is limited by its retrospective design, single geographic region, and reliance on publicly available online reviews, which may not reflect the broader patient population or clinical encounters. Real-world validation in diverse settings and prospective studies is necessary to confirm the clinical applicability of these models.
- Research Article
8
- 10.3390/electronics13224507
- Nov 17, 2024
- Electronics
This study examines how sentiment analysis of environmental, social, and governance (ESG) news affects the financial performance of companies in innovative sectors such as mobility, technology, and renewable energy. Using approximately 9828 general ESG articles from Google News and approximately 140,000 company-specific ESG articles, we performed term frequency-inverse document frequency (TF-IDF) analysis to identify key ESG-related terms and visualize their materiality across industries. We then applied models such as bidirectional encoder representations from transformers (BERT), the robustly optimized BERT pretraining approach (RoBERTa), and big bidirectional encoder representations from transformers (BigBird) for multiclass sentiment analysis, and distilled BERT (DistilBERT), a lite BERT (ALBERT), tiny BERT (TinyBERT), and efficiently learning an encoder that classifies token replacements accurately (ELECTRA) for positive and negative sentiment identification. Sentiment analysis results were correlated with profitability, cash flow, and stability indicators over a three-year period (2019–2021). ESG ratings from Morgan Stanley Capital International (MSCI), a prominent provider that evaluates companies’ sustainability practices, further enriched our analysis. The results suggest that sentiment impacts financial performance differently across industries; for example, positive sentiment correlates with financial success in mobility and renewable energy, while consumer goods often show positive sentiment even with low environmental ESG scores. The study highlights the need for industry-specific ESG strategies, especially in dynamic sectors, and suggests future research directions to improve the accuracy of ESG sentiment analysis.
- Research Article
25
- 10.1186/s12911-022-01946-y
- Jul 30, 2022
- BMC Medical Informatics and Decision Making
BackgroundGiven the increasing number of people suffering from tinnitus, the accurate categorization of patients with actionable reports is attractive in assisting clinical decision making. However, this process requires experienced physicians and significant human labor. Natural language processing (NLP) has shown great potential in big data analytics of medical texts; yet, its application to domain-specific analysis of radiology reports is limited.ObjectiveThe aim of this study is to propose a novel approach in classifying actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer BERT-based models and evaluate the benefits of in domain pre-training (IDPT) along with a sequence adaptation strategy.MethodsA total of 5864 temporal bone computed tomography(CT) reports are labeled by two experienced radiologists as follows: (1) normal findings without notable lesions; (2) notable lesions but uncorrelated to tinnitus; and (3) at least one lesion considered as potential cause of tinnitus. We then constructed a framework consisting of deep learning (DL) neural networks and self-supervised BERT models. A tinnitus domain-specific corpus is used to pre-train the BERT model to further improve its embedding weights. In addition, we conducted an experiment to evaluate multiple groups of max sequence length settings in BERT to reduce the excessive quantity of calculations. After a comprehensive comparison of all metrics, we determined the most promising approach through the performance comparison of F1-scores and AUC values.ResultsIn the first experiment, the BERT finetune model achieved a more promising result (AUC-0.868, F1-0.760) compared with that of the Word2Vec-based models(AUC-0.767, F1-0.733) on validation data. In the second experiment, the BERT in-domain pre-training model (AUC-0.948, F1-0.841) performed significantly better than the BERT based model(AUC-0.868, F1-0.760). Additionally, in the variants of BERT fine-tuning models, Mengzi achieved the highest AUC of 0.878 (F1-0.764). Finally, we found that the BERT max-sequence-length of 128 tokens achieved an AUC of 0.866 (F1-0.736), which is almost equal to the BERT max-sequence-length of 512 tokens (AUC-0.868,F1-0.760).ConclusionIn conclusion, we developed a reliable BERT-based framework for tinnitus diagnosis from Chinese radiology reports, along with a sequence adaptation strategy to reduce computational resources while maintaining accuracy. The findings could provide a reference for NLP development in Chinese radiology reports.
- Research Article
7
- 10.48084/etasr.10625
- Jun 4, 2025
- Engineering, Technology & Applied Science Research
Text classification is a fundamental task in Natural Language Processing (NLP) with a wide range of applications such as sentiment analysis, document classification and content recommendation. Traditional approaches like Naive Bayes (NB), Support Vector Machine (SVM) and Random Forest (RF) relied on feature engineering but lacked contextual understanding. Deep learning came into the picture for text classification with transformer models such as Bidirectional Encoder Representations from Transformers (BERT), which could understand contextual words bidirectionally. In this article, we utilize a pre-trained BERT model fine-tuned on the Reuters-21578 dataset to classify news articles. We aim to measure the performance of transfer learning against common machine learning models and non-fine-tuned BERT. The fine-tuned model achieves 91.77% accuracy, which significantly outperforms the non-fine-tuned BERT, and performs better than classical classifiers such as NB, SVM and RF. The results show that fine-tuning allows BERT to contextualize domain-specific intricacies, resulting in improved classification performance. We also address the computational trade-offs associated with transformer models, highlighting the need for optimal methods for deployment. Thus, this study further enables the use of fine-tuned BERT in automatic news classification and is of significant value for information retrieval and content personalization.
- Conference Article
37
- 10.18653/v1/2021.nllp-1.22
- Jan 1, 2021
Bidirectional Encoder Representations from Transformers (BERT) has achieved state-of-the-art performances on several text classification tasks, such as GLUE and sentiment analysis. Recent work in the legal domain started to use BERT on tasks, such as legal judgement prediction and violation prediction. A common practise in using BERT is to fine-tune a pre-trained model on a target task and truncate the input texts to the size of the BERT input (e.g. at most 512 tokens). However, due to the unique characteristics of legal documents, it is not clear how to effectively adapt BERT in the legal domain. In this work, we investigate how to deal with long documents, and how is the importance of pre-training on documents from the same domain as the target task. We conduct experiments on the two recent datasets: ECHR Violation Dataset and the Overruling Task Dataset, which are multi-label and binary classification tasks, respectively. Importantly, on average the number of tokens in a document from the ECHR Violation Dataset is more than 1,600. While the documents in the Overruling Task Dataset are shorter (the maximum number of tokens is 204). We thoroughly compare several techniques for adapting BERT on long documents and compare different models pre-trained on the legal and other domains. Our experimental results show that we need to explicitly adapt BERT to handle long documents, as the truncation leads to less effective performance. We also found that pre-training on the documents that are similar to the target task would result in more effective performance on several scenario.
- Research Article
4
- 10.54254/2755-2721/92/20241711
- Oct 9, 2024
- Applied and Computational Engineering
This research study emphasizes sentiment analysis and examines Natural Language Processing (NLP) by Bidirectional Encoder Representations from Transformers (BERT). BERT's bidirectional Transformer architecture pre-trained utilizes Next Sentence Prediction (NSP) and Masked Language Modeling (MLM) and has achieved a lot in terms of AI transformation. This paper provides a description of the BERT design, pre-training methods, and fine-tuning for sentiment analysis tasks. The study goes ahead and compares BERT's performance with other deep learning models, machine learning algorithms, and traditional rule-based techniques, highlighting the latter's limited ability to handle linguistic nuances and context. Additionally, studies proving the consistency and accuracy of BERT's sentiment analysis are examined, along with the challenges of handling irony, sarcasm, and domain-specific data. Ethical and privacy concerns that sentiment analysis inherently raises and makes recommendations for further research are also examined in the study, which also shows how integrating sentiment analysis with other domains can lead to multidisciplinary breakthroughs that can offer more comprehensive insights and applications.
- Research Article
13
- 10.17762/turcomat.v12i7.3055
- Apr 19, 2021
- Turkish Journal of Computer and Mathematics Education (TURCOMAT)
The latest trend in the direction of sentiment analysis has brought up new demand for understanding the contextual representation of the language. Among the various conventional machine learning and deep learning models, learning the context is the promising candidate for the sentiment classification task. BERT is a new pre-trained language model for context embedding and attracted more attention due to its deep analyzing capability, valuable linguistic knowledge in the intermediate layer, trained with larger corpus, and fine-tuned for any NLP task. Many researchers adapted the BERT model for sentiment analysis tasks by influencing the original architecture to get better classification accuracy. This article summarizes and reviews BERT architecture and its performance observed from fine-tuning different layers and attention heads.
- Conference Article
2
- 10.5644/pi2024.215.19
- Sep 1, 2024
Sentiment analysis leverages machine learning and natural language processing techniques to classify and interpret textual data, identifying sentiments as positive, negative, or neutral. This study explores sentiment analysis in the context of mental health, utilising two models: Logistic Regression and Bidirectional Encoder Representations from Transformers (BERT). The dataset comprises 52 680 unique statements associated with seven mental health statuses, including depression, anxiety and suicidal tendencies. Logistic Regression achieved an accuracy of 72%, while BERT, with its advanced deep learning architecture, demonstrated a significant improvement with an accuracy of 84%. BERT’s superior performance is attributed to its bidirectional contextual understanding and attention mechanisms, enhancing its ability to handle complex and nuanced textual information. This study highlights the efficacy of BERT over traditional models in analysing and classifying sentiments related to mental health, underscoring its potential for improving early detection and intervention.