Abstract
This paper presents a meticulously curated dataset tailored for textual sentiment analysis within the realm of technical education, falling under the domain of Natural Language Processing and Pattern Recognition. The dataset, crafted in collaboration with the All India Council for Technical Education (AICTE), encompasses over 14,000 records manually entered by representatives from technical institutes across India over the course of one year. The data, hosted on AICTEʼs in-house servers, has been categorized into seven distinct labels, including Appreciation, Complaint, Support, Suggestion, among others. Through a detailed data collection process facilitated by an online application, this dataset serves as a cornerstone for sentiment analysis within the domain of technical education. Notably, it is the first publicly available dataset of its kind, providing a rich resource for evaluating existing models and fostering the development of novel ones. The dataset, consisting of 14,272 records, is further enhanced with classification into 10 distinct modules, offering a nuanced understanding across various aspects of technical education. The paper outlines the experimental design, materials, and methods employed in the data collection process, along with its limitations and ethical considerations. Additionally, the paper acknowledges the contributions of the All India Council for Technical Education in facilitating the data collection process. This dataset holds significant value in advancing research and applications in sentiment analysis and related fields within the domain of technical education.Approximately 10,000 technical institutions in India operate under the jurisdiction of the All India Council for Technical Education (AICTE). To gather comprehensive data, an extensive online application with ten modules has been devised and distributed to all 10,000 institutions. Over the course of a year, labeled data has been systematically collected from these technical institutions, categorizable into seven distinct types such as Appreciation, Complaint, Support, Suggestion, and more. This rich dataset holds significant potential for applications in deep learning, including sentiment analysis and classification problems. The data, meticulously entered by representatives from these technical institutions, is stored in a highly accurate and systematic process. This dataset, characterized by its precision and reliability, stands as an excellent resource for both training and testing purposes in various deep-learning models. Its suitability extends to applications such as sentiment analysis, where the quality and authenticity of the data are crucial for robust model development.As best of our knowledge this is first kind of dataset in domain of technical education with 14,000 + samples. Very few Multiclass Multi-labeled dataset are available therefore this dataset is very much useful in applications of Natural Language Processing like Sentiment Analysis etc.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have