Abstract

To know the information from the internet searching is one of the most important part for any user. In case of ‘Syntactic Search’ keyword based matching technique is used. Search accuracy is improved applying the filter like location, preference, user-history etc. However, it can happen that the user query or question and the best available answer or result in the internet domain has no terms in common or ignorable number of terms is common. In such case syntactic search cannot give the desired output. The role of ‘Semantic Search’ becomes prevalent in this scenario. The execution of semantic search faces challenge due to unavailability of resources like WordNet, Ontology, Annotation etc. An end to end algorithm is described to improve the accuracy of the semantic search in this work. Four classification techniques are used. They are ANN, Decision Tree, SVM and Naïve Bayes. Dataset is provided from the TDIL project of the Ministry of Electronics and IT, Govt. of India. The repository contains 86 categories of text having more than a million sentences. After getting the impressive result for the Bengali language test run was done for other Indian languages and a very good result is achieved. This research is extremely useful for the automatic question answering system, semantic similarity analysis, e-governance and m- governance.

Highlights

  • INTRODUCTIONWe are surrounded by the digital content. We use internet for any information

  • A day, we are surrounded by the digital content

  • Four classification techniques namely ANN, Naïve Bayes, SVM and Decision Tree are used as classification techniques

Read more

Summary

INTRODUCTION

We are surrounded by the digital content. We use internet for any information. For comfort or for the lack of knowledge of other languages users always prefer to use their mother tongue This is very much true for internet users as well. A query or question is fired in language A and the answer to that or information related to that is available in language B. Search Engine is unaware about that answer as it’s never possible to translate the query in to thousands of different languages. The query and the answer are available in the same language but there is no word in common. In this work a collection of novels and texts written or translated in Bengali is used This repository contains 86 different categories of text marked with text category and the sentence header. In the few sections result, the challenges faced and the future scope of the improvement is given

RELATED WORK
PROBLEM STATEMENT
PROPOSED APPROACH
DECISION TREE
METHODOLOGY
RESULT SUMMARY
DETAILED RESULT
APPLICATIONS
VIII. FUTURE SCOPE FOR IMPROVEMENTS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.