Abstract

Cross-Language Multi-document summarization (CLMDS) process produces a summary generated from multiple documents in which the summary language is different from the source document language. The CLMDS model allows the user to provide query in a particular language (e.g., Tamil) and generates a summary in the same language from different language source documents. The proposed model enables the user to provide a query in Tamil language, generate a summary from multiple English documents, and finally translate the summary into Tamil language. The proposed model makes use of naïve Bayes classifier (NBC) model for the CLMDS. An extensive set of experimentation analysis was performed and the results are investigated under distinct aspects. The resultant experimental values ensured the supremacy of the presented CLMDS model.

Highlights

  • Text Summarization (TS) is the task of purifying the essential data from the initial document to offer abbreviated version for particular operation

  • The cross-lingual based set of documents [13] have been retrieved for all the queries, where the expansion terms are identified by term selection value

  • The cross-lingual based set of documents has been retrieved for all the queries where the expansion terms are identified by term selection value

Read more

Summary

Introduction

Text Summarization (TS) is the task of purifying the essential data from the initial document to offer abbreviated version for particular operation. The systems have employed compressive as well as abstractive frameworks to maximize the usefulness and grammatical supremacy of summaries These models need special resources for a language [2] and unification of diverse models which restricts the applicability of these approaches in summary generation in various languages. Some of the current models have increased the supremacy of cross-lingual summarization by translation quality value as well as data of documents in all languages [4,5].[7] applied a Support Vector Machine (SVM) regression approach for predicting the translation quality of pair of English-Chinese sentences from fundamental features like sentence length, sub-sentence value, proportion of nouns and adjective, as well as parse features.[8] trained ε-Support Vector Regression (ε-SVR) for the purpose of predicting the score of translation quality according to the automatic NIST measure as quality indicator. The cohesion metrics is utilized for producing clusters of similar sentences in multisentence compression procedure

NBC based Multi-document summarization
Performance Validation
Conclusion
6.References

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.