Abstract
Text summarization remains a challenging task in the natural language processing field despite the plethora of applications in enterprises and daily life. One of the common use cases is the summarization of web pages which has the potential to provide an overview of web pages to devices with limited features. In fact, despite the increasing penetration rate of mobile devices in rural areas, the bulk of those devices offer limited features in addition to the fact that these areas are covered with limited connectivity such as the GSM network. Summarizing web pages into SMS becomes, therefore, an important task to provide information to limited devices. This work introduces WATS-SMS, a T5-based French Wikipedia Abstractive Text Summarizer for SMS. It is built through a transfer learning approach. The T5 English pre-trained model is used to generate a French text summarization model by retraining the model on 25,000 Wikipedia pages then compared with different approaches in the literature. The objective is twofold: (1) to check the assumption made in the literature that abstractive models provide better results compared to extractive ones; and (2) to evaluate the performance of our model compared to other existing abstractive models. A score based on ROUGE metrics gave us a value of 52% for articles with length up to 500 characters against 34.2% for transformer-ED and 12.7% for seq-2seq-attention; and a value of 77% for articles with larger size against 37% for transformers-DMCA. Moreover, an architecture including a software SMS-gateway has been developed to allow owners of mobile devices with limited features to send requests and to receive summaries through the GSM network.
Highlights
One of the most fascinating advances in the field of artificial intelligence is the ability of computers to understand natural language
This paper introduces WATS-SMS, a French Wikipedia Abstractive Text Summarizer that aims to summarize French Wikipedia pages into SMS and to provide summaries directly on the user’s device
T5 works as a ledger of all Natural language processing (NLP) tasks into a unified format, which is different from BERT-based models that usually generate either a class label or a span of the input [57]
Summary
One of the most fascinating advances in the field of artificial intelligence is the ability of computers to understand natural language. This paper introduces WATS-SMS, a French Wikipedia Abstractive Text Summarizer that aims to summarize French Wikipedia pages into SMS and to provide summaries directly on the user’s device It is built by applying a transfer learning technique to fineduring the summarization [26,27,28,29,30,31], they do not take into consideration the limitation in terms of the number of characters. That aims to summarize French Wikipedia pages into SMS and to provide summaries directly on the user’s device It is built by applying a transfer learning technique to fine-tune a pre-trained model on French Wikipedia pages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.