Abstract

In recent years, video contents have observed a massive growth with online lecture, news channels, and media contents and with short clips. These contents have objects, texts in them. The majority of the time, multilingual datasets are not available as per the need and have to utilize YouTube videos with multiple languages as a source . This study proposes a systematic structure for the generation of multilingual datasets and explains the purpose for their creation. Additionally, the study demonstrated the difficulties in creating the data set, like segregating data that is multilingual, etc. Lastly, discusses the effects of creating multilingual datasets because of various difficulties. As a result, the research's contribution is the multilingual dataset that was created using the suggested systematic structure, which also practically shows its limitations and benefits. Key Words:, Data Set, Multilingual, Text detection, challenge, consequences

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call