Abstract

The goal of storyboard extraction is to decompose the comic image into several storyboards(or frames), which is the fundamental step of comic image understanding and producing digital comic documents suitable for mobile reading. Most of existing approaches are based on hand crafted low-level visual patters like edge segments and line segments, which do not capture high-level vision. To overcome shortcomings of the existing approaches, we propose a novel architecture based on deep convolutional neural network, namely Shape Regression Network(SReN), to detect storyboards within comic images. Firstly, we use Fast R-CNN to generate rectangle bounding boxes as storyboard proposals. Then we train a deep neural network to predict quadrangles for these propos- als. Unlike existing object detection methods which only output rectangle bounding boxes, SReN can produce more precise quadrangle bounding boxes. Experimental results, evaluating on 7382 comic pages, demonstrate that SReN outperforms the state-of-the-art methods by more than 10% in terms of F1-score and page correction rate.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.