Layout-aware Single-image Document Flattening

Pu Li,Dong-Ming Yan,Weize Quan,Jianwei Guo

doi:10.1145/3627818

Abstract

Single image rectification of document deformation is a challenging task. Although some recent deep learning-based methods have attempted to solve this problem, they cannot achieve satisfactory results when dealing with document images with complex deformations. In this article, we propose a new efficient framework for document flattening. Our main insight is that most layout primitives in a document have rectangular outline shapes, making unwarping local layout primitives essentially homogeneous with unwarping the entire document. The former task is clearly more straightforward to solve than the latter due to the more consistent texture and relatively smooth deformation. On this basis, we propose a layout-aware deep model working in a divide-and-conquer manner. First, we employ a transformer-based segmentation module to obtain the layout information of the input document. Then a new regression module is applied to predict the global and local UV maps. Finally, we design an effective merging algorithm to correct the global prediction with local details. Both quantitative and qualitative experimental results demonstrate that our framework achieves favorable performance against state-of-the-art methods. In addition, the current publicly available document flattening datasets have limited 3D paper shapes without layout annotation and also lack a general geometric correction metric. Therefore, we build a new large-scale synthetic dataset by utilizing a fully automatic rendering method to generate deformed documents with diverse shapes and exact layout segmentation labels. We also propose a new geometric correction metric based on our paired document UV maps. Code and dataset will be released at https://github.com/BunnySoCrazy/LA-DocFlatten .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Layout-aware Single-image Document Flattening

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Graphics

Lead the way for us

Journal: ACM Transactions on Graphics	Publication Date: Nov 2, 2023
Citations: 3

Similar Papers

Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification
Yanan Wang ... Shengcai Liao
-
Yanan Wang, et. al.Yanan Wang ... Shengcai Liao
01 Jun 2022
01 Jun 2022

Synthesizing Social Media Data Using Information Morphing
Kirk Ogaard
-
Kirk OgaardKirk Ogaard
01 Sep 2013
01 Sep 2013

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?
Cuicui Kang
-
Cuicui KangCuicui Kang
10 Oct 2022
10 Oct 2022

Using global mapping to create more accurate document-level maps of research fields
Richard Klavans ... Kevin W Boyack
Journal of the American Society for Information Science and Technology | VOL. 62
Richard Klavans, et. al.Richard Klavans ... Kevin W Boyack
12 Nov 2010
Journal of the American Society for Information Science and Technology | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Layout-aware Single-image Document Flattening

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Graphics