Abstract

• An interactive framework for reconstruction of stripshredded documents • The user lock and forbid pairs automatically selected by the recommender module • Four query strategies for recommending the pairs of shreds to be annotated • A novel methodology to assess the human impact on the quality of a reconstruction • Annotating 25% of the shreds can yield an error reduction of more than 40% The advances in machine learning – particularly in deep learning – have enabled automatizing the reconstruction of shredded documents with significant accuracy. However, despite the recent remarkable results, the state-of-the-art on fully automatic reconstruction still has room for improvement, mainly due to imprecision on the evaluation of how the shreds fit each other (compatibility/cost evaluation). To tackle this problem, we propose a human-in-the-loop reconstruction framework that takes user inputs to improve the solutions (permutation of shreds). In our approach, the user verifies whether adjacent shreds of a solution are also adjacent in the original document. Unlike the current literature, our framework includes a recommender module that automatically selects pairs of shreds to be analyzed by a human. Four recommendation strategies were proposed and evaluated. Results achieved by coupling deep learning reconstruction methods into our framework have shown that introducing the human in the loop can reduce errors by more than 40 % .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call