Abstract

We address the problem of offline handwritten diagram recognition. Recently, it has been shown that diagram symbols can be directly recognized with deep learning object detectors. However, object detectors are not able to recognize the diagram structure. We propose Arrow R-CNN, the first deep learning system for joint symbol and structure recognition in handwritten diagrams. Arrow R-CNN extends the Faster R-CNN object detector with an arrow head and tail keypoint predictor and a diagram-aware postprocessing method. We propose a network architecture and data augmentation methods targeted at small diagram datasets. Our diagram-aware postprocessing method addresses the insufficiencies of standard Faster R-CNN postprocessing. It reconstructs a diagram from a set of symbol detections and arrow keypoints. Arrow R-CNN improves state-of-the-art substantially: on a scanned flowchart dataset, we increase the rate of recognized diagrams from 37.7 to 78.6%.

Highlights

  • Graphical modeling languages are a long-used and intuitive device to visualize algorithms, business process models, and software systems

  • The model mostly struggles with recognizing arrows and text phrases due to their varying form and size. We agree with their motivation and propose an offline handwritten diagram recognition approach which builds upon Faster R-convolutional neural networks (CNNs) for symbol recognition

  • – We demonstrate how a Faster R-CNN object detector can be extended with a lightweight arrow keypoint predictor for diagram structure recognition

Read more

Summary

Introduction

Graphical modeling languages are a long-used and intuitive device to visualize algorithms, business process models, and software systems. The model mostly struggles with recognizing arrows and text phrases due to their varying form and size We agree with their motivation and propose an offline handwritten diagram recognition approach which builds upon Faster R-CNN for symbol recognition. While the recognition of computer-generated arrows in mentioned examples is important, this work focuses on handwritten diagrams, where each arrow connects two nodes, and each text phrase annotates either a node or an arrow. This structure is simple, it is sufficiently powerful to describe graphical modeling languages from various domains.

Related work
Handwritten diagram recognition
Keypoint detection
Arrow R-CNN
Network architecture
Training
Inference
Integrating diagram domain knowledge
Augmentation
Diagram-aware postprocessing
Experiments
Datasets
Evaluation metrics
Implementation
Results
Error analysis and future work
Conclusion
Compliance with ethical standards
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call