Abstract

With large amounts of digital map archives becoming available, the capability to automatically extracting information from historical maps is important for many domains that require long-term geographic data, such as understanding the development of the landscape and human activities. In the previous work, we built a system to automatically recognize geographic features in historical maps using Convolutional Neural Networks (CNN). Our system uses contemporary vector data to automatically label examples of the geographic feature of interest in historical maps as training samples for the CNN model. The alignment between the vector data and geographic features in maps controls if the system can generate representative training samples, which has a significant impact on recognition performance of the system. Due to the large number of training data that the CNN model needs and tens of thousands of maps needed to be processed in an archive, manually aligning the vector data to each map in an archive is not practical. In this paper, we present an algorithm that automatically aligns vector data with geographic features in historical maps. Existing alignment approaches focus on road features and imagery and are difficult to generalize for other geographic features. Our algorithm aligns various types of geographic features in document images with the corresponding vector data. In the experiment, our alignment algorithm increased the correctness and completeness of the extracted railroad and river vector data for about 100% and 20%, respectively. For the performance of feature recognition, the aligned vector data had a 100% improvement on the precision while maintained a similar recall.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.