Abstract. The development of remote sensing platforms and sensors, as well as the improvement of remote data processing tools and methods, create new opportunities for automatic updating of maps. Currently, aerial photographs serve as the main source for automatic map updates due to their accessibility and significant informational value. One of the core elements for image to maps transition is accurate image segmentation. Nowadays, machine learning methods demonstrate the best results in task of image segmentation. At its core, maps represent information about a certain area in a vector form, that not only contains visual information about area, but also reflects some relations between objects in the map. This quality makes a map more convenient for human perception than an aerial photograph (raster image). This study addresses the problem of accurate aerial image segmentation with taking the advantages of using graph neural network as the more adequate model of map structure. We use graph neural network for retrieving semantic and vector information about a captured area from its aerial image. The developed framework at first phase utilizes visual transformer for retrieving deep features from the input aerial image. The graph neural network then performs clustering of the extracted deep features to obtain semantic segmentation of the image. To train and evaluate the developed framework, a special dataset is collected and annotated. It contains more than 10k aerial photographs representing various types of objects taken in different years and seasons. The evaluation results on the created dataset proved the state-of-the-art performance of the developed framework.
Read full abstract