Abstract

This paper investigates recent research on active learning for (geo) text and image classification, with an emphasis on methods that combine visual analytics and/or deep learning. Deep learning has attracted substantial attention across many domains of science and practice, because it can find intricate patterns in big data; but successful application of the methods requires a big set of labeled data. Active learning, which has the potential to address the data labeling challenge, has already had success in geospatial applications such as trajectory classification from movement data and (geo) text and image classification. This review is intended to be particularly relevant for extension of these methods to GISience, to support work in domains such as geographic information retrieval from text and image repositories, interpretation of spatial language, and related geo-semantics challenges. Specifically, to provide a structure for leveraging recent advances, we group the relevant work into five categories: active learning, visual analytics, active learning with visual analytics, active deep learning, plus GIScience and Remote Sensing (RS) using active learning and active deep learning. Each category is exemplified by recent influential work. Based on this framing and our systematic review of key research, we then discuss some of the main challenges of integrating active learning with visual analytics and deep learning, and point out research opportunities from technical and application perspectives—for application-based opportunities, with emphasis on those that address big data with geospatial components.

Highlights

  • Big data are leading to dramatic changes in science and in society

  • Machine learning (ML) and deep learning (DL), where DL is a sub-domain of ML, are increasingly successful in extracting information from big data

  • We argue for taking a visual analytics approach to empowering active deep learning for text and image classification; we review a range of recent developments in the relevant fields that can be leveraged to support this approach

Read more

Summary

Introduction

Big data are leading to dramatic changes in science (with the advent of data-driven science) and in society (with potential to support economic, public health, and other advances). Recent advances in machine learning and especially in deep learning, coupled with release of many open source tools (e.g., Google TensorFlow [1]—an open-source software library for machine intelligence), creates the potential to leverage big data to address GIScience and Remote Sensing (RS) research and application challenges. Two primary goals for this paper are: (1) to synthesize ideas and results from machine learning and deep learning, plus visual analytics, and (2) to provide a base from which new GIScience and RS advances can be initiated. We argue for taking a visual analytics approach to empowering active deep learning for (geo) text and image classification; we review a range of recent developments in the relevant fields that can be leveraged to support this approach. AI – Artificial intelligence ML – Machine Learning DL – Deep Learning AL – Active Learning ADL – Active Deep Learning VA – Visual Analytics RS – Remote Sensing 1 – GIScience applications (AL / AL+VA)

Scope and Intended Audience
The State of the Art
What’s AL and Why AL?
AL Problem Scenarios
AL Core Components
Batch-Mode AL
AL Query Strategies
Recent and Novel AL Methods
AL Summary and Discussion
AL with VA
GIScience and RS Applications Using AL and ADL
Challenges and Research Opportunities
Summary and Discussion
Findings
Technical Challenges and Opportunities

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.