Abstract

AMI-Diagram is a tool for mining facts from diagrams, and converting the graphics primitives into XML. Targets include X-Y plots, barcharts, chemical structure diagrams and phylogenetic trees. Part of the ContentMine framework for automatically extracting science from the published literature, AMI can ingest born-digital diagrams either as latent vectors, pixel diagrams or scanned documents. For high-quality/resolution diagrams the process is automatic; command line parameters can be used for noisy or complex diagrams. This article provides on overview of the tool which is currently being deployed to alpha-testers, especially in chemistry and phylogenetics.

Highlights

  • There are at least 10 million diagrams published in the scientific literature each year and many of them represent factual information

  • AMI-Diagram is a flexible tool which can mine facts from diagrams and convert the graphics primitives into XML

  • Over 1 million scientific articles are published yearly and a similar amount of theses and grey literature [1]. Many contain diagrams, such as graphs or domain-specific objects, representing factual information and often this is the primary way of communicating the information contained

Read more

Summary

Introduction

There are at least 10 million diagrams published in the scientific literature each year and many of them represent factual information. AMI-Diagram is a flexible tool which can mine facts from diagrams and convert the graphics primitives into XML. The targets include X-Y plots, barcharts, chemical structure diagrams and phylogenetic trees. AMI can ingest born-digital diagrams either as latent vectors (converted from Postscript), pixel diagrams (PNGs and JPEGs) or scanned documents. For highquality/resolution diagrams the process is automatic; command line parameters can be used for noisy or complex diagrams. AMI is part of the ContentMine framework for automatically extracting science from the published literature

Background
Overview
Interpreting PDFs
Interpreting pixel maps
Findings
Current status
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call