Text Localization Research Articles

The taxonomic literature is one of the largest resources of information on biodiversity, both current and in the past. Unlike many scientific disciplines this literature remains perpetually relevant as successive taxonomic work builds upon those earlier foundations. Projects such as the Biodiversity Heritage Library (BHL) have greatly increased access to that literature, as have numerous independent digitisation efforts by museums, herbaria, and publishers. But the focus of this access has been human readers, with limited use of text mining tools, mostly focussed on extracting taxonomic names. This talk explores other kinds of data that can be extracted from text on BHL and elsewhere, focusing on taxonomic names, geographic localities and specimen codes in the context of the BioStor project (https://biostor.org, Page 2011). The problem of finding taxonomic names in text has been well studied (e.g., Akella et al. 2012), and new BHL content is continuously indexed by names. Despite this, there is only weak linkage between taxonomic name databases and BHL. Even projects that create these links (e.g., BioNames, Page 2013) do not enable links in the reverse direction. In other words, a BHL reader is unaware whether the appearance of a name on a page is the first publication of that name, nor are they told of the fate of a name in subsequent research. The absence of these links reduces the value of BHL to working taxonomists. In addition to taxonomic names, a typical taxonomic paper often contains specimen codes. Extracting these from text and linking them to digital representations, such as occurrence records in GBIF, opens up the possibility to provide detailed provenance for occurrence data, as well as citation-based metrics for the utility of natural history collections. Taxonomic papers are also often rich in geographic information. A simple method for extracting locality information from text is to search for latitude and longitude coordinates, and BioStor currently does this. To date some 83,000 individual point localities have been extracted (Fig. 1 ). These are used to provide a simple geographic search interface in BioStor, and are also harvested by JournalMap (Karl et al. 2013). But these localities are not linked to the original location in the source text, nor are they linked to any associated specimens, so they cannot be interpreted as occurrences that could be harvested by GBIF. If the goal is to contribute to GBIF then we need tools that can parse locality information and link that to associated specimens. A general framework for handling data on taxonomic names, specimens, and geographic localities in text is to treat them as annotations (Batista-Navarro et al. 2017). By modelling annotations using the Web Annotation Data Model (https://www.w3.org/TR/annotation-model/ ) we can incorporate these annotations into biodiversity knowledge graphs (Page 2016). We can also combine these annotations with new standards for describing digitised content, such as the International Image Interoperability Framework (IIIF, https://iiif.io). The implications of this approach for developing new interfaces to the biodiversity literature will be discussed.

Read full abstract

Цель. На сегодняшний день необходимость локализации онлайн-ресурсов обусловлена глобализирующей тенденцией развития интернет-коммуникации. Стратегии локализации варьируются в зависимости от характера реализуемых посредством интернет-платформ продуктов и услуг, однако в любом случае локализованные версии предоставляют потенциальным клиентам доступ к достоверным источникам информации о мировых брендах. Целью настоящего исследования является изучение феномена локализации с лингвистической точки зрения применительно к вербальному контенту международных сайтов. В статье предпринимается попытка обоснования необходимости дифференции англоязычного контента для каждой из англоязычных стран, включенных в структуру локализованных версий сайтов, и выявить вербальные средства, за счет которых осуществляется данный процесс.Методология проведения работы. Основу исследования образуют методы контекстуального, сравнительного и прагматического анализа.Результаты. В результате проведенного исследования были описаны механизмы создания локализованного контента в рамках одного сайта и в пространстве одного языка – английского. В статье делается вывод о значимости локализации сайта и текста, в частности, на английский язык и о необходимости дифференциации между двумя уровнями, на которых может осуществляться локализация – уровень текста и интертекстуальный уровень.Область применения результатов. Результаты исследования могут быть применены в сфере осуществления масштабных проектов локализации различных видов международных сайтов.

Read full abstract

Text Localization Research Articles

Related Topics

Articles published on Text Localization

TEXT LOCALIZATION IN SCENE IMAGES BY BENDELET TRANSFORM

Text Extraction and Recognition in Natural Scene Images using Contourlet Transform and PNN

Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection

An algorithm to identify text marked in rock core pictures with machine learning algorithm

Real-time localization of multi-oriented text in natural scene images using a linear spatial filter

Detection, Localization of Text in Images by Mser and Enhanced Swt

Unambiguous Scene Text Segmentation with Referring Expression Comprehension.

The Effect of an Innovative Vision Simulator (OrCam) on Quality of Life in Patients with Glaucoma

Text-mining BHL: towards new interfaces to the biodiversity literature

Text extraction from natural scene images using Renyi entropy

An Enhanced MSER Pruning Algorithm for Detection and Localization of Bangla Texts from Scene Images

LOCALIZATION OF WEBSITE VERBAL CONTENT INTO ENGLISH AS A MARKETING STRATEGY

Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter

An augmented reality sign-reading assistant for users with reduced vision.

Arabic Cursive Text Recognition from Natural Scene Images

Implementation of Multi-Agent based Digital Rights Management System for Distance Education (DRMSDE) using JADE

Heuristic Algorithm for Generalized Function Matching

Topological descriptors of philosophical text

Can the similarity index predict the causes of retractions in high-impact anesthesia journals? A bibliometric analysis.

A Detection and Verification Model Based on SSD and Encoder-Decoder Network for Scene Text Detection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Text Localization Research Articles

Related Topics

Articles published on Text Localization

TEXT LOCALIZATION IN SCENE IMAGES BY BENDELET TRANSFORM

Text Extraction and Recognition in Natural Scene Images using Contourlet Transform and PNN

Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection

An algorithm to identify text marked in rock core pictures with machine learning algorithm

Real-time localization of multi-oriented text in natural scene images using a linear spatial filter

Detection, Localization of Text in Images by Mser and Enhanced Swt

Unambiguous Scene Text Segmentation with Referring Expression Comprehension.

The Effect of an Innovative Vision Simulator (OrCam) on Quality of Life in Patients with Glaucoma

Text-mining BHL: towards new interfaces to the biodiversity literature

Text extraction from natural scene images using Renyi entropy

An Enhanced MSER Pruning Algorithm for Detection and Localization of Bangla Texts from Scene Images

LOCALIZATION OF WEBSITE VERBAL CONTENT INTO ENGLISH AS A MARKETING STRATEGY

Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter

An augmented reality sign-reading assistant for users with reduced vision.

Arabic Cursive Text Recognition from Natural Scene Images

Implementation of Multi-Agent based Digital Rights Management System for Distance Education (DRMSDE) using JADE

Heuristic Algorithm for Generalized Function Matching

Topological descriptors of philosophical text

Can the similarity index predict the causes of retractions in high-impact anesthesia journals? A bibliometric analysis.

A Detection and Verification Model Based on SSD and Encoder-Decoder Network for Scene Text Detection