Representation Of Strings Research Articles

Writing is one of the most important forms of communication and for centuries, handwriting had been the most reliable way to preserve knowledge. However, despite the recent development of printing houses and electronic devices, handwriting is still broadly used for taking notes, doing annotations, or sketching ideas. In order to be easily accessed, there is a huge amount of handwritten documents, some of them with uncountable cultural value, that have been recently digitized. This has made necessary the development of methods able to extract information from these document images.Transferring the ability of understanding handwritten text or recognizing handwritten shapes to computers has been the goal of many researches due to its huge importance for many different fields. However, designing good representations to deal with handwritten shapes, e.g. symbols or words, is a very challenging problem due to the large variability of these kinds of shapes. One of the consequences of working with handwritten shapes is that we need representations to be, i.e., able to adapt to large intra-class variability. We need representations to be discriminative, i.e., able to learn what are the differences between classes. And, we need representations to be efficient, i.e., able to be rapidly computed and compared. Unfortunately, current techniques of handwritten shape representation for matching and recognition do not fulfill some or all of these requirements.Through this thesis we focus on the problem of learning to represent handwritten shapes aimed at retrieval and recognition tasks. Concretely, on the first part of the thesis, we focus on the general problem of representing any kind of handwritten shape. We first present a novel shape descriptor based on a deformable grid that deals with large deformations by adapting to the shape and where the cells of the grid can be used to extract different features. Then, we propose to use this descriptor to learn statistical models, based on the Active Appearance Model, that jointly learns the variability in structure and texture of a given class. Then, on the second part, we focus on a concrete application, the problem of representing handwritten words, for the tasks of word spotting, where the goal is to find all instances of a query word in a dataset of images, and recognition. First, we address the segmentation-free problem and propose an unsupervised, sliding-window-based approach that achieves state-of-the-art results in two public datasets.Second, we address the more challenging multi-writer problem, where the variability in words exponentially increases. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace, and where those that represent the same word are close together. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. This leads to a low-dimensional, unified representation of word images and strings, resulting in a method that allows one to perform either image and text searches, as well as image transcription, in a unified framework. We evaluate our methods on different public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.

Read full abstract

A string with many repetitions can be represented compactly by replacing h-fold contiguous repetitions of a string r with (r)h. We present a compact representation, which we call a repetition representation (of a string) or RRS, by which a set of disjoint or nested tandem arrays can be compacted. In this paper, we study the problem of finding a minimum RRS or MRRS, where the size of an RRS is defined by the sum of the length of component letters and the description length of the component repetitions (⋅)h which is defined by wR(h) using a repetition weight function wR. We develop two dynamic programming-based algorithms to solve this problem: CMR, which works for any type of wR, and CMR-C, which is faster but can be applied to a constant wR only. CMR-C is an O(n2logn)-time O(nlogn)-space algorithm, which is more efficient in both time and space than CMR by a ((logn)/n)-factor, where n is the length of the given string. The problem of finding an MRRS for a string can be extended to that of finding a minimum repetition representation (of a tree) or MRRT for a given labeled ordered tree. For this problem, we present two algorithms, CMRT and CMRT-C, by using CMR and CMR-C, respectively, as a subroutine. As well as the theoretical analysis, we confirmed the efficiency of the proposed algorithms by experiments, which consist of the following three parts: First we demonstrated that CMR-C and CMRT-C are fast enough for large-scale data by using synthetic strings and trees, respectively. The size of an MRRS for a given string can be a measure of how compactly the string can be represented, meaning how well the string is structurally organized. This is also true of trees. To check such ability of MRRS-size, second we measured the size of an MRRS for chromosomes of nine different species. We found that all the chromosomes of the same species have a similar compression rate when realized by an MRRS. Run length encoding (RLE) was also shown to have species-specific compression rate, but species were separated more clearly by MRRS than by RLE. Third we examined the size of an MRRT for web pages of world-leading companies by using the tag trees, showing a consistency between the compression rate by an MRRT and visual web page structures.

Read full abstract

Representation Of Strings Research Articles

Articles published on Representation Of Strings

Engineering perturbative string duals for symmetric product orbifold CFTs

Case Study: Optimizing Grading Ring Design for High Voltage Polymeric Insulators in Power Transmission Systems for Enhanced Electric Field and Voltage Distribution by Using a Finite Element Method

Context-sensitive fusion grammars and fusion grammars with forbidden context are universal

Is human face recognition lateralized to the right hemisphere due to neural competition with left-lateralized visual word recognition? A critical review.

Ambitwistor strings in six and five dimensions

A new approach to regular & indeterminate strings

Supersymmetric S-matrices from the worldsheet in 10 & 11d

Transformation of Turing Machines into Context-Dependent Fusion Grammars

Using RuleBuilder to Graphically Define and Visualize BioNetGen-Language Patterns and Reaction Rules.

Statistical Relational Learning With Unconventional String Models.

Learning to Represent Handwritten Shapes and Words for Matching and Recognition

A Framework for Succinct Labeled Ordinal Trees over Large Alphabets

The role of consonant/vowel organization in perceptual discrimination.

Fast algorithms for finding a minimum repetition representation of strings and trees

NUMERICAL NEGATIVE SELECTION ALGORITHM

A Novel Approach to Dynamic Representation of Drill Strings in Test Rigs

Letter Processing and the Formation of Memory Representations in Children with Naming Speed Deficits

Semi-classical mechanics in phase space: the quantum target of minimal strings

Neural networks with chaotic recursive nodes: techniques for the design of associative memories, contrast with Hopfield architectures, and extensions for time-dependent inputs

Task demands and representation in long-term repetition priming.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Representation Of Strings Research Articles

Articles published on Representation Of Strings

Engineering perturbative string duals for symmetric product orbifold CFTs

Case Study: Optimizing Grading Ring Design for High Voltage Polymeric Insulators in Power Transmission Systems for Enhanced Electric Field and Voltage Distribution by Using a Finite Element Method

Context-sensitive fusion grammars and fusion grammars with forbidden context are universal

Is human face recognition lateralized to the right hemisphere due to neural competition with left-lateralized visual word recognition? A critical review.

Ambitwistor strings in six and five dimensions

A new approach to regular & indeterminate strings

Supersymmetric S-matrices from the worldsheet in 10 & 11d

Transformation of Turing Machines into Context-Dependent Fusion Grammars

Using RuleBuilder to Graphically Define and Visualize BioNetGen-Language Patterns and Reaction Rules.

Statistical Relational Learning With Unconventional String Models.

Learning to Represent Handwritten Shapes and Words for Matching and Recognition

A Framework for Succinct Labeled Ordinal Trees over Large Alphabets

The role of consonant/vowel organization in perceptual discrimination.

Fast algorithms for finding a minimum repetition representation of strings and trees

NUMERICAL NEGATIVE SELECTION ALGORITHM

A Novel Approach to Dynamic Representation of Drill Strings in Test Rigs

Letter Processing and the Formation of Memory Representations in Children with Naming Speed Deficits

Semi-classical mechanics in phase space: the quantum target of minimal strings

Neural networks with chaotic recursive nodes: techniques for the design of associative memories, contrast with Hopfield architectures, and extensions for time-dependent inputs

Task demands and representation in long-term repetition priming.