Semantic Granularities Research Articles

The tremendous development in Natural Language Processing (NLP) has enabled the detection of bilingual and multilingual textual similarity. One of the main challenges of the Textual Similarity Detection (TSD) system lies in learning effective text representation. The research focuses on identifying similar texts between Indonesian and English across a broad range of semantic similarity spectrums. The primary challenge is generating English and Indonesian dense vector representation, a.k.a. embeddings that share a single vector space. Through trial and error, the research proposes using the Universal Sentence Encoder (USE) model to construct bilingual embeddings and FAISS to index the bilingual dataset. The comparison between query vectors and index vectors is done using two approaches: the heuristic comparison with Euclidian distance and a clustering algorithm, Approximate Nearest Neighbors (ANN). The system is tested with four different semantic granularities, two text granularities, and evaluation metrics with a cutoff value of k={2,10}. Four semantic granularities used are highly similar or near duplicate, Semantic Entailment (SE), Topically Related (TR), and Out of Topic (OOT), while the text granularities take on the sentence and paragraph levels. The experimental results demonstrate that the proposed system successfully ranks similar texts in different languages within the top ten. It has been proven by the highest F1@2 score of 0.96 for the near duplicate category on the sentence level. Unlike the near-duplicate category, the highest F1 scores of 0.77 and 0.89 are shown by the SE and TR categories, respectively. The experiment results also show a high correlation between text and semantic granularity.

Read full abstract

This paper complies with the Quality Assurance Framework for Earth Observation (QA4EO) international guidelines to provide a metrological/statistically-based quality assessment of the Spectral Classification of surface reflectance signatures (SPECL) secondary product, implemented within the popular Atmospheric/Topographic Correction (ATCOR™) commercial software suite, and of the Satellite Image Automatic Mapper™ (SIAM™) software product, proposed to the remote sensing (RS) community in recent years. The ATCOR™-SPECL and SIAM™ physical model-based expert systems are considered of potential interest to a wide RS audience: in operating mode, they require neither user-defined parameters nor training data samples to map, in near real-time, a spaceborne/airborne multi-spectral (MS) image into a discrete and finite set of (pre-attentional first-stage) spectral-based semi-concepts (e.g., “vegetation”), whose informative content is always equal or inferior to that of target (attentional second-stage) land cover (LC) concepts (e.g., “deciduous forest”). For the sake of simplicity, this paper is split into two: Part 1—Theory and Part 2—Experimental results. The Part 1 provides the present Part 2 with an interdisciplinary terminology and a theoretical background. To comply with the principle of statistics and the QA4EO guidelines discussed in the Part 1, the present Part 2 applies an original adaptation of a novel probability sampling protocol for thematic map quality assessment to the ATCOR™-SPECL and SIAM™ pre-classification maps, generated from three spaceborne/airborne MS test images. Collected metrological/ statistically-based quality indicators (QIs) comprise: (i) an original Categorical Variable Pair Similarity Index (CVPSI), capable of estimating the degree of match between a test pre-classification map’s legend and a reference LC map’s legend that do not coincide and must be harmonized (reconciled); (ii) pixel-based Thematic (symbolic, semantic) QIs (TQIs) and (iii) polygon-based sub-symbolic (non-semantic) Spatial QIs (SQIs), where all TQIs and SQIs are provided with a degree of uncertainty in measurement. Main experimental conclusions of the present Part 2 are the following. (I) Across the three test images, the CVPSI values of the SIAM™ pre-classification maps at the intermediate and fine semantic granularities are superior to those of the ATCOR™-SPECL single-granule maps. (II) TQIs of both the ATCOR™-SPECL and the SIAM™ tend to exceed community-agreed reference standards of accuracy. (III) Across the three test images and the SIAM™’s three semantic granularities, TQIs of the SIAM™ tend to be significantly higher (in statistical terms) than the ATCOR™-SPECL’s. Stemming from the proposed experimental evidence in support to theoretical considerations, the final conclusion of this paper is that, in compliance with the QA4EO objectives, the SIAM™ software product can be considered eligible for injecting prior spectral knowledge into the pre-attentive vision first stage of a novel generation of hybrid (combined deductive and inductive) RS image understanding systems, capable of transforming large-scale multi-source multi-resolution EO image databases into operational, comprehensive and timely knowledge/information products.

Read full abstract

Semantic Granularities Research Articles

Related Topics

Articles published on Semantic Granularities

Indonesian-English Textual Similarity Detection Using Universal Sentence Encoder (USE) and Facebook AI Similarity Search (FAISS)

Enhancing interaction representation for joint entity and relation extraction

Full-span named entity recognition with boundary regression

Towards POI-based large-scale land use modeling: spatial scale, semantic granularity, and geographic context

Pre-training language model incorporating domain-specific heterogeneous knowledge into a unified representation

High-quality domain expert finding method in CQA based on multi-granularity semantic analysis and interest drift

Quality Assessment of Pre-Classification Maps Generated from Spaceborne/Airborne Multi-Spectral Images by the Satellite Image Automatic Mapper™ and Atmospheric/Topographic Correction™-Spectral Classification Software Products: Part 2 — Experimental Results

Multimedia event detection with multimodal feature fusion and temporal concept localization

Context dependent semantic granularity

Exploitation of semantic relationships and hierarchical data structures to support a user in his annotation and browsing activities in folksonomies

An HMM-based framework for video semantic analysis

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semantic Granularities Research Articles

Related Topics

Articles published on Semantic Granularities

Indonesian-English Textual Similarity Detection Using Universal Sentence Encoder (USE) and Facebook AI Similarity Search (FAISS)

Enhancing interaction representation for joint entity and relation extraction

Full-span named entity recognition with boundary regression

Towards POI-based large-scale land use modeling: spatial scale, semantic granularity, and geographic context

Pre-training language model incorporating domain-specific heterogeneous knowledge into a unified representation

High-quality domain expert finding method in CQA based on multi-granularity semantic analysis and interest drift

Quality Assessment of Pre-Classification Maps Generated from Spaceborne/Airborne Multi-Spectral Images by the Satellite Image Automatic Mapper™ and Atmospheric/Topographic Correction™-Spectral Classification Software Products: Part 2 — Experimental Results

Multimedia event detection with multimodal feature fusion and temporal concept localization

Context dependent semantic granularity

Exploitation of semantic relationships and hierarchical data structures to support a user in his annotation and browsing activities in folksonomies

An HMM-based framework for video semantic analysis