Decoder Structure Research Articles

Morphological segmentation and stemming are foundational tasks in natural language processing. They have become effective ways to alleviate data sparsity in agglutinative languages because of the nature of agglutinative language word formation. Uyghur and Kazakh, as typical agglutinative languages, have made significant progress in morphological segmentation and stemming in recent years. However, the evaluation metrics used in previous work are character-level based, which may not comprehensively reflect the performance of models in morphological segmentation or stemming. Moreover, existing methods avoid manual feature extraction, but the model’s ability to learn features is inadequate in complex scenarios, and the correlation between different features has not been considered. Consequently, these models lack representation in complex contexts, affecting their effective generalization in practical scenarios. To address these issues, this paper redefines the morphological-level evaluation metrics: F1-score and accuracy (ACC) for morphological segmentation and stemming tasks. In addition, two models are proposed for morpheme segmentation and stem extraction tasks: supervised model and unsupervised model. The supervised model learns character and contextual features simultaneously, then feature embeddings are input into a Transformer encoder to study the correlation between character and context embeddings. The last layer of the model uses a CRF or softmax layer to determine morphological boundaries. In unsupervised learning, an encoder–decoder structure introduces n-gram correlation assumptions and masked attention mechanisms, enhancing the correlation between characters within n-grams and reducing the impact of characters outside n-grams on boundaries. Finally, comprehensive comparative analyses of the performance of different models are conducted from various points of view. Experimental results demonstrate that: (1) The proposed evaluation method effectively reflects the differences in morphological segmentation and stemming for Uyghur and Kazakh; (2) Learning different features and their correlation can enhance the model’s generalization ability in complex contexts. The proposed models achieve state-of-the-art performance on Uyghur and Kazakh datasets.

Read full abstract

Abstract. Knowledge of plant species distributions is essential for various application fields, such as nature conservation, agriculture, and forestry. Remote sensing data, especially high-resolution orthoimages from unoccupied aerial vehicles (UAVs), paired with novel pattern-recognition methods, such as convolutional neural networks (CNNs), enable accurate mapping (segmentation) of plant species. Training transferable pattern-recognition models for species segmentation across diverse landscapes and data characteristics typically requires extensive training data. Training data are usually derived from labor-intensive field surveys or visual interpretation of remote sensing images. Alternatively, pattern-recognition models could be trained more efficiently with plant photos and labels from citizen science platforms, which include millions of crowd-sourced smartphone photos and the corresponding species labels. However, these pairs of citizen-science-based photographs and simple species labels (one label for the entire image) cannot be used directly for training state-of-the-art segmentation models used for UAV image analysis, which require per-pixel labels for training (also called masks). Here, we overcome the limitation of simple labels of citizen science plant observations with a two-step approach. In the first step, we train CNN-based image classification models using the simple labels and apply them in a moving-window approach over UAV orthoimagery to create segmentation masks. In the second phase, these segmentation masks are used to train state-of-the-art CNN-based image segmentation models with an encoder–decoder structure. We tested the approach on UAV orthoimages acquired in summer and autumn at a test site comprising 10 temperate deciduous tree species in varying mixtures. Several tree species could be mapped with surprising accuracy (mean F1 score =0.47). In homogenous species assemblages, the accuracy increased considerably (mean F1 score =0.55). The results indicate that several tree species can be mapped without generating new training data and by only using preexisting knowledge from citizen science. Moreover, our analysis revealed that the variability in citizen science photographs, with respect to acquisition data and context, facilitates the generation of models that are transferable through the vegetation season. Thus, citizen science data may greatly advance our capacity to monitor hundreds of plant species and, thus, Earth's biodiversity across space and time.

Read full abstract

Decoder Structure Research Articles

Related Topics

Articles published on Decoder Structure

Multiple Unmanned Aerial Vehicle (multi-UAV) Reconnaissance and Search with Limited Communication Range Using Semantic Episodic Memory in Reinforcement Learning

ESFPNet: Efficient Stage-Wise Feature Pyramid on Mix Transformer for Deep Learning-Based Cancer Analysis in Endoscopic Video.

MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation

DAT-Net: Filling of missing temperature values of meteorological stations by data augmentation attention neural network

Enhancing deep reinforcement learning for scale flexibility in real-time strategy games

Arbitrary style transfer via multi-feature correlation

Research on Multi-Step Fruit Color Prediction Model of Tomato in Solar Greenhouse Based on Time Series Data

CrowdUNet: Segmentation assisted U-shaped crowd counting network

A deep learning model for predicting the state of energy in lithium-ion batteries based on magnetic field effects

A Benchmark for Morphological Segmentation in Uyghur and Kazakh

Dual-branch deep cross-modal interaction network for semantic segmentation with thermal images

Representation Learning Based on Vision Transformer

Robot Grasp Detection with Loss-Guided Collaborative Attention Mechanism and Multi-Scale Feature Fusion

From simple labels to semantic image segmentation: leveraging citizen science plant photographs for tree species mapping in drone imagery

Bidirectional interaction of CNN and Transformer for image inpainting

Multi‐scale feature aggregation network for single‐image dehazing

A multimodal stepwise-coordinating framework for pedestrian trajectory prediction

Dynamic Perception-Based Vehicle Trajectory Prediction Using a Memory-Enhanced Spatio-Temporal Graph Network

Research on Efficient Asymmetric Attention Module for Real-Time Semantic Segmentation Networks in Urban Scenes

CC-DETR: DETR with Hybrid Context and Multi-Scale Coordinate Convolution for Crowd Counting

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Decoder Structure Research Articles

Related Topics

Articles published on Decoder Structure

Multiple Unmanned Aerial Vehicle (multi-UAV) Reconnaissance and Search with Limited Communication Range Using Semantic Episodic Memory in Reinforcement Learning

ESFPNet: Efficient Stage-Wise Feature Pyramid on Mix Transformer for Deep Learning-Based Cancer Analysis in Endoscopic Video.

MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation

DAT-Net: Filling of missing temperature values of meteorological stations by data augmentation attention neural network

Enhancing deep reinforcement learning for scale flexibility in real-time strategy games

Arbitrary style transfer via multi-feature correlation

Research on Multi-Step Fruit Color Prediction Model of Tomato in Solar Greenhouse Based on Time Series Data

CrowdUNet: Segmentation assisted U-shaped crowd counting network

A deep learning model for predicting the state of energy in lithium-ion batteries based on magnetic field effects

A Benchmark for Morphological Segmentation in Uyghur and Kazakh

Dual-branch deep cross-modal interaction network for semantic segmentation with thermal images

Representation Learning Based on Vision Transformer

Robot Grasp Detection with Loss-Guided Collaborative Attention Mechanism and Multi-Scale Feature Fusion

From simple labels to semantic image segmentation: leveraging citizen science plant photographs for tree species mapping in drone imagery

Bidirectional interaction of CNN and Transformer for image inpainting

Multi‐scale feature aggregation network for single‐image dehazing

A multimodal stepwise-coordinating framework for pedestrian trajectory prediction

Dynamic Perception-Based Vehicle Trajectory Prediction Using a Memory-Enhanced Spatio-Temporal Graph Network

Research on Efficient Asymmetric Attention Module for Real-Time Semantic Segmentation Networks in Urban Scenes

CC-DETR: DETR with Hybrid Context and Multi-Scale Coordinate Convolution for Crowd Counting