Noisy Datasets Research Articles

We introduce a novel methodological framework for robust trend analysis (RTA) using remote sensing data to enhance the accuracy and reliability of detecting significant environmental trends. Our approach sequentially integrates the Theil–Sen (TS) slope estimator, the Contextual Mann–Kendall (CMK) test, and the false discovery rate (FDR) control. This comprehensive method addresses common challenges in trend analysis, such as handling small, noisy datasets with outliers and issues related to spatial autocorrelation, cross-correlation, and multiple testing. We applied this RTA workflow to study tree cover trends in Los Alcornocales Natural Park (Southern Spain), Europe’s largest cork oak forest, analysing interannual changes in tree cover from 2000 to 2022 using Terra MODIS MOD44B data. Our results reveal that the TS estimator provides a robust measure of trend direction and magnitude, but its effectiveness is dramatically enhanced when combined with the CMK test. This combination highlights significant trends and effectively corrects for spatial autocorrelation and cross-correlation, ensuring that genuine environmental signals are distinguished from statistical noise. Unlike previous workflows, our approach incorporates the FDR control, which successfully filtered out 29.6% of false discoveries in the case study, resulting in a more stringent assessment of true environmental trends captured by multi-temporal remotely sensed data. In the case study, we found that approximately one-third of the area exhibits significant and statistically robust declines in tree cover, with these declines being geographically clustered. Importantly, these trends correspond with relevant changes in tree cover, emphasising the ability of RTA to detect relevant environmental changes. Overall, our findings underscore the crucial importance of combining these methods, as their synergy is essential for accurately identifying and confirming robust environmental trends. The proposed RTA framework has significant implications for environmental monitoring, modelling, and management.

Read full abstract

In recent years, various datasets related to the phenotyping of sunflower genotypes have become increasingly accessible. However, one of the key challenges remains the efficient and accurate prediction of phenotypes based on genotypes in the context of climate change. Analyzing phenotypes at different levels of organization and detecting connections between phenotypes and genotypes require the integration and processing of large, diverse, and often noisy datasets. Machine learning offers a broad arsenal of methods and approaches for identifying predictive patterns in such data. Therefore, the research aimed to develop a methodology for the systematization of sunflower genotypes based on seed phenotypic characteristics using the data vector quantization method and neural networks. The study revealed the phenotypic characteristics of sunflower seeds from various genotypes selected by the Institute of Oilseed Crops of NAAS, grown in the southern Steppe of Ukraine, including seed length, width, thickness, seed mass, kernel mass, and seed coat cracking force. For this purpose, appropriate laboratory equipment was developed, including two modules for determining the morphological and rheological properties of seeds. The developed methodology for the systematization of sunflower genotypes based on seed phenotypic characteristics includes the following steps: measuring the characteristics of sunflower seeds from various samples (parental components); studying the mutual correlation of characteristics; conducting hierarchical cluster analysis of the data using the Ward's method; determining the optimal number of groups; performing k-means clustering using the vector quantization method; determining the correspondence of ranges of characteristics to the group; training a neural network to group the data by samples and created groups; verifying the adequacy of the neural network on test data. The developed methodology was tested, and the MLP 30-15-3 neural network for grouping data by samples and created groups of sunflower seeds was developed in the Statistica software package. The network's training efficiency was 99.4%, and such of testing and validation was 95.6% and 96.7%, respectively.

Read full abstract

Noisy Datasets Research Articles

Related Topics

Articles published on Noisy Datasets

A new binary classifier robust on noisy domains based on kNN algorithm

Robust Trend Analysis in Environmental Remote Sensing: A Case Study of Cork Oak Forest Decline

EEG-based emotional valence and emotion regulation classification: a data-centric and explainable approach

A deep autoencoder for electric double layer capacitance prediction in electrochemical sensors

Uncertainty in Fourier Transforms: A Fuzzy Logic Perspective

MTC-NET: A Multi-Channel Independent Anomaly Detection Method for Network Traffic.

Segmenting hotspots from medical thermal images using Density-based modified FC-Pc FS with spatial information

Pose‐to‐Motion: Cross‐Domain Motion Retargeting with Pose Prior

MD-BiMamba: An aero-engine inter-shaft bearing fault diagnosis method based on Mamba with modal decomposition and bidirectional features fusion strategy

Learning Hamiltonian dynamics with reproducing kernel Hilbert spaces and random features

NEURO DEEP FUZZY GENETIC ALGORITHM APPROACH FOR CLASSIFICATION AND DETECTION OF BRAIN TUMOR FROM LARGE DATASETS

Systematization of sunflower genotypes based on seed phenotypic characteristics using neural networks

Artificial-Intelligence-Based Condition Monitoring of Industrial Collaborative Robots: Detecting Anomalies and Adapting to Trajectory Changes

Testing protocols for smoothing datasets of hydraulic variables acquired during unsteady flows

A hierarchical heterogeneous ant colony optimization based oversampling algorithm using feature similarity for classification of imbalanced data

Ultralow Energy Consumption and Fast Neuromorphic Computing Based on La0.1Bi0.9FeO3 Ferroelectric Tunnel Junctions.

Dynamic data reconciliation for enhancing the prediction performance of long short-term memory network

BO4IO: A Bayesian optimization approach to inverse optimization with uncertainty quantification

Transformer for low concentration image denoising in magnetic particle imaging.

Research on steel surface defect detection system based on YOLOv5s-SE-CA model and BEMD image enhancement

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Noisy Datasets Research Articles

Related Topics

Articles published on Noisy Datasets

A new binary classifier robust on noisy domains based on kNN algorithm

Robust Trend Analysis in Environmental Remote Sensing: A Case Study of Cork Oak Forest Decline

EEG-based emotional valence and emotion regulation classification: a data-centric and explainable approach

A deep autoencoder for electric double layer capacitance prediction in electrochemical sensors

Uncertainty in Fourier Transforms: A Fuzzy Logic Perspective

MTC-NET: A Multi-Channel Independent Anomaly Detection Method for Network Traffic.

Segmenting hotspots from medical thermal images using Density-based modified FC-Pc FS with spatial information

Pose‐to‐Motion: Cross‐Domain Motion Retargeting with Pose Prior

MD-BiMamba: An aero-engine inter-shaft bearing fault diagnosis method based on Mamba with modal decomposition and bidirectional features fusion strategy

Learning Hamiltonian dynamics with reproducing kernel Hilbert spaces and random features

NEURO DEEP FUZZY GENETIC ALGORITHM APPROACH FOR CLASSIFICATION AND DETECTION OF BRAIN TUMOR FROM LARGE DATASETS

Systematization of sunflower genotypes based on seed phenotypic characteristics using neural networks

Artificial-Intelligence-Based Condition Monitoring of Industrial Collaborative Robots: Detecting Anomalies and Adapting to Trajectory Changes

Testing protocols for smoothing datasets of hydraulic variables acquired during unsteady flows

A hierarchical heterogeneous ant colony optimization based oversampling algorithm using feature similarity for classification of imbalanced data

Ultralow Energy Consumption and Fast Neuromorphic Computing Based on La0.1Bi0.9FeO3 Ferroelectric Tunnel Junctions.

Dynamic data reconciliation for enhancing the prediction performance of long short-term memory network

BO4IO: A Bayesian optimization approach to inverse optimization with uncertainty quantification

Transformer for low concentration image denoising in magnetic particle imaging.

Research on steel surface defect detection system based on YOLOv5s-SE-CA model and BEMD image enhancement