Amount Of Ground Truth Data Research Articles

Expert workers make non-trivial decisions with significant implications. Experts’ decision accuracy is, thus, a fundamental aspect of their judgment quality, key to both management and consumers of experts’ services. Yet, in many important settings, transparency in experts’ decision quality is rarely possible because ground truth data for evaluating the experts’ decisions is costly and available only for a limited set of decisions. Furthermore, different experts typically handle exclusive sets of decisions, and thus, prior solutions that rely on the aggregation of multiple experts’ decisions for the same instance are inapplicable. We first formulate the problem of estimating experts’ decision accuracy in this setting and then develop a machine–learning–based framework to address it. Our method effectively leverages both abundant historical data on workers’ past decisions and scarce decision instances with ground truth labels. Using both semi-synthetic data based on publicly available data sets and purposefully compiled data sets on real workers’ decisions, we conduct extensive empirical evaluations of our method’s performance relative to alternatives. The results show that our approach is superior to existing alternatives across diverse settings, including settings that involve different data domains, experts’ qualities, and amounts of ground truth data. To our knowledge, this paper is the first to posit and address the problem of estimating experts’ decision accuracies from historical data with scarce ground truth, and it is the first to offer comprehensive results for this problem setting, establishing the performances that can be achieved across settings as well as the state-of-the-art performance on which future work can build. This paper was accepted by Anindya Ghose, information systems. Funding: T. Geva acknowledges research grants from the Jeremy Coller Foundation and from the Henry Crown Institute for Business Research. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2021.03357 .

This study presents a novel approach, based on high-dimensionality hydro-acoustic data, for improving the performance of angular response analysis (ARA) on multibeam backscatter data in terms of acoustic class separation and spatial resolution. This approach is based on the hyper-angular cube (HAC) data structure which offers the possibility to extract one angular response from each cell of the cube. The HAC consists of a finite number of backscatter layers, each representing backscatter values corresponding to single-incidence angle ensonifications. The construction of the HAC layers can be achieved either by interpolating dense soundings from highly overlapping multibeam echo-sounder (MBES) surveys (interpolated HAC, iHAC) or by producing several backscatter mosaics, each being normalized at a different incidence angle (synthetic HAC, sHAC). The latter approach can be applied to multibeam data with standard overlap, thus minimizing the cost for data acquisition. The sHAC is as efficient as the iHAC produced by actual soundings, providing distinct angular responses for each seafloor type. The HAC data structure increases acoustic class separability between different acoustic features. Moreover, the results of angular response analysis are applied on a fine spatial scale (cell dimensions) offering more detailed acoustic maps of the seafloor. Considering that angular information is expressed through high-dimensional backscatter layers, we further applied three machine learning algorithms (random forest, support vector machine, and artificial neural network) and one pattern recognition method (sum of absolute differences) for supervised classification of the HAC, using a limited amount of ground truth data (one sample per seafloor type). Results from supervised classification were compared with results from an unsupervised method for inter-comparison of the supervised algorithms. It was found that all algorithms (regarding both the iHAC and the sHAC) produced very similar results with good agreement (>0.5 kappa) with the unsupervised classification. Only the artificial neural network required the total amount of ground truth data for producing comparable results with the remaining algorithms.

Amount Of Ground Truth Data Research Articles

Related Topics

Articles published on Amount Of Ground Truth Data

A Machine Learning Framework for Assessing Experts’ Decision Quality

Human selection bias drives the linear nature of the more ground truth effect in explainable deep learning optical coherence tomography image segmentation.

Pixel Diffuser: Practical Interactive Medical Image Segmentation without Ground Truth.

Physics-informed neural networks for modeling physiological time series for cuffless blood pressure estimation

Background-foreground segmentation for interior sensing in automotive industry

Deep-layers-assisted machine learning for accurate image segmentation of complex materials

U-Net-Based Segmentation of Microscopic Images of Colorants and Simplification of Labeling in the Learning Process.

Fast location and segmentation of high-throughput damaged soybean seeds with invertible neural networks.

Object-Based Visual Camera Pose Estimation From Ellipsoidal Model and 3D-Aware Ellipse Prediction

Geolocation Prediction in Twitter Using Social Networks: A Critical Analysis and Review of Current Practice

High-throughput soybean seeds phenotyping with convolutional neural networks and transfer learning

Learning to Calibrate Battery Models in Real-Time with Deep Reinforcement Learning

Estimation of Crop Yield From Combined Optical and SAR Imagery Using Gaussian Kernel Regression

Mapping sub-field maize yields in Nebraska, USA by combining remote sensing imagery, crop simulation models, and machine learning

Identifying malicious social media contents using multi-view Context-Aware active learning

Remotely sensed vegetation index and LAI for parameter determination of the CSM-CROPGRO-Soybean model when in situ data are not available

SynSys: A Synthetic Data Generation System for Healthcare Applications.

Development of a Classification Method for Forest Vegetation on the Stand Level, Using KOMPSAT-3A Imagery and Land Coverage Map

The Hyper-Angular Cube Concept for Improving the Spatial and Acoustic Resolution of MBES Backscatter Angular Response Analysis

Citywide Traffic Volume Estimation Using Trajectory Data

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Amount Of Ground Truth Data Research Articles

Related Topics

Articles published on Amount Of Ground Truth Data

A Machine Learning Framework for Assessing Experts’ Decision Quality

Human selection bias drives the linear nature of the more ground truth effect in explainable deep learning optical coherence tomography image segmentation.

Pixel Diffuser: Practical Interactive Medical Image Segmentation without Ground Truth.

Physics-informed neural networks for modeling physiological time series for cuffless blood pressure estimation

Background-foreground segmentation for interior sensing in automotive industry

Deep-layers-assisted machine learning for accurate image segmentation of complex materials

U-Net-Based Segmentation of Microscopic Images of Colorants and Simplification of Labeling in the Learning Process.

Fast location and segmentation of high-throughput damaged soybean seeds with invertible neural networks.

Object-Based Visual Camera Pose Estimation From Ellipsoidal Model and 3D-Aware Ellipse Prediction

Geolocation Prediction in Twitter Using Social Networks: A Critical Analysis and Review of Current Practice

High-throughput soybean seeds phenotyping with convolutional neural networks and transfer learning

Learning to Calibrate Battery Models in Real-Time with Deep Reinforcement Learning

Estimation of Crop Yield From Combined Optical and SAR Imagery Using Gaussian Kernel Regression

Mapping sub-field maize yields in Nebraska, USA by combining remote sensing imagery, crop simulation models, and machine learning

Identifying malicious social media contents using multi-view Context-Aware active learning

Remotely sensed vegetation index and LAI for parameter determination of the CSM-CROPGRO-Soybean model when in situ data are not available

SynSys: A Synthetic Data Generation System for Healthcare Applications.

Development of a Classification Method for Forest Vegetation on the Stand Level, Using KOMPSAT-3A Imagery and Land Coverage Map

The Hyper-Angular Cube Concept for Improving the Spatial and Acoustic Resolution of MBES Backscatter Angular Response Analysis

Citywide Traffic Volume Estimation Using Trajectory Data