Data Balancing Method for Training Segmentation Neural Networks

Alexey Kochkarev,Andrey Krylov,Alexander Khvostikov,Dmitry Korshunov,Mikhail Boguslavskiy

doi:10.51130/graphicon-2020-2-4-19

Abstract

Data imbalance is a common problem in machine learning and image processing. The lack of training data for the rarest classes can lead to worse learning ability and negatively affect the quality of segmentation. In this paper, we focus on the problem of data balancing for the task of image segmentation. We review major trends in handling unbalanced data and propose a new method for data balancing, based on Distance Transform. This method is designed for using in segmentation convolutional neural networks (CNNs), but it is universal and can be used with any patch-based segmentation machine learning model. The evaluation of the proposed data balancing method is performed on two datasets. The first is medical dataset LiTS, containing CT images of liver with tumor abnormalities. The second one is a geological dataset, containing of photographs of polished sections of different ores. The proposed algorithm enhances the data balance between classes and improves the overall performance of CNN model.

Highlights

Data imbalance is a common issue in image segmentation [1]
The problem of data imbalance is very common in medical problems and, in particular, detecting liver tumors
In this paper we propose a data balancing method that focuses on modifying the class distribution in the dataset

Summary

Introduction

Data imbalance is a common issue in image segmentation [1]. If pixels corresponding to a particular “majority” class are far more numerous than pixels of one or more “minority” classes, the rarity of the “minority” class in the training data makes the training process less effective and worses the final results, as the learned model will tend to classify most pixels as members of the “majority” classes. The problem of data imbalance is very common in medical problems and, in particular, detecting liver tumors. One of these problems is segmentation of CT images, since the volume and area of different organs and abnormalities differs a lot. One common scheme involves assigning to each class a cost equal to the inverse of the proportion of this class in dataset. This leads to higher model penalization for rarest classes. The second category of methods is represented with so-called data-based methods They use sampling techniques to rebalance the distribution of classes during preprocessing. The proposed method is specially created for segmentation problems and has a wide range of applications

Proposed method

Class choice

Image choice

Patch choice

Used datasets

LiTS dataset

Polished sections of ores dataset

Experiments and results

Background

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2	Publication Date: Dec 17, 2020
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Data Balancing Method for Training Segmentation Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2

Lead the way for us

Similar Papers

Performance analysis of state‐of‐the‐art CNN architectures for brain tumour detection
Hafiz Muhammad Tayyab Khushi ... Sheeraz Akram
International Journal of Imaging Systems and Technology | VOL. 34
Hafiz Muhammad Tayyab Khushi, et. al.Hafiz Muhammad Tayyab Khushi ... Sheeraz Akram
18 Aug 2023
International Journal of Imaging Systems and Technology | VOL. 34

Effects of spatiotemporal correlations in wind data on neural network-based wind predictions
Heesoo Shin ... Sangseung Lee
Energy | VOL. 279
Heesoo Shin, et. al.Heesoo Shin ... Sangseung Lee
12 Jun 2023
Energy | VOL. 279

A priori assessment of convolutional neural network and algebraic models for flame surface density of high Karlovitz premixed flames
Jiahao Ren ... Haiou Wang
Physics of Fluids | VOL. 33
Jiahao Ren, et. al.Jiahao Ren ... Haiou Wang
01 Mar 2021
Physics of Fluids | VOL. 33

Image classification and training with severe data loss
Dillon Marquard ... Jonathan Howe
-
Dillon Marquard, et. al.Dillon Marquard ... Jonathan Howe
03 Oct 2022
03 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Balancing Method for Training Segmentation Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2