Random Initialization Research Articles

PurposeThe K-means (KM) clustering algorithm is extremely responsive to the selection of initial centroids since the initial centroid of clusters determines computational effectiveness, efficiency and local optima issues. Numerous initialization strategies are to overcome these problems through the random and deterministic selection of initial centroids. The random initialization strategy suffers from local optimization issues with the worst clustering performance, while the deterministic initialization strategy achieves high computational cost. Big data clustering aims to reduce computation costs and improve cluster efficiency. The objective of this study is to achieve a better initial centroid for big data clustering on business management data without using random and deterministic initialization that avoids local optima and improves clustering efficiency with effectiveness in terms of cluster quality, computation cost, data comparisons and iterations on a single machine.Design/methodology/approachThis study presents the Normal Distribution Probability Density (NDPD) algorithm for big data clustering on a single machine to solve business management-related clustering issues. The NDPDKM algorithm resolves the KM clustering problem by probability density of each data point. The NDPDKM algorithm first identifies the most probable density data points by using the mean and standard deviation of the datasets through normal probability density. Thereafter, the NDPDKM determines K initial centroid by using sorting and linear systematic sampling heuristics.FindingsThe performance of the proposed algorithm is compared with KM, KM++, Var-Part, Murat-KM, Mean-KM and Sort-KM algorithms through Davies Bouldin score, Silhouette coefficient, SD Validity, S_Dbw Validity, Number of Iterations and CPU time validation indices on eight real business datasets. The experimental evaluation demonstrates that the NDPDKM algorithm reduces iterations, local optima, computing costs, and improves cluster performance, effectiveness, efficiency with stable convergence as compared to other algorithms. The NDPDKM algorithm minimizes the average computing time up to 34.83%, 90.28%, 71.83%, 92.67%, 69.53% and 76.03%, and reduces the average iterations up to 40.32%, 44.06%, 32.02%, 62.78%, 19.07% and 36.74% with reference to KM, KM++, Var-Part, Murat-KM, Mean-KM and Sort-KM algorithms.Originality/valueThe KM algorithm is the most widely used partitional clustering approach in data mining techniques that extract hidden knowledge, patterns and trends for decision-making strategies in business data. Business analytics is one of the applications of big data clustering where KM clustering is useful for the various subcategories of business analytics such as customer segmentation analysis, employee salary and performance analysis, document searching, delivery optimization, discount and offer analysis, chaplain management, manufacturing analysis, productivity analysis, specialized employee and investor searching and other decision-making strategies in business.

Read full abstract

PurposeRadiochromic films are versatile 2D dosimeters with high‐resolution and near tissue equivalence. To assure high precision and accuracy, a time‐consuming calibration process is required. To improve the time efficiency, a novel calibration method utilizing the ratio of the same dose profile measured at different monitor units (MUs) is introduced and tested in a proton and photon beam.MethodsThe calibration procedure employs the dose ratio of film measurements of the same relative profile for different absolute dose values. Hence, the ratio of the dose is constant at any point of the profile, but the ratio of the net optical densities is not constant. The key idea of the method is to optimize the calibration function until the ratio of the calculated doses is constant. The proposed method was tested in the dose range between 0.25–12 and 1–6 Gy in a proton and photon beam, respectively. A radial symmetric profile and a rectangular profile were created, both having a central plateau region of about 3 cm diameter and a dose falloff of about 1.5 cm at larger distances. The dose falloff region was used as input for the optimization method and the central plateau region served as dose reference points. Only the plateau region of the highest dose entered the optimization as an additional objective. The measured data were randomly split into differently sized training and test sets. The optimization was repeated 1000 times with random start value initialization using the same start values for the standard and the gradient method. Finally, a proton plan with four dose levels was created, which were separated spatially, to test the possibility of a full calibration within a single measurement.ResultsParameter estimation was possible with as low as one dose ratio used for optimization in both the photon and the proton case, yet exhibiting a high sensitivity on the dose level. The root mean squared deviation (RMSD) of the dose was less than 1% when the dose ratio was in the order of 20, whereas the median RMSD of all optimizations was 1.7%. Using four dose levels for optimization resulted in a median RMSD of 1% when randomly selecting the dose levels. Having at least one dose ratio of about 20 included in the optimization considerably improved the RMSD of the calibration function. Using six or eight dose levels reduced the sensitivity on the dose level selection and the median RMSD was 0.8%. A full calibration was possible in a single measurement having four dose levels in one plan but spatially separated.ConclusionsThe number of measurements required to obtain an EBT3 film calibration function could be reduced using the proposed dose ratio method while maintaining the same accuracy as with the standard method.

Read full abstract

Random Initialization Research Articles

Related Topics

Articles published on Random Initialization

Encryption Modes Identification of Block Ciphers based on Machine Learning

ECKM: An improved K-means clustering based on computational geometry

A Spatial Temporal Classification Analysis And Visualization Of Tropical Cyclone Tracks In Bay Of Bengal Using GIS

Sparse Gaussian processes for multi-step motion prediction of space tumbling objects

A convergence analysis of Nesterov’s accelerated gradient method in training deep linear neural networks

NDPD: an improved initial centroid method of partitional clustering for big data mining

Pneumonia Detection in Chest X-Ray Images Using Enhanced Restricted Boltzmann Machine.

A Dynamic Colored Traveling Salesman Problem With Varying Edge Weights

Principal coefficient encoding for subject-independent human activity analysis

Research on Human Resource Management Performance Evaluation Method Based on Chaos Optimization Algorithm

A faster dynamic convergency approach for self-organizing maps

RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning.

SELF-ORGANIZING RESERVOIR NETWORK FOR ACTION RECOGNITION

To Estimate Performance of Artificial Neural Network Model Based on Terahertz Spectrum: Gelatin Identification as an Example.

Accelerating and improving radiochromic film calibration by utilizing the dose ratio in photon and proton beams.

Accelerating and improving deep reinforcement learning-based active flow control: Transfer training of policy network

Activation function design for deep networks: linearity and effective initialisation

A Link Prediction Algorithm Based on GAN

Evaluation for Development Effect of Enterprise Innovation with Neural Network from Low-Carbon Economy

Efficient Robust Training via Backward Smoothing

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Random Initialization Research Articles

Related Topics

Articles published on Random Initialization

Encryption Modes Identification of Block Ciphers based on Machine Learning

ECKM: An improved K-means clustering based on computational geometry

A Spatial Temporal Classification Analysis And Visualization Of Tropical Cyclone Tracks In Bay Of Bengal Using GIS

Sparse Gaussian processes for multi-step motion prediction of space tumbling objects

A convergence analysis of Nesterov’s accelerated gradient method in training deep linear neural networks

NDPD: an improved initial centroid method of partitional clustering for big data mining

Pneumonia Detection in Chest X-Ray Images Using Enhanced Restricted Boltzmann Machine.

A Dynamic Colored Traveling Salesman Problem With Varying Edge Weights

Principal coefficient encoding for subject-independent human activity analysis

Research on Human Resource Management Performance Evaluation Method Based on Chaos Optimization Algorithm

A faster dynamic convergency approach for self-organizing maps

RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning.

SELF-ORGANIZING RESERVOIR NETWORK FOR ACTION RECOGNITION

To Estimate Performance of Artificial Neural Network Model Based on Terahertz Spectrum: Gelatin Identification as an Example.

Accelerating and improving radiochromic film calibration by utilizing the dose ratio in photon and proton beams.

Accelerating and improving deep reinforcement learning-based active flow control: Transfer training of policy network

Activation function design for deep networks: linearity and effective initialisation

A Link Prediction Algorithm Based on GAN

Evaluation for Development Effect of Enterprise Innovation with Neural Network from Low-Carbon Economy

Efficient Robust Training via Backward Smoothing