Abstract

The article describes the basic methods and mechanisms of cluster analysis in relation to transport. In addition, an example of the analysis of individual polygons of the Trans-Siberian Railway is shown using a computer program that implements Kruskal and Prim methods.

Highlights

  • INTRODUCTIONOne of the most powerful toolkits that help to extract previously unknown knowledge from various, including large databases, is Data Mining Tools (DMT)

  • One of the most powerful toolkits that help to extract previously unknown knowledge from various, including large databases, is Data Mining Tools (DMT).Data Mining Tools, called Knowledge Discovery In Data, allow to significantly expand the range of practical management tasks that are solved using computers.The discovery of new knowledge by means of data mining is carried out using a wide range of tools, among which an important place is occupied by cluster analysis.The task of cluster analysis is to identify a natural local condensation of objects, each of which is described by a set of variables or characteristics

  • In the process of cluster analysis, the investigated set of objects represented by multidimensional data is divided into groups of objects similar in a certain sense, called clusters

Read more

Summary

INTRODUCTION

One of the most powerful toolkits that help to extract previously unknown knowledge from various, including large databases, is Data Mining Tools (DMT). Data Mining Tools, called Knowledge Discovery In Data, allow to significantly expand the range of practical management tasks that are solved using computers. The discovery of new knowledge by means of data mining is carried out using a wide range of tools, among which an important place is occupied by cluster analysis. In the process of cluster analysis, the investigated set of objects represented by multidimensional data is divided into groups of objects similar in a certain sense, called clusters. The result of cluster analysis is both the selection of the clusters themselves, and the determination of the belonging of each object to one of them. Often the results of the cluster analysis performed are the starting point for further data mining. To determine the "similarity" of objects, it is necessary to introduce a measure of proximity or distance between objects. The most commonly used is the Euclidean metric, which is related to the intuitive notion of distance

FORMALIZATION OF THE CLUSTERING PROBLEM
CLUSTERING ALGORITHM
MECHANISMS FOR CONSTRUCTING A MINIMAL SPANNING TREE
CHARACTERISTICS OF THE TRANS-SIBERIAN RAILWAY POLYGONS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call