Abstract

The aim of the work is to develop and implement a technology for identifying similar series, and to test on series of data represented by hydrological samples. The subject of the study is the methods and approaches for identifying similar series. The object of the study is the process of identifying similar series, which are represented by certain indicators. The task is to propose and implement distance measures, where one of them takes into consideration the similarity between the values of the series and their relationship, and another is based on a weighted Euclidean distance taking into account the need to actualize the values that are the most important under certain conditions of the task; to implement a technology to find similar series represented by certain indicators values; to obtain a more resilient solution, to implement a procedure for determining a set of similar series based on the results obtained for each individual distance; the results should be analyzed and the conclusions have to be drawn dealing with practical application of the technology. The following methods were used: statistical analysis methods, methods for calculating distances, and similarity between data series. The following results were obtained: the technology for similar data series detection has been implemented; two distance measures were proposed and described as a part of the technology implemented; a procedure for determining a set of similar rows was implemented that was based on the obtained distances calculation. The scientific novelty of the research under discussion involves: Euclidean weighted distance was described and applied taking into account the actuality of data series values; a new measure of distance has been described and applied that allows both the degree of similarity between the values of the series and their correlation to be taken into account, as well as a technique has been developed for determining similar series from a set of selected distance measures. The practical importance of the developed and implemented technology consists in the following possibilities application to data series of different applied fields: conducting an assessment and identifying some similar series, in particular as an intermediate step in the analysis; in addition, the proposed distance measures improve the quality of identifying similar data series. In our further research, we plan to investigate the possibilities of lengthening the data series and filling in the gaps with values from other series defined as similar ones.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.