Abstract

The article presents a solution for the problem of identifying duplicated ads on property selling websites. This task is formulated in the form of a classification problem: the input parameters are identified then divided into basic and non-basic, as well as a class-forming feature. It is also necessary to consider the preliminary data of processed property objects, which is necessary for proper application of the classification methods. The following is a brief review of chosen modern algorithms for solving classification problems, namely: decision trees, artificial neural networks, logistic regression. As a result of experiments, it was revealed that Artificial neural network gives the most accurate result therefore, this algorithm is suitable for the solution of the stated problem.

Highlights

  • Nowadays there are many sites specializing in property sales and rentals. Using these sites users can pick up accommodation, corresponding to their personal needs without leaving their houses. Sites of this kind simplified the process of property search, but many users are faced with the problem that some ads are duplicated

  • Algorithm selection The main task of the portal - identification of genuine ads – can be presented as a classification problem. These problems can be solved with such algorithms as: decision trees, artificial neural networks, logistic regression [7 - 11]

  • As a result of experiments, it is revealed that artificial neural network gives the most accurate and correct result

Read more

Summary

Introduction

Nowadays there are many sites specializing in property sales and rentals. Using these sites users can pick up accommodation, corresponding to their personal needs without leaving their houses. 4. Preliminary data processing of property objects not every characteristic store information that can help to determine the authenticity of the ad, all of the characteristics can be input data for testing the method of searching of the duplicated ads. In order to properly apply the classification method, it is necessary to standardize all the basic characteristics In this case, it is proposed to use not the values of the characteristics, but to use the degree of identity of the object’s characteristics (OC) with other objects of the group. Algorithm selection The main task of the portal - identification of genuine ads – can be presented as a classification problem These problems can be solved with such algorithms as: decision trees, artificial neural networks, logistic regression [7 - 11]. Match factors were calculated for all property values of all objects

True False True False
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.