Abstract

Street-level landmarks are an important basis for street-level IP geolocation, and the web-based landmark is one of the main sources of street-level landmarks. Considering the existing street-level landmark evaluation methods having low accuracy and strict constraints, this paper analyses the causes and evaluation idea of invalid web-based candidate landmarks and proposes Evaluator, a web-based landmark evaluation approach. Evaluator adopts the idea of the decision tree to filter invalid landmarks layer by layer and comprehensively estimates the quantitative reliability of candidate landmarks with public data and services to obtain reliable landmarks. This paper proposes the domain name system (DNS) distributed query algorithm to effectively resolve all IP addresses of a domain name, which provides data support for Evaluator to filter candidate landmarks. Meanwhile, this paper also proposes a reverse verification algorithm to obtain all domain names of an IP address, which provides an important reference to calculate the reliability of a reliable landmark. In addition, gradient descent is used to assess the parameters of the reliability estimating model, which effectively improves the robustness of Evaluator. Experiments show that reliable landmarks from Evaluator reduce the geolocation error of 100 targets in Hong Kong from 7.30 km to 3.91 km, compared with the landmark verifying method (LVM), one of the latest web-based landmark evaluation methods. Moreover, Evaluator significantly improves the evaluation coverage based on the same geolocation accuracy with street-level landmark evaluation (SLE), one of the latest landmark evaluation methods.

Highlights

  • IP geolocation [1] is a technique to locate an Internet host using its IP address

  • In order to verify the validity of Evaluator, we evaluated candidate landmarks collected from several cities and tested the performance of gained reliable landmarks

  • It contains organization information of Beijing, Zhengzhou, Hong Kong, New York, and Los Angles and IP addresses resolved from their domain name, as shown in Table 5, where candidate landmarks refer to all the landmarks obtained, domain names mean the distinct domain names included in the candidate landmarks, and IP addresses obtained refers to the distinct IP addresses of the candidate landmarks

Read more

Summary

Introduction

One is judging the distribution of the locations claimed by the candidate landmarks with different domain names sharing one IP address, as different domain names are mapped to one IP address by virtual hosting and the locations of their corresponding organizations are usually widely distributed. Candidate landmarks which belong to CDN networks and cloud services are excluded according to the subnet distribution of the IP address mapped by their domain name. Reliability of a candidate landmark is calculated by the following information: the comparison result of the web page responses to its IP address request and its domain request, the number of domain names mapped by its IP address, the matching result with registration organization, and its location information in the administrative district level of its IP address

The Evaluator Framework
Experiments and Results
Verification Dataset
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call