Abstract

在深度网研究领域,通用搜索引擎(比如Google和Yahoo)具有许多不足之处:它们各自所能覆盖的数据量与整个深度网数据总量的比值小于1/3;与表层网中的情况不同,几个搜索引擎相结合所能覆盖的数据量基本没有发生变化.许多深度网站点能够提供大量高质量的信息,并且,深度网正在逐渐成为一个最重要的信息资源.提出了一个三分类器的框架,用于自动识别特定领域的深度网入口.查询接口得到以后,可以将它们进行集成,然后将一个统一的接口提交给用户以方便他们查询信息.通过8组大规模的实验,验证了所提出的方法可以准确高效地发现特定领域的深度网入口.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.