Abstract

Search engine optimization (SEO) can significantly influence what is shown on the result pages of commercial search engines. However, it is unclear what proportion of (top) results have actually been optimized. We developed a tool that uses a semi-automatic approach to detect, based on a given URL, whether SEO measures were taken. In this multi-dimensional approach, we analyze the HTML code from which we extract information on SEO and analytics tools. Further, we extract SEO indicators on the page level and the website level (e.g., page descriptions and loading time of a website). We amend this approach by using lists of manually classified websites and use machine learning methods to improve the classifier. An analysis based on three datasets with a total of 1,914 queries and 256,853 results shows that a large fraction of pages found in Google is at least probably optimized, which is in line with statements from SEO experts saying that it is tough to gain visibility in search engines without applying SEO techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call