A digital service, like a web site, may contain a lot of information but we often do not know if it is used, relevant or valuable. Transaction log files generated by digital information services do record the pages (topics or content) viewed by users and this is perhaps the most interesting aspect of the logs. However, analysing these pages poses plenty of problems for researchers, especially when comparing content coverage of various related services. It is quite normal, even for digital services of the same organization, to adopt different page naming conventions for each service. This is even truer about digital services run by different organizations. What all this means is that there is no easy way to compare topic use as revealed by access behaviour. This paper looks at the problems of describing and comparing the content usage of digital information services, covering three digital platforms operating in the health field. This paper discusses problems posed in making health content comparisons based on page names listed in the transaction log files and between very large data sets. It reviews the impact that system architecture might have as well as the time the service has been available online and the impact due to outlet differences. However, the main focus of the article is a comparison of five sources of health information through their log files. It makes use of cluster analysis and applies procedures normally used to define species diversity to research content coverage. In all, two million page views were analysed, covering more than 5000 unique health pages.