Abstract
The aim of the article is to emphasize and illustrate the retrieval dimensions of data collection activity online and their influence on the research evaluation outcome. The attempt is to reinforce the link between online retrieval and bibliometrics. Given that various forms of publication counts and citation analyses provide a valuable and revealing quantitative starting point for more qualitative indications and assessments of Science and Technology (S&T) performance, it is evident that their reliability and objectivity must be undisputed as far as possible. The article discusses the basic problems and limitations inherent in online bibliometric data collection and analyses, and points to possible solutions by means of illustrative case studies and examples. The reason for performing local publication analyses online often arises because of the increased use of external research assessments made by centralized bodies. For small institutions in small countries, like the North European one, such self-analyses may in addition provide valuable and inexpensive insights into novel S&T niches to explore. The major concern is the extent to which online bibliographic and domain dependent databases, as a supplement to the Institute for Scientific Information (ISI) citation files, are suitable for quantitative analysis and mapping of R&D outcome. By merging these two different types of databases into a single cluster, the method of duplicate removal becomes crucial. The article introduces a novel removal procedure by describing and exemplifying the principle of Reversed Duplicate Removal (RDR). RDR enables the analyst to take control of the location of the duplicates and to perform tailored analyses of the overlap of identical documents between files. It is well known that the databases themselves present obstacles directly associated with the process of performing online retrieval of the information necessary for further analysis. Problems encountered are, for instance, poor or inconsistent subject indexing within a single database or among several databases. Name form inconsistencies as to authors, institutions, and journals, the lack or inaccessibility of vital data in the database structures, etc., also present obstacles. On the other hand, comprehensive online bibliometric analyses are in many ways easier, faster, and less expensive to perform locally than those made using the independent CD-ROM versions of the relevant databases. In contrast to the online versions, the CD-ROM systems demonstrate a vital shortage of robust data processing and manipulation facilities. The downloading of records from a variety of CD-ROM files, the cleaning-up process, and the ensuing data processing activities become cumbersome and resource demanding. Regardless of database versioning, the degree of awareness of these retrieval and set isolation factors, such as the relevant search commands, syntax, and the analysis assumptions on the part of the analyst, plays an important role for the quality of the analysis outcome. © 1997 John Wiley & Sons, Inc.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of the American Society for Information Science
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.