Abstract

The way open data resources of varied type and volume are used by software applications remains only partly known. In this study, following CRoss-Industry Standard Process for Data Mining, we propose a methodology for collecting and analyzing access data describing the use of open data resources by individual software applications. The methodology includes novel categorization of the data collected at an exposition portal providing access to underlying open data portals and third-party services. Furthermore, it enables research into the use of both different open data resources and resource groups such as Big Data resources for software development. We apply the methodology to analyze the re-use of open urban data during reference software development events. The identification of open data use by individual applications is largely improved compared to baseline scenario, as shown by numerical indicators including F1 measure. Insight into re-use of data streams and actual development time is obtained.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call