Empowering OCL research: a large-scale corpus of open-source data from GitHub

Josh G. M. Mengerink,Jeroen Noten,Alexander Serebrenik

doi:10.1007/s10664-018-9641-6

Josh G. M. Mengerink, Jeroen Noten + Show 1 more

Open Access

https://doi.org/10.1007/s10664-018-9641-6

Copy DOI

Abstract

Model-driven engineering (MDE) enables the rise in abstraction during development in software and system design. In particular, meta-models become a central artifact in the process, and are supported by various other artifacts such as editors and transformation. In order to define constraints, invariants, and queries on model-driven artifacts, a generic language has been developed: the Object Constraint Language (OCL). In literature, many studies into OCL have been performed on small collections of data, mostly originating from a single source (e.g., OMG standards). As such, generalization of results beyond the data studied is often mentioned as a threat to validity. Creation of a benchmark dataset has already been identified as a key enabler to address the generalization threat. To facilitate further empirical studies in the field of OCL, we present the first large-scale dataset of 103262 OCL expression, systematically extracted from 671 GitHub repositories. In particular, our dataset has extracted these expressions from various types of files (a.o. metamodels and model-to-text transformations). In this work we showcase a variety of different studies performed using our dataset, and describe several other types that could be performed. We extend previous work with data and experiments regarding OCL in model-to-text (mtl) transformations.

Highlights

Model driven engineering (MDE) is being used in industry to drive increase in productivity (Hutchinson et al 2011b)
In Noten et al (2017a) we have described the data collection process, presented the dataset of Object Constraint Language (OCL) expressions derived from the Ecore meta-models, replicated a study of Cadavid et al (2015) and evaluated the assumptions made by a previous work of Anastasakis et al (2008)
In this work we present a publicly available dataset of OCL expressions derived from GitHub

Summary

Introduction

Model driven engineering (MDE) is being used in industry to drive increase in productivity (Hutchinson et al 2011b) One such driver is the use of domain specific languages (DSLs) to allow engineers to specify systems in terms relevant to their domain, rather than encoding them into general purpose concepts like those of UML. These DSLs are underpinned by metamodels (Cuadrado and Molina 2007), which express the concepts and structure of possible models (i.e., abstract syntax). As DSLs grow in complexity, the expressivity of meta-models alone is often not sufficient to accurately specify the domain (Richters and Gogolla 1998) To address this problem, more complex mechanisms have been proposed, such as the Object Constraint Language (OCL) (Warmer and Kleppe 2003). A need for a more diverse and thorough dataset has been recognized in the literature (Gogolla et al 2013; Gogolla and Cabot 2016)

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Empirical Software Engineering	Publication Date: Aug 23, 2018
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

Empowering OCL research: a large-scale corpus of open-source data from GitHub

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Empirical Software Engineering

Lead the way for us

Similar Papers

A Data Set of OCL Expressions on GitHub
Jeroen Noten ... Josh G.M. Mengerink
-
Jeroen Noten, et. al.Jeroen Noten ... Josh G.M. Mengerink
01 May 2017
01 May 2017

A benchmark for OCL engine accuracy, determinateness, and efficiency
Mirco Kuhlmann ... Lars Hamann
Software & Systems Modeling | VOL. 11
Mirco Kuhlmann, et. al.Mirco Kuhlmann ... Lars Hamann
14 Sep 2010
Software & Systems Modeling | VOL. 11

An Empirical Study of the Impact of OCL Smells and Refactorings on the Understandability of OCL Specifications
Alexandre Correa ... Cláudia Werner
-
Alexandre Correa, et. al.Alexandre Correa ... Cláudia Werner
11 Sep 2007
11 Sep 2007

On OCL-based imperative languages
Fabian Büttner ... Martin Gogolla
Science of Computer Programming | VOL. 92
Fabian Büttner, et. al.Fabian Büttner ... Martin Gogolla
26 Oct 2013
Science of Computer Programming | VOL. 92

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Empowering OCL research: a large-scale corpus of open-source data from GitHub

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Empirical Software Engineering