Abstract

Formalization is becoming more common in all stages of the development of information systems, as a better understanding of its benefits emerges. Classification systems are ubiquitous, no more so than in domain modeling. The classification pattern that underlies these systems provides a good case study of the move toward formalization in part because it illustrates some of the barriers to formalization, including the formal complexity of the pattern and the ontological issues surrounding the “one and the many.” Powersets are a way of characterizing the (complex) formal structure of the classification pattern, and their formalization has been extensively studied in mathematics since Cantor’s work in the late nineteenth century. One can use this formalization to develop a useful benchmark. There are various communities within information systems engineering (ISE) that are gradually working toward a formalization of the classification pattern. However, for most of these communities, this work is incomplete, in that they have not yet arrived at a solution with the expressiveness of the powerset benchmark. This contrasts with the early smooth adoption of powerset by other information systems communities to, for example, formalize relations. One way of understanding the varying rates of adoption is recognizing that the different communities have different historical baggage. Many conceptual modeling communities emerged from work done on database design, and this creates hurdles to the adoption of the high level of expressiveness of powersets. Another relevant factor is that these communities also often feel, particularly in the case of domain modeling, a responsibility to explain the semantics of whatever formal structures they adopt. This paper aims to make sense of the formalization of the classification pattern in ISE and surveys its history through the literature, starting from the relevant theoretical works of the mathematical literature and gradually shifting focus to the ISE literature. The literature survey follows the evolution of ISE’s understanding of how to formalize the classification pattern. The various proposals are assessed using the classical example of classification; the Linnaean taxonomy formalized using powersets as a benchmark for formal expressiveness. The broad conclusion of the survey is that (1) the ISE community is currently in the early stages of the process of understanding how to formalize the classification pattern, particularly in the requirements for expressiveness exemplified by powersets, and (2) that there is an opportunity to intervene and speed up the process of adoption by clarifying this expressiveness. Given the central place that the classification pattern has in domain modeling, this intervention has the potential to lead to significant improvements.

Highlights

  • Classification is ubiquitous [28, 36,102]

  • Our interest is in a pattern that can be discerned at the core of these classification systems, which underpins its structure, what we call the “classification pattern.”

  • The comparisons with the mathematical framework have made clear that the term “powertype” in the Odell-UML strand has been subject to semantic drift

Read more

Summary

Introduction

Classification (in the everyday sense) is ubiquitous [28, 36,102]. This should not be surprising; classifications are one of the major ways we organize things in the world. This paper aims to start to fill that gap It surveys how the formalization of the classification pattern has emerged and evolved in information systems engineering (ISE). It briefly tracks the history of its emergence and builds a picture of the current status. The hypothesis is that both the formal complexity of the pattern and the requirement for an explanation of what the formal structures being proposed represent (a semantic-ontological narrative about what aspect of reality they are reflecting) contribute to the slow adoption This is described in the second part of the paper

Why classification is being formalized now
What is the classification pattern?
Separating the concerns
How to assess the formalization
ISE survey: assessing the formalization
The classification benchmark
Formal structure
Mathematical background
Which mathematical theory?
Powersets and related mathematical objects
Origin and definition of powerset
Ur-elements
Subset-of
Powerset expanded
Intersection and union of sets
Partition of S
Cover of S
Powerset-of relation
Powerset-subset-of
The extensionality of set theory
A technical point
Alternative mathematical frameworks
Type theory
Category theory
Formalizing classifications using mathematical set-theoretic objects
The Linnaean taxonomy
The Linnaean classifications
The five Linnaean ranks
Rank ordering and partitioning
The underlying formal structure
A survey of powertypes in ISE
The mathematics adopting communities
Historical context
Major powertype strands
Materialization strand
Example benchmark
Mathematical framework
Ptech-UML
Ptech-Odell-powertype
Initial mathematical framework
Comparing the Odell-powertype and the materialization strand
UML-powertype
Clabject-powertype
6.2.10 Semantic drift
6.2.11 Clabject’s approach to the “one over many” question
6.2.12 A Procrustean approach?
Powertype in BORO and ISO 15926-2
ISO 15926 powertype
BORO powertype
Implementation strands
Classification formalization landscape
Classification blindness
Formal expressiveness in terms of mathematical objects
Formal expressiveness in terms of example requirements
Too constraining formal structures
Further work
Large implemented systems survey
Survey the possible semantics for powertypes
Higher-order types survey
Relation between meta-modeling and higher-order types
Conclusions
Aristotle
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.