Internal Data Model Research Articles

The article discusses the possibilities of using modeling tools at the stages of database design. Database design is one of the most complex and crucial tasks associated with the creation of an automated information system. As a result of solving this problem, the content of the database should be determined, an effective way for organizing data for all its future users, and data management tools. The main goal of this stage is to develop such a database scheme that it includes the necessary and sufficient data on information objects, their properties and relationships in accordance with the objects of the subject area and the processes of their transformation. The authors present the results of using modeling tools to solve the problems of the database design stage. Creating models is considered from the point of view of three-circuit architecture, the main principle of which is abstraction. To display the levels of data presentation, three related models have been developed: an external data model that displays the representations of each type of user existing in the organization (description of the subject area); a conceptual data model that displays a logical (or generalized) idea of data, independent of the type of database management system selected; an internal data model that displays the conceptual diagram in a specific way that is understandable to the selected target database management system. As an example of a subject area, the activity of a school of foreign languages is considered, namely, the registration of students in a school. The main problem is the large amount of information that is processed manually: paper forms of clients are drawn up, reports on the implementation of work are manually generated and filled out. It is necessary to develop a draft database for the registration of students of a school of foreign languages and to further implement it in the environment of the selected database management system.

Read full abstract

Zenodo (https://zenodo.org) is an open-access repository operated by CERN (European Organization for Nuclear Research), which provides researchers with an easy and stable platform to archive and publish their data and other output, such as software tools, manuals and project reports. In the context of the ICEDIG (Innovation and Consolidation for Large scale Digitisation of Natural Heritage) project, Zenodo was investigated for its usability as a platform where digitized images of collection specimens could be archived and published. In a production digitization pipeline, we foresee the automated archiving of daily image production. If Zenodo could be used for this purpose, such a process would also immediately mean that data and images are published FAIR-ly (Findable, Accessible, Interoperable and Reusable) within hours of their creation. To evaluate performance of the system, we first used a test dataset of 1800 herbarium specimen images, which was uploaded using Zenodo's API (Application Programming Interface) (Dillen et al. 2019). This dataset includes lossless TIFF images, label-segmented overlays and JSON-LD (JavaScript Object Notation for Linked Data) metadata using DwC (Darwin Core) terminology, constituting over 208 gigabytes of data. In addition, for all individual digital specimens the data about the specimen (in DwC) as well as metadata about its deposition on Zenodo (in Zenodo's internal data model) were available in multiple machine-readable formats. All data in DwC were provided as linked data with their DwC identifiers (e.g. http://rs.tdwg.org/dwc/terms/basisOfRecord). All individual specimens received minted DOIs (Digital Object Identifiers). A second upload of 280,000 herbarium JPEG images from a single institution (ca. 1 terabyte of data) with limited metadata (but using the same approach) was launched as well. In this presentation, the workflow for proper usage of the API will be described as well as some performance metrics, flexibilities and functionalities of the platform. Some issues and potential developments to tackle them will be discussed. Currently, the rate of ingestion into Zenodo seems only fast enough for small scale digitization pipelines. However, a modest improvement in transfer rate would make this a realistic proposition for large volume usage.

Read full abstract

Internal Data Model Research Articles

Related Topics

Articles published on Internal Data Model

Enhancing RDM in Galaxy by integrating RO-Crate

Реализация функции долговременного хранения научных данных большого объема в вычислительном центре

The use of modeling tools at the stages of database design

Zenodo, an Archive and Publishing Repository: A tale of two herbarium specimen pilot projects

Data Model and Software Architecture for Business Process Model Generator

Building an archive with Saada

Integration of pathway data as prior knowledge into methods for network reconstruction

Extraction of standardized archetyped data from Electronic Health Record systems based on the Entity-Attribute-Value Model

Autoplot: a browser for scientific data on the web

웹 문서의 의미적 연관성 기술을 위한 온톨로지 에디터

A method for integrating multiple components in a decision support system

Design Choices when Architecting Visualizations

UPGRADE: A Framework for Building Graph-Based Interactive Tools

Java power tools

A multi-view VR interface for 3D GIS

Choice of conceptual and internal data model in a data base management system (DBMS) with a layered structure

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Internal Data Model Research Articles

Related Topics

Articles published on Internal Data Model

Enhancing RDM in Galaxy by integrating RO-Crate

Реализация функции долговременного хранения научных данных большого объема в вычислительном центре

The use of modeling tools at the stages of database design

Zenodo, an Archive and Publishing Repository: A tale of two herbarium specimen pilot projects

Data Model and Software Architecture for Business Process Model Generator

Building an archive with Saada

Integration of pathway data as prior knowledge into methods for network reconstruction

Extraction of standardized archetyped data from Electronic Health Record systems based on the Entity-Attribute-Value Model

Autoplot: a browser for scientific data on the web

웹 문서의 의미적 연관성 기술을 위한 온톨로지 에디터

A method for integrating multiple components in a decision support system

Design Choices when Architecting Visualizations

UPGRADE: A Framework for Building Graph-Based Interactive Tools

Java power tools

A multi-view VR interface for 3D GIS

Choice of conceptual and internal data model in a data base management system (DBMS) with a layered structure