Design Towards Modern High Performance Numerical LA Library Enabling Heterogeneity and Flexible Data Formats
This work introduces a design proposal towards modernization of high performance numerical library enabling various parallel execution, such as offloading, many-core concurrent execution, heterogeneous hybrid execution. A prototype implementation of Eigen-G2 employs a parallel task model supported by OpenMP, and exclusive control of offloading a GPU device. Also, multiple data formats are available by taking advantage of a metaprogramming support of C++ language. Eigen-G2 exhibits good parallel performance scalability when we use both multicore CPUs and a GPU at the same time.
- Research Article
11
- 10.3389/fmolb.2023.1237129
- Sep 6, 2023
- Frontiers in Molecular Biosciences
Introduction: Co-normalization of RNA profiles obtained using different experimental platforms and protocols opens avenue for comprehensive comparison of relevant features like differentially expressed genes associated with disease. Currently, most of bioinformatic tools enable normalization in a flexible format that depends on the individual datasets under analysis. Thus, the output data of such normalizations will be poorly compatible with each other. Recently we proposed a new approach to gene expression data normalization termed Shambhala which returns harmonized data in a uniform shape, where every expression profile is transformed into a pre-defined universal format. We previously showed that following shambhalization of human RNA profiles, overall tissue-specific clustering features are strongly retained while platform-specific clustering is dramatically reduced. Methods: Here, we tested Shambhala performance in retention of fold-change gene expression features and other functional characteristics of gene clusters such as pathway activation levels and predicted cancer drug activity scores. Results: Using 6,793 cancer and 11,135 normal tissue gene expression profiles from the literature and experimental datasets, we applied twelve performance criteria for different versions of Shambhala and other methods of transcriptomic harmonization with flexible output data format. Such criteria dealt with the biological type classifiers, hierarchical clustering, correlation/regression properties, stability of drug efficiency scores, and data quality for using machine learning classifiers. Discussion: Shambhala-2 harmonizer demonstrated the best results with the close to 1 correlation and linear regression coefficients for the comparison of training vs validation datasets and more than two times lesser instability for calculation of drug efficiency scores compared to other methods.
- Research Article
8
- 10.1007/s11633-006-0414-0
- Oct 1, 2006
- International Journal of Automation and Computing
Efficient real time data exchange over the Internet plays a crucial role in the successful application of web-based systems. In this paper, a data transfer mechanism over the Internet is proposed for real time web based applications. The mechanism incorporates the eXtensible Markup Language (XML) and Hierarchical Data Format (HDF) to provide a flexible and efficient data format. Heterogeneous transfer data is classified into light and heavy data, which are stored using XML and HDF respectively; the HDF data format is then mapped to Java Document Object Model (JDOM) objects in XML in the Java environment. These JDOM data objects are sent across computer networks with the support of the Java Remote Method Invocation (RMI) data transfer infrastructure. Client’s defined data priority levels are implemented in RMI, which guides a server to transfer data objects at different priorities. A remote monitoring system for an industrial reactor process simulator is used as a case study to illustrate the proposed data transfer mechanism.
- Book Chapter
1
- 10.1007/978-1-84996-359-6_5
- Jan 1, 2011
Efficient real-time data exchange over the Internet plays a crucial role in the successful application of Internet-based control. In this chapter, a data transfer mechanism over the Internet is introduced for real-time web-based applications, particularly for Internet-based control systems. The mechanism incorporates the eXtensible Markup Language (XML) and Hierarchical Data Format (HDF) to provide a flexible and efficient data format. Heterogeneous transfer data are classified into light and heavy data, which are stored using XML and HDF, respectively; the HDF data format is then mapped to Java Document Object Model (JDOM) objects in XML, in the Java environment. These JDOM data objects are sent across computer networks with the support of the Java Remote Method Invocation (RMI) data transfer infrastructure. Clients defined data priority levels are implemented in RMI, which guides a server to transfer data objects at different priorities. A remote monitoring system for an industrial reactor process simulator is used as a case study to illustrate the proposed data transfer mechanism.
- Book Chapter
3
- 10.1007/3-540-49164-3_14
- Jan 1, 1999
Portable and efficient ways for calling numerical high performance software libraries from HPF programs are investigated. The methods suggested utilize HPF's EXTRINSIC mechanism and are independent of implementation details of HPF compilers. Two prototypical examples are used to illustrate these techniques. Highly optimized BLAS routines are utilized for local computations: (i) in parallel multiplication of matrices, and (ii) in parallel Cholesky factorization. Both implementations turn out to be very efficient and show significant improvements over standard HPF implementations.
- Research Article
47
- 10.1016/j.ymeth.2016.09.002
- Sep 13, 2016
- Methods
Modeling and interoperability of heterogeneous genomic big data for integrative processing and querying
- Research Article
- 10.1088/1742-6596/664/7/072016
- Dec 1, 2015
- Journal of Physics: Conference Series
Data access and availability is a crucial issue in high energy physics (HEP) experiments, given the huge amount of data produced. We present a flexible and modular data format implementation for HEP applications. It has been designed to modularize data in order to update the minimum amount of event information in case of bug correction, software updates or data format extension, to simplify data distribution and upgrades to the regional data centers, and to reduce the amount of data to be transferred to data members really affected by reprocessing. The proposed design and implementation has been developed as mini-DST data format for the Alpha Magnetic Spectrometer (AMS [1]) experiment on the International Space Station (ISS) and is based on the CERN ROOT [2] toolkit.
- Research Article
7
- 10.1007/s40192-017-0084-5
- Mar 1, 2017
- Integrating Materials and Manufacturing Innovation
Modern high-performing structural materials gain their excellent properties from the complex interactions of various constituent phases, grains, and subgrain structures that are present in their microstructure. To further understand and improve their properties, simulations need to take into account multiple aspects in addition to the composite nature. Crystal plasticity simulations incorporating additional physical effects such as heat generation and distribution, damage evolution, phase transformation, or changes in chemical composition enable the compilation of comprehensive structure–property relationships of such advanced materials under combined thermo-chemo-mechanical loading conditions. Capturing the corresponding thermo-chemo-mechanical response at the microstructure scale usually demands specifically adopted constitutive descriptions per phase. Furthermore, to bridge from the essential microstructure scale to the component scale, which is often of ultimate interest, a sophisticated (computational) homogenization scheme needs to be employed. A modular simulation toolbox that allows the problem-dependent use of various constitutive models and/or homogenization schemes in one concurrent simulation requires a flexible and adjustable file format to store the resulting heterogeneous data. Besides dealing with heterogeneous data, a file format suited for microstructure simulations needs to be able to deal with large (and growing) amounts of data as (i) the spatial resolution of routine simulations is ever increasing and (ii) more and more quantities are taken into account to characterize a material. To cope with such demands, a flexible and adjustable data layout based on HDF5 is proposed. The key feature of this data structure is the decoupling of spatial position and data, such that spatially variable information can be efficiently accommodated. For position-dependent operations, e.g., spatially resolved visualization, the spatial link is restored through explicit mappings between simulation results and their spatial position.
- Preprint Article
- 10.5194/egusphere-egu25-16541
- Mar 15, 2025
This presentation explores the analysis of heterogeneous geospatial data from various sources through the application of artificial intelligence (AI) tools. Wastewater networks are used as a case study to address challenges such as data completion, multi-source integration, and managing diverse data formats, including Geographic Information Systems (GIS), analog maps, and pipe inspection videos, all derived from real-world data. We will review some solutions developed under the European project Starwars (STormwAteR and WastewAteR networkS heterogeneous data AI-driven management). These solutions are based on innovative models and tools that employ logical and graph-based representations of heterogeneous data. Specifically, we aim to represent different data types — such as GIS, ITV inspection videos, and maps — as annotated graphs, incorporating the uncertainty stemming from incomplete or inconsistent information.
- Conference Article
1
- 10.1109/cluster.2017.57
- Sep 1, 2017
In recent years, a lot of computer simulation codes have been developed as open-source software. Meanwhile major processors adopt a concept of a vector processing in high performance computing. Hence, the computer simulation codes need to follow a vector processing manner to have a benefit of a computational potential of the vector processing. Our study is evaluation and analysis of performance of various simulation codes developed as open-source software on several vector architectures. In this paper, we evaluate one package of Quantum ESPRESSO as an open-source software code in materials science. Quantum ESPRESSO makes use of several numerical libraries and it is known that parallel parameters called parallelization levels affect the performance in parallel execution. We discuss adjustability of the code to a vector architecture. Moreover, we clarify that the performance of PWscf, which is one of major packages of Quantum ESPRESSO, depends on numerical libraries and parallelization levels of PWscf. For evaluation of performance, we use a vector-parallel supercomputer system named NEC SX-ACE and an Intel Xeon-based cluster system named NEC LX 406Re-2. From this evaluation, we confirm that the code is suitable for vector architectures. Additionally, we clarify the effectiveness of applying optimum numerical libraries to each architecture with appropriate parallel parameters to obtain the high performance.
- Conference Article
30
- 10.1145/2132876.2132885
- Nov 14, 2011
The goal of this paper is to present an efficient implementation of an explicit matrix inversion of general square matrices on multicore computer architecture. The inversion procedure is split into four steps: 1) computing the LU factorization, 2) inverting the upper triangular U factor, 3) solving a linear system, whose solution yields inverse of the original matrix and 4) applying backward column pivoting on the inverted matrix. Using a tile data layout, which represents the matrix in the system memory with an optimized cache-aware format, the computation of the four steps is decomposed into computational tasks. A directed acyclic graph is generated on the fly which represents the program data flow. Its nodes represent tasks and edges the data dependencies between them. Previous implementations of matrix inversions, available in the state-of-the-art numerical libraries, are suffer from unnecessary synchronization points, which are non-existent in our implementation in order to fully exploit the parallelism of the underlying hardware. Our algorithmic approach allows to remove these bottlenecks and to execute the tasks with loose synchronization. A runtime environment system called QUARK is necessary to dynamically schedule our numerical kernels on the available processing units. The reported results from our LU-based matrix inversion implementation significantly outperform the state-of-the-art numerical libraries such as LAPACK (5x), MKL (5x) and ScaLAPACK (2.5x) on a contemporary AMD platform with four sockets and the total of 48 cores for a matrix of size 24000. A power consumption analysis shows that our high performance implementation is also energy efficient and substantially consumes less power than its competitors.
- Research Article
1
- 10.1038/s41597-024-04196-x
- Feb 1, 2025
- Scientific Data
Customizing the structure and format of scientific data facilitates the publication of diverse and heterogeneous data. Many data publishing platforms empower users to create self-designed schemas, leading to schema proliferation and more intricate creation processes. To address these challenges, we present a semi-automatic method and system for constructing heterogeneous material data schemas based on structure and context-aware recommendation. We propose a schema fragment tree structure to represent data schemas with hierarchical relationships, transforming the recommendation into subtree matching. Fragment index and semantic search techniques are introduced to identify candidate fragments, and a tree editing distance algorithm calculates similarity scores. Evaluated on the Data Schema Construction System, the algorithm outperforms baselines—TF-IDF and BM25 for schemas matching—in precision, recall, and F1-score. The baseline for reduced workload refers to the effort required to create schemas without recommendation. Our recommendation improves schema creation efficiency by 50.5% and reduces schema proliferation by 16.5%.
- Research Article
- 10.18699/vjgb-24-101
- Jan 26, 2025
- Vavilovskii zhurnal genetiki i selektsii
To systematize and effectively use the huge volume of experimental data accumulated in the field of bioinformatics and biomedicine, new approaches based on ontologies are needed, including automated methods for semantic integration of heterogeneous experimental data, methods for creating large knowledge bases and self-interpreting methods for analyzing large heterogeneous data based on deep learning. The article briefly presents the features of the subject area (bioinformatics, systems biology, biomedicine), formal definitions of the concept of ontology and knowledge graphs, as well as examples of using ontologies for semantic integration of heterogeneous data and creating large knowledge bases, as well as interpreting the results of deep learning on big data. As an example of a successful project, the Gene Ontology knowledge base is described, which not only includes terminological knowledge and gene ontology annotations (GOA), but also causal influence models (GO-CAM). This makes it useful not only for genomic biology, but also for systems biology, as well as for interpreting large-scale experimental data. An approach to building large ontologies using design patterns is discussed, using the ontology of biological attributes (OBA) as an example. Here, most of the classification is automatically computed based on previously created reference ontologies using automated inference, except for a small number of high-level concepts. One of the main problems of deep learning is the lack of interpretability, since neural networks often function as "black boxes" unable to explain their decisions. This paper describes approaches to creating methods for interpreting deep learning models and presents two examples of self-explanatory ontology-based deep learning models: (1) Deep GONet, which integrates Gene Ontology into a hierarchical neural network architecture, where each neuron represents a biological function. Experiments on cancer diagnostic datasets show that Deep GONet is easily interpretable and has high performance in distinguishing cancerous and non-cancerous samples. (2) ONN4MST, which uses biome ontologies to trace microbial sources of samples whose niches were previously poorly studied or unknown, detecting microbial contaminants. ONN4MST can distinguish samples from ontologically similar biomes, thus offering a quantitative way to characterize the evolution of the human gut microbial community. Both examples demonstrate high performance and interpretability, making them valuable tools for analyzing and interpreting big data in biology.
- Conference Article
4
- 10.1109/itre.2005.1503151
- Jun 27, 2005
Biological data exists all over the world as various Web services, which provide biologists with much useful information. However, heterogeneous data formats present a technical hurdle for biologists to fully take advantage of the information. It needs some power tools to handle this issue. The grid technology could help common biology tools with high performance and high throughput. Even so, the data formats produced from various biology tools are heterogeneous. The process of information integration of heterogeneous biological data is complex and difficult. This paper describes an approach to solve this problem by using XML technologies combine with BioGrid system. We use BioJava to integrate with our system for translating data into XML format. Finally, an example is used to illustrate how these techniques can come together in integrating heterogeneous biological data sources.
- Research Article
37
- 10.1016/j.inffus.2019.02.008
- Feb 23, 2019
- Information Fusion
iFusion: Towards efficient intelligence fusion for deep learning from real-time and heterogeneous data
- Research Article
4
- 10.1016/j.parco.2023.103039
- Jul 22, 2023
- Parallel Computing
A flexible sparse matrix data format and parallel algorithms for the assembly of finite element matrices on shared memory systems