An enhanced MQRBF-FD method with parallel computing and multiscale modeling for efficient elastic wave propagation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

An enhanced MQRBF-FD method with parallel computing and multiscale modeling for efficient elastic wave propagation

Similar Papers
  • Research Article
  • Cite Count Icon 3
  • 10.12694/scpe.v3i2.189
KPROC—An Instruction Systolic Architecture for Parallel Prefix Applications
  • Jan 1, 2000
  • Scalable Computing Practice and Experience
  • Bertil Schmidt + 1 more

The KPROC (KiloPROCessor) architecture is the first implementation of a parallel computer with 1024 floating-point processors on a single chip. It strictly follows the concept of an instruction systolic array. The modular organisation allows for either building large arrays of many KPROC chips or speeding up small machines with a single KPROC as a coprocessor. This paper presents concept of this parallel computer model as well as the architectural details of the processor design. It is shown that this computer model allows for efficient implementation of parallel prefix computations. A large variety of applications from different areas is presented to demonstrate how parallel prefix computations can be used as key operations for deriving efficient implementations on the KPROC.

  • Dissertation
  • 10.17918/etd-3913
A novel approach to data-driven modeling of damage-induced elastic wave propagation
  • Jun 1, 2012
  • Daniel Paul Servansky + 1 more

Current research into the simulation of elastic stress wave propagation utilizes user-imposed energy functions to drive the required energy changes for the production of elastic waves that propagate through the continuum model. This thesis proposes a novel approach to theoretically investigate the creation and propagation of elastic stress waves in a computational model by linking "experimental data-driven" quasi-static crack growth simulations with dynamic simulations for transient elastic wave propagation. The quasi-static simulations are used to determine both the displacement, strain and stress fields associated with crack initiation and the rate at which the crack will grow. As these elastic fields change over time as a function of crack growth increments, the dynamic model is used to capture the changes in the stored energy that lead to new equilibrium states for a developing crack, as well as the transient path followed to achieve this structural evolution. Within this energy balance lies the production and propagation of elastic stress waves associated with energy released by the crack growth. In the computational models, specific locations are selected for monitoring in- and out-of-plane displacement, velocity and acceleration. Such data generated by the model and captured by numerical sensors are analyzed in both time and frequency domains and are compared to related experimental measurements. The computationally generated transient elastic stress waves that propagate through the model produce the equivalent of what is known as Acoustic Emissions, providing in this way an innovative approach to directly link fracture mechanics with theory related to nondestructive testing. The research findings in this thesis are expected to contribute towards the design of more efficient strategies for the fundamental understanding of the fracture process, as well as for the reliable damage monitoring in structural health monitoring applications.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.cpc.2023.108899
A new version of PyWolf for the propagation of partially coherent light in media other than free space
  • Aug 30, 2023
  • Computer Physics Communications
  • Tiago E.C Magalhães + 1 more

A new version of PyWolf for the propagation of partially coherent light in media other than free space

  • Research Article
  • 10.1142/s2591728524500221
Long-Range Hydroacoustic Propagation Modelling Schemes on Distributed Memory Parallel Computers
  • Feb 22, 2025
  • Journal of Theoretical and Computational Acoustics
  • Noriyuki Kushida + 1 more

The complex nature of the ocean environment requires advanced computational acoustic models to gain insights into the dominating physical factors controlling the underwater sound propagation and scattering in the ocean. In this study, we explore the “ab-initio” approach to solve wave equations with numerical algorithms that can be implemented in distributed memory parallel computers. The goal is to improve the calculation speed so long-range global scale hydroacoustic wave propagation can be studied more efficiently and effectively. Two major algorithms: the finite difference time domain (FDTD) method and the Parabolic Equation (PE) method are investigated. Two PE-based numerical models are considered. One is the Split-Step-Fourier Parabolic Equation (SSFPE) model using split-step Fourier schemes in three-dimensional (3D) environments. The second PE model is a multi-frequency implementation of the Range-dependent Acoustic Model (RAM) that computes two-dimensional (2D) sound pressure fields. One of the primary challenges in global-scale hydroacoustic propagation modeling with the “ab-initio” approach is the demand for significant computational resources on both memories and computational speeds. To address this, we employed parallel computing techniques using directive-based programming languages, such as OpenACC and XscalableACC, to leverage the power of multiple Graphics Processing Unit (GPU) systems and Central Processing Unit (CPU) cluster computers. Our performance evaluation revealed substantial speedup gains. For the FDTD method, we achieved approximately 3.5-fold and 4-fold speedups with four GPUs and four CPU cluster nodes, respectively. With the 3D SSFPE model, we obtained a speedup of 3.7-fold using four CPU cluster nodes. A significant result was observed with the 2D broadband multi-frequency RAM model, where we achieved a 110-fold speedup compared to one core original implementation. These results demonstrate the promising potential of massively parallel computers for ocean acoustic propagation modeling and highlight the significant performance benefits of utilizing parallel computing techniques. Our findings emphasize the importance of efficient computational strategies.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1002/0471732710.ch4
Bulk‐Synchronous Parallelism: An Emerging Paradigm of High‐Performance Computing
  • Sep 2, 2005
  • Alexander Tikin

This chapter contains sections titled: The BSP Model BSP Programming Conclusion Reference Parallel computers are a powerful tool of modern science and engineering. A parallel computer may have tens, hundreds or thousands of processors, making parallel computation inherently more complex than single-processor computation. Much effort has been spent trying to tackle this complexity, both in theory and in practice. One of the most important recent advances is the model of bulk-synchronous parallel (BSP) computation, proposed in 1990 by L. Valiant. Thanks to its elegance and simplicity, the BSP model has now become one of the mainstream research areas in parallel computing, as well as a firm foundation for language and library design. In this chapter, we survey the state-of-the-art in computation models and programming tools based on the BSP model.

  • Conference Article
  • Cite Count Icon 3
  • 10.1145/2795122.2795130
Towards teaching embedded parallel computing
  • Jun 13, 2015
  • Zain Ul-Abdin + 1 more

Embedded electronic systems are finding increased applications in our daily life. In order to meet the application demands in embedded systems, parallel computing is used. This paper emphasizes teaching of the specific issues of parallel computing that are critical to embedded systems. We propose an analytical approach to deliver declarative and functioning knowledge for learning in the field of computer science and engineering with a special focus on Embedded Parallel Computing (EPC). We describe the teaching of a course with a focus on how parallel computing can be used to enhance performance and improve energy efficiency of embedded systems. The teaching methods include interactive lectures with web-based course literature, seminars, and lab exercises and home-assigned practical tasks. Further, the course is intended to give a general insight into current research and development in regard to parallel architectures and computation models. Since the course is an advanced level course, the students are expected to have a basic knowledge about the fundamentals of computer architecture and their common programming methodologies. The course puts emphasis on hands-on experience with embedded parallel computing. Therefore it includes an extensive laboratory and project part, in which a state of the art manycore embedded computing system is used. We believe that undertaking these methods in succession will prepare the students for both research as well as professional career.

  • Book Chapter
  • 10.1017/9781316795835.003
Basic Models of Parallel Computation
  • Nov 30, 2016
  • Zbigniew J Czech

THE SHARED MEMORY MODEL A model is a theoretical or physical object whose analysis or observation allows for exploration of another real object or process. A model represents an explored object in a simplified manner by taking into account only its basic features. Due to simplifications models are easier to analyze than the corresponding real objects. The subject of our interest are models of computers enabling the study of computational processes executed within those computers. The models, called models of computation , are helpful when analyzing and designing algorithms, as well as in determining performance metrics used for evaluation of algorithms (see Section 3.1). Models of computation should not be associated with a specific computer architecture, or with a class of such architectures. In other words, their independence from hardware is essential. Another essential feature should be versatility that ensures that algorithms developed adopting these models can be implemented and run on computers with different architectures. It is particularly important in the field of parallel computing where diversity of architectures is high. As a result of this diversity several models have been advanced. Unfortunately, due to relatively large number of requirements that a model should satisfy, partly in conflict with each other, none of the models developed so far has become a generally accepted model of parallel computation. The frequently used are the shared memory model (or parallel random access machine model, PRAM) and the network model . They correspond to parallel computation conducted with the use of shared memory and by sending messages over some communication network. Before discussing these models, we will present a sequential model of computation underlying the PRAM model. 2.1.1 The RAM model A widely accepted model of sequential computation is the machine with random access memory (RAM). The model consists of a processor and memory containing a potentially infinite number of cells M i for i = 1, 2, 3, … (Figure 2.1). In each memory cell identified by its address i , a finite value expressed in binary, perhaps very large, can be stored. The model assumes that the time to read (write) a value from (in) a cell M i is constant and equal to unit time, regardless of a cell address.

  • Research Article
  • 10.12694/scpe.v3i2.185
Unconventional Parallel Architectures
  • Jan 1, 2000
  • Scalable Computing Practice and Experience
  • Yakov I Fet + 1 more

Looking at the history of computing one can find out that the designers repeatedly addressed specialized ( non von-Neumann ) architectures which provide comparatively high performance in some specific applications. Suffice it to say that one of the earliest supercomputer projects, the S. Unger's system destined for pattern recognition was a rectangular array of interconnected simple processing modules. At the beginning of 60s, an original conception of a reconfigurable cellular structure called Homogeneous Computing Medium was suggested in Russia. It is worth to note that recently this promising idea found its continuation in the MIT project Raw based on newest technology. Great consideration was given to associative memories and fine-grained SIMD architectures. Thus, in the late 80s the massively parallel fine-grained systems DAP (S. Reddaway) and CM (D. Hillis) made at the market a grave competition to the supercomputers of conventional architecture. In 90s, powerful microprocessors of essentially traditional architecture occupied the dominating position due to the rapid progress in microelectronics and the corresponding dramatic decrease of component cost. Nevertheless, the well-known rule remains valid: taking into account specific features of a given class of problems and computational methods, one can get great improvement in performance and cost/performance factor when using application specific unconventional architectures. After the publishing of the CFP for this Special Issue of SCPE, we received 10 papers submitted by 19 authors from various countries. In the present Issue five papers are being published which obtained the highest rating from reviewers. These papers introduce some interesting trends in the development of unconventional parallel computing systems. The first paper Reaction-Diffusion and Excitable Processors: a Sense of Unconventional by Andrew Adamatzky is devoted to a very important and prospective subject of parallel data processing in chemical media. So-called architecture-less processors are discussed where a layer of an oscillating reaction is considered a massively parallel system each elementary processor of which is represented by a micro-volume reactor. Cellular automata are used as a computational model. The author demonstrates that excitable media can constitute universal massively parallel computing devices. In the second paper A Scalable Multiprocessor for Real-Time Signal Processing Applications Hans Eberle and Daniel Scherrer present a new platform for real-time processing of continuous data streams such as audio, video or graphical data. To overcome the limitations of existing DSPs the authors designed a flexible and scalable communication infrastructure Switcherland which guarantees data transfer with bounded latencies. The advantages of suggested multiprocessor provide for a wide range of applications in real-time signal processing. Bernard Girau in his paper Conciliating Connectionism and Parallel Digital Hardware presents an original paradigm of digital hardware implementation of neural computations, Field Programmable Neural Arrays (FPNAs). Both theoretical and practical aspects are discussed including the mapping of FPNAs into FPGAs. It is shown that the suggested FPNA framework results in efficient fine-grained reconfigurable parallel hardware. Bertil Schmidt and Manfred Schimmler describe in their paper KPROC—an Instruction Systolic Architecture for Parallel Prefix Applications a highly parallel architecture with 1024 floating-point processors on a single chip (at 0.25 micron CMOS technology). These processors constitute an instruction systolic array implementing parallel prefix computation. It is shown that many important applications can take advantage of the suggested model. Examples are discussed from different areas such as image processing, long argument arithmetic, etc. Finally, researchers from Computer Technology Institute (Patras, Greece) C. Konstantinopulos, A. Svolos, D. Serpanos, and D. Maritsas in the paper The Effect of Interword Connectivity in Associative Processing bring forward an important development of associative processors. They introduce an asssociative memory with a single-stage hypercube network interconnecting the memory words. Significant performance improvement of this architecture is demonstrated through the analysis of various real applications. We wish to express our deep gratitude to the reviewers (D. Sima, F. Vajda, K. Grosspietsch, P. Szolgay, E. Luque, A. Pimentel, D. Ortega, M. Valero, O. Bandman, V. Malyugin, P. Keresztes, M. Amamiya, Y. Fet, R. Moore, K. Waldschmidt, S. Theodoridis, L. Ricci, A. Ripoll, S-W. Lee, J-L. Gaudiot, V. Estigneev, S. Sedukhin) who helped us to select the best papers and to improve the quality of the accepted. Guest Editors Yakov Fet and Peter Kacsuk

  • Research Article
  • Cite Count Icon 13
  • 10.1016/j.eswa.2005.11.034
Parallelizing evolutionary computation: A mobile agent-based approach
  • Jan 11, 2006
  • Expert Systems with Applications
  • Wei-Po Lee

Parallelizing evolutionary computation: A mobile agent-based approach

  • Book Chapter
  • 10.1007/978-3-540-36668-3_88
A Mobile Agent Approach to Support Parallel Evolutionary Computation
  • Jan 1, 2006
  • Wei-Po Lee

To enhance the performance of evolutionary algorithms, different parallel computation models have been proposed, and they have been implemented on parallel computers to speed up the computation. Instead of using expensive parallel computing facilities, in this paper we propose to implement parallel evolutionary computation models on easily available networked PCs, and present a multi-agent framework to support parallelism. To evaluate the proposed approach, different kinds of experiments have been conducted to assess the developed system and the preliminary results show the efficiency of our approach.

  • Conference Article
  • Cite Count Icon 593
  • 10.5555/1873601.1873677
A model of computation for MapReduce
  • Jan 17, 2010
  • Howard Karloff + 2 more

In recent years the MapReduce framework has emerged as one of the most widely used parallel computing platforms for processing data on terabyte and petabyte scales. Used daily at companies such as Yahoo!, Google, Amazon, and Facebook, and adopted more recently by several universities, it allows for easy parallelization of data intensive computations over many machines. One key feature of MapReduce that differentiates it from previous models of parallel computation is that it interleaves sequential and parallel computation. We propose a model of efficient computation using the MapReduce paradigm. Since MapReduce is designed for computations over massive data sets, our model limits the number of machines and the memory per machine to be substantially sublinear in the size of the input. On the other hand, we place very loose restrictions on the computational power of of any individual machine---our model allows each machine to perform sequential computations in time polynomial in the size of the original input.We compare MapReduce to the PRAM model of computation. We prove a simulation lemma showing that a large class of PRAM algorithms can be efficiently simulated via MapReduce. The strength of MapReduce, however, lies in the fact that it uses both sequential and parallel computation. We demonstrate how algorithms can take advantage of this fact to compute an MST of a dense graph in only two rounds, as opposed to Ω(log(n)) rounds needed in the standard PRAM model. We show how to evaluate a wide class of functions using the MapReduce framework. We conclude by applying this result to show how to compute some basic algorithmic problems such as undirected s-t connectivity in the MapReduce framework.

  • Research Article
  • Cite Count Icon 11
  • 10.1007/s11227-009-0319-0
Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems
  • Jul 29, 2009
  • The Journal of Supercomputing
  • Keqin Li

We present fast and highly scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include matrix multiplication, matrix chain product, and computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. Our highly scalable parallel computations for these problems are based on a highly scalable implementation of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known parallel time complexities on parallel random access machines (PRAM), the most powerful but unrealistic shared memory model of parallel computing, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast (in terms of parallel time complexity) and highly scalable (in terms of our definition of scalability) parallel matrix computations were rarely seen before on any distributed memory systems.

  • Research Article
  • 10.12694/scpe.v5i1.262
What is the GRID
  • Jan 1, 2002
  • Scalable Computing Practice and Experience
  • Ami Marowka

In 1998, Ian Foster and Carl Kesselman, together with thirty distinguished experts in high-performance computing and networking, laid the foundations of a new computing model called GRID. Their vision was introduced in the book, The GRID: Blueprint for a New Computing Infrastructure (see review, SCPE Vol. 3, No. 3 ). The editors wrote: … A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities… allowing new classes of applications to emerge… as yet, we have only a preliminary understanding of what these new applications will look like. Over the next five years, Grid computing became the hottest R&D field. Academia and national labs built experimental infrastructures, new startups were established to develop Grid technologies, and the HPC giant vendors joined the party. Each one pushed in a different direction based on what and how he believed Grid computing should look. It was only a matter of time until people started to ask what the Grid is all about. A survey of ten Grid experts produced ten different answers. The question remains without a clear answer, leaving many people confused. Ian Foster picked up the gloves. In his article, What is the GRID? A three point checklist (GRIDtoday, July 22, 2002: Vol. 1, No. 6), he redefines the Grid model to reflect the trades and changes in the field during the past five years. According to Foster, a Grid is a system that: Coordinates resources that are not subject to centralized control. Uses standard, open, general-purpose protocols and interfaces. Delivers a nontrivial quality of service. But Foster's new definition could not encompass all of the Grid's approaches. Moreover, Foster explained by example what the Grid is not: A cluster management system such as Sun's Sun Grid Engine, Platform's Load Sharing Facility, or Viridian's Portable Batch System can, when installed on a parallel computer or local area network, deliver quality of service guarantees and thus constitutes a powerful Grid resource. However, such a system is not a Grid itself, due to its centralized control of the hosts that it manages Within a week the responses started to come in. The first one was from Wolfgang Gentzsch, Director of Grid computing at Sun Microsystems (GRIDtoday, August 5, 2002: Vol. 1, No. 8). Gentzsch wrote: I very much like Ian's two earlier definitions. Combined, they may provide a very generic Grid definition. The new checklist, however, reduces the potential of a wide and rich variety of Grids, with different architectures, operational models, and applications, and may miss the chance to becoming widely accepted as a standard definition of the Grid. Why do we need a definition for the Grid? Can the Grid be defined? Is a definition is necessary to the Grid's development? The Grid is an open vision and thus cannot be defined. An attempt to define the Grid can only reach the foreseen technology horizon, and this horizon is only one to two years ahead. Grid computing is a paradigm in shift. New architectures and standards are changing the shape and direction of the Grid's development. In the past year one such prominent architecture was Open Grid Services Architecture (OGSA). The OGSA is the first attempt to bridge the gap between two cyberspace worlds: the Internet and the Grid. The Internet and the Grid are two different computing models. There are people who believe that the Grid is the next generation of the Internet and the World Wide Web, saying that the Great Global Grid is still many years away. Others state that the Grid will complement, not replace, the Internet as we know it. In 1998, when the Grid blueprint was published, a new Internet language appeared called eXtensible Markup Language (XML). XML is a development tool for a new Internet computing model known as Web Services. At that time, nobody understood that there is a relationship between Web Services and the Grid. Three years past before Steve Tuecke from Argonne National Laboratory started to write a spec for Grid Services. Later, people from IBM joined the mission and together they created OGSA. The OGSA specification outlines interfaces to grid computing software that comply with Web Services standards. If adopted, Grid services such as job scheduling, authentication, failure-detection, staging of applications and data, and migration of applications and data, will all be accessed through standard Web Services architecture. OGSA is basically where Grid meets Web Services and maybe the first step towards full integration of the Internet and the Grid. The Internet appeared a decade ago. In the beginning, the Net offered primitive applications: a mailing system, ftp services, and a web of home pages. Nobody asked, What is the Internet? The years passed and new emerging Internet technologies were developed. Today we can talk on the phone over the Net, chat with friends, and video conference with colleagues on the other side of the world. New computing and business models were developed: e-commerce, e-learning, e-science, and e-you-name-it. And nobody asks what the Internet is. The Grid is a super-model, a bag of networking models associated with many types of architectures, software packages, applications, and standards. Today, when people talk about the Grid they mention Data Grids, Science Grids, and Campus Grids. And at the same time they mention Farm computing, Peer-to-peer computing, and Utility computing, among others. Ten years from now many of today's technologies will disappear and new ones will be developed. I don't know how Grid computing will look a decade from now, but I know that we will have g-commerce, g-learning, g-science, and g-chat. Nobody will ask what the Grid is. It will be obvious, just as the Internet is today. Ami Marowka The Computer Aided Design Laboratory School of Engineering and Computer Science Hebrew University of Jerusalem

  • Research Article
  • 10.48175/ijarsct-23085
A Descriptive Study on Interconnection Networks for Parallel Computing and Algorithm Models in Parallel Computing
  • Jan 25, 2025
  • International Journal of Advanced Research in Science, Communication and Technology
  • Er Rajbhan Singh + 1 more

In parallel computing, Interconnection networks are very crucial for efficient communication among all processors within a similar system. Parallel computing has become a crucial topic in the concern of computer science and also it is revealed to be critical when researching in high performance. The evolution of computer architectures towards an improved number of nodes, where parallelism could be the approach to option for speeding up an algorithm within the last few decades. Efficient data transfer between processors is an essential component in any large scale parallel computation. Motivated by the growing interest in parallel computers, a significant amount of theoretical research has been devoted to the area of interconnection networks for parallel computers, most of it to the packet routing (or store-and-forward) model of communication. We survey some of the major developments in this field, and discuss several new alternative models of communication, In many large scale applications, communication time dominates the execution time of the whole parallel computation. Thus, the performance of a large scale parallel computer is highly correlated with the efficiency of its network and communication algorithm. The combination of processing units build a model of computation (circuits) has gained an essential place in the area of high performance computing (HPC) due to its configuration and considerable processing supremacy that is parallel, series, etc. The aim of the Presenting this paper is study on the idea of parallel computing and its programming models and also explore some theoretical and technical concepts which can be often needed to understand the Interconnection network. In particular, we show how this technology is new in assisting the field of computational physics, especially when the issue is data parallel. In the real-life example of parallel computing, there are two queues to get a ticket of anything; if two cashiers are giving tickets to 2 persons simultaneously, it helps to save time as well as reduce complexity

  • Research Article
  • 10.1007/bf02575730
Parallel models of computation: an introductory survey
  • Jun 1, 1989
  • Calcolo
  • M Leoncini

The paper gives an overview of some models of computation which have proved successful in laying a foundation for a general theory of parallel computation. We present three models of parallel computation, namelyboolean andarithmetic circuit families, andParallel Random Access Machines. They represent different viewpoints on parallel computing: boolean circuit families are useful for in-depth theoretical studies on the power and limitations of parallel computers; Parallel Random Access Machines are the most general vehicles for designing highly parallel algorithms; arithmetic circuit families are an important tool for undertaking studies related to one of the most active areas in parallel computing, i.e. parallel algebraic complexity.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant