Related Topics
Articles published on Parallel software
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
716 Search results
Sort by Recency
- Research Article
- 10.37394/232014.2025.21.15
- Jul 30, 2025
- WSEAS TRANSACTIONS ON SIGNAL PROCESSING
- Veska Gancheva
Image quality is of crucial importance in modern digital technologies. This paper analyzes the computational challenges in image processing, with an emphasis on improving performance through parallel computing. The focus is on the implementation of efficient parallel models and software solutions using filtering techniques. In the framework of the study, a parallel model based on filters is created and tested through a multi-threaded parallel software implementation. This implementation applies a set of filters to a list of compressed images and generates output results for each filter, which allows for an analysis of their effectiveness. The filters Robert, Binary Threshold, Black and White, and UV are selected due to their broad applicability in various domains. Scalability analyses and performance evaluations show that the proposed parallel computing model is highly scalable and can be adapted to different hardware configurations. The article also presents an overview of alternative edge detection algorithms (Sobel, Canny, Prewitt), classical noise reduction approaches (Gaussian Blur, Median Filtering), a comparison between multi-threaded CPU implementations and GPU-based parallel computing, as well as an application of machine learning methods for automated image filtering and segmentation, which opens new possibilities for future research.
- Research Article
1
- 10.1016/j.future.2024.107694
- May 1, 2025
- Future Generation Computer Systems
- Jianguo Liang + 3 more
Parallel software design of large-scale diamond-structured crystals molecular dynamics simulation
- Research Article
- 10.52783/jisem.v10i38s.6835
- Apr 22, 2025
- Journal of Information Systems Engineering and Management
- Omar Antonio Hernández Duany
The process of capturing multiple IP video streams in real time is of great importance for the development of heterogeneous computer vision systems, because it is the first stage for the management of various processes that base their potential on visual analysis for decision making in different organizational environments. This process demands the execution of a high number of operations that guarantees that several concurrent video streams are managed without losses and under time restrictive conditions. In this sense, a parallel and scalable software module has been designed to optimize the use of computational resources in order to capture a higher number of concurrent video streams. The parallel algorithm is based on shared-distributed memory programming paradigms and improves the efficiency of the use of the network adapters architecture and integrates the potential of all the processing cores available in the computational node that supports the process, based on the cooperation between processors and computational nodes on a high-performance computational infrastructure with the purpose of promoting the scalability of the system. It is important to point out that the video capture module has transversal applicability in several application domains and makes it possible to increase the number of video streams to be processed concurrently in correspondence with the available hardware architecture.
- Research Article
- 10.1002/acm2.70095
- Apr 16, 2025
- Journal of applied clinical medical physics
- Jiayi Liang + 13 more
The adapt-to-shape (ATS) workflow on the Unity MR-Linac (Elekta AB, Stockholm, Sweden) allows for full replanning including recontouring and reoptimization5. Additional complexity to this workflow is added when the adaptation involves the use of MIM Maestro (MIM Software, Cleveland, OH) software in conjunction with Monaco (Elekta AB, Stockholm, Sweden). Given the interplay of various systems and the inherent complexity of the ATS workflow, a risk analysis would be instructive. Failure modes and effects analysis (FMEA) following Task Group 10013 was completed to evaluate the ATS workflow. A multi-disciplinary team was formed for this analysis. The team created a process map detailing the steps involved in ATS treating both the standard Monaco workflow and a workflow with the use of MIM software in parallel. From this, failure modes were identified, scored using three categories (likelihood of occurrence, severity, and detectability which multiplied create a risk priority number), and then mitigations for the top 20th percentile of failure modes were found. Risk analysis found 264 failure modes in the ATS workflow. Of those, 82 were high-ranking failure modes that ranked in the top 20th percentile for risk priority number and severity scores. Although high-ranking failure modes were identified in each step in the process, 62 of them were found in the contouring and planning steps, highlighting key differences from adapt-to-position (ATP), where the importance of these steps are minimized. Mitigations are suggested for all high-ranking failure modes. The flexibility of the ATS workflow, which enables reoptimization of the treatment plan, also introduces potential critical points where errors can occur. There are more opportunities for error in ATS that can create unintentionally negative dosimetric impact. FMEA can help mitigate these risks by identifying and addressing potential failure points in the ATS process.
- Research Article
2
- 10.1063/5.0252566
- Mar 25, 2025
- The Journal of chemical physics
- Emir Kocer + 6 more
Machine learning potentials allow performing large-scale molecular dynamics simulations with about the same accuracy as electronic structure calculations, provided that the selected model is able to capture the relevant physics of the system. For systems exhibiting long-range charge transfer, fourth-generation machine learning potentials need to be used, which take global information about the system and electrostatic interactions into account. This can be achieved in a charge equilibration step, but the direct solution of the set of linear equations results in an unfavorable cubic scaling with system size, making this step computationally demanding for large systems. In this work, we propose an alternative approach that is based on the iterative solution of the charge equilibration problem (iQEq) to determine the atomic partial charges. We have implemented the iQEq method, which scales quadratically with system size, in the parallel molecular dynamics software LAMMPS for the example of a fourth-generation high-dimensional neural network potential (4G-HDNNP) intended to be used in combination with the n2p2 library. The method itself is general and applicable to many different types of fourth-generation MLPs. An assessment of the accuracy and the efficiency is presented for a benchmark system of FeCl3 in water.
- Research Article
- 10.1063/5.0241560
- Feb 25, 2025
- The Journal of chemical physics
- Jiří Janek
Molecular dynamics (MD) simulations applied to various special tasks such as thermostats, barostats, external volume control, or external magnetic fields often require tailored integration methods, complicating software development. Here, we propose a unified integration scheme for solving the common sets of equations of motion in MD. To achieve this, we adapted the traditional SHAKE method for treating rigid bonds to a predictor-corrector integration scheme and combined it with the existing time-reversible velocity predictor and box predictor. This new approach enables using both the time-honored Verlet method and the recently improved Gear methods. The resulting unified integration scheme was tested on two simple models and two MD systems: SPC/E water and ionic liquid (1-ethyl-3-methylimidazolium tetrafluoridoborate). Trajectories in microcanonical, Nosé-Hoover canonical, and MTK isobaric ensembles were generated. The results for Verlet-based methods were in excellent agreement with those obtained using conventional integration methods. For particular tasks, the use of higher-order methods can be beneficial. Overall, in comparison with standard approaches, our universal scheme provides a significantly simpler route to devising new integrators and maintaining existing simulation software. Furthermore, our family of new integrators can be efficiently deployed in massively parallel MD software.
- Research Article
1
- 10.3390/math13050734
- Feb 24, 2025
- Mathematics
- Abdullah Sevin
The Internet of Things is used in many application areas in our daily lives. Ensuring the security of valuable data transmitted over the Internet is a crucial challenge. Hash functions are used in cryptographic applications such as integrity, authentication and digital signatures. Existing lightweight hash functions leverage task parallelism but provide limited scalability. There is a need for lightweight algorithms that can efficiently utilize multi-core platforms or distributed computing environments with high degrees of parallelization. For this purpose, a data-parallel approach is applied to a lightweight hash function to achieve massively parallel software. A novel structure suitable for data-parallel architectures, inspired by basic tree construction, is designed. Furthermore, the proposed hash function is based on a lightweight block cipher and seamlessly integrated into the designed framework. The proposed hash function satisfies security requirements, exhibits high efficiency and achieves significant parallelism. Experimental results indicate that the proposed hash function performs comparably to the BLAKE implementation, with slightly slower execution for large message sizes but marginally better performance for smaller ones. Notably, it surpasses all other evaluated algorithms by at least 20%, maintaining a consistent 20% advantage over Grostl across all data sizes. Regarding parallelism, the proposed PLWHF achieves a speedup of approximately 40% when scaling from one to two threads and 55% when increasing to three threads. Raspberry Pi 4-based tests for IoT applications have also been conducted, demonstrating the hash function’s effectiveness in memory-constrained IoT environments. Statistical tests demonstrate a precision of ±0.004, validate the hypothesis in distribution tests and indicate a deviation of ±0.05 in collision tests, confirming the robustness of the proposed design.
- Research Article
5
- 10.1109/jestpe.2024.3477743
- Feb 1, 2025
- IEEE Journal of Emerging and Selected Topics in Power Electronics
- Manh Tuan Tran + 6 more
A High-Performance GaN Power Module With Parallel Packaging for High-Current and Low- Voltage Traction Inverter Applications
- Research Article
- 10.1103/physrevresearch.7.013067
- Jan 17, 2025
- Physical Review Research
- Omer Rathore + 3 more
With the advent of exascale computing, effective load balancing in massively parallel software applications is critically important for leveraging the full potential of high-performance computing systems. Load balancing is the distribution of computational work between available processors. Here, we investigate the application of quantum annealing to load balance two paradigmatic algorithms in high-performance computing. Namely, adaptive mesh refinement and smoothed particle hydrodynamics are chosen as representative grid and off-grid target applications. While the methodology for obtaining real simulation data to partition is application specific, the proposed balancing protocol itself remains completely general. In a grid based context, quantum annealing is found to outperform classical methods such as the round robin protocol but lacks a decisive advantage over more advanced methods such as steepest descent or simulated annealing despite remaining competitive. The primary obstacle to scalability is found to be limited coupling on current quantum annealing hardware. However, for the more complex particle formulation, approached as a multiobjective optimization, quantum annealing solutions are demonstrably Pareto dominant to state of the art classical methods across both objectives. This signals a noteworthy advancement in solution quality which can have a large impact on effective CPU usage. Published by the American Physical Society 2025
- Research Article
- 10.1051/rdne/2025002
- Jan 1, 2025
- Research & Design of Nuclear Engineering
- Shengtao Cao + 5 more
The safety of nuclear power structure is very important. The establishment of a refined fine finite element model is beneficial to the dynamic analysis of nuclear power structure, but it brings challenges to the computational efficiency. In this paper, a self-developed finite element software based on multi-GPU parallel explicit algorithm (GFE) is firstly introduced. Using the GFE software, dynamic response of nuclear power structures considering soil-structure interaction (SSI) under seismic load is analyzed and compared with the commercial software. The calculation results show that the results obtained from the GFE software is consistent with that obtained from the commercial software. Compared with the general commercial software, the GFE software has higher computational efficiency. The calculation time of GFE software for the seismic analysis is respectively about 1/7 that of the general commercial software.
- Research Article
1
- 10.24215/16666038.24.e11
- Oct 18, 2024
- Journal of Computer Science and Technology
- Adrián Pousa + 3 more
High Performance Computing (HPC) applies different techniques to complex or large-volume applications, relying on both parallel software and hardware, to reduce their execution time compared to running them on a simple computer. On the other hand, Quantum Computing (QC) emerges as a new paradigm that leverages the properties of Quantum Mechanics for computation. QC has an inherently parallel nature and it is expected to solve some problems faster than classical computing. This paper carries out a bibliographic review to examine the point of view of different authors regarding the relationship between HPC and QC. The objective is to determine the trend of this relationship: Will QC replace classical HPC computing?or Will they complement each other? Also, if they were complementary tools, the aim is to answer: How could they be integrated? How will users access these resources?
- Research Article
1
- 10.2514/1.a35945
- Sep 24, 2024
- Journal of Spacecraft and Rockets
- Destiny M Fawley + 4 more
An electrical conductivity database for continuum flow in a CO2 atmosphere over a 70° spherecone was created using the Data Parallel Line Relaxation Code computational fluid dynamics software to inform development of future magnetohydrodynamic subsystems at Venus and Mars. Sixteen freestream conditions were considered at Mars with atmospheric relative velocities from 5 to 8 km/s and altitudes between 20 and 80 km. Sixteen freestream conditions were considered at Venus with atmospheric relative velocities from 9 to 12 km/s and altitudes between 85 and 115 km. Results indicate that the total electrical conductivity in the flow volume always increases as velocity increases. At low velocities, the electrical conductivity is higher at high altitudes whereas, at high velocities, the electrical conductivity is higher at low altitudes. Three of the 80 km altitude computational fluid dynamics solutions show good agreement with direct simulation Monte Carlo results. In general, computational fluid dynamics predicts thinner shocks, higher electron number density, and similar vibrational temperatures to direct simulation Monte Carlo. The magnetohydrodynamic force was calculated at both Mars and Venus. Results indicate there may not be sufficient control authority to use magnetohydrodynamics as a trajectory control mechanism at Mars without artificially increasing the electrical conductivity of the flow, but there may be appreciable control authority for a drag-modulated aerocapture at Venus.
- Research Article
- 10.1088/1402-4896/ad6e29
- Sep 9, 2024
- Physica Scripta
- N Liu + 5 more
An efficient large-scale parallel computing program for blast effects on structures is developed based on software platform mode, which adopts component-based three-layer software architecture to achieve disciplinary hierarchy. The parallel layer encapsulates high-performance data structures and shields large-scale parallel computing technology, enabling efficient mesh adaptivity. The common numerical algorithm layer encapsulates fluid, structural, and coupling algorithm processes providing standardized interfaces for extending numerical schemes. The physical model layer provides extension interfaces for state equations, source terms, initial boundary values, etc, to facilitate quick customization of large-scale parallel software for blast effects on structures. Typical examples verify the correctness, effectiveness and high precision of the proposed program in fluid-structure coupling problems. The parallel efficiency of 300,000 processor cores with hundreds of millions of grids reach 60.84% for the shock wave transmission. Finally, the dynamic behavior of multi-storey buildings under blast waves is simulated to reveal the damage model of the overall collapse. These results show that the software has a broad application prospect in the field of explosion. The results suggest that the software holds significant potential for application in the field of explosion, particularly in intricate and large-scale scenarios.
- Research Article
2
- 10.1021/acs.jpca.4c04146
- Aug 23, 2024
- The journal of physical chemistry. A
- Jason N Byrd + 2 more
The task of developing high-performing parallel software must be made easier and more cost-effective in order to fully exploit existing and emerging large-scale computer systems for the advancement of science. The Super Instruction Architecture (SIA) is a parallel programming platform geared toward applications that need to manage large amounts of data stored in potentially sparse multidimensional arrays during calculations. The SIA platform was originally designed for the quantum chemistry software package ACESIII. More recently, the SIA was reimplemented to overcome the limitations in the original ACESIII program. It has now been successfully employed in the new ACES4 quantum chemistry software package. This paper describes the SIA and ACES4 and illustrates their capabilities with some difficult quantum chemistry open-shell coupled-cluster benchmark calculations.
- Research Article
- 10.1016/j.advengsoft.2024.103739
- Aug 3, 2024
- Advances in Engineering Software
- Xiangyu Liu + 5 more
Gridder-HO: Rapid and efficient parallel software for high-order curvilinear mesh generation
- Research Article
1
- 10.1016/j.cageo.2024.105640
- Jun 6, 2024
- Computers and Geosciences
- Marino Vetuschi Zuccolini + 2 more
PHREESQL: A toolkit to efficiently compute and store geochemical speciation calculation
- Research Article
- 10.1680/jgele.23.00085
- Jun 1, 2024
- Géotechnique Letters
- S Bandera + 3 more
This paper outlines a new approach to use coarse-grained molecular dynamics (CGMD) and the Gay–Berne (GB) potential to simulate the compression of kaolinite saturated with water at an acidic pH ( = 4) in a low (1 mM) ion concentration solution. To overcome the limitations of the standard GB potential and capture the charge heterogeneity on the surface of kaolinite particles under acidic pH conditions, each clay platelet is modelled using a two-ellipsoid composite particle. The molecular dynamics software Large-scale Atomic/Molecular Massively Parallel Software was employed to generate virtual monodisperse samples containing 1000 composite particles and to simulate isotropic compression at 100 kPa. The observed macro-scale response in void ratio–effective stress space lay above the response obtained in a simulation that used an equivalent CGMD model developed to simulate alkaline (pH = 8) pore water conditions. This is in qualitative agreement with available experimental data for one-dimensional compression. A post-compression qualitative observation of two virtual samples revealed a book-house-type fabric in the sample with acidic pore fluid, whereas a turbostatic-type fabric was observed when an alkaline pore fluid was simulated. These observations are also in qualitative agreement with scanning electron microscopy data reported in the literature.
- Research Article
6
- 10.1016/j.cpc.2024.109246
- May 16, 2024
- Computer Physics Communications
- Marcin Rogowski + 6 more
Unlocking massively parallel spectral proper orthogonal decompositions in the PySPOD package
- Research Article
- 10.47526/2024-1/2524-0080.09
- Mar 30, 2024
- Q A Iasaýı atyndaǵy Halyqaralyq qazaq-túrіk ýnıversıtetіnіń habarlary (fızıka matematıka ınformatıka serııasy)
- N.M Zhunissov + 2 more
This article explores the development of parallel software applications using the Python programming language. Parallel programming is becoming increasingly important in the information technology world as multi-core processors and distributed computing become more common. Python provides developers with a variety of tools and libraries for creating parallel applications, including threads, processes, and asynchronous programming. This topic covers the basics of parallel programming in Python, including the principles of thread and process management, error handling, synchronization mechanisms, and resource management. He also considers asynchronous programming using the asyncio library, which allows you to efficiently handle asynchronous tasks. In addition, this topic raises issues of optimization and profiling of parallel applications, as well as explores distributed parallel programming using third-party libraries and frameworks. He also emphasizes the importance of testing and debugging in the context of parallel programming. Research and experiments in parallel programming using Python help developers create high-performance and efficient applications that can effectively use multi-core systems and distributed computing. This article offers an in-depth study that examines how Python is suitable for teaching parallel programming to inexperienced students. The results show that there are obstacles that prevent Python from maintaining its advantages in the transition from sequential programming to parallel.
- Research Article
- 10.62517/jhet.202415225
- Mar 1, 2024
- Journal of Higher Education Teaching
- Juanyan Xu
Starting from the re-interpretation of the word "composition", this paper sets the teaching goal of "Three compositions" as "feeling, abstraction, creation and expression". The main problems encountered in this course in contemporary times are the demand to integrate the three major modules, the demand to innovate teaching methods, the demand to improve teaching output, the demand to extend teaching content to professional design, and the demand to solve the contradiction between "high requirements" and "reduced class hours". Based on these problems, this article redesigned the overall teaching framework of the "Three composition s", and based on this, expounded the implementation path of the ability training goal of this course, which is online and offline integration, hand-painting and software parallel operation, internal integration and external extension, and gradual ability training. It also shared some specific implementation methods and teaching cases.