An Analysis of the Computational Complexity and Efficiency of Various Algorithms for Solving a Nonlinear Model of Radon Volumetric Activity with a Fractional Derivative of a Variable Order

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

The article presents a study of the computational complexity and efficiency of various parallel algorithms that implement the numerical solution of the equation in the hereditary α(t)-model of radon volumetric activity (RVA) in a storage chamber. As a test example, a problem based on such a model is solved, which is a Cauchy problem for a nonlinear fractional differential equation with a Gerasimov–Caputo derivative of a variable order and variable coefficients. Such equations arise in problems of modeling anomalous RVA variations. Anomalous RVA can be considered one of the short-term precursors to earthquakes as an indicator of geological processes. However, the mechanisms of such anomalies are still poorly understood, and direct observations are impossible. This determines the importance of such mathematical modeling tasks and, therefore, of effective algorithms for their solution. This subsequently allows us to move on to inverse problems based on RVA data, where it is important to choose the most suitable algorithm for solving the direct problem in terms of computational resource costs. An analysis and an evaluation of various algorithms are based on data on the average time taken to solve a test problem in a series of computational experiments. To analyze effectiveness, the acceleration, efficiency, and cost of algorithms are determined, and the efficiency of CPU thread loading is evaluated. The results show that parallel algorithms demonstrate a significant increase in calculation speed compared to sequential analogs; hybrid parallel CPU–GPU algorithms provide a significant performance advantage when solving computationally complex problems, and it is possible to determine the optimal number of CPU threads for calculations. For sequential and parallel algorithms implementing numerical solutions, asymptotic complexity estimates are given, showing that, for most of the proposed algorithm implementations, the complexity tends to be n2 in terms of both computation time and memory consumption.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.3390/fractalfract6030163
Application of the Fractional Riccati Equation for Mathematical Modeling of Dynamic Processes with Saturation and Memory Effect
  • Mar 16, 2022
  • Fractal and Fractional
  • Dmitriy Tverdyi + 1 more

In this study, the model Riccati equation with variable coefficients as functions, as well as a derivative of a fractional variable order (VO) of the Gerasimov-Caputo type, is used to approximate the data for some physical processes with saturation. In particular, the proposed model is applied to the description of solar activity (SA), namely the number of sunspots observed over the past 25 years. It is also used to describe data from Johns Hopkins University on coronavirus infection COVID-19, in particular data on the Russian Federation and the Republic of Uzbekistan. Finally, it is used to study issues related to seismic activity, in particular, the description of data on the volumetric activity of Radon (RVA). The Riccati equation used in the mathematical model was numerically solved by constructing an implicit finite difference scheme (IFDS) and its implementation by the modified Newton method (MNM). The calculated curves obtained in the study are compared with known experimental data. It is shown that if the model parameters are chosen appropriately, the model curves will give results that correlate well with real experimental data. Moreover, with other parameters of the model, it is possible to make some prediction about the possible course of the considered processes.

  • Research Article
  • Cite Count Icon 63
  • 10.1109/21.3463
Efficient parallel algorithms for robot forward dynamics computation
  • Jan 1, 1988
  • IEEE Transactions on Systems, Man, and Cybernetics
  • C.S.G Lee + 1 more

Two efficient parallel algorithms for computing the forward dynamics for real-time simulation were developed for implementation on a single-instruction multiple-data-stream (SIMD) computer with n processors, where n is the number of degrees of freedom of the manipulator. The first parallel algorithm, based on the composite rigid-body method, generates the inertia matrix using the parallel Newton-Euler algorithm, the parallel linear recurrence algorithm, and the modified row-sweep algorithm, and then inverts the inertia matrix to obtain the joint acceleration vector at time t. The time complexity of this parallel algorithm is of the order O(n/sup 2/) with O(n) processors. The second parallel algorithm, based on the conjugate gradient method, computes the joint acceleration with a time complexity of O(n) for multiplication operation and O(n log/sub 2/n) for addition operation. The interprocessor communication problem for the implementation of the proposed parallel algorithms on SIMD machines is also discussed and analyzed. >

  • Conference Article
  • Cite Count Icon 12
  • 10.1109/sc.2016.31
An Efficient and Scalable Algorithmic Method for Generating Large-Scale Random Graphs
  • Nov 1, 2016
  • Maksudul Alam + 3 more

Many real-world systems and networks are modeled and analyzed using various random graph models. These models must incorporate relevant properties such as degree distribution and clustering coefficient. Many models, such as the Chung-Lu (CL), stochastic Kronecker, stochastic block model (SBM), and block two-level Erdős-Rényi (BTER) models have been devised to capture those properties. However, the generative algorithms for these models are mostly sequential and take prohibitively long time to generate large-scale graphs. In this paper, we present a novel time and space efficient algorithmic method to generate random graphs using CL, BTER, and SBM models. First, we present an efficient sequential algorithm and an efficient distributed-memory parallel algorithm for the CL model. Our sequential algorithm takes O(m) time and O(Λ) space, where m and Λ are the number of edges and distinct degrees, and our parallel algorithm takes O (m/p + Λ + P) time w.h.p. and O(Λ) space using P processors. These algorithms are almost time optimal since any sequential and parallel algorithms need at least Ω(m) and Ω(m/p) time, respectively. Our algorithms outperform the best known previous algorithms by a significant margin in terms of both time and space. Experimental results on various large-scale networks show that both of our sequential and parallel algorithms require 400–15000 times less memory than the existing sequential and parallel algorithms, respectively, making our algorithms suitable for generating very large-scale networks. Moreover, both of our algorithms are about 3–4 times faster than the existing sequential and parallel algorithms. Finally, we show how our algorithmic method also leads to efficient parallel and sequential algorithms for the SBM and BTER models.

  • Conference Article
  • Cite Count Icon 5
  • 10.5555/3014904.3014947
An efficient and scalable algorithmic method for generating large: scale random graphs
  • Nov 13, 2016
  • Mohammad Monzurul Alam + 3 more

Many real-world systems and networks are modeled and analyzed using various random graph models. These models must incorporate relevant properties such as degree distribution and clustering coefficient. Many models, such as the Chung-Lu (CL), stochastic Kronecker, stochastic block model (SBM), and block two-level Erdős-Renyi (BTER) models have been devised to capture those properties. However, the generative algorithms for these models are mostly sequential and take prohibitively long time to generate large-scale graphs. In this paper, we present a novel time and space efficient algorithmic method to generate random graphs using CL, BTER, and SBM models. First, we present an efficient sequential algorithm and an efficient distributed-memory parallel algorithm for the CL model. Our sequential algorithm takes O(m) time and O(Λ) space, where m and Λ are the number of edges and distinct degrees, and our parallel algorithm takes O (m/p + Λ + P) time w.h.p. and O(Λ) space using P processors. These algorithms are almost time optimal since any sequential and parallel algorithms need at least Ω(m) and Ω(m/p) time, respectively. Our algorithms outperform the best known previous algorithms by a significant margin in terms of both time and space. Experimental results on various large-scale networks show that both of our sequential and parallel algorithms require 400–15000 times less memory than the existing sequential and parallel algorithms, respectively, making our algorithms suitable for generating very large-scale networks. Moreover, both of our algorithms are about 3–4 times faster than the existing sequential and parallel algorithms. Finally, we show how our algorithmic method also leads to efficient parallel and sequential algorithms for the SBM and BTER models.

  • Research Article
  • Cite Count Icon 42
  • 10.1007/s00025-012-0269-3
Initial Value Problems of Fractional Order with Fractional Impulsive Conditions
  • Jul 11, 2012
  • Results in Mathematics
  • Nickolai Kosmatov

In this paper we intend to accomplish two tasks firstly, we address some basic errors in several recent results involving impulsive fractional equations with the Caputo derivative, and, secondly, we study initial value problems for nonlinear differential equations with the Riemann–Liouville derivative of order 0 < α ≤ 1 and the Caputo derivatives of order 1 < δ < 2. In both cases, the corresponding fractional derivative of lower order is involved in the formulation of impulsive conditions.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.3390/math11153358
Hybrid GPU–CPU Efficient Implementation of a Parallel Numerical Algorithm for Solving the Cauchy Problem for a Nonlinear Differential Riccati Equation of Fractional Variable Order
  • Jul 31, 2023
  • Mathematics
  • Dmitrii Tverdyi + 1 more

The numerical solution for fractional dynamics problems can create a high computational load, which makes it necessary to implement efficient algorithms for their solution. The main contribution to the computational load of such computations is created by heredity (memory), which is determined by the dependence of the current value of the solution function on previous values in the time interval. In terms of mathematics, the heredity here is described using a fractional differentiation operator in the Gerasimov–Caputo sense of variable order. As an example, we consider the Cauchy problem for the non-linear fractional Riccati equation with non-constant coefficients. An efficient parallel implementation algorithm has been proposed for the known sequential non-local explicit finite-difference numerical solution scheme. This implementation of the algorithm is a hybrid one, since it uses both GPU and CPU computational nodes. The program code of the parallel implementation of the algorithm is described in C and CUDA C languages, and is developed using OpenMP and CUDA hardware, as well as software architectures. This paper presents a study on the computational efficiency of the proposed parallel algorithm based on data from a series of computational experiments that were obtained using a computing server NVIDIA DGX STATION. The average computation time is analyzed in terms of the following: running time, acceleration, efficiency, and the cost of the algorithm. As a result, it is shown on test examples that the hybrid version of the numerical algorithm can give a significant performance increase of 3–5 times in comparison with both the sequential version of the algorithm and OpenMP implementation.

  • Book Chapter
  • Cite Count Icon 6
  • 10.1090/dimacs/015/13
Implementation of parallel graph algorithms on the MasPar
  • Jun 7, 1994
  • Tsan-Sheng Hsu + 2 more

Graphs play an important role in modeling the underlying structure of many real world problems. Over the past couple of decades, efficient sequential algorithms have been developed for several graph problems and have been implemented on sequential machines. The NETPAD system at Bellcore is a general tool for graph manipulations and algorithm design that facilitates such implementations. More recently, several research results on efficient parallel algorithms have been developed, but not much implementation has been done. We have implemented some of the parallel algorithms for basic graph problems on the massively parallel machine MasPar, and we have interfaced these algorithms with NETPAD. In this paper, we give a description of our implementation together with some performance data. We also describe the interface that we have built between our library of parallel graph algorithms and the NETPAD system.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 6
  • 10.3390/fractalfract7020183
On the 1st-Level General Fractional Derivatives of Arbitrary Order
  • Feb 12, 2023
  • Fractal and Fractional
  • Yuri Luchko

In this paper, the 1st-level general fractional derivatives of arbitrary order are defined and investigated for the first time. We start with a generalization of the Sonin condition for the kernels of the general fractional integrals and derivatives and then specify a set of the kernels that satisfy this condition and possess an integrable singularity of the power law type at the origin. The 1st-level general fractional derivatives of arbitrary order are integro-differential operators of convolution type with the kernels from this set. They contain both the general fractional derivatives of arbitrary order of the Riemann–Liouville type and the regularized general fractional derivatives of arbitrary order considered in the literature so far. For the 1st-level general fractional derivatives of arbitrary order, some important properties, including the 1st and the 2nd fundamental theorems of fractional calculus, are formulated and proved.

  • Research Article
  • Cite Count Icon 15
  • 10.1007/s10958-021-05254-0
Hereditary Riccati Equation with Fractional Derivative of Variable Order
  • Feb 12, 2021
  • Journal of Mathematical Sciences
  • D A Tvyordyj

The Riccati differential equation with a fractional derivative of variable order is considered. A derivative of variable fractional order in the original equation implies the hereditary property of the medium, i.e., the dependence of the current state of a dynamic system on its previous states. A software called Numerical Solution of a Fractional-Differential Riccati Equation (briefly NSFDRE) is created; it allows one to compute a numerical solution of the Cauchy problem for the Riccati differential equation with a derivative of variable fractional order. The numerical algorithm implemented in the software is based on the approximation of the variable-order derivative by finite differences and the subsequent solution of the corresponding nonlinear algebraic system. New distribution modes depending on the specific type of variable order of the fractional derivative were obtained. We also show that some distribution curves are specific for other hereditary dynamic systems.

  • Research Article
  • Cite Count Icon 2
  • 10.1080/00102202.2018.1549039
Performance of Parallel Chemistry Acceleration Algorithm in Simulations of Gaseous Detonation: Effects of Fuel Type and Numerical Scheme Resolution
  • Dec 3, 2018
  • Combustion Science and Technology
  • Jintao Wu + 2 more

Gaseous detonation is regarded as a promising combustion mode in novel detonation engines. The numerical simulations with high fidelity of the gaseous detonation require the detailed chemical mechanism and the high numerical scheme resolutions, and therefore lead to the expensive computational costs. In the present work, a parallel algorithm based on the storage/deletion method is selected to accelerate the chemistry computations in the numerical simulations of two-dimensional gaseous detonation wave propagation. The effects of fuel type (chemical reaction mechanism) and numerical scheme resolution on the acceleration performance of the selected parallel algorithm are studied. Two gaseous fuels, hydrogen with smaller chemical reaction mechanism size and ethylene with larger one, are chosen to carry out the simulations; while for the numerical scheme, the weighted essentially non-oscillatory (WENO) scheme with fifth-order and ninth-order resolutions are respectively employed. It was found that the parallel algorithm can provide the satisfactory performances on both computational accuracy and efficiency for all simulations in present study. The fuel type (chemical reaction mechanism) has a more obvious influence on the computational efficiency, while the numerical scheme resolution has a relatively unimportant effect during the entire simulations. The hydrogen chemistry has higher speedup ratio than the ethylene chemistry at the early stage of simulations, but the speedup ratios of both fuels will finally converge to the similar level at the later stage. The optimal speedup ratio of 4.67 is obtained for the case with the ethylene reaction mechanism (larger size) and the ninth-order WENO scheme (higher resolution) at the end of computations. Furthermore, the balance and synchronization of table operations among different data tables in the parallel algorithm are analyzed. Neither balance nor synchronization can solely affect the acceleration performance, both of them jointly play the important roles in accelerating the simulations of gaseous detonation.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.chaos.2021.111040
Unique existence of solution to initial value problem for fractional differential equation involving with fractional derivative of variable order
  • Jul 1, 2021
  • Chaos, Solitons &amp; Fractals
  • Shuqin Zhang + 1 more

Unique existence of solution to initial value problem for fractional differential equation involving with fractional derivative of variable order

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 25
  • 10.3390/rs12203308
Fast Reconstruction of 3D Point Cloud Model Using Visual SLAM on Embedded UAV Development Platform
  • Oct 12, 2020
  • Remote Sensing
  • Fang Huang + 5 more

In recent years, the rapid development of unmanned aerial vehicle (UAV) technologies has made data acquisition increasingly convenient, and three-dimensional (3D) reconstruction has emerged as a popular subject of research in this context. These 3D models have many advantages, such as the ability to represent realistic scenes and a large amount of information. However, traditional 3D reconstruction methods are expensive, and require long and complex processing. As a result, they cannot rapidly respond when used in time-sensitive applications, e.g., those for such natural disasters as earthquakes, debris flow, etc. Computer vision-based simultaneous localization and mapping (SLAM) along with hardware development based on embedded systems, can provide a solution to this problem. Based on an analysis of the principle and implementation of the visual SLAM algorithm, this study proposes a fast method to quickly reconstruct a dense 3D point cloud model on a UAV platform combined with an embedded graphics processing unit (GPU). The main contributions are as follows: (1) to resolve the contradiction between the resource limitations and the computational complexity of visual SLAM on UAV platforms, the technologies needed to compute resource allocation, communication between nodes, and data transmission and visualization in an embedded environment were investigated to achieve real-time data acquisition and processing. Visual monitoring to this end is also designed and implemented. (2) To solve the problem of time-consuming algorithmic processing, a corresponding parallel algorithm was designed and implemented based on the parallel programming framework of the compute unified device architecture (CUDA). (3) The visual odometer and methods of 3D “map” reconstruction were designed using under a monocular vision sensor to implement the prototype of the fast 3D reconstruction system. Based on preliminary results of the 3D modeling, the following was noted: (1) the proposed method was feasible. By combining UAV, SLAM, and parallel computing, a simple and efficient 3D reconstruction model of an unknown area was obtained for specific applications. (2) The parallel SLAM algorithm used in this method improved the efficiency of the SLAM algorithm. On the one hand, the SLAM algorithm required 1/6 of the time taken by the structure-from-motion algorithm. On the other hand, the speedup obtained using the parallel SLAM algorithm based on the embedded GPU on our test platform was 7.55 × that of the serial algorithm. (3) The depth map results show that the effective pixel with an error less than 15cm is close to 60%.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1007/978-3-030-29614-8_3
Cauchy Type Problems
  • Jan 1, 2019
  • Trifce Sandev + 1 more

We now analyze Cauchy type problems of differential equations of fractional order with Hilfer and Hilfer-Prabhakar derivative operators. The existence and uniqueness theorems for n-term nonlinear fractional differential equations with Hilfer fractional derivatives of arbitrary orders and types will be proved. Cauchy type problems for integro-differential equations of Volterra type with generalized Mittag-Leffler function in the kernel will be considered as well. Using the operational method of Mikusinski, the solution of a Cauchy type problem for a linear n-term fractional differential equations with Hilfer fractional derivatives will be obtained. We will show utility of operational method to solve Cauchy type problems of a wide class of integro-differential equations with variable coefficients, involving Prabhakar integral operator and Laguerre derivatives. For this purpose, following some recent works, we choose the examples which, by means of fractional derivatives, generalize the well-known ordinary differential equations and partial differential equations, related to time fractional heat equations, free electronic laser equation, some evolution and boundary value problems, and finally some Cauchy type problems for the generalized fractional Poisson process.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3357384.3358062
Efficient Sequential and Parallel Algorithms for Estimating Higher Order Spectra
  • Nov 3, 2019
  • Zigeng Wang + 4 more

Higher order spectra (HOS) are a powerful tool in nonlinear time series analysis and they have been extensively used as feature representations in data mining, communications and cosmology domains. However, HOS estimation suffers from high computational cost and memory consumption. Any algorithm for computing the kth order spectra on a dataset of size n needs O(n^k-1 ) time since the output size will be O(n^k-1 ) as well, which makes the direct HOS analysis difficult for long time series, and further prohibits its direct deployment to resource-limited and time-sensitive applications. Existing algorithms for computing HOS are either inefficient or have been implemented on obsolete architectures. Thus it is essential to develop efficient generic algorithms for HOS estimations. In this paper, we present a package of generic sequential and parallel algorithms for computationally and memory efficient HOS estimations which can be employed on any parallel machine or platform. Our proposed algorithms largely reduce the HOS' computational cost and memory usage in spectrum multiplication and smoothing steps through carefully designed prefix sum operations. Moreover, we employ a matrix partitioning technique and design algorithms with optimal memory usage and present the parallel approaches on the PRAM and the mesh models. Furthermore, we implement our algorithms for both bispectrum and trispectrum estimations. We conduct extensive experiments and cross-compare the proposed algorithms' performance. Results show that our algorithms achieve state-of-the-art computational and memory efficiency, and our parallel algorithms achieve close to linear speedups. The code is available at https://github.com/ZigengWang/HOS.

  • Research Article
  • 10.1002/cpe.3910
Embedded multicore computing and applications
  • Aug 2, 2016
  • Concurrency and Computation: Practice and Experience
  • Frédéric Magoulès + 3 more

Embedded multicore computing and applications

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon