Articles published on Matrix Vector
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
3494 Search results
Sort by Recency
- Research Article
- 10.1103/kzyj-j3g8
- May 3, 2026
- Physical Review Research
- Anonymous
Decoherence-free quantum error mitigation by density matrix vectorization
- Research Article
- 10.3390/fractalfract10050290
- Apr 24, 2026
- Fractal and Fractional
- Yiyin Liang + 1 more
In this paper, an efficient and accurate framework for nonlinear spacetime fractional diffusion equations is proposed. The methods are based on the spectral deferred correction technique, which employs a compact difference scheme as the preconditioner via the Picard integral collocation formulation. The nonlinear term is incorporated into the preconditioner in a way similar to linear systems without using Newtonian methods. The preconditioner is proven to be a stable operator, and the resulting spectral deferred correction method maintains an arbitrary order of accuracy and excellent stability. Due to the dense property of the central finite difference approximation of the fractional Laplacian (−Δ)s, a dual accelerated algorithm for the exact computation of the matrix–vector product is presented by introducing the discrete sine transform. The numerical results demonstrate that the proposed new methods are highly efficient and precise.
- Research Article
- 10.3390/informatics13040060
- Apr 14, 2026
- Informatics
- Armando Arce + 5 more
This study presents a deep learning-based framework for beam pattern synthesis in optimized uniform linear antenna arrays, combining Differential Evolution–based pre-optimization with recurrent neural network (RNN) modeling. Radiation patterns are first generated to satisfy sidelobe suppression and directivity constraints and are then used to train recurrent models that learn the mapping between radiation patterns and complex excitation parameters. A formal mathematical formulation of the Simple RNN, Gated Recurrent Unit (GRU), and Long Short-Term Memory (LSTM) architectures is provided, together with a per–time-step computational cost analysis based on dominant matrix–vector multiplications. A comparative evaluation under identical training conditions shows that gated architectures significantly outperform the standard RNN. Although the LSTM achieves the lowest prediction errors, the GRU attains comparable performance with reduced structural complexity. Beam pattern synthesis experiments for unseen steering directions demonstrate accurate reconstruction of main lobe alignment, sidelobe levels (approximately −12 to −13 dB), and directivity values close to 8 dB. The floating-point operations (FLOPs) analysis indicates that the GRU requires fewer dominant operations per time step than the LSTM, potentially reducing computational cost and energy consumption in resource-constrained beamforming applications.
- Research Article
- 10.1007/s00607-026-01658-5
- Apr 13, 2026
- Computing
- Andrés E Tomás + 6 more
Abstract The sparse matrix–vector multiplication ( SpMV ) kernel is a key kernel in scientific and engineering applications, forming the core of many iterative solvers for linear systems and eigenvalue problems. Due to its low arithmetic intensity and irregular memory access patterns, SpMV remains memory-bound on modern architectures, making its efficient implementation particularly challenging. This paper presents vectorized SpMV routines for RISC-V processors with SIMD support, exploiting the RISC-V Vector Extension (RVV 1.0). We implement and evaluate three storage formats—CSR (Compressed Sparse Row), SELL- p (a vector-friendly variant of ELLPACK), and JDS (Jagged Diagonal Storage)—providing low-level implementations that leverage RVV intrinsics. Performance is assessed on two commercial RISC-V platforms (CanMV-K230 and BananaPi F3) with 128-bit and 256-bit vector registers, and on the EPAC research system featuring 16,384-bit vectors. Results show that the vectorized routines significantly outperform scalar baselines, achieving a variety of speed-ups depending on the format and architecture. These findings highlight the potential of open RISC-V architectures for high-performance sparse linear algebra and provide a foundation for future vector-aware sparse kernel optimizations.
- Research Article
- 10.1109/tte.2025.3642057
- Apr 1, 2026
- IEEE Transactions on Transportation Electrification
- Ehsan Majma + 4 more
This paper introduces a novel approach for diagnosing Inter-Turn Short Circuit (ITSC) faults in Permanent Magnet Synchronous Motors (PMSMs) using a two-layer Interactive Multiple Model (IMM) strategy tailored for real-time applications. The first layer employs an IMM-Constrained Extended Kalman Filter (CEKF) framework for fault detection, while the second layer uses an Extended Kalman Filter (EKF) framework for fault diagnosis. In the first layer, four models are employed: the healthy motor model and three models with ITSC faults in each of the PMSM phases. The short-circuit resistance is estimated as an augmented state alongside system states. Upon detecting a fault in one of the phases, the second layer estimates the number of shorted turns and refines the estimated short-circuit resistance value. This study addresses three key challenges in applying the IMM framework to this problem. First, distinguishing between the healthy motor and faulty models with large short-circuit resistances is difficult due to their similar dynamic responses. This issue is resolved by imposing constraints on the estimated resistance, enabling better model separation. Second, the diversity of state variables across models is managed by zero-padding the state vector and estimation error covariance matrix, with matrix-based mode probabilities mitigating errors from augmented zeros. Third, uncertainty in the number of shorted turns is addressed in the second layer by using a secondary model set to determine the exact number of shorted turns and refine the short circuit resistance estimate. Experimental validation, conducted with a custom relay box for controlled fault injection, demonstrates the method's effectiveness in detecting and diagnosing ITSC faults using the first and second model banks, respectively.
- Research Article
1
- 10.1016/j.future.2025.108231
- Apr 1, 2026
- Future Generation Computer Systems
- Héctor Martínez + 3 more
Recent advances in deep learning (DL) have promoted to a shift from traditional 64-bit floating point (FP64) arithmetic for scientific computing toward reduced-precision formats–such as FP16, BF16, or even 8-bit integers–combined with mixed-precision arithmetic. This transition enhances computational throughput, reduces memory and bandwidth usage, and improves energy efficiency, offering significant advantages for resource-constrained edge devices. To support this shift, hardware architectures have evolved accordingly, now including adapted ISAs (Instruction Set Architectures) that expose mixed-precision vector units and matrix engines tailored for DL workloads. At the heart of many DL and scientific computing tasks is the general matrix-matrix multiplication ( GEMM ), a fundamental kernel historically optimized using fused multiply-add (FMA) vector instructions on SIMD (single instruction, multiple data) units. However, as hardware moves toward mixed-precision dot (or inner)-product-centric operations optimized for quantized inference, these legacy approaches are being phased out. In response to this, our paper revisits the conventional, high-performance implementation of GEMM and describes strategies for adapting it to mixed integer precision (MIP) arithmetic across modern ISAs, including x86_64, Arm, and RISC-V. Concretely, we illustrate novel micro-kernel designs and data layouts that better exploit today’s specialized hardware and demonstrate significant performance gains from MIP arithmetic over floating-point implementations across three representative CPUs. These contributions highlight a new era of GEMM optimization-driven by the demands of DL inference on heterogeneous architectures, marking what we term as the “Cambrian period” for matrix multiplication.
- Research Article
- 10.1016/j.optlastec.2026.114687
- Apr 1, 2026
- Optics & Laser Technology
- Yanfeng Bi + 6 more
Reconfigurable photonic complex-valued matrix–vector multiplication processor based on time-division multiplexing
- Research Article
- 10.1088/1361-6501/ae5278
- Mar 27, 2026
- Measurement Science and Technology
- Chenghao Shan + 3 more
Abstract This work presents a novel multivariate Laplace distribution (MLD)-based Gaussian approximate filter (GAF) developed to address the estimation problem in a nonlinear system subject to colored heavy-tailed measurement noise (CHTMN). Initially, through the application of measurement difference and state extension methods, the estimation challenge featuring CHTMN is converted into a filtering issue with white heavy-tailed measurement noise (WHTMN); subsequently, the MLD is introduced to characterize the WHTMN, and the formulation of a novel state space model is thereby established. Moreover, the system state vector and MLD covariance matrix are jointly inferred through the application of variational Bayesian inference methodology and a novel MLD-based GAF is designed. Ultimately, the superiority of the proposed MLD-based GAF in the scenario of CHTMN is demonstrated by two simulation models.
- Research Article
- 10.55592/cilamce2025.v5i.14102
- Mar 18, 2026
- Ibero-Latin American Congress on Computational Methods in Engineering (CILAMCE)
- Magno Mota + 5 more
This paper reports the employment of parallelization techniques in the implementation of a 3D probabilistic explicit cracking model for concrete. The mentioned model is based on the use of interface elements to explicitly represent cracks and naturally has a high computational cost, which justifies the importance of developing parallelization strategies to accelerate the analyses. Parallelization was considered in the most time-consuming tasks in the finite element code: assembly of the stiffness matrix and residual force vector, and solution of the linear equation system. The results show that significant reductions in simulation time can be obtained with the parallelized code.
- Research Article
- 10.1142/s021812662650194x
- Mar 18, 2026
- Journal of Circuits, Systems and Computers
- Xinjian Zhao + 5 more
Resistive random-access memory (RRAM)–based computing systems have emerged as a promising platform for deploying deep neural networks (DNNs) due to their high energy efficiency and massive parallelism enabled by in-memory vector–matrix multiplication. However, unauthorized access to RRAM-based platforms exposes the deployed models to black-box query–based model theft and reverse-engineering attacks. To address this threat, this paper proposes a neuron-level activation–triggered perturbation protection mechanism tailored for RRAM-based neural network deployment. The proposed protection method is composed of an activation module and a perturbation module. First, a lightweight conditional activation module is embedded into the model’s semantic bottleneck layer, encoding the authorization state of an access request as neuron activation signals. Then, the resulting activation signal controls a perturbation module that injects direction- and magnitude-adjustable distortions into the output logits if an unauthorized behavior is detected. Under authorized operation, on the other hand, the activation value remains close to zero, ensuring that the original inference functionality is preserved. This protection mechanism does not modify backbone network weights nor introduce additional inference branches. The authorization decision and output intervention can be completed within a single forward pass. Evaluation results on a behavior-level RRAM simulator demonstrate that the proposed method can effectively protect the deployed models against model theft attacks under unauthorized queries while preserving prediction accuracy under authorized access, with an insignificant hardware overhead.
- Research Article
- 10.63775/19dnft89
- Mar 17, 2026
- Transformations and Sustainability
- Mladen Krstić + 3 more
The European Union dominates global wine exports and production. The EU’s wine sector is primarily supported by the Common Agricultural Policy (CAP) and Geographical Indications systems (GIs). On an international point of view, the sector is experiencing remarkable economic repercussions due to US tariffs. In order to overcome the identified challenges, it is crucial for wineries to implement a tailored sales distribution strategy, particularly for small wineries. The distribution landscape for small wineries is characterized by limited resources, diverse channel options and rapidly changing market conditions, making the selection of an optimal mix both complex and critical for profitability and resilience. This study formulates the choice of distribution strategy as a multi criteria decision making (MCDM) problem and introduces a hybrid framework that combines the Best–Worst Method (BWM) for deriving consistent criterion weights with the novel Axial Distance based Aggregated Measurement (ADAM) technique for robust alternative ranking. Seven evaluation criteria, economic profitability, resource availability, implementation feasibility, strategic alignment, market opportunity, competitive advantage, and flexibility, are applied to five distribution strategies: direct sales; online and social media channels; local partnerships; distributor partnerships; and participation in festivals and events. Expert assessments generate the decision matrix and weight vectors, yielding a final ranking that places local partnerships highest, followed by direct sales, online channels, distributor partnerships, and festivals. The results demonstrate the value of community-based collaborations and experiential marketing, while the hybrid MCDM approach offers a transparent, adaptable tool for strategic decision-making. Limitations linked to expert subjectivity and criterion scope are discussed, and avenues for incorporating sustainability and dynamic updates are outlined.
- Research Article
- 10.3390/s26051588
- Mar 3, 2026
- Sensors (Basel, Switzerland)
- Roney Duarte Da Silva + 1 more
This work presents the design, calibration and detailed performance characterization of a triaxial accelerometer based on fiber Bragg gratings (FBG), intended for space navigation applications. The sensor employs a single seismic mass architecture, whose acceleration-induced displacement deforms six optical fibers (OFs), forming twelve fiber segments (FSs) that act as elastic elements, with the strain measured by FBGs inscribed in each fiber. The methodology ranges from the manufacturing and spectral characterization of the FBGs to the design of a differential optical interrogation system and a low-noise signal conditioning circuit. A cornerstone of this work is the proposal of an extended calibration model that, in addition to the conventional sensitivity matrix and bias vector parameters, incorporates polynomial terms to actively compensate for the effects of temperature variation. This model was validated through tests in a climatic chamber, subjecting the sensor to different orientations and controlled temperatures. The experimental results validate the design's effectiveness, demonstrating that the accelerometer achieves tactical-grade performance with a bias instability below 1.9 mgE for all axes. The analysis confirmed that the sensor's effective full-scale range is approximately ±20gE, and sensitivity of 112 pm/gE, limited by the nature of the optical interrogation system. Furthermore, a third-order polynomial thermal compensation model was shown to provide the most efficient balance between model complexity and error reduction, reducing errors to a level dominated by the system's intrinsic noise and ensuring the sensor's accuracy over a wide operational temperature range.
- Research Article
- 10.1190/geo-2025-0276
- Mar 1, 2026
- Geophysics
- Hanming Chen + 4 more
ABSTRACT Bayesian full-waveform inversion (FWI) addresses the problem of the nonuniqueness of solutions in traditional deterministic FWI by quantifying the model uncertainties, which can be realized by a variational inference (VI) approach. As an efficient VI algorithm, the Stein variational gradient descent (SVGD) has been used to develop a VI-based FWI method, which approximates the posterior probability density function using the distribution of a particle set. However, the SVGD-based FWI method reported in the existing literature usually uses some weak priors, such as a uniform distribution, to generate prior particles (or models, a term commonly used in the geophysics community). The particles generated from such priors exhibit random structures. Although this maximizes the retention of all possible solutions, it usually requires a large number of iterations to ensure convergence to the results with clear geologic implications. To address this, a geostatistical method was introduced to extract geologic structure information from seismic images and this information was used to generate prior particles. Specifically, the particles were generated by perturbing a smooth model with products of a pattern-feature correlation (PFC) matrix and random vectors. The elements of the PFC matrix, quantitatively determined as correlation coefficients of the pattern score vectors at each point, represent similarities of the geologic patterns at different positions. To reduce the storage amount of the PFC matrix and eliminate the spurious spatial correlations, which typically occur between two spatially distant points, the variogram function in geostatistics was adopted to determine the maximum correlation radius and the PFC matrix was sparsified according to this radius. The sparsified PFC matrix was then used to generate informed prior particles for SVGD-based FWI. Numerical examples demonstrate clearly that using the geostatistical prior particles as initial particles enhances the convergence of SVGD-based FWI visibly and yields an accurate characterization of the posterior distribution of the velocity model.
- Research Article
- 10.1007/s44443-026-00575-z
- Feb 27, 2026
- Journal of King Saud University Computer and Information Sciences
- Bo Liu + 7 more
Abstract Latent space-based facial attribute editing methods have gained popularity in applications such as digital entertainment, virtual avatar creation, and human-computer interaction systems due to their potential for efficient and flexible attribute manipulation, particularly for continuous edits. Among these, unsupervised latent space-based methods, which discover effective semantic vectors without relying on labeled data, have attracted considerable attention in the research community. However, existing methods still encounter difficulties in disentanglement, as manipulating a specific facial attribute may unintentionally affect other attributes, complicating fine-grained controllability. To address these challenges, we propose a novel framework designed to offer an effective and adaptable solution for unsupervised facial attribute editing, called Unsupervised Facial Attribute Controllable Editing (U-Face). The proposed method frames semantic vector learning as a subspace learning problem, where latent vectors are approximated within a lower-dimensional semantic subspace spanned by a semantic vector matrix. This formulation can also be equivalently interpreted from a projection-reconstruction perspective and further generalized into an autoencoder framework, providing a foundation that can support disentangled representation learning in a flexible manner. To improve disentanglement and controllability, we impose orthogonal non-negative constraints on the semantic vectors and incorporate attribute boundary vectors to reduce entanglement in the learned directions. Although these constraints make the optimization problem challenging, we design an alternating iterative algorithm, called Alternating Iterative Disentanglement and Controllability (AIDC), with closed-form updates and provable convergence under specific conditions. Extensive experiments on multiple pre-trained GAN generators demonstrate that U-Face shows improved disentanglement, controllability, and visual fidelity compared to existing state-of-the-art supervised and unsupervised baselines. Specifically, U-Face reduces inter-attribute correlation by an average of 5%-15%, and improves FID by 10%-20% and LPIPS by 5%-10%. Compared to recent diffusion-based methods (DDS, CDS, FPE), U-Face achieves comparable editing quality and reduces inference times, making it suitable for real-time applications across different GAN backbones and enabling large-scale interactive facial editing.
- Research Article
- 10.1002/advs.202516478
- Feb 23, 2026
- Advanced Science
- Jiwon You + 8 more
ABSTRACTStrategic optimization of ferroelectric tunnel junctions (FTJs) is critical for advancing nonvolatile memory and neuromorphic computing technologies. In this work, we present a comprehensive study on materials and structural engineering to enable scalable hybrid‐switching FTJ arrays. We systematically manipulated oxygen vacancy (VO) concentrations in HfZrO2 (HZO) films through strategic choices of bottom electrodes and interfacial layers, achieving three distinct operational modes: pure ferroelectric switching, defect‐modulated switching, and combined hybrid switching. Our optimized devices demonstrate exceptional tunneling electroresistance (TER) performance: Mo bottom electrodes achieve a TER ratio of around 102, while Mo/Ti bottom electrodes attain TER to over 104. Lower‐leakage ferroelectric switching and enhanced polarization stability are observed with Mo bottom and ZrO2 interlayers, while VO‐driven resistive contributions from Ti electrodes amplify TER in hybrid devices. Utilizing these optimized parameters, we fabricated a 42 × 42 FTJ array demonstrating uniform multi‐level conductance modulation. The fabricated FTJ array was integrated into an in‐memory Vision Transformer (ViT) architecture, successfully performing stable and energy‐efficient parallel vector–matrix multiplication (VMM) operations despite device variability. This work shows that precisely engineered, large‐area hybrid‐switching FTJ arrays can provide a scalable and energy‐efficient hardware platform for next‐generation memory and neuromorphic systems.
- Research Article
- 10.1016/j.jfa.2025.111266
- Feb 1, 2026
- Journal of Functional Analysis
- Zhigang Bao + 2 more
Phase transition for the bottom singular vector of rectangular random matrices
- Research Article
- 10.1002/sam.70060
- Feb 1, 2026
- Statistical Analysis and Data Mining: An ASA Data Science Journal
- Seungyeon Oh + 2 more
ABSTRACT This paper addresses classification problems with matrix‐valued data, which commonly arise in applications such as neuroimaging and signal processing. Building on the assumption that the data from each class follows a matrix normal distribution, we propose a novel extension of Fisher's Linear Discriminant Analysis (LDA) tailored for matrix‐valued observations. To effectively capture structural information while maintaining estimation flexibility, we adopt a nonparametric empirical Bayes framework based on Nonparametric Maximum Likelihood Estimation (NPMLE), applied to vectorized and scaled matrices. The NPMLE method has been shown to provide robust, flexible, and accurate estimates for vector‐valued data with various structures in the mean vector or covariance matrix. By leveraging its strengths, our method is effectively generalized to the matrix setting, thereby improving classification performance. Through extensive simulation studies and real data applications, including electroencephalography (EEG) and magnetic resonance imaging (MRI) analysis, we demonstrate that the proposed method tends to outperform existing approaches across a variety of data structures.
- Research Article
- 10.1038/s41598-025-34924-1
- Jan 30, 2026
- Scientific Reports
- Seyed Parsa Hemmasi + 3 more
Compute-in-Memory (CIM) offers an efficient approach for accelerating DNNs by performing matrix–vector multiplications directly within memory. However, its adoption in edge devices is limited by unstable power supplies and the performance overhead of conventional row- or column-wise computing. This paper presents a two-directional CIM-based nvSRAM cell that performs both row- and column-wise operations, enabling faster and more efficient matrix–vector multiplication. The proposed design stores the CIM outputs within the same computation cycle, referred to as Simultaneous Compute and Write (SCW), thereby reducing latency during complex neural network inference. By integrating a single I-MTJ into each SRAM cell, it also provides reliable data retention and restoration during power failures, making it well-suited for low-power, energy-constrained edge applications. Post-layout simulations were conducted to evaluate the proposed architecture. The detailed post-layout simulation results demonstrate a 31% improvement in write margin, a 40% reduction in PDP in memory mode, and an 85% reduction in energy in backup scenarios, compared to state-of-the-art designs. Furthermore, the proposed design achieves a 39.2% EDP reduction during neural network inference operation under power instability, highlighting its suitability for low-power edge computing.
- Research Article
- 10.1007/s10915-025-03157-9
- Jan 26, 2026
- Journal of Scientific Computing
- Anna Yesypenko + 1 more
Randomized Strong Recursive Skeletonization: Simultaneous Compression and LU Factorization of Hierarchical Matrices using Matrix–Vector Products
- Research Article
- 10.5120/ijca2026926300
- Jan 20, 2026
- International Journal of Computer Applications
- Ahmad Farhan Alshammari
The goal of this research is to implement network analysis using Markov chains in Python.Networks exist almost everywhere in life.There are networks of computers, people, articles, posts, etc. Network analysis is used to understand the structure, function, and performance of the network.Markov chains method is used to predict the future state based on the present state and not on the previous states.The basic steps of network analysis using Markov chains are explained: defining network (states, transition matrix, and distribution vector), performing matrix multiplication (computing stationary distribution vector and computing stationary transition vector), performing random walk (computing stationary distribution vector), comparing results, and plotting charts.The developed program was tested on an experimental data.The program has successfully performed the basic steps of network analysis using Markov chains and provided the required results.