Articles published on Variational Models
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
7694 Search results
Sort by Recency
- New
- Research Article
- 10.1109/jiot.2025.3625928
- May 1, 2026
- IEEE Internet of Things Journal
- Chengliang Yang + 8 more
In recent years, the rapid advancement and application of deep learning in medical imaging have demonstrated its effectiveness in reducing physicians’ workload and lowering the risk of misdiagnosis in pathological spine diagnosis. Nevertheless, deep learning–based models for pathological spine diagnosis have not yet matured to the level required for clinical deployment. Several challenges contribute to this limitation. First, the availability of spinal X-ray images for training is limited, and the class distribution of samples is often imbalanced. Second, conventional deep learning models rely on convolutional kernels that primarily capture local features in X-ray images, while overlooking the global morphological characteristics of the spine. To address these issues, we propose ViTST, a Vision Transformer (ViT)–based model with a self-supervised learning task for scoliosis classification. ViTST incorporates a masked strategy–based self-supervised pretext task to mitigate the challenges posed by limited training data and leverages the ViT architecture to capture global structural features of spinal X-ray images. This design enables more effective modeling of inter-regional relationships and variations within the spine. Moreover, by jointly optimizing reconstruction loss and cross-entropy loss, ViTST learns robust image representations even from relatively small datasets. In addition, we introduce a healthcare Internet of Medical Things (IoMT) architecture to enable the practical deployment of ViTST in clinical environments. Through this IoMT platform, clinicians can monitor patients’ conditions in real time and adapt treatment plans dynamically, thereby enhancing clinical decision-making and accelerating patient recovery. Finally, we conducted extensive experiments on a real-world pathological spine image dataset to validate the effectiveness of the proposed model. Experimental results demonstrate that ViTST achieved a Precision of 0.975, an Accuracy of 0.979, and an F1-score of 0.975, confirming its strong potential for application in clinical practice.
- New
- Research Article
- 10.1016/j.cja.2025.103986
- May 1, 2026
- Chinese Journal of Aeronautics
- Kunyu Wei + 1 more
Severe load spectrum development for transport aircraft from measured load data: A representative flight method
- New
- Research Article
- 10.1016/j.knosys.2026.115659
- May 1, 2026
- Knowledge-Based Systems
- Rodica Ioana Lung + 1 more
• existence of barren plateaux is one of the challenges in the practical use of variational quantum classifiers; • a noise-based mechanism that shifts training data during optimization, helping escape barren plateaux, is proposed; • the approach is tested with a variational quantum classifier modeling BET index changes using other indices from Europe and the United States; • simulations use the Pennylane framework. The barren plateaux phenomenon has been identified as a significant challenge for variational quantum algorithms, particularly for classification tasks. In this article, we propose a novel approach to mitigating this problem for variational quantum classifiers during the optimization phase. The noisy optimization mechanism shifts the training data by adding a small amount of uniform noise, thereby inducing changes in the parameters being searched. The effectiveness of the method is evaluated using real financial data, modeling the evolution of the BET index in relation to well-known indices from neighboring Central and Eastern European countries, as well as from Western Europe and the United States. The results demonstrate that this approach significantly improves upon the corresponding baseline quantum classifier and provides results comparable to those of established classical methods.
- New
- Research Article
- 10.1016/j.bspc.2026.109642
- May 1, 2026
- Biomedical Signal Processing and Control
- Zuoping Tan + 11 more
Innovative Algorithm for Keratoconus Intelligent Grading Using Variational Encoding Bayesian Gaussian Mixture Model
- New
- Research Article
- 10.20935/acadquant8243
- Apr 27, 2026
- Academia Quantum
- Vinit Singh + 4 more
Quantum machine learning (QML) is rapidly transitioning from theoretical promise to practical relevance across data-intensive scientific domains. In this review, we provide a structured overview of recent advances that bridge foundational quantum learning principles with real-world applications. We survey foundational QML paradigms, including variational quantum algorithms, quantum kernel methods, and neural-network quantum states, with emphasis on their applicability to complex quantum systems. We examine neural-network quantum states as expressive variational models for correlated matter, non-equilibrium dynamics, and open quantum systems, and discuss fundamental challenges associated with training and sampling. Recent advances in quantum-enhanced sampling and diagnostics of learning dynamics, including information-theoretic tools, are reviewed as mechanisms for improving scalability and trainability. The review further highlights application-driven QML frameworks in drug discovery, cancer biology, and agro-climate modeling, where data complexity and constraints motivate hybrid quantum–classical approaches. We conclude with a discussion of federated quantum machine learning as a route to distributed, privacy-preserving quantum learning. Overall, this review presents a unified perspective on the opportunities and limitations of QML for complex systems.
- New
- Research Article
- 10.1038/s41598-026-49521-z
- Apr 24, 2026
- Scientific reports
- Yubin Zhang + 7 more
To reveal the regional differentiation characteristics of carbon emissions during the construction phase of expressways and to improve prediction accuracy, six typical expressway projects located in the plain, hilly, and mountainous regions of Anhui Province were selected as case studies. A carbon emission accounting model for the construction phase was established based on the life cycle assessment method, and the effects of the bridge-tunnel ratio, subproject structure, and material and energy consumption on carbon emission intensity were systematically analyzed. On this basis, a regional carbon emission prediction model was developed and optimized using data from 21 completed expressways across the province. The results indicate that carbon emission intensity exhibits a significant topographic gradient, with mountainous regions showing higher values than hilly regions, and hilly regions higher than plain regions. The maximum carbon emission intensity in mountainous projects reaches 5.27 × 10⁷ kg CO₂/km, which is 2.86 times that of plain regions. As terrain complexity increases, the carbon emission structure shifts from being dominated by subgrade engineering and interchange engineering to being dominated by structural engineering, such as bridges and tunnels. In mountainous regions, emissions from structural engineering account for more than 50% of the total emissions. At the material level, cement and steel are identified as the primary emission sources, jointly accounting for 78% of total emissions in mountainous projects, and demonstrating the highest sensitivity to variations in total emissions. The prediction results show that the baseline model using the bridge-tunnel ratio as a single variable achieves a coefficient of determination (R²) of 0.69. After incorporating material and energy consumption variables, the optimized XGBoost model improves the coefficient of determination to 0.9517, achieving high-accuracy prediction using only eight categories of material and energy consumption indicators. Based on the analytical results, differentiated emission reduction pathways are proposed. In mountainous regions, priority should be given to optimizing the design of tunnels and interchange engineering and controlling the intensity of high-carbon structural materials. In plain and hilly regions, emphasis should be placed on low-carbon design and construction optimization of bridge and culvert engineering and subgrade engineering. This study provides a data-driven basis for regional carbon emission prediction and emission reduction decision-making during the construction phase of expressways.
- New
- Research Article
- 10.3765/plsa.v11i1.6057
- Apr 24, 2026
- Proceedings of the Linguistic Society of America
- Yunchuan Chen + 1 more
This study investigates whether L1 Japanese L2 English learners can acquire the knowledge that English bare plurals prohibit specific readings, despite the absence of explicit evidence for this constraint in the input. In Japanese, bare plurals permit both generic and specific readings, whereas in English they allow only generic readings. This cross-linguistic difference creates a potential poverty-of-the-stimulus issue for Japanese-speaking learners of English. To examine whether learners can acquire this constraint, we conducted a sentence–picture matching truth-value judgment task with 30 L1 Japanese L2 English learners and 11 native English speakers. The L2 English participants also completed LexTALE to measure English proficiency. The results show that native English speakers consistently rejected the specific reading of English bare plurals, while native Japanese speakers consistently accepted both specific and generic readings in Japanese. Among the L2 learners, 43% consistently rejected the specific reading while accepting the generic reading in English, indicating successful acquisition of the target constraint. A generalized linear mixed-effects analysis further reveals that learners’ sensitivity to the restriction increases significantly with English proficiency. We argue that learners gradually eliminate the transferred specific reading as a result of distributional evidence in the input, which aligns with Yang’s (2003) variational learning model.
- New
- Research Article
- 10.1088/2058-9565/ae636a
- Apr 22, 2026
- Quantum Science and Technology
- Connor Van Rossum + 2 more
Abstract Variational quantum algorithms (VQAs) have dominated literature as tools for demonstrating quantum utility on near-term quantum hardware, with applications in optimisation, quantum simulation, and machine learning. While researchers have studied how easy VQAs are to train, the effect of quantum noise on the classical optimisation process is still not well understood. Contrary to expectations, we find that twirling, which is commonly used in standard error-mitigation strategies to symmetrise noise, actually degrades performance in the variational setting, whereas preserving biased or non-unital noise can help classical optimisers find better solutions. Analytically, we study a universal quantum regression model and demonstrate that relatively uniform Pauli channels suppress gradient magnitudes and reduce expressivity, making optimisation more difficult. Conversely, asymmetric noise such as amplitude damping or biased Pauli channels introduces directional bias that can be exploited during optimisation. Numerical experiments on a variational eigensolver for the transverse-field Ising model confirm that non-unital noise yields lower-energy states compared to twirled noise. Finally, we show that coherent errors are fully mitigated by re-parameterisation. These findings challenge conventional noise-mitigation strategies and suggest that preserving noise biases may enhance VQA performance.
- New
- Research Article
- 10.3390/quantum8020038
- Apr 22, 2026
- Quantum Reports
- Robert Castro
Variational models describe deformation and stability through the first and second variations in an underlying functional, but the relationship between these responses is seldom expressed as an intrinsic equilibrium quantity of the model itself. A canonical curvature–strain representation for equilibrium ratios arising in variational field settings is developed. For a twice Fréchet differentiable functional and an admissible perturbation generator, strain is defined as normalized first-order response and curvature as normalized second-order response along the generator direction. Their quotient defines a curvature–strain ratio that measures proportional balance between deformation and curvature within the model. The main result shows that this curvature–strain ratio is a canonical representative of a response ratio already implicit in the variational data. Under canonical normalization, the curvature–strain ratio coincides with the quotient of second- and first-order response, and stationarity of the curvature–strain ratio is equivalent to proportional stationarity of that response quotient along the admissible flow. A further theorem establishes transfer of local isolation: when the second-variation operator satisfies standard hypotheses such as compact resolvent and non-degeneracy of the constrained extremum, isolated equilibrium ratios persist in the curvature–strain representation for the same operator-theoretic reasons. Quadratic scalar and Maxwell-type models illustrate the construction. The paper establishes a mathematically controlled curvature–strain representation of equilibrium ratios within ordinary variational theory, with emphasis on the analysis of variational response and equilibrium balance.
- New
- Research Article
- 10.59973/ipil.354
- Apr 22, 2026
- IPI Letters
- Raoul Bianchetti + 1 more
We formulate and analyze a regularized variational model in which finitely many discrete anchors constrain a scalar field on a bounded Lipschitz domain and, through second-order response, determine a tensorial object of metric type. The analytic point of departure is that singular pointwise anchoring is incompatible with the natural Sobolev setting of the Dirichlet energy. To overcome this difficulty, each anchor is represented by a mollified averaging functional, so that the full anchor mechanism becomes continuous on the admissible class and remains compatible with weak convergence. The resulting action consists of a Dirichlet term, a weighted anchor-fidelity term, and an auxiliary regularization term. Within this framework we derive the first variation, the weak Euler--Lagrange equation, and the second variation in complete form. We then prove existence of minimizers under standard coercivity and weak lower-semicontinuity hypotheses, establish uniqueness under strict convexity, and show that the second-order response is symmetric and positive semidefinite when the regularization is convex. In the quadratic regularization case we obtain a linear elliptic field equation with smooth localized forcing and record the corresponding interior regularity consequences. The geometric conclusion of the paper is stated at its natural level of generality: whenever the second-order response admits a local tensor representation and satisfies an explicit nondegeneracy condition relative to a positive definite reference tensor, it induces a continuous metric-type tensor on the region under consideration. A finite-difference discretization of the quadratic model is also constructed, validated by a manufactured-solution experiment, and used to study the dependence of the reconstructed response tensor on the anchor-fidelity and anchor-width parameters. The paper therefore provides a mathematically controlled Euclidean variational framework in which localized discrete data determine a field and, under explicit hypotheses, a geometric response of metric type.
- New
- Research Article
- 10.1145/3799429
- Apr 21, 2026
- ACM Transactions on Multimedia Computing, Communications, and Applications
- Xingming Yang + 5 more
Gaze target detection aims to localize a person’s gaze target. During gaze transition in video, the absence of accurate temporal variation modeling (TVM) may lead to errors in gaze target localization. In this work, we propose a Transition-aware Gaze Model (TGM), which focuses on analyzing temporal differences to achieve accurate location variation modeling. The TGM contains four key components: a frame gaze model, and three transition-aware modules (path variation, direction variation, and fusion). First , the frame Transformer extracts gaze location and direction features. Second , to analyze the feature difference among transition frames, we introduce TVM guided by transition-aware loss. TVM analyzes the location features to capture the moving trajectory of targets (defined as path variation ), which facilitates the search for target locations near the path. Third , TVM also analyzes the direction features to capture the transition-aware direction area (defined as direction variation ), which facilitates the search for target locations within this area. Fourth , since gaze directions dynamically adjust to track gaze targets, path variation, and direction variation are inherently aligned with the natural movement of a person’s gaze. Thus, these two variations are fused into a unified transition-aware feature, which helps cover all potential target locations. To search for accurate target locations, we embed this transition-aware feature into frame features with cross-attention, which can enhance gaze target detection in transition frames. Extensive experiments demonstrate that our method achieves state-of-the-art performance on two datasets, namely VideoAttentionTarget and VideoCoAtt.
- New
- Research Article
- 10.1145/3789508
- Apr 20, 2026
- ACM Transactions on Multimedia Computing, Communications, and Applications
- Jiawei Huang + 4 more
Generating natural and realistic human motion sequences under the constraints of 3D scenes is a highly challenging task, requiring not only the precise modeling of dynamic variations in human joints but also the rigorous consideration of intricate interactions between the human body and the surrounding environment. While recent advances in deep generative models show great potential in tackling these challenges, existing methods often result in unnatural human motions and human–environment penetration during generation. In order to cope with these issues, we propose a novel approach that divides human motion generation into two stages. The first stage employs a bidirectional long short-term memory network incorporated with full-connected layers to generate motion trajectory under the input conditions including the starting and ending positions and orientations of the human model and scene feature point clouds extracted from the surrounding environment. In the second stage, we design a conditional diffusion model, guided by the trajectory generated in the first stage and the embedding of 3D scene information, to generate human motion sequences within 3D scenes. We evaluate our framework through extensive experiments on the PROX datasets, which validates its effectiveness. The results show that our method significantly outperforms existing ones in enhancing human motion naturalness and reasonableness, and reducing human penetration.
- New
- Research Article
- 10.1140/epja/s10050-026-01837-0
- Apr 20, 2026
- The European Physical Journal A
- Maria Lugaro + 8 more
Abstract Bulk meteoritic data show isotopic variability of slow -neutron-capture ( s -process) origin in several elements heavier than Fe. One peculiar feature is that the lighter s -process elements (e.g., Zr and Mo) present larger anomalies than the heavier s -process elements (e.g., Nd and W). To address this observation, we compared Zr and Nd data to model predictions of the s -process abundances at the surface of low-mass asymptotic giant branch (AGB) stars of initial metallicity from solar to twice solar. We found that the relative magnitude of the isotopic variability between these two elements can be matched by models of AGB stars of super-solar metallicity. The match is favoured by stronger convective overshoot, leading to a deeper dredge-up of the H-rich envelope into the He-rich region, and/or a smaller ( $$\sim $$ ∼ half than standard) mass of the region rich in the $$^{13}$$ 13 C nuclei that produce free neutrons via the $$^{13}$$ 13 C( $$\alpha ,$$ α , n) $$^{16}$$ 16 O reaction. We conclude that nucleosynthesis in AGB stars can match the difference in the magnitude of the bulk meteoritic variations in Zr and Nd, provided that super-solar metallicity stars are the original site of these signatures. The AGB stars that produced such variations could have belonged to the current population of old, super-solar metallicity stars seen in the galactic solar neighbourhood.
- Research Article
- 10.1038/s41598-026-43884-z
- Apr 18, 2026
- Scientific reports
- Qiangqiang Li + 1 more
This paper systematically investigates the influence of the thickness and power-law exponent of functionally graded rings (FGR) on the bandgap characteristics of two-dimensional (2D) phononic metamaterials (PMs). Based on the plane wave expansion method (PWEM), a spatial variation model for the material parameters of FGR is established, the dispersion relations of elastic waves in 2D PMs are derived, and the evolution of the first bandgap is analyzed. The results indicate that an increase in the thickness of FGR shifts both the upper and lower boundary of the bandgap upward, while the bandgap width initially increases and then decreases, reaching a maximum at a specific thickness. Variations in the power-law exponent significantly affect the width and center frequency of the bandgap, demonstrating a notable nonlinear regulatory effect. On this basis, the single-objective optimization of the bandgap width was achieved by the genetic algorithm combining with PWEM, while the multi-objective optimization of both bandgap width and center frequency was realized using the non-dominated sorting genetic algorithm (NSGA-II) in conjunction with PWEM. The Pareto optimal front indicates that there is a favorable competitive relationship between the bandgap width and the center frequency, and the optimal parameters are not limited to the boundary values.
- Research Article
- 10.4208/nmtma.oa-2025-0089
- Apr 14, 2026
- Numerical Mathematics: Theory, Methods and Applications
- Wei Wang + 1 more
In this paper, we propose and develop a novel variational model based on hue-saturation similarity and fuzzy membership function for color image segmentation. The main contribution of the proposed model is that we determine different segments by using the similarity of hue and saturation information in hue, saturation, and value color space. We first provide specific definitions of the hue/saturation distance to describe hue-saturation similarity, then formulate a novel data fitting term with an adaptive weight coefficient by using hue-saturation similarity in the proposed energy functional. Two efficient iterative algorithms based on coordinate descent method and alternating direction method of multipliers have been proposed to solve the proposed optimization problem. Theoretically we study the existence of the solution of the proposed model and the convergence of the proposed coordinate descent algorithm. Numerical experimental results demonstrate that the segmentation performance of the proposed model is much better than that of other existing color image segmentation methods.
- Research Article
- 10.3390/info17040356
- Apr 8, 2026
- Information
- Chao Duan + 6 more
The rapid development of intelligent learning guidance systems has created a favorable environment for personalized learning. By accurately predicting students’ future performance, education can be tailored and teaching strategies optimized. However, traditional prediction algorithms seldom account for highly imbalanced datasets in basic education, overlook temporal factors, and lack further interpretability of the prediction results. To address these shortcomings, we propose Temporal Variational Autoencoder-Generative Adversarial Network (TVAE-GAN), a temporal variational autoencoder-generative adversarial network model aimed at providing early warnings for high-risk students in basic education, with in-depth interpretability analysis of the prediction results to suit the unique context of basic education. TVAE-GAN extracts features from real samples and introduces a Long Short-Term Memory (LSTM) network to capture dynamic features in time series, helping the model better understand temporal dependencies in the data, remember the sequential causal information of students’ online learning, and achieve better data generation performance. Using these features, the generative model generates new samples, and the discriminator model evaluates their quality, producing outputs that closely resemble real samples through training. The effectiveness of the TVAE-GAN model is validated on a collected online basic education dataset while also advancing the timing of interventions in predictions. The performance differences between the proposed method and classic resampling methods, as well as their impact in the educational field, are analyzed, highlighting that misclassification increases teacher workload and affects students’ emotions. Key influencing factors are identified using a decision-tree surrogate model, providing teachers with multidimensional references for academic assessment.
- Research Article
1
- 10.7554/elife.105822
- Apr 7, 2026
- eLife
- Lucas Inchausti + 20 more
Trypanosoma cruzi, the causative agent of Chagas disease, presents a major public health challenge in Central and South America, affecting approximately 8 million people and placing millions more at risk. The T. cruzi life cycle includes transitions between epimastigote, metacyclic trypomastigote, amastigote, and blood trypomastigote stages, each marked by distinct morphological and molecular adaptations to different hosts and environments. Unlike other trypanosomatids such as Trypanosoma brucei, T. cruzi does not employ a monoallelic model of antigenic variation; instead, it relies on a diverse repertoire of cell-surface associated proteins encoded by large multigene families, which are essential for infectivity and immune evasion. This study analyzes cell-specific transcriptomes using single-cell RNA sequencing of amastigote and trypomastigote cells to characterize stage-specific surface protein expression during mammalian infection. Through clustering and identification of cell-specific markers, we assigned cells to distinct parasite developmental forms. Analysis of individual cells revealed that surface protein-coding genes, especially members of the trans-sialidase-like superfamily (TcS), are expressed with greater heterogeneity than single-copy genes. Moreover, no recurrent combinations of TcS genes were observed between individual cells in the population. Remarkably, a small subset of TcS mRNAs, encoded by genes preferentially located in the core genomic compartment, are frequently detected across the cell population, whereas the vast majority of TcS mRNAs show low detection frequencies and are mainly encoded in the disruptive compartment. Our findings thus reveal transcriptomic heterogeneity within trypomastigote populations where each cell displays unique TcS expression profiles. Focusing on the diversity of surface protein expression, this research aims to deepen our understanding of T. cruzi cellular biology and infection strategies.
- Research Article
- 10.64898/2026.04.05.26350202
- Apr 6, 2026
- medRxiv : the preprint server for health sciences
- Peijie Qiu + 5 more
Multimodal medical image analysis exploits complementary information from multiple data sources ( e.g ., multi-contrast Magnetic Resonance Imaging (MRI), Diffusion Tensor Imaging (DTI), and Positron Emission Tomography (PET)) to enhance diagnostic accuracy and support clinical decision-making. Central to this process is the learning of robust representations that capture both modality-invariant and modality-specific features, which can then be leveraged for downstream tasks such as MRI segmentation and normative modeling of population-level variation and individual deviations. However, learning robust and generalizable representations becomes particularly challenging in the presence of missing modalities and heterogeneous data distributions. Most existing methods address this challenge primarily from a statistical perspective, yet they lack a theoretical understanding of the underlying geometric behavior-such as how probability mass is allocated across modalities. In this paper, we introduce a generalized geometric perspective for multimodal representation learning grounded in the concept of barycenters, which unifies a broad class of existing methods under a common theoretical perspective. Building on this barycentric formulation, we propose a novel approach that leverages generalized Wasserstein barycenters with hierarchical modality-specific priors to better preserve the geometry of unimodal distributions and enhance representation quality. We evaluated our framework on two key multimodal tasks-brain tumor MRI segmentation and normative modeling-demonstrating consistent improvements over a variety of multimodal approaches. Our results highlight the potential of scalable, theoretically grounded approaches to advance robust and generalizable representation learning in medical imaging applications.
- Research Article
- 10.1007/s10278-026-01934-y
- Apr 2, 2026
- Journal of imaging informatics in medicine
- Bingzhen Wang + 9 more
This study proposes a Residual Conditional Variational Autoencoder model (ResCVAE-Harmonizer) that integrates batch information and clinical covariates for multi-center feature harmonization and systematically and comprehensively evaluates its harmonization performance. This study collected 806 cases from 9 different centers. After preprocessing, three types of features were extracted from PET and CT images: low-dimensional radiomic features, high-dimensional radiomic features, and deep learning features based on 3D-DenseNet-121. Each feature type was harmonized using ComBat, CovBat, and the proposed ResCVAE-Harmonizer. Both harmonized and original features were included in a comprehensive evaluation framework comprising variance homogeneity analysis, multi-center classification test, and downstream task effectiveness evaluation. The ResCVAE-Harmonizer significantly improved cross-center feature consistency. Levene's test results showed a general reduction in - log10(p) values after harmonization, with more pronounced improvements observed in low- and high-dimensional radiomic features. In center classification tasks, ResCVAE-harmonized features demonstrated greater stability across four classifiers and outperformed the original features. For the downstream survival prediction task, PET deep learning features processed by ResCVAE achieved the highest C-index (0.8920, 95% CI 0.8514-0.9325), surpassing those of the original features (0.8765), ComBat (0.8909), and CovBat (0.8455). Similarly, the C-index for CT deep features improved to 0.8296 (95% CI 0.7715-0.8877). Kaplan-Meier survival stratification based on ResCVAE features showed clearer separation between high- and low-risk groups, with statistically significant log-rank test results. While slightly inferior to ComBat in linear variance consistency, ResCVAE-Harmonizer effectively eliminated both linear and nonlinear batch effects and significantly enhanced survival prediction performance, demonstrating strong research potential.
- Research Article
- 10.1109/tnnls.2026.3677762
- Apr 2, 2026
- IEEE transactions on neural networks and learning systems
- Ratun Rahman + 1 more
Quantum convolutional neural networks (QCNNs) are a highly appealing architecture that combines quantum computing and deep learning. Inspired by classical convolutional neural network (CNN) hierarchical feature extraction, QCNNs use quantum operations such as entanglement, superposition, and measurement to capture complex correlations in high-dimensional datasets. Since their introduction, QCNNs have evolved into several architectural variants, including fully quantum, variational, hybrid, and graph-based models, and have been investigated in applications ranging from quantum many-body physics to classical machine learning tasks such as image classification, speech recognition, and time-series forecasting. The development of software ecosystems such as Qiskit Machine Learning, Pennylane, and TensorFlow Quantum (TFQ) has facilitated rapid development and experimentation, speeding up research. Despite these developments, existing research remains divided, with many focused on individual implementations and lacking a unifying taxonomy or full review. This article addresses that gap by providing a systematic and holistic survey on QCNN, offering comparative insights across key architectures, applications, and toolboxes, and outlining open challenges. We also identify potential study areas, such as scalable architectures, domain-knowledge integration, fault tolerance, and security. This survey seeks to serve as a basic reference for furthering QCNN research in the current near-term quantum era and beyond, providing both architectural insights and application-driven viewpoints.