Abstract
Quantum neural networks (QNNs) have generated excitement around the possibility of efficiently analyzing quantum data. But this excitement has been tempered by the existence of exponentially vanishing gradients, known as barren plateau landscapes, for many QNN architectures. Recently, quantum convolutional neural networks (QCNNs) have been proposed, involving a sequence of convolutional and pooling layers that reduce the number of qubits while preserving information about relevant data features. In this work, we rigorously analyze the gradient scaling for the parameters in the QCNN architecture. We find that the variance of the gradient vanishes no faster than polynomially, implying that QCNNs do not exhibit barren plateaus. This result provides an analytical guarantee for the trainability of randomly initialized QCNNs, which highlights QCNNs as being trainable under random initialization unlike many other QNN architectures. To derive our results, we introduce a novel graph-based method to analyze expectation values over Haar-distributed unitaries, which will likely be useful in other contexts. Finally, we perform numerical simulations to verify our analytical results.9 MoreReceived 12 March 2021Revised 13 July 2021Accepted 2 August 2021DOI:https://doi.org/10.1103/PhysRevX.11.041011Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.Published by the American Physical SocietyPhysics Subject Headings (PhySH)Research AreasMachine learningQuantum algorithmsQuantum computationQuantum Information
Highlights
The field of classical machine learning has been revolutionized by the advent of neural networks (NNs)
Our second main result is that we employ the graph recursion integration method (GRIM) to obtain a lower bound on Var1⁄2∂μC for the quantum convolutional neural networks (QCNNs) architecture
We present our results to guarantee the trainability of a pooling-based QCNN
Summary
The field of classical machine learning has been revolutionized by the advent of neural networks (NNs). These architectures employ a noisy hardware to evaluate a cost (or loss) function, while leveraging the power of classical optimizers to train the parameters in a quantum circuit, or a neural network. We show that the variance of the cost function partial derivatives is at most polynomially vanishing with the system size This implies that the cost function landscape does not exhibit a barren plateau, and that the QCNN architecture is trainable under random initialization of parameters. The QCNN architecture takes as input an n-qubit input state ρin in a Hilbert space Hin, which is sent through a circuit composed of a sequence of convolutional and pooling layers. After the final pooling layer, one applies a fully connected unitary (F) to the remaining qubits and obtains an output state ρout whose dimension is much smaller than that of ρin. Note that the nonlinearities in a QCNN arise from the pooling operators (measurement and conditioned unitary) in the pooling layers, which effectively reduce the degrees of freedom in each layer
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.