Abstract

Model bias is an inherent limitation of the current dominant approach to optimal quantum control, which relies on a system simulation for optimization of control policies. To overcome this limitation, we propose a circuit-based approach for training a reinforcement learning agent on quantum control tasks in a model-free way. Given a continuously parameterized control circuit, the agent learns its parameters through trial-and-error interaction with the quantum system, using measurement outcomes as the only source of information about the quantum state. Focusing on control of a harmonic oscillator coupled to an ancilla qubit, we show how to reward the learning agent using measurements of experimentally available observables. We train the agent to prepare various non-classical states using both unitary control and control with adaptive measurement-based quantum feedback, and to execute logical gates on encoded qubits. This approach significantly outperforms widely used model-free methods in terms of sample efficiency. Our numerical work is of immediate relevance to superconducting circuits and trapped ions platforms where such training can be implemented in experiment, allowing complete elimination of model bias and the adaptation of quantum control policies to the specific system in which they are deployed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call