Abstract
Deep reinforcement learning is an emerging machine-learning approach that can teach a computer to learn from their actions and rewards similar to the way humans learn from experience. It offers many advantages in automating decision processes to navigate large parameter spaces. This paper proposes an approach to the efficient measurement of quantum devices based on deep reinforcement learning. We focus on double quantum dot devices, demonstrating the fully automatic identification of specific transport features called bias triangles. Measurements targeting these features are difficult to automate, since bias triangles are found in otherwise featureless regions of the parameter space. Our algorithm identifies bias triangles in a mean time of <30 min, and sometimes as little as 1 min. This approach, based on dueling deep Q-networks, can be adapted to a broad range of devices and target transport features. This is a crucial demonstration of the utility of deep reinforcement learning for decision making in the measurement and operation of quantum devices.
Highlights
Reinforcement learning (RL) is a neurobiologically inspired machine-learning paradigm where an RL agent will learn policies to successfully navigate or influence the environment
The gate voltage regions we explore are delimited by a 640 × 640 mV window centred in the gate voltage coordinates proposed by a super coarse tuning npj Quantum Information (2021) 100 algorithm, as mentioned in the Introduction, and the current traces in this stage have a resolution of 6.4 mV
The random agent was initialised in the same random positions as the deep reinforcement learning (DRL) agent so that a fair comparison could be made between their performances
Summary
Reinforcement learning (RL) is a neurobiologically inspired machine-learning paradigm where an RL agent will learn policies to successfully navigate or influence the environment. The potential of deep reinforcement learning for the efficient measurement of quantum devices is still unexplored. Singlet–triplet qubits encoded in double quantum dots[22] have demonstrably long coherence times[23,24], as well as high one-25 and two-qubit[26,27,28] gate fidelities. Quantum dot devices are subject to variability, and many measurements are required to characterise each device and find the conditions for qubit operation. Machine learning has been used to automate the tuning of devices from scratch, known as super coarse tuning[34,35,36], the identification of single or double quantum dot regimes, known as coarse tuning[37,38], and the tuning of the inter-dot tunnel couplings and other device parameters, referred to as fine tuning[39,40,41]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have