Implementation of a Hamming distance\u2013like genomic quantum classifier using inner products on ibmqx2 and ibmq_16_melbourne

Kunal Kathuria,Aakrosh Ratan,Stefan Bekiranov,Michael Mcconnell

doi:10.1007/s42484-020-00017-7

Kunal Kathuria, Aakrosh Ratan + Show 2 more

Open Access

https://doi.org/10.1007/s42484-020-00017-7

Copy DOI

Journal: Quantum machine intelligence	Publication Date: Jun 1, 2020
Citations: 20	License type: open-access

Affiliation: University of Virginia

Abstract

Motivated by the problem of classifying individuals with a disease versus controls using a functional genomic attribute as input, we present relatively efficient general purpose inner product–based kernel classifiers to classify the test as a normal or disease sample. We encode each training sample as a string of 1 s (presence) and 0 s (absence) representing the attribute’s existence across ordered physical blocks of the subdivided genome. Having binary-valued features allows for highly efficient data encoding in the computational basis for classifiers relying on binary operations. Given that a natural distance between binary strings is Hamming distance, which shares properties with bit-string inner products, our two classifiers apply different inner product measures for classification. The active inner product (AIP) is a direct dot product–based classifier whereas the symmetric inner product (SIP) classifies upon scoring correspondingly matching genomic attributes. SIP is a strongly Hamming distance–based classifier generally applicable to binary attribute-matching problems whereas AIP has general applications as a simple dot product–based classifier. The classifiers implement an inner product between N = 2n dimension test and train vectors using n Fredkin gates while the training sets are respectively entangled with the class-label qubit, without use of an ancilla. Moreover, each training class can be composed of an arbitrary number m of samples that can be classically summed into one input string to effectively execute all test–train inner products simultaneously. Thus, our circuits require the same number of qubits for any number of training samples and are O(log {N}) in gate complexity after the states are prepared. Our classifiers were implemented on ibmqx2 (IBM-Q-team 2019b) and ibmq_16_melbourne (IBM-Q-team 2019a). The latter allowed encoding of 64 training features across the genome.

Highlights

Quantum computing algorithms have been developed that show great promise of making potentially significant improvements upon existing classical equivalents, in the area of machine learning
Motivated by the problem of classifying individuals with a disease (e.g., Alzheimer’s disease) given single cell neuronal genomic copy number variation (CNV) data, we developed a set of quantum classifier circuits which exploit a biologically relevant encoding of the training and test vectors that allow us to take full advantage of the computational basis of the quantum computer
The formulation of symmetric inner product (SIP) is that the presence and absence of a CNV in a genomic region is encoded with a state coefficient of +1 and −1, respectively, in the computational basis, immediately yielding the total sum of bit matches minus mismatches after state overlap

Summary

Introduction

Quantum computing algorithms have been developed that show great promise of making potentially significant improvements upon existing classical equivalents, in the area of machine learning. Quadratic speedups have been theoretically demonstrated for Bayesian inference (Low et al 2014; Wiebe and Granade 2015), online perceptron (Kapoor et al 2016), classical Boltzmann machines (Wiebe et al 2014b), and quantum reinforcement learning (Dunjko et al 2016; Biamonte et al 2017). These speedups presume a low-error rate, universal, quantum computer with hundreds to thousands of qubits. Recent progress has been made in exploiting a relatively natural connection between kernel-based classification and quantum computing

Methods

Results

Conclusion