The human genome is the ultimate identifier of an individual even though most of it is identical across human beings. Biological differences between two individuals are encoded in a set of base pair variations called Single Nucleotide Polymorphisms (SNPs), which may be indicative of an individual's personal information such as skin color and susceptibility to diseases. The large-scale nature of human genome necessitates outsourcing of genomic computations to public clouds. However, this raises some serious privacy concerns. The fact that the human reference template is public poses additional challenges. In this paper, we propose a two-cloud private read alignment algorithm using the Burrows-Wheeler Transform and the FM-Index. Our algorithm runs in the same order of complexity as the core FM-Index alignment algorithm without privacy. Our proposed scheme is able to achieve accuracy comparable to modern alignment algorithms such as Bowtie with complete privacy.
Read full abstract