SummaryThe HeLa cell line was established in 1951 from cervical cancer cells taken from a patient, Henrietta Lacks, marking the first successful attempt to continually culture human-derived cells in vitro1. HeLa’s robust growth and unrestricted distribution resulted in its broad adoption – both intentionally and through widespread cross-contamination2 – and for the past sixty years it has served a role analogous to that of a model organism3. Its cumulative impact is illustrated by the fact that HeLa is named in >74,000 or ~0.3% of PubMed abstracts. The genomic architecture of HeLa remains largely unexplored beyond its karyotype4, in part because like many cancers, its extensive aneuploidy renders such analyses challenging. We performed haplotype-resolved whole genome sequencing5 of the HeLa CCL-2 strain, discovering point and indel variation, mapping copy-number and loss of heterozygosity (LOH), and phasing variants across full chromosome arms. We further investigated variation and copy-number profiles for HeLa S3 and eight additional strains. Surprisingly, HeLa is relatively stable with respect to point variation, accumulating few new mutations since early passaging. Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region at chromosome 8q24.21 at which the HPV-18 viral genome integrated as the likely initial event underlying tumorigenesis. We combined these maps with RNA-Seq6 and ENCODE Project7 datasets to phase the HeLa epigenome, revealing strong, haplotype-specific activation of the proto-oncogene MYC by the integrated HPV-18 genome ~500 kilobases upstream, and permitting global analyses of the relationship between gene dosage and expression. These data provide an extensively phased, high-quality reference genome for past and future experiments relying on HeLa, and demonstrate the value of haplotype resolution for characterizing cancer genomes and epigenomes.
Read full abstract