Abstract

BackgroundThe cell line BT-474 is a popular cell line for studying the biology of cancer and developing novel drugs. However, there is no complete, published genome sequence for this highly utilized scientific resource. In this study we sought to provide a comprehensive and useful data set for the scientific community by generating a whole genome sequence for BT-474.FindingsFive μg of genomic DNA, isolated from an early passage of the BT-474 cell line, was used to generate a whole genome sequence (114X coverage) using Complete Genomics’ standard sequencing process. To provide additional variant phasing and structural variation data we also processed and analyzed two separate libraries of 5 and 6 individual cells to depths of 99X and 87X, respectively, using Complete Genomics’ Long Fragment Read (LFR) technology.ConclusionsBT-474 is a highly aneuploid cell line with an extremely complex genome sequence. This ~300X total coverage genome sequence provides a more complete understanding of this highly utilized cell line at the genomic level.Electronic supplementary materialThe online version of this article (doi:10.1186/s13742-016-0113-x) contains supplementary material, which is available to authorized users.

Highlights

  • The cell line BT-474 is a popular cell line for studying the biology of cancer and developing novel drugs

  • Utility of the dataset The cell line BT-474 was isolated by Lasfargues et al [1] in 1978, from a biopsy of invasive ductal carcinoma from a 60 year old Caucasian female

  • At the time of writing, entering the search term “BT-474 OR BT474” into PubMed resulted in 973 unique articles

Read more

Summary

Introduction

The cell line BT-474 is a popular cell line for studying the biology of cancer and developing novel drugs. STD and LFR libraries were mapped to the NCBI reference genome build 37. BT-474 genome analysis Read data of 343, 298, and 261 Gb from the STD, LFR1, and LFR2 libraries, respectively, were mapped to the NCBI human reference genome (build 37) using Complete Genomics’ pipeline [3, 5, 6] (Table 1), resulting in close to ~100X coverage in each of the libraries.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call