Abstract
We have developed a novel analysis method that can interrogate the authenticity of biological samples used for generation of transcriptome profiles in public data repositories. The method uses RNA sequencing information to reveal mutations in expressed transcripts and subsequently confirms the identity of analysed cells by comparison with publicly available cell-specific mutational profiles. Cell lines constitute key model systems widely used within cancer research, but their identity needs to be confirmed in order to minimise the influence of cell contaminations and genetic drift on the analysis. Using both public and novel data, we demonstrate the use of RNA-sequencing data analysis for cell line authentication by examining the validity of COLO205, DLD1, HCT15, HCT116, HKE3, HT29 and RKO colorectal cancer cell lines. We successfully authenticate the studied cell lines and validate previous reports indicating that DLD1 and HCT15 are synonymous. We also show that the analysed HKE3 cells harbour an unexpected KRAS-G13D mutation and confirm that this cell line is a genuine KRAS dosage mutant, rather than a true isogenic derivative of HCT116 expressing only the wild type KRAS. This authentication method could be used to revisit the numerous cell line based RNA sequencing experiments available in public data repositories, analyse new experiments where whole genome sequencing is not available, as well as facilitate comparisons of data from different experiments, platforms and laboratories.
Highlights
The prevalence of using human cell lines as in vitro model systems for cancer research is due to their ability to replace scarce and valuable human samples
We show that comparing RNA sequencing (RNA-seq) data from several colorectal cancer cell lines (COLO205, DLD1, HCT15, HCT116, HKE3, HT29 and RKO) to databases such as the Catalogue of somatic mutations in cancer (COSMIC) [15] can authenticate cell lines to a high degree of certainty, give in-depth information about errors in known variants as well as point to possible HeLa contaminations
[4] Cirulli et al has previously shown that variants from RNA-seq data cover 40% of those found from whole genome sequencing, and up to 81% when filtering for expressed genes
Summary
The prevalence of using human cell lines as in vitro model systems for cancer research is due to their ability to replace scarce and valuable human samples. Cell lines offer an unlimited source of biological material and represent homogeneous cell type populations, which facilitates both experimental procedures and interpretation of results in comparison to the analysis of tissues and organs. They are easy to use since well-developed protocols are available for culturing, genetic manipulation, molecular analysis and other assay-based experiments. Using cell lines to model human biology, test efficacy of therapies and produce therapeutic proteins is common practice in research, yet it is widely acknowledged that contamination of said cell lines is a prevalent problem. Genetic drift and other subculturing effects can affect the cell lines’ suitability as an experimental model system, and long-time culturing should be avoided. [5]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.