Reference Dataset Research Articles

Abstract Single-nucleus joint ATAC- and RNA-sequencing (snMultiome) can be used to identify functionally divergent cell subpopulations based on their transcriptomic and epigenetic profiles within complex samples. Accurate cell type annotation is critical to successful snMultiome data analysis. Several computational methods have been developed for automatic annotation. Traditional cell type annotation methods initially cluster cells using unsupervised learning methods based on the gene expression profiles, then label the clusters using aggregated cluster-level expression profiles and marker genes. These methods rely heavily on the clustering results. As the purity of clusters cannot be guaranteed, false detection of cluster features may lead to incorrect annotations. Further, canonical cell surface markers may not always be suitable to be applied in single-nucleus RNA-seq studies because single-nucleus RNA-seq generally yields lower detected transcript numbers compared to typical single-cell RNA-seq. Moreover, cell type marker genes in the snRNA-seq data may differ from the ones obtained with scRNA-seq data, reflecting biological differences in the cytoplasmic and nuclear RNA pools. Lastly, the data obtained from malignant cells are best left out in establishing cell type reference data because they are too heterogeneous and patient-specific. Reference-based automated algorithms such as SingleR enable quick and unbiased classifications by leveraging a collection of built-in reference data sets for human (e.g. Human Primary Cell Atlas (microarray-based) and the combined Blueprint Epigenomics and Encode data set (RNA-seq-based)). Still, SingleR may return erroneous cell type classifications. Our dataset was generated using the 10x Genomics snMultiome platform to yield 296,557 nuclei from 82 frozen breast tumors, representing patients from diverse genetic ancestral background. Using these data, we sought to improve the accuracy of cell type annotation by SingleR. To achieve this, we first separated malignant and non-malignant cells based on DNA copy number aberrations (aneuploidy) through CopyKAT. For cells determined to be non-malignant, we built the custom reference from snRNA-seq data set, recently made available by The Human Breast Cell Atlas, and then applied singleR with a custom reference where each cell type is represented by single-cells of that type, allowing a well-founded estimate of the confidence with which a cell type call can be made. Using this approach, we successfully identified 11 distinct cell types for non-malignant cells, including fibroblast, adipocyte, pericyte, basal, luminal-secretory, luminal-HR, myeloid, mast, vascular, lymphatic, and T-cells, which can then be further subclassified. Furthermore, we interrogated each cluster using known canonical markers and transferred the cell type labels to snATAC-seq. This approach enabled us to link peaks to genes in each cell type. We believe this new approach that refines SingleR can greatly improve accuracy and minimize misclassification when annotating cell types in breast tumors using snMultiome data. Citation Format: Huaitian Liu, Alexandra Harris, Brittany Jenkins-Lord, Tiffany H. Dorsey, Francis Makokha, Shahin Sayed, Gretchen Gierach, Stefan Ambs. Cell type annotation using singleR with custom reference for single-nucleus multiome data derived from frozen human breast tumors [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 2 (Late-Breaking, Clinical Trial, and Invited Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(7_Suppl):Abstract nr LB240.

Abstract The battle against many cancers and infectious diseases has long been hindered due to the complexity of finding potent and effective drug combinations. With each new drug considered, the number of combinations to potentially test increases exponentially, posing substantial challenges in screening throughput. These challenges further intensify when accounting for the number of doses of each drug that need to be tested in a drug combination matrix (known as matrix density). There is a pressing need to screen large amounts of combinations at sufficient density to discover new therapies for diseases like cancer, but this has traditionally been out of reach. However, the recent widespread adoption of acoustic liquid handling robots has shown promise to overcome these obstacles by allowing for intricate drug screening template designs which were previously not possible to make. Despite these advances, the throughput achieved by these technologies has been limited due to lack of broadly accessible protocols and analytical tools for drug combination screening. We present Combocat, an end-to-end platform that allows for substantial increases in throughput of drug combination screens by combining experimental protocols that can be deployed for acoustic liquid handlers, machine learning algorithms for data imputation, and software that allows for in-depth analysis of results. We first generated a reference dataset of over 250,000 unique drug combination measurements in multiple cancer cell lines. The combination data were collected in a dense format (10×10 combination matrices) using a novel drug-drug template and achieved a dramatic increase in throughput compared to conventional methods. We then used this dataset to build a computational model which allowed us to accurately estimate drug combination effects using sparse measurements and imputing non-measured values with machine learning. The sparse measurements are collected in 1536-well microplates and substantially boost the throughput capabilities of drug-drug screens. As proof-of-concept, we used our method to screen a preclinical model of neuroblastoma with 9,045 drug combinations. This represents 10% the scale of the largest drug combination studies ever reported, achieved using a fraction of the resources, and in dense formats. We validated our findings by re-screening top hits using the fully-measured, non-imputed method and demonstrate the accuracy of our platform. The Combocat platform’s documentation and codebase is open-source, and we also make a GUI available for interactive exploration of screening results. By integrating advanced experimental and computational methods, we provide a generalizable pipeline that will expedite synergy screens and the drug combination discovery process for many diseases. Citation Format: William C. Wright, Paul Geeleher. An ultrahigh-throughput synergy screening platform enables discovery of novel drug combinations [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 4936.

Reference Dataset Research Articles

Related Topics

Articles published on Reference Dataset

Automating Polysomnography in an Under-Explored Animal Model: Preliminary Results from Hibernating Bears

High-accuracy experimental study of performance characteristics of optimised Ranque-Hilsch vortex tube

Magnetohydrodynamics tangent hyperbolic nanofluid flow across a vertical stretching surface using Levengberg-Marquardt back propagation artificial neural networks

Reference Data Set for Circular Dichroism Spectroscopy Comprised of Validated Intrinsically Disordered Protein Models.

Multi-Sensor 3D Survey: Aerial and Terrestrial Data Fusion and 3D Modeling Applied to a Complex Historic Architecture at Risk

Online feeding behavior monitoring of individual group-housed grow-finish pigs using a low-frequency RFID electronic feeding system.

Abstract LB240: Cell type annotation using singleR with custom reference for single-nucleus multiome data derived from frozen human breast tumors

Investigating the Mixture Quality in Multi-Injector Burner Systems, Part II: Model Application

Enabling Advanced Snow Physics Within Land Surface Models Through an Interoperable Model‐Physics Coupling Framework

Abstract 80: Toward best practices for B-cell receptor repertoire profiling

Abstract 3520: A scalable single cell RNA-seq pipeline leveraging machine learning and high-quality references for cell-type prediction

Abstract 4936: An ultrahigh-throughput synergy screening platform enables discovery of novel drug combinations

Bayesian Hyperbolic Multidimensional Scaling

A Comparison of 5 Algorithmic Methods and Machine Learning Pattern Recognition for Artifact Detection in Electronic Records of 5 Different Vital Signs: A Retrospective Analysis.

Deep learning methods for fully automated dental age estimation on orthopantomograms.

Predicting absolute risk for a person with missing risk factors.

The Simultaneous Model-Based Estimation of Joint, Muscle, and Tendon Stiffness is Highly Sensitive to the Tendon Force-Strain Relationship.

Leadless pacemakers at 5-year follow-up: the Micra transcatheter pacing system post-approval registry.

Melt pond fractions on Arctic summer sea ice retrieved from Sentinel-3 satellite data with a constrained physical forward model

Improved temporal resolution and acceleration on 4D-MR angiography based on superselective pseudo-continuous arterial spin labeling combined with CENTRA-keyhole and view-sharing (4D-S-PACK) using an interpolation algorithm on the temporal axis and compressed sensing–sensitivity encoding (CS-SENSE)

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Reference Dataset Research Articles

Related Topics

Articles published on Reference Dataset

Automating Polysomnography in an Under-Explored Animal Model: Preliminary Results from Hibernating Bears

High-accuracy experimental study of performance characteristics of optimised Ranque-Hilsch vortex tube

Magnetohydrodynamics tangent hyperbolic nanofluid flow across a vertical stretching surface using Levengberg-Marquardt back propagation artificial neural networks

Reference Data Set for Circular Dichroism Spectroscopy Comprised of Validated Intrinsically Disordered Protein Models.

Multi-Sensor 3D Survey: Aerial and Terrestrial Data Fusion and 3D Modeling Applied to a Complex Historic Architecture at Risk

Online feeding behavior monitoring of individual group-housed grow-finish pigs using a low-frequency RFID electronic feeding system.

Abstract LB240: Cell type annotation using singleR with custom reference for single-nucleus multiome data derived from frozen human breast tumors

Investigating the Mixture Quality in Multi-Injector Burner Systems, Part II: Model Application

Enabling Advanced Snow Physics Within Land Surface Models Through an Interoperable Model‐Physics Coupling Framework

Abstract 80: Toward best practices for B-cell receptor repertoire profiling

Abstract 3520: A scalable single cell RNA-seq pipeline leveraging machine learning and high-quality references for cell-type prediction

Abstract 4936: An ultrahigh-throughput synergy screening platform enables discovery of novel drug combinations

Bayesian Hyperbolic Multidimensional Scaling

A Comparison of 5 Algorithmic Methods and Machine Learning Pattern Recognition for Artifact Detection in Electronic Records of 5 Different Vital Signs: A Retrospective Analysis.

Deep learning methods for fully automated dental age estimation on orthopantomograms.

Predicting absolute risk for a person with missing risk factors.

The Simultaneous Model-Based Estimation of Joint, Muscle, and Tendon Stiffness is Highly Sensitive to the Tendon Force-Strain Relationship.

Leadless pacemakers at 5-year follow-up: the Micra transcatheter pacing system post-approval registry.

Melt pond fractions on Arctic summer sea ice retrieved from Sentinel-3 satellite data with a constrained physical forward model