Sort by
Sex-limited experimental evolution drives transcriptomic divergence in a hermaphrodite.

The evolution of gonochorism from hermaphroditism is linked with the formation of sex chromosomes, as well as the evolution of sex-biased and sex-specific gene expression to allow both sexes to reach their fitness optimum. There is evidence that sexual selection drives the evolution of male-biased gene expression in particular. However, previous research in this area in animals comes from either theoretical models or comparative studies of already old sex chromosomes. We therefore investigated changes in gene expression under 3 different selection regimes for the simultaneous hermaphrodite Macrostomum lignano subjected to sex-limited experimental evolution (i.e. selection for fitness via eggs, sperm, or a control regime allowing both). After 21 and 22 generations of selection for male-specific or female-specific fitness, we characterized changes in whole-organism gene expression. We found that female-selected lines had changed the most in their gene expression. Although annotation for this species is limited, gene ontology term and Kyoto Encyclopedia of Genes and Genomes pathway analyses suggest that metabolic changes (e.g. biosynthesis of amino acids and carbon metabolism) are an important adaptive component. As predicted, we found that the expression of genes previously identified as testis-biased candidates tended to be downregulated in the female-selected lines. We did not find any significant expression differences for previously identified candidates of other sex-specific organs, but this may simply reflect that few transcripts have been characterized in this way. In conclusion, our experiment suggests that changes in testis-biased gene expression are important in the early evolution of sex chromosomes and gonochorism.

Open Access
Relevant
Improved detection of clinically relevant fusion transcripts in cancer by machine learning classification

BackgroundGenomic rearrangements in cancer cells can create fusion genes that encode chimeric proteins or alter the expression of coding and non-coding RNAs. In some cancer types, fusions involving specific kinases are used as targets for therapy. Fusion genes can be detected by whole genome sequencing (WGS) and targeted fusion panels, but RNA sequencing (RNA-Seq) has the advantageous capability of broadly detecting expressed fusion transcripts.ResultsWe developed a pipeline for validation of fusion transcripts identified in RNA-Seq data using matched WGS data from The Cancer Genome Atlas (TCGA) and applied it to 910 tumors from 11 different cancer types. This resulted in 4237 validated gene fusions, 3049 of them with at least one identified genomic breakpoint. Utilizing validated fusions as true positive events, we trained a machine learning classifier to predict true and false positive fusion transcripts from RNA-Seq data. The final precision and recall metrics of the classifier were 0.74 and 0.71, respectively, in an independent dataset of 249 breast tumors. Application of this classifier to all samples with RNA-Seq data from these cancer types vastly extended the number of likely true positive fusion transcripts and identified many potentially targetable kinase fusions. Further analysis of the validated gene fusions suggested that many are created by intrachromosomal amplification events with microhomology-mediated non-homologous end-joining.ConclusionsA classifier trained on validated fusion events increased the accuracy of fusion transcript identification in samples without WGS data. This allowed the analysis to be extended to all samples with RNA-Seq data, facilitating studies of tumor biology and increasing the number of detected kinase fusions. Machine learning could thus be used in identification of clinically relevant fusion events for targeted therapy. The large dataset of validated gene fusions generated here presents a useful resource for development and evaluation of fusion transcript detection algorithms.

Open Access
Relevant
Multimodal Single-Cell Sequencing of B Cells in Primary Sjögren's Syndrome.

B cells are important in the pathogenesis of primary Sjögren's syndrome (pSS). Patients positive for Sjögren's syndrome antigen A/Sjögren syndrome antigen B (SSA/SSB) autoantibodies are more prone to systemic disease manifestations and adverse outcomes. We aimed to determine the role of B cell composition, gene expression, and B cell receptor usage in pSS subgroups stratified for SSA/SSB antibodies. Over 230,000 B cells were isolated from peripheral blood of patients with pSS (n=6 SSA-, n=8 SSA+ single positive and n=10 SSA/SSB+ double positive) and four healthy controls and processed for single-cell RNA sequencing (scRNA-seq) and single-cell variable, diversity, and joining (VDJ) gene sequencing (scVDJ-seq). We show that SSA/SSB+ patients present the highest and lowest proportion of naïve and memory B cells, respectively, and the highest up-regulation of interferon-induced genes across all B cell subtypes. Differential usage of IGHV showed that IGHV1-69 and IGHV4-30-4 were more often used in all pSS subgroups compared with controls. Memory B cells from SSA/SSB+ patients displayed a higher proportion of cells with unmutated VDJ transcripts compared with other pSS patient groups and controls, indicating altered somatic hypermutation processes. Comparison with previous studies revealed heterogeneous clonotype pools, with little overlap in CDR3 sequences. Joint analysis using scRNA-seq and scVDJ-seq data allowed unsupervised stratification of patients with pSS and identified novel parameters that correlated to disease manifestations and antibody status. We describe heterogeneity and molecular characteristics in B cells from patients with pSS, providing clues to intrinsic differences in B cells that affect the phenotype and outcome and allowing stratification of patients with pSS at improved resolution.

Open Access
Relevant
Screening for Circulating Inflammatory Proteins Does Not Reveal Plasma Biomarkers of Constant Tinnitus

Background and ObjectiveTinnitus would benefit from an objective biomarker. The goal of this study is to identify plasma biomarkers of constant and chronic tinnitus among selected circulating inflammatory proteins.MethodsA case–control retrospective study on 548 cases with constant tinnitus and 548 matched controls from the Swedish Tinnitus Outreach Project (STOP), whose plasma samples were examined using Olink’s Inflammatory panel. Replication and meta-analysis were performed using the same method on samples from the TwinsUK cohort. Participants from LifeGene, whose blood was collected in Stockholm and Umeå, were recruited to STOP for a tinnitus subtyping study. An age and sex matching was performed at the individual level. TwinsUK participants (n = 928) were selected based on self-reported tinnitus status over 2 to 10 years. Primary outcomes include normalized levels for 96 circulating proteins, which were used as an index test. No reference standard was available in this study.ResultsAfter adjustment for age, sex, BMI, smoking, hearing loss, and laboratory site, the top proteins identified were FGF-21, MCP4, GDNF, CXCL9, and MCP-1; however, these were no longer statistically significant after correction for multiple testing. Stratification by sex did not yield any significant associations. Similarly, associations with hearing loss or other tinnitus-related comorbidities such as stress, anxiety, depression, hyperacusis, temporomandibular joint disorders, and headache did not yield any significant associations. Analysis in the TwinsUK failed in replicating the top candidates. Meta-analysis of STOP and TwinsUK did not reveal any significant association. Using elastic net regularization, models exhibited poor predictive capacity tinnitus based on inflammatory markers [sensitivity = 0.52 (95% CI 0.47–0.57), specificity = 0.53 (0.48–0.58), positive predictive value = 0.52 (0.47–0.56), negative predictive values = 0.53 (0.49–0.58), and AUC = 0.53 (0.49–0.56)].DiscussionOur results did not identify significant associations of the selected inflammatory proteins with constant tinnitus. Future studies examining longitudinal relations among those with more severe tinnitus and using more recent expanded proteomics platforms and sampling of cerebrospinal fluid could increase the likelihood of identifying relevant molecular biomarkers.

Open Access
Relevant
WOMBAT-P: Benchmarking Label-Free Proteomics Data Analysis Workflows.

The inherent diversity of approaches in proteomics research has led to a wide range of software solutions for data analysis. These software solutions encompass multiple tools, each employing different algorithms for various tasks such as peptide-spectrum matching, protein inference, quantification, statistical analysis, and visualization. To enable an unbiased comparison of commonly used bottom-up label-free proteomics workflows, we introduce WOMBAT-P, a versatile platform designed for automated benchmarking and comparison. WOMBAT-P simplifies the processing of public data by utilizing the sample and data relationship format for proteomics (SDRF-Proteomics) as input. This feature streamlines the analysis of annotated local or public ProteomeXchange data sets, promoting efficient comparisons among diverse outputs. Through an evaluation using experimental ground truth data and a realistic biological data set, we uncover significant disparities and a limited overlap in the quantified proteins. WOMBAT-P not only enables rapid execution and seamless comparison of workflows but also provides valuable insights into the capabilities of different software solutions. These benchmarking metrics are a valuable resource for researchers in selecting the most suitable workflow for their specific data sets. The modular architecture of WOMBAT-P promotes extensibility and customization. The software is available at https://github.com/wombat-p/WOMBAT-Pipelines.

Open Access
Relevant
Whole-genome resequencing facilitates the development of a 50K single nucleotide polymorphism genotyping array for Scots pine (Pinus sylvestris L.) and its transferability to other pine species.

Scots pine (Pinus sylvestris L.) is one of the most widespread and economically important conifer species in the world. Applications like genomic selection and association studies, which could help accelerate breeding cycles, are challenging in Scots pine because of its large and repetitive genome. For this reason, genotyping tools for conifer species, and in particular for Scots pine, are commonly based on transcribed regions of the genome. In this article, we present the Axiom Psyl50K array, the first single nucleotide polymorphism (SNP) genotyping array for Scots pine based on whole-genome resequencing, that represents both genic and intergenic regions. This array was designed following a two-step procedure: first, 192 trees were sequenced, and a 430K SNP screening array was constructed. Then, 480 samples, including haploid megagametophytes, full-sib family trios, breeding population, and range-wide individuals from across Eurasia were genotyped with the screening array. The best 50K SNPs were selected based on quality, replicability, distribution across the draft genome assembly, balance between genic and intergenic regions, and genotype-environment and genotype-phenotype associations. Of the final 49 877 probes tiled in the array, 20 372 (40.84%) occur inside gene models, while the rest lie in intergenic regions. We also show that the Psyl50K array can yield enough high-confidence SNPs for genetic studies in pine species from North America and Eurasia. This new genotyping tool will be a valuable resource for high-throughput fundamental and applied research of Scots pine and other pine species.

Open Access
Relevant