Abstract

If you have a tenure-track job in science, then you’ve worked very hard. You’ve worked very hard. Therefore, you have a tenure-track job. This is what’s known in formal logic as a “converse error.” I’ve constructed a blatant example to make the logical fallacy obvious, but more reasonable arguments that follow this structure can be very seductive. Systems biology is rife with them, especially in papers that face the difficult job of marrying models and experiment—if my model is right, then it will match my experimental observations; model and experiment match, therefore my model is right.Even though this logic can be useful when it’s constrained with great care and many seminal papers are built on it, formally speaking, it’s wrong. It mistakes necessity for sufficiency. In biology, converse errors also represent what is essentially a failure of imagination. Just because “if A, then B,” it doesn’t logically follow that myriad possibilities might not also give B. The fact that we’ve framed the question to exclude the possibilities that X, Y, and Z also give B doesn’t mean that they don’t in vivo.In this issue of Cell Systems, Andreas Hilfinger, Thomas Norman, and Johan Paulsson (pp. 251–259) provide a mathematically derived and logically correct way to be free of converse errors. By focusing on the relations between key quantities that describe the data within a large dataset (mRNA and protein abundances across the genome, for example), they can rigorously disprove large classes of hypotheses. To my mind, Hilfinger et al. is a milestone for all of the reasons that Johan Elf thoughtfully articulates in his Preview (pp. 219–220) but also for two more reasons.First, their approach dramatically expands the range of hidden variables that can be taken into account when testing hypotheses. For example, Hilfinger et al. consider whether some relationships seen within large datasets don’t reflect biology at all, but rather systematic biases in measurement. Importantly, they can do this whether they explicitly know the sources of bias or not. This feature of their approach may be especially important when using state-of-the-art technologies whose limits and caveats aren’t yet well known.Second, and even more important, is what I’ve already highlighted: Hilfinger et al. use formally correct logic. Science’s leading edge extends when we can clearly demarcate what we know. Here, we rigorously know what cannot be: it is formally impossible for X, Y, and Z to give B. That sort of certainty can be game changing. In putting logical constraints on our imagination, it lets us focus on fruitful possibilities.An analogy: what Hilfinger et al. is to logic, another paper in this month’s issue, Heimberg et al., is to reason.When datasets are comprehensive and quantitative, one way to gain confidence in conclusions is to vastly oversample, but this approach isn’t practical for many. Nor is it necessarily reasonable. Spurred by the idea that a principled approach could eliminate the need to oversample or guess, Graham Heimberg, Rajat Bhatnagar, Hana El-Samad, and Matt Thomson (pp. 239–250) dive into mRNA-seq data using tools borrowed from signal processing (think digitizing music).Using transparent reasoning at every step, Heimberg et al. investigate the tradeoff between read depth and error in the extraction of biological signals. The result is a clear line between what we know with reasonable confidence and what we don’t know. For example, we know that given our read depth, the behaviors X, Y, and Z exhibit in our dataset are essentially noise. Excellent! Score one for reproducibility and move on. The flip-side of the coin is even better: we know that we can make accurate claims about many biological behaviors at read-depths that are a tiny fraction of what’s standard (about 1%).To me, it’s especially pleasing that in approaching a practical problem with rigor and purpose, Heimberg et al. make a conceptual advance. Their work demonstrates that the example I’ve given above—the data are too fuzzy to say anything rigorous about X, Y, or Z—is actually rarer than one might naively expect. Essentially, this is because hard-to-see parts of biology don’t exist in isolation; instead, they tend to interact with easier-to-see parts. If we look at these easier-to-see parts through the right lens, we can see a surprisingly refined picture of the hard-to-see parts as well. Put another way: because correlations are always present in biology, we actually know more than we think we do. Biology helps us out.Although they’re vastly different pieces of work, the commonalities between Hilfinger et al. and Heimberg et al. are striking to me. Both take approaches that stand outside the mainstream. Both allow the study of biology to become fundamentally more rigorous. Both develop reasonable criteria to analyze molecules and mechanisms that don’t themselves involve molecular mechanisms. But then again, that’s just good logic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call