Studies often report estimates of the average treatment effect (ATE). While the ATE summarizes the effect of a treatment on average, it does not provide any information about the effect of treatment within any individual. A treatment strategy that uses an individual's information to tailor treatment to maximize benefit is known as an optimal dynamic treatment rule (ODTR). Treatment, however, is typically not limited to a single point in time; consequently, learning an optimal rule for a time-varying treatment may involve not just learning the extent to which the comparative treatments' benefits vary across the characteristics of individuals, but also learning the extent to which the comparative treatments' benefits vary as relevant circumstances evolve within an individual. The goal of this paper is to provide a tutorial for estimating ODTR from longitudinal observational and clinical trial data for applied researchers. We describe an approach that uses a doubly-robust unbiased transformation of the conditional average treatment effect. We then learn a time-varying ODTR for when to increase buprenorphine-naloxone (BUP-NX) dose to minimize return-to-regular-opioid-use among patients with opioid use disorder. Our analysis highlights the utility of ODTRs in the context of sequential decision making: the learned ODTR outperforms a clinically defined strategy.