Through repeated interactions, firms today refine their understanding of individual users’ preferences adaptively for personalization. In this paper, we use a continuous-time bandit model to analyze firms that recommend content to multihoming consumers, a representative setting for strategic learning of consumer preferences to maximize lifetime value. In both monopoly and duopoly settings, we compare a forward-looking recommendation algorithm that balances exploration and exploitation to a myopic algorithm that only maximizes the quality of the next recommendation. Our analysis shows that, compared with a monopoly, firms competing for users’ attention focus more on exploitation than exploration. When users are impatient, competition decreases the return from developing a forward-looking algorithm. In contrast, development of a forward-looking algorithm may hurt users under monopoly but always benefits users under competition. Competing firms’ decisions to invest in a forward-looking algorithm can create a prisoner’s dilemma. Our results have implications for artificial intelligence adoption and for policy makers on the effect of market power on innovation and consumer welfare. This paper was accepted by Dmitri Kuksov, marketing. Supplemental Material: The online appendix is available at https://doi.org/10.1287/mnsc.2023.4722 .