There is currently no standard practice for the monitoring of patients receiving treatment for osteoporosis. Repeated dual-energy X-ray absorptiometry (DXA) is commonly used for monitoring treatment response, but it has its limitations. Bone turnover markers have advantages over DXA as they are non-invasive, relatively cheap and can detect changes in bone turnover rates earlier. However, they do have disadvantages, particularly high within- and between-patient variability. The ability of bone turnover markers to identify treatment non-responders and predict future fracture risk has yet to be established. We aimed to determine the clinical effectiveness, test accuracy, reliability, reproducibility and cost-effectiveness of bone turnover markers for monitoring the response to osteoporosis treatment. We searched 12 electronic databases (including MEDLINE, EMBASE, The Cochrane Library and trials registries) without language restrictions from inception to March 2012. We hand-searched three relevant journals for the 12 months prior to May 2012, and websites of five test manufacturers and the US Food and Drug Administration (FDA). Reference lists of included studies and relevant reviews were also searched. A systematic review of test accuracy, clinical utility, reliability and reproducibility, and cost-effectiveness of two formation and two resorption bone turnover markers, in patients being treated for osteoporosis with any of bisphosphonate [alendronate (Fosamax, MSD), risedronate (Actonel, Warner Chilcott Company), zolendronate (Zometa, Novartis)], raloxifene (Evista, Eli Lilly and Company Ltd), strontium ranelate (Protelos, Servier Laboratories Ltd), denosumab (Prolia, Amgen Ltd) or teriparatide (Forsteo, Eli Lilly and Company Ltd), was undertaken according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. Given the breadth of the review question, a range of study designs and outcome measures were eligible. The development of a decision model was planned to determine the cost-effectiveness of bone turnover markers for informing changes in patient management if clinical effectiveness could be established. Forty-two studies (70 publications) met the inclusion criteria; none evaluated cost-effectiveness. Only five were randomised controlled trials (RCTs); these assessed only the impact of bone marker monitoring on aspects of adherence. No RCTs evaluated the effectiveness of bone turnover marker monitoring on treatment management. One trial suggested that feedback of a good response decreased non-persistence [hazard ratio (HR) 0.71, 95% confidence interval (CI) 0.53 to 0.95], and feedback of a poor response increased non-persistence (HR 2.22, 95% CI 1.27 to 3.89); it is not clear whether or not the trial recruited a population representative of that seen in clinical practice. Thirty-three studies reported results of some assessment of test accuracy, mostly correlations between changes in bone turnover and bone mineral density. Only four studies reported on intra- or interpatient reliability and reproducibility in treated patients. Overall, the results were inconsistent and inconclusive, owing to considerable clinical heterogeneity across the studies and the generally small sample sizes. As clinical effectiveness of bone turnover monitoring could not be established, a decision-analytic model was not developed. There was insufficient evidence to inform the choice of which bone turnover marker to use in routine clinical practice to monitor osteoporosis treatment response. The research priority is to identify the most promising treatment-test combinations for evaluation in subsequent, methodologically sound, RCTs. In order to determine whether or not bone turnover marker monitoring improves treatment management decisions, and ultimately impacts on patient outcomes in terms of reduced incidence of fracture, RCTs are required. Given the large number of potential patient population-treatment-test combinations, the most promising combinations would initially need to be identified in order to ensure that any RCTs focus on evaluating those strategies. As a result, the research priority is to identify these promising combinations, by either conducting small variability studies or initiating a patient registry to collect standardised data. The National Institute for Health Research Health Technology Assessment programme.