Functional Family Therapy (FFT) is a short-term family-based intervention for youth with behaviour problems. FFT has been widely implemented in the USA and other high-income countries. It is often described as an evidence-based program with consistent, positive effects. We aimed to synthesise the best available data to assess the effectiveness of FFT for families of youth with behaviour problems. Searches were performed in 2013-2014 and August 2020. We searched 22 bibliographic databases (including PsycINFO, ERIC, MEDLINE, Science Direct, Sociological Abstracts, Social Services Abstracts, World CAT dissertations and theses, and the Web of Science Core Collection), as well as government policy databanks and professional websites. Reference lists of articles were examined, and experts were contacted to search for missing information. We included randomised controlled trials (RCTs) and quasi-experimental designs (QEDs) with parallel cohorts and statistical controls for between-group differences at baseline. Participants were families of young people aged 11-18 with behaviour problems. FFT programmes were compared with usual services, alternative treatment, and no treatment. There were no publication, geographic, or language restrictions. Two reviewers independently screened 1039 titles and abstracts, read all available study reports, assessed study eligibility, and extracted data onto structured electronic forms. We assessed risks of bias (ROB) using modified versions of the Cochrane ROB tool and the What Works Clearinghouse standards. Where possible, we used random effects models with inverse variance weights to pool results across studies. We used odds ratios for dichotomous outcomes and standardised mean differences for continuous outcomes. We used Hedges g to adjust for small sample sizes. We assessed the heterogeneity of effects with χ 2 and I 2. We produced separate forest plots for conceptually distinct outcomes and for different endpoints (<9, 9-14, 15-23, and 24-42 months after referral). We grouped studies by study design (RCT or QED), and then assessed differences between these two subgroups of studies with χ 2 tests. We generated robust variance estimates, using correlated effects (CE) models with small sample corrections to synthesise all available outcome data. Exploratory CE analyses assessed potential moderators of effects within these domains. We used GRADE guidelines to assess the certainty of evidence on six primary outcomes at 1 year after referral. Twenty studies (14 RCTs and 6 QEDs) met our inclusion criteria. Fifteen of these studies provided some valid data for meta-analysis; these studies included 10,980 families in relevant FFT and comparison groups. All included studies had high risks of bias on at least one indicator. Half of the studies had high risks of bias on baseline equivalence, support for intent-to-treat analysis, selective reporting, and conflicts of interest. Fifteen studies had incomplete reporting of outcomes and endpoints. Using the GRADE rubric, we found that the certainty of evidence for FFT was very low for all of our primary outcomes. Using pairwise meta-analysis, we found no evidence of effects of FFT compared with other active treatments on any primary or secondary outcomes. Primary outcomes were: recidivism, out-of-home placement, internalising behaviour problems, external behaviour problems, self-reported delinquency, and drug or alcohol use. Secondary outcomes were: peer relations and prosocial behaviour, youth self esteem, parent symptoms and behaviour, family functioning, school attendance, and school performance. There were few studies in the pairwise meta-analysis (k < 7) and little heterogeneity of effects across studies in most of these analyses. There were few differences between effect estimates obtained in RCTs versus QEDs. More comprehensive CE models showed positive results of FFT in some domains and negative results in others, but these effects were small (standardised mean difference [SMD] <|0.20|) and not significantly different from no effect with one exception: Two studies found positive effects of FFT on youth substance abuse and two studies found null results in this domain, and the overall effect estimate for this outcome was statistically different from zero. Over all outcomes (15 studies and 293 effect sizes), small positive effects were detected (SMD = 0.19, SE = 0.09), but these were not significantly different from zero effect. Prediction intervals showed that future FFT evaluations are likely to produce a wide range of results, including moderate negative effects and strong positive results (-0.37 to 0.75). Results of 10 RCTs and five QEDs show that FFT does not produce consistent benefits or harms for youth with behavioural problems and their families. The positive or negative direction of results is inconsistent within and across studies. Most outcomes are not fully reported, the quality of available evidence is suboptimal, and the certainty of this evidence is very low. Overall estimates of effects of FFT may be inflated, due to selective reporting and publication biases.