Treatment fidelity data are critical to evaluate intervention effectiveness, yet there are only general guidelines regarding treatment fidelity measurement. Initial investigations have found treatment fidelity data collected via direct observation to be more reliable than data collected via permanent product or self-report. However, the comparison of assessment methods is complicated by the intervention steps accounted for, observation timing, and intervention sessions accounted for, which may impact treatment fidelity estimates. In this study, we compared direct observation and permanent product data to evaluate these varied assessment and data collection decisions on treatment fidelity data estimates in three classrooms engaged in a group contingency intervention. Findings revealed that treatment fidelity estimates, in addition to being different across assessment method, are, in fact, different depending on the intervention steps assessed, intervention sessions accounted for, and observation timing. Implications for treatment fidelity assessment research, reporting in intervention research broadly, and implementation assessment in practice are described.