As space heating accounts for 54% of annual residential electricity consumption in Quebec, demand response programs specifically target load shifting through the automated control of thermostat setpoints during peak hours. On a district scale, varied thermostat preferences and setpoint override behaviors can have an impact on the success of the demand response program. This study examines two unique occupant types (Average, Tolerant) in terms of thermostat setpoint preferences as well as three different occupant Levels-of-Detail (LoDs) and analyzes their effects on the energy flexibility provided during demand response periods. For our baseline scenario, LoD 1, a static setpoint schedule is used and there is no control of the heat pump, while LoD 2 and LoD 3 incorporate thermostat setbacks during demand response events. LoD 2 assumes the occupant is comfortable within 2°C from the setpoint while LoD 3 allows the occupant to override the DR setbacks. We estimate the flexibility services provided by a ten-house residential community through the automated control of heat pumps during a three-month winter period, and we implement and simulate our study in CityLearn using reinforcement learning-based control for district-level energy management. When comparing LoD 3 to LoD 1, electricity cost was reduced by approximately 12% and net electricity consumption was reduced by approximately 17% during demand response periods. Likewise, we find that LoD 2 could overestimate savings in net electricity consumption, cost, and peak demand by 5% compared to LoD 3. The number of hours where the indoor temperature deviated more than ▪ from the setpoint occurred for less than 5% of the timesteps on average for the 10 buildings for LoD 3 while still achieving significant net electricity consumption reductions, thus highlighting that we can provide energy flexibility services to the grid while balancing occupant thermal comfort. Finally, the agents learned optimal decision-making that reduced the number of overrides across the training episodes for both Average and Tolerant occupants. We thus present a multi-agent framework as a means for addressing various occupant setpoint preferences and override behaviors for the control of heat pump systems at the neighborhood level.