Drivers’ distraction has been widely studied in the field of naturalistic driving studies. However, it is difficult to use traditional variables, such as speed, acceleration, and yaw rate to detect drivers’ distraction in real time. Emerging technologies have obtained features from human faces, such as eye gaze, to detect drivers’ visual distraction. However, eye gaze is hard to detect in naturalistic driving situations, because of low-resolution cameras, drivers wearing sunglasses, and so forth. Instead, head pose is easier to detect, and has correlation with eye gaze direction. In this study, city-wide videos are collected using onboard cameras from over 289 drivers representing 423 events. Head pose (pitch, yaw, and roll rates) are derived and fed into a convolutional neural network to detect drivers’ distraction. The experiment results show that the proposed model can achieve recall value of 0.938 and area under the receiver operating characteristic curve value of 0.931, with variables from five time slices (1.25 s) used as input. The study proves that head pose can be used to detect drivers’ distraction. The study offers insights for detecting drivers’ distraction and can be used for the development of advanced driver assistance systems.