The study investigated whether video-otoscopic images taken by a telehealth clinic facilitator are sufficient for accurate asynchronous diagnosis by an otolaryngologist within a heterogeneous population. A within-subject comparative design was used with 61 adults recruited from patients of a primary healthcare clinic. The telehealth clinic facilitator had no formal healthcare training. On-site otoscopic examination performed by the otolaryngologist was considered the gold standard diagnosis. A single video-otoscopic image was recorded by the otolaryngologist and facilitator from each ear, and the images were uploaded to a secure server. Images were assigned random numbers by another investigator, and 6 weeks later the otolaryngologist accessed the server, rated each image, and made a diagnosis without participant demographic or medical history. A greater percentage of images acquired by the otolaryngologist (83.6%) were graded as acceptable and excellent, compared with images recorded by the facilitator (75.4%). Diagnosis could not be made from 10.0% of the video-otoscopic images recorded by the facilitator compared with 4.2% taken by the otolaryngologist. A moderate concordance was measured between asynchronous diagnosis made from video-otoscopic images acquired by the otolaryngologist and facilitator (κ=0.596). The sensitivity for video-otoscopic images acquired by the otolaryngologist and the facilitator was 0.80 and 0.91, respectively. Specificity for images acquired by the otolaryngologist and the facilitator was 0.85 and 0.89, respectively, with a diagnostic odds ratio of 41.0 using images acquired by the otolaryngologist and 46.0 using images acquired by the facilitator. A trained telehealth facilitator can provide a platform for asynchronous diagnosis of otological status using video-otoscopy in underserved primary healthcare settings.