Abstract. The term speech sound disorder describes a range of speech difficulties in children that affect speech intelligibility. Differential diagnosis is difficult and reliant on access to validated and reliable measures. Technological advances aim to provide clinical access to measurements that have been identified as beneficial in diagnosing speech disorders. To generate objective measurements and, consequently, automatic scores, the output from multi-camera networks is required to produce quality results. The quality of photogrammetric results is usually expressed in terms of the precision and reliability of the network. Precision is determined at the design stage as a function of the geometry of the network. In this manuscript, we focus on the design of a photogrammetric camera network using three cameras. We adopted a similar workflow as Alsadika et al. (2012) and tested serval network configurations. As the distances from the camera stations to object points were fixed to 3500mm, only the horizontal and vertical placements of the cameras were varied. Horizontal angles were changed within an increment of 10º, and vertical angles were changed within an increment of 5º. The object space coordinates of GCPs for each camera configuration were assessed in terms of horizontal error ellipses and vertical precision. The best design was the maximum horizontal and vertical convergence angles of 90° and 30°. The existing camera network used to capture videos for speech assessment was approximately as good as the top third of tested designs. However, from a validation perspective, it can be concluded that the design is viable for continued use.