Naming latency (NL) represents the speech onset time after the presentation of an image. We recently developed an extended threshold-based algorithm for automatic NL (aNL) detection considering the envelope of the speech wave. The present study aims at exploring the influence of different manners (e.g., "m" and "p") and positions (e.g., "t" and "p") of articulation on the differences between manual NL (mNL) and aNL detection.Speech samples were collected from 123 healthy participants. They named 118 pictures in German, including different initial phonemes. NLs were manually (Praat, waveform and spectrogram) and automatically (developed algorithm) determined. To investigate the accuracy of automatic detections, correlations between mNLs and aNLs were analyzed for different initial phonemes.ANLs and mNLs showed a strong positive correlation and similar tendencies in initial phoneme groups. ANL mean values were shorter than the ones of mNLs. Nasal sounds (e.g., /m/) showed the largest and those for fricatives (e.g., /s/) the smallest difference. However, in fricatives, 39% of NLs were detected later by automatic detections than by manual detections, which led to a reduced mean difference with mNLs. The signal energy of the initial phonemes, i.e., if they are voiced or voiceless, influences the form of the speech envelope: initial high signal energy is often responsible for an early detection by the algorithm.Our study provides evidence of a similar tendency in mNL and aNL according to different positions of articulation in each initial phoneme group. ANLs are highly sensitive to detection of speech onsets across different initial phonemes. The dependency of the NL differences on the initial phonemes will lose importance during progress evaluations in aphasia patients if the relative changes for each picture are considered separately. Nevertheless, the algorithm will be further optimized by adapting its parameters for each initial phoneme group individually.Clinical Relevance- This underlines the feasibility to use automatic naming latency detection for the evaluation of patients with aphasia in a clinical setting as well as for practices at home during picture naming.