In Japanese continuous speech, a content word is frequently followed by a particle to form an utterance unit with one accent component, called an accentual phrase or a prosodic word. As opposed to accent type identification for isolated word utterances, automatic accent type identification for accentual phrases in continuous speech is quite difficult, and no reliable method has yet been developed. In order to realize accurate identification, a method was proposed which was based on representing the fundamental frequency ( F0) movement of an utterance as a sequence of F0 values in mora unit ( F0 mora’s). Although a consonant–vowel ( CV) cluster is usually said to correspond to a mora, we also tested the vowel–consonant ( VC) cluster in our study. As for F0 values, two definitions were selected and compared: one is to average F0 values of the voiced frames within the mora unit, and the other is to set the target value as the F0 value at the end of a linear regression line fit through the mora. As combinations of the two definitions for mora unit and F0 value, 4 candidates ( CV- average, CV- target, VC- average, VC- target) are possible as the F0 mora definition. A variable, F0 ratio, was then defined as the F0 mora difference between two successive morae to quantitatively represent their pitch change, and its distribution for each accent type was analyzed. After constructing a multi-dimensional Gaussian model for each accent type using F0 ratio as the feature parameter, an experiment was conducted on accent type identification. The experiment included identification using accent type HMM’s of frame-based F0’s and delta- F0’s as the baseline method. On average, the proposed method out-performed the baseline method for all four candidates of F0 mora, with CV- target and VC- average showing better performances. The candidates were further analyzed in regard to how well they corresponded to human perceived mora pitch values. For this purpose, after developing a tool enabling us to control musical instrument digital interface (MIDI) sound pitch to the interval of a quarter of semitone, we asked subjects to adjust the MIDI sound pitch to the perceived mora pitch. The MIDI sound pitch values obtained after adjustment by the subjects were used to quantify the human-perceived mora pitch ( F0 human). F0 human values were used to evaluate each F0 mora candidate. Although VC- average and CV- target had shown a better match with F0 human, they showed large mismatches when large F0 changes occurred within the mora. Analysis of mismatches between F0 mora and F0 human showed that mora pitch was related to the direction of the F0 change. It also showed that the target value obtained from the linear regression approximation of the observed F0 curve over-estimated the F0 change effect on the perceived pitch. An optimal definition for F0 mora will be found between averaging and linear-targeting.