Longitudinal fMRI studies of language production are of interest for evaluating recovery from post-stroke aphasia, but numerous methodological issues remain unresolved, particularly regarding strategies for evaluating single subjects at multiple timepoints. To address these issues, we studied overt picture naming in eleven healthy subjects, scanned four times each at one-month intervals. To evaluate the natural variability present across repeated sessions, repeated scans were directly contrasted in a unified statistical framework on a per-voxel basis. The effect of stimulus familiarity was evaluated using explicitly overtrained pictures, novel pictures, and untrained pictures that were repeated across sessions. For untrained pictures, we found that activation declined across multiple sessions, equally for both novel and repeated stimuli. Thus, no repetition priming for individual stimuli at one-month intervals was found, but rather a general effect of task habituation was present. Using a set of overtrained pictures identical in each session, no decline was found, but activation was minimized and produced less consistent patterns across participants, as measured by intra-class correlation coefficients. Subtraction of a baseline task, in which subjects produced a stereotyped utterance to scrambled pictures, resulted in specific activations in the left inferior frontal gyrus and other language areas for untrained items, while overlearned stimuli relative to pseudo pictures activated only the fusiform gyrus and supplementary motor area. These findings indicate that longitudinal fMRI is an effective means of detecting changes in neural activation magnitude over time, as long as the effect of task habituation is taken into account.