Epilepsy patients rank memory problems as their most significant cognitive comorbidity. Current clinical assessments are laborious to administer and score and may not always detect subtle memory decline. The Famous Faces Task (FF) has robustly demonstrated that left temporal lobe epilepsy (LTLE) patients remember fewer names and biographical details compared to right TLE (RTLE) patients and healthy controls (HCs). We adapted the FF task to capture subjects' entire spontaneous spoken recall, then scored responses using manual and natural language processing (NLP) methods. We expected to replicate previous group level differences using spontaneous speech and semi-automated analysis. Seventy-three (N=73) adults (28 LTLE, 18 RTLE, and 27 HCs) were included in a case-control prospective study design. Twenty FF in politics, sports, and entertainment (active 2008-2017) were shown to subjects, who were asked if they could recognize and spontaneously recall as much biographical detail as possible. We created human-generated and automatically-generated keyword dictionaries for each celebrity, based on a randomly selected training set of half of the HC transcripts. To control for speech output, we measured the speech duration, total word count and content word count for the FF task and a Cookie Theft Control Task (CTT), in which subjects were merely asked to describe a visual scene. Subjects' responses to FF and CTT tasks were recorded, transcribed, and analyzed in a blinded manner with a combination of manual and automated NLP approaches. Famous face recognition accuracy was similar between groups. LTLE patients recalled fewer biographical details compared to HCs and RTLEs using both the gold-standard human-generated dictionary (24%±12% vs. 31%±12% and 30%±12%, p=0.007) and the automated dictionary (24%±12% vs. 31%±12% and 32%±13%, p=0.007). There were no group level differences in speech duration, total word count, or content word count for either the FF and CTT to explain difference in recall performance. There was a positive, statistically significant relationship between MOCA score and FF recall performance as scored by the human-generated (ρ= .327, p= .029) and automatically-generated dictionaries (ρ= .422, p= .004) for TLE subjects, but not HCs, an effect that was driven by LTLE subjects. LTLE patients remember fewer details of famous people than HCs or RTLE patients, as discovered by NLP analysis of spontaneous recall. Decreased biographical memory was not due to decreased speech output and correlated with lower MOCA scores. NLP analysis of spontaneous recall can detect memory dysfunction in clinical populations in a semi-automated, objective, and sensitive manner.