The dopamine reward prediction error signal is known to be subjective but has so far only been assessed in aggregate choices. However, personal choices fluctuate across trials and thus reflect the instantaneous subjective reward value. In the well-established Becker-DeGroot-Marschak (BDM) auction-like mechanism, participants are encouraged to place bids that accurately reveal their instantaneous subjective reward value; inaccurate bidding results in suboptimal reward (“incentive compatibility”). In our experiment, male rhesus monkeys became experienced over several years to place accurate BDM bids for juice rewards without specific external constraints. Their bids for physically identical rewards varied trial by trial and increased overall for larger rewards. In these highly experienced animals, responses of midbrain dopamine neurons followed the trial-by-trial variations of bids despite constant, explicitly predicted reward amounts. Inversely, dopamine responses were similar with similar bids for different physical reward amounts. Support Vector Regression demonstrated accurate prediction of the animals’ bids by as few as twenty dopamine neurons. Thus, the phasic dopamine reward signal reflects instantaneous subjective reward value.
Read full abstract