Abstract

Article Figures and data Abstract Editor's evaluation Introduction Results Discussion Materials and methods Appendix 1 Appendix 2 Appendix 3 Appendix 4 Appendix 5 Data availability References Decision letter Author response Article and author information Metrics Abstract Some theories of human cultural evolution posit that humans have social-specific learning mechanisms that are adaptive specialisations moulded by natural selection to cope with the pressures of group living. However, the existence of neurochemical pathways that are specialised for learning from social information and individual experience is widely debated. Cognitive neuroscientific studies present mixed evidence for social-specific learning mechanisms: some studies find dissociable neural correlates for social and individual learning, whereas others find the same brain areas and, dopamine-mediated, computations involved in both. Here, we demonstrate that, like individual learning, social learning is modulated by the dopamine D2 receptor antagonist haloperidol when social information is the primary learning source, but not when it comprises a secondary, additional element. Two groups (total N = 43) completed a decision-making task which required primary learning, from own experience, and secondary learning from an additional source. For one group, the primary source was social, and secondary was individual; for the other group this was reversed. Haloperidol affected primary learning irrespective of social/individual nature, with no effect on learning from the secondary source. Thus, we illustrate that dopaminergic mechanisms underpinning learning can be dissociated along a primary-secondary but not a social-individual axis. These results resolve conflict in the literature and support an expanding field showing that, rather than being specialised for particular inputs, neurochemical pathways in the human brain can process both social and non-social cues and arbitrate between the two depending upon which cue is primarily relevant for the task at hand. Editor's evaluation This work has important implications on how we view and understand social and individual learning with respect to dopamine processing in the human brain. This study, supported by a well-controlled experimental design, clear hypothesis testing, and rigorous model-based analyses, revealed that the dopamine system is involved in learning from a primary source as opposed to a secondary source, irrespective of social or non-social individual learning. This work encourages new investigations into testing when and how different neuromodulator systems may converge or diverge in guiding social versus non-social learning. https://doi.org/10.7554/eLife.74893.sa0 Decision letter eLife's review process Introduction The complexity and sophistication of human learning are increasingly appreciated. Enduring theoretical models illustrate that learners utilise ‘prediction errors’ to refine their predictions of future states (e.g., Rescorla–Wagner [RW] and temporal difference models; O’Doherty et al., 2003; Rescorla and Wagner, 1972; Schultz et al., 1997; Sutton and Barto, 2018). An explosion of studies, however, illustrates that this simple mechanism lies at the heart of more complex and sophisticated systems that enable humans (and other species) to learn from, keep track of the utility of, and integrate information from, multiple learning sources (Behrens et al., 2009; Biele et al., 2009; Li et al., 2011), meaning that one can learn from many sources of information simultaneously (Daw et al., 2006). Such complexity enables individuals to, for example, rank colleagues according to the utility of their advice and learn primarily from the top-ranked individual (Kendal et al., 2018; Laland, 2004; Morgan et al., 2012; Rendell et al., 2011) whilst also tracking the evolving utility of advice from others (Behrens et al., 2008; Biele et al., 2011). Recent studies have further revealed that learning need not rely solely on directly experienced associations since one can also learn via inference (Bromberg-Martin et al., 2010; Dolan and Dayan, 2013; Jones et al., 2012; Langdon et al., 2018; Moran et al., 2021; Sadacca et al., 2016; Sharpe and Schoenbaum, 2018). This growing appreciation of the complexity and sophistication of human learning may help to explain contradictory findings in various fields. Here, we focus on the field of social learning. The existence in the human brain of neural and/or neurochemical pathways that are specialised for learning from social information and individual experience respectively is the topic of much debate (Heyes, 2012; Heyes and Pearce, 2015). Indeed, the claim that humans have social-specific learning mechanisms that are adaptive specialisations moulded by natural selection to cope with the pressures of group living lies at the heart of some theories of cultural evolution (Kendal et al., 2018; Morgan et al., 2012; Templeton et al., 1999). Since cultural evolution is argued to be specific to humans (Richerson and Boyd, 2005), establishing whether humans do indeed possess social-specific learning mechanisms has attracted many scholars with its promise of elucidating the key ingredient that ‘makes us human’. Cognitive neuroscience offers tools that are ideally suited to investigating whether the mechanisms underpinning social learning (learning from others) do indeed differ from those that govern learning from one’s individual experience (individual learning). Cognitive neuroscientific studies, however, present mixed evidence for social-specific learning mechanisms. Some studies find dissociable neural correlates for social and individual learning (Apps et al., 2016; Behrens et al., 2008; Hill et al., 2016; Zhang and Gläscher, 2020). For example, a study by Behrens et al., 2008 reported that whilst individual learning was associated with activity in dopamine-rich regions such as the striatum that are classically associated with reinforcement learning, social learning was associated with activity in a dissociable network that instead included the anterior cingulate cortex gyrus (ACCg) and temporoparietal junction. Further supporting this dissociation, studies have revealed correlations between personality traits, such as social dominance (Cook et al., 2014) and dimensions of psychopathy (Brazil et al., 2013) and social, but not individual, learning, as well as atypical social, but not individual, prediction error-related signals in the ACCg in autistic individuals (Balsters et al., 2017). Together, these studies support the existence of social-specific learning mechanisms. In contrast, other studies have reported that the same computations, based on the calculation of prediction error, are involved in both social and individual learning (Diaconescu et al., 2014), and that social learning is associated with activity in dopamine-rich brain regions typically linked to individual learning (Biele et al., 2009; Braams et al., 2014; Campbell-Meiklejohn et al., 2010; Delgado et al., 2005; Diaconescu et al., 2017; Klucharev et al., 2009). Diaconescu et al., 2017, for example, observed that social learning-related prediction errors covaried with naturally occurring genetic variation that affected the function of the dopamine system. Further supporting this overlap between social and individual learning, behavioural studies have observed that social and individual learning are subject to the same contextual influences. For example, Tarantola et al., 2017 observed that prior preferences bias social learning, just as they do individual learning. Such findings promote the view that ‘domain-general’ learning mechanisms underpin social learning: we learn from other people in the same way that we learn from any other stimulus in our environment (Heyes, 2012; Heyes and Pearce, 2015). That is, there are no social-specific learning mechanisms. One potential resolution to this conflict in the literature hinges on (1) an appreciation of the complexity and sophistication of human learning systems and (2) a difference in study design between tasks that have, and have not, found evidence of social-specific mechanisms. In studies that have linked social learning with the dopamine-rich circuitry typically associated with individual learning (and which are therefore consistent with the domain-general view), participants have been encouraged to learn primarily from social information. Indeed, in many cases the social source has been the sole information source (Campbell-Meiklejohn et al., 2010; Diaconescu et al., 2017; Klucharev et al., 2009). For example, in the paradigm employed by Diaconescu and colleagues (2014, 2017), participants were required to choose between a blue and green stimulus and were provided with social advice which was sometimes valid and sometimes misleading; on each trial, participants received information about the time-varying probability of reward associated with the blue and green stimuli, thus participants did not have to rely on their own individual experience of blue/green reward associations and could fully dedicate themselves to social learning. That is, participants did not learn from multiple sources (i.e., social information and individual experience); participants only engaged in social learning. In contrast, in studies where social learning has been associated with neural correlates outside of the dopamine-rich regions classically linked to individual learning (and which are therefore consistent with the domain-specific view), social information has typically comprised a secondary, additional source (Behrens et al., 2008; Cook et al., 2014). Typically, the non-social (individual) information is presented first to participants, represented in a highly salient form, and is directly related to the feedback information. The social information, in contrast, is presented second, is typically less salient in form, and is not directly related to the feedback information. For example, in the Behrens et al. study (2008) (and in our own work employing this paradigm; Cook et al., 2014; Cook et al., 2019), participants were required to choose between two, highly salient, blue and green boxes to accumulate points. The boxes were the first stimuli that participants saw on each trial. Outcome information came in the form of a blue or green indicator, thus primarily informing participants about whether they had made the correct choice on the current trial (i.e., if the outcome indicator was blue, then the blue box was correct). In addition, each trial also featured a thin red frame, which represented social information, surrounding one of the two boxes. The red frame was the second stimulus that participants saw on each trial and indirectly informed participants about the veracity of the frame: if the outcome was blue and the frame surrounded the blue box, then the frame was correct. In such paradigms, participants must learn from multiple sources of information with one source taking primary status over the other. Consequently, in studies that have successfully dissociated social and individual learning the two forms of learning differ both in terms of social nature (social or non-social) and rank (primary versus secondary status). Thus, it is unclear which of these two factors accounts for the dissociation. This study tests whether social and individual learning share common neurochemical mechanisms when they are matched in terms of (primary versus secondary) status. Given its acclaimed role in learning (Glimcher and Bayer, 2005; Schultz, 2007), we focus specifically on the role of the neuromodulator dopamine. Drawing upon recent studies illustrating the complexity and sophistication of human learning (Daw et al., 2005; Gläscher et al., 2011; Moran et al., 2021), we hypothesise that pharmacological modulation of the human dopamine system will dissociate learning from two sources of information along a primary versus secondary, but not along a social versus individual axis. In other words, we hypothesise that social learning relies upon the dopamine-rich mechanisms that also underpin individual learning when social information is the primary source, but not when it comprises a secondary, additional element. Such a finding would offer a potential resolution to the aforementioned debate concerning the existence of social-specific learning mechanisms. Preliminary support for our hypothesis comes from three lines of work. First, studies have convincingly argued for flexibility within learning systems. For example, in a study by Daw et al., 2006, participants tracked the utility of four uncorrelated bandits, with particular brain regions – such as the ventromedial prefrontal cortex – consistently representing the value of the top-ranked bandit, even though the identity of this bandit changed over time. Second, studies are increasingly illustrating the flexibility of social brain networks (Ereira et al., 2020; Garvert et al., 2015). The medial prefrontal cortex (mPFC), for example, is not – as was once thought – specialised for representing the self; if the concept of ‘other’ is primarily relevant for the task at hand, then the mPFC will prioritise representation of other over self (Cook, 2014; Nicolle et al., 2012). Finally, in a recent study (Cook et al., 2019), we provided preliminary evidence of a catecholaminergic (i.e., dopaminergic and noradrenergic) dissociation between learning from primary and secondary, but not social and individual, sources of information. In this work (Cook et al., 2019), we employed a between-groups design, wherein both groups completed a version of the social learning task adapted from Behrens et al., 2008 described above. For one group, the secondary source was social in nature (social group). For the non-social group, the secondary source comprised a system of rigged roulette wheels and was thus non-social in nature. We observed that, in comparison to placebo (PLA), the catecholaminergic transporter blocker methylphenidate only affected learning from the primary source, which, in this paradigm, always comprised participant’s own individual experience. Methylphenidate did not affect learning from the secondary source, irrespective of its social or non-social nature. That is, we found positive evidence supporting a dissociation between primary and secondary learning but no evidence to support a distinction between learning from social and non-social sources. Nevertheless, since we did not observe an effect of methylphenidate on learning from the (social or non-social) secondary source of information, this study was unable to provide positive evidence of shared mechanisms for learning from social and non-social sources. If it is truly the case that domain-general (neurochemical) mechanisms underpin social learning, it should follow that pharmacological manipulations that affect individual learning when individual information is the primary source also affect social learning when social information is the primary source. The current (pre-registered) experiment tested this hypothesis by orthogonalising social versus individual and primary versus secondary learning. We perturbed learning using the dopamine D2 receptor antagonist haloperidol (HAL), in a double-blind, counter-balanced, PLA-controlled design. To test whether pharmacological manipulation of dopamine dissociates learning along a primary-secondary and/or a social-individual axis, we developed a novel between-groups manipulation wherein one group of participants learned primarily from social information and could supplement this learning with their own individual experience, and a second group learned primarily from individual experience and could supplement this learning with socially learned information. To foreshadow our results, we demonstrate that HAL specifically affects learning from the primary (not secondary) source of information. Bayesian statistics confirmed that the effects of haloperidol were comparable between the groups, thus, HAL affected individual learning when individual information was the primary source and, to the same extent, social learning when social information was the primary source. Our data support an expanding field showing that, rather than being fixedly specialised for particular inputs, neurochemical pathways in the human brain can process both social and non-social cues and arbitrate between the two depending upon which cue is primarily relevant for the task at hand (Cook, 2014; Garvert et al., 2015; Nicolle et al., 2012). Results Participants (n = 43; aged 19–38, mean [standard error] x¯(σx¯) = 25.950 [0.970]; 24 males, 19 females; see Materials and methods) completed an adapted version of the behavioural task originally developed by Behrens et al., 2008. Participants were randomly allocated to one of two groups. Participants in the individual-primary group (n = 21) completed the classic version of this task (Figure 1A; Behrens et al., 2008) in which they were required to make a choice between a blue and green box in order to win points. A red frame (the social information), which represented the most popular choice made by a group of four participants who had completed the task previously, surrounded either the blue or green box on each trial, and participants could use this to help guide their choice. The actual probability of reward associated with the blue and green boxes and the probability that the red frame surrounded the correct box varied according to uncorrelated pseudo-randomised schedules (Appendix 2—figure 1). For the individual-primary group, the individual information (blue and green stimuli) was primary, and the social information (red stimulus) was secondary on the basis that the blue/green stimuli appeared first on the screen, were highly salient (large boxes versus a thin frame) and were directly related to the feedback information. That is, after making their selection, participants saw a small blue or green box which primarily informed them whether a blue or green choice had been rewarded on the current trial. From this information, the participant could, secondarily, infer whether the social information (red frame) was correct or incorrect. Figure 1 Download asset Open asset Behavioural task. (A) Individual-primary group. Participants selected between a blue and a green box to gain points. On each trial, the blue and green boxes were presented first. After 1-4 seconds (s), one of the boxes was highlighted with a red frame, representing the social information. After 0.5–2s, a question mark appeared, indicating that participants were able to make their response. Response was indicated by a silver frame surrounding their choice. After a 1-3s interval, participants received feedback in the form of a green or blue box in the middle of the screen. (B) Social-primary group. Participants selected between going with, or against a red box, which represented the social information. On each trial, the red box was displayed. After 1-4s, blue and green frames appeared. After 0.5–2s, a question mark appeared, indicating that participants were able to make their response. Response was indicated by a silver frame surrounding their choice. After a 1-3s interval, participants received feedback in the form of a tick or a cross. This feedback informed participants if going with the group was correct or incorrect, from this feedback participants could infer whether the blue or green frame was correct. (C) Example of pseudo-randomised probabilistic schedule. The probability of reward varied according to probabilistic schedules, including stable and volatile blocks for both the probability of the blue box/frame being correct (top) and the probability of the red (social) box/frame being correct (bottom). Our social-primary group (n = 22; groups matched on age, gender, body mass index [BMI], and verbal working memory [VWM] span; Table 1) completed an adapted version of this task (Figure 1B) wherein the social information (red stimulus) was primary and the individual information (blue/green stimuli) was secondary. Participants first saw two placeholders; one empty and one containing a red box which indicated the social information. Subsequently, a thin green and a thin blue frame appeared around each placeholder. Participants were told that the red box represented the group’s choice. They were then required to choose whether to go with the social group (red box) or not. After making their choice, a tick or cross appeared which primarily informed participants whether going with the social information was the correct option. From this they could, secondarily, infer whether the blue or green frame was correct. Consequently, for the social-primary group the social information was primary on the basis that it appeared first on the screen, highly salient (a large red box versus thin green/blue frames), and directly related to the feedback information. Table 1 Participant information. Individual-primary group(n = 15)Mean (SD)Social-primary group(n = 16)Mean (SD)t (1,29)X2 (1, N = 31)p-ValueGender (n males: n females)7:88:80.0340.853Age25.600 (5.448)25.625 (4.745)0.0140.989VWM80.333 (6.016)76.354 (7.823)1.5800.125BMI24.016 (2.807)22.625 (2.606)1.4310.114 Age, gender, BMI, and VWM did not significantly differ between the groups. SD: standard deviation; VWM: verbal working memory span; BMI: body mass index. Participants in both the individual-primary and social-primary groups performed 120 trials of the task on each of two separate study days. To perturb learning, on one day participants took 2.5 mg of HAL, previously shown to affect learning (Pessiglione et al., 2006) via multiple routes including perturbation of phasic dopamine signalling (Schultz, 2007; Schultz et al., 1997) facilitated by action at mesolimbic D2 receptors (Camps et al., 1989; Grace, 2002; Lidow et al., 1991). On the other day, they took a PLA under double-blind conditions, with the order of the days counterbalanced. 43 participants took part in at least one study day, 33 participants completed both study days. Two participants performed at below-chance-level accuracy and were excluded from further analysis. We present an analysis of data from the 31 participants who completed both study days with above-chance accuracy (Table 1) in this article, which we complement with a full analysis of all 41 datasets in Appendix 4i. We used the following strategy to analyse our data. First, we sought to validate our manipulation by testing (under PLA) whether participants in both the individual-primary and social-primary groups learned in a more optimal fashion from the primary, versus secondary, source of information. Next, we tested our primary hypothesis that both social and individual learning would be modulated by HAL when they are the primary source of learning, but not when they comprise the secondary source. To do so, we estimated learning rates for primary and secondary sources of information, for each group (social-primary, individual-primary), under HAL and PLA, by fitting an adapted RW learning model to choice data. To ascertain that our model accurately described choices, we used simulations and parameter recovery. We used random-effects Bayesian model selection (BMS) to compare our model with alternative models. These analyses provided confidence that our model accurately described participants’ behaviour. After testing our primary hypothesis, we explored the relationship between parameters from our computational model and performance. To accomplish this, we first used an optimal learner model, with the same architecture and priors as our adapted RW model, to assess the extent to which HAL made participants’ learning rates more (or less) optimal. Finally, we regressed estimated model parameters against accuracy to gain insight into the extent to which variation in these parameters (and the effect of the drug thereupon) contributed to correct responses on the task. Social information is the primary source of learning for participants in the social-primary group Our novel manipulation orthogonalised primary versus secondary and social versus individual learning. To validate our manipulation, we tested whether participants in both the individual-primary and social-primary group learned in a more optimal fashion from the primary versus secondary source of information in our PLA condition. For this validation analysis, we used a Bayesian learner model to create two optimal models: (1) an optimal primary learner and (2) an optimal secondary learner (Materials and methods). Subsequently, we regressed both models against participants’ choice data, resulting in two βoptimal values capturing the extent to which a participant made choices according to the optimal primary, and optimal secondary learner models, respectively. βoptimal values were submitted to a repeated-measures analysis of variance (RM-ANOVA) with factors information source (primary, secondary) and group (social-primary, individual-primary), revealing main effects of information source (F(1,29) = 6.594, p=0.016) and group (F(1,29) = 10.423, p=0.003). βoptimal values (averaged across individual-primary and social-primary groups) were significantly higher for the primary information (x¯(σx¯) = 0.872 (0.101)) compared with secondary information source (x¯(σx¯) = 0.438 (0.101); t(29) = 2.568, pholm = 0.016). βoptimal values (averaged across primary and secondary conditions) were significantly higher for the social-primary group (x¯(σx¯) = 0.833 (0.078)) compared with the individual-primary group (x¯(σx¯) = 0.477 (0.078); t(29) = 3.228, pholm = 0.003) (Figure 2). Crucially, we did not observe a significant interaction between information and group (F(1,29) = 0.067, p=0.797), meaning that participants’ choices were more influenced by the primary information source, regardless of whether it was social or individual in nature. Furthermore, βoptimal values for primary information alone did not significantly differ between groups (t(29) = –1.982, pholm = 0.257). Note that βoptimal weights for both information sources were significantly greater than zero (primary: t(30) = 7.534, p<0.001; secondary: t(30) = 4.789, p<0.001), thus our optimal models of information use explained a significant amount of variance in the use of both primary and secondary learning sources. These data show that, irrespective of social (or individual) nature, participants learned in a more optimal fashion from the primary (relative to secondary) learning source, which was first in the temporal order of events, highly salient and directly related to the reward feedback. Figure 2 Download asset Open asset Beta weights (β_optimal) for primary and secondary information. βoptimal values were significantly higher for the primary, compared to secondary, information source and for the social-primary, compared with the individual-primary, group. Data points indicate estimated β_optimal weights for individual participants (n = 31, placebo data only), bold point indicates the mean, and bold line indicates standard error of the mean (1 SEM). Haloperidol reduces the rate of learning from primary sources We hypothesised that both social and individual learning would be modulated by administration of the dopamine D2 receptor antagonist HAL when they were the primary source of learning, but not when they comprised the secondary source. To test this hypothesis, we fitted an adapted RW learning model (Rescorla and Wagner, 1972) to participants’ choice data, enabling us to estimate various parameters that index learning from primary and secondary sources of information, for HAL and PLA conditions, for participants in the social-primary and individual-primary groups. Our adapted RW model provided estimates, for each participant, of α, β, and ζ. The learning rate (α) controls the weighting of prediction errors on each trial. A high α favours recent over (outdated) historical outcomes, while a low α suggests a more equal weighting of recent and more distant trials. Since our pseudo-random schedules included stable phases (where the reward probability associated with a particular option was constant for >30 trials), and volatile phases (where reward probabilities changed every 10–20 trials), α was estimated separately for volatile and stable phases (for both primary and secondary learning) to accord with previous research (Behrens et al., 2007; Cook et al., 2019; Manning et al., 2017). β captures the extent to which learned probabilities determine choice, with a larger β meaning that choices are more deterministic with regard to the learned probabilities. ζ represents the relative weighting of primary and secondary sources of information, with higher values indicating a bias towards the over-weighting of secondary relative to primary (see Materials and methods and Appendix 3 for further details of the model, model fitting, and model comparison). We hypothesised an interaction between drug and (primary versus secondary) information source such that HAL would affect learning from the primary information source only, regardless of its social/individual nature. To test this hypothesis, we employed a linear mixed effects model with fixed factors information source (primary, secondary), drug (HAL, PLA), environmental volatility (volatile, stable), and group (social-primary, individual-primary) and dependent variable α (square-root transformed to meet assumptions of normality). We controlled for inter-individual differences by including random intercepts for subject. Including pseudo-randomisation schedule as a factor in all analyses did not change the pattern of results. The mixed model revealed a drug by information interaction (F(1, 203) = 6.852, p=0.009, beta

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call