Research Article - Neuropsychiatry (2016) Volume 6, Issue 5

Impact of working memory training on hot executive functions (decision-making and theory of mind) in children with ADHD: a randomized controlled trial

Corresponding Author:
Aitana Bigorra
Child and Adolescent Mental Health Unit, Hospital Universitari Mutua Terrassa, Barcelona, Spain
+34 93 736 59 03
+34 937887641



Children with attention deficit/hyperactivity disorder (ADHD) have deficits in working memory (WM) and in hot executive functions (EFs) that may be related. The main aim of this study was to analyze the efficacy of computerized Cogmed Working Memory Training (CWMT) on hot EF decision-making and theory of mind (ToM). Correlational analyses between WM and hot EFs at baseline were also performed to better clarify the nature of this interrelationship.


66 children with combined-type ADHD, aged 7 to 12 years, were included. Participants were randomized (1:1) to an experimental group (CWMT) (n=36) or a control group (non-adaptive training). At baseline, 1-2 weeks, and 6 months after the intervention, participants were assessed using performance-based measures of WM (backward digit span, letter-number sequencing of WISC-IV, and backward spatial span of WMS-III), decision-making (Iowa Gambling Task), and ToM (Happé’s Strange Stories and Folk Psychology Test).


Statistically significant correlations were found between WM and ToM measures at baseline, but not between WM and decision-making. On adjusted multiple linear regression analysis, there were no significant improvements in any of the outcome measures at either time point.


There was no relationship between WM and decision-making in ADHD. A relationship was found between WM and ToM, but CWMT did not show far-transfer effects on ToM deficits in ADHD. Other implications of these results are discussed.


Attention-deficit/hyperactivity disorder (ADHD), Working memory, Computerized cognitive training, Hot executive functions, Decision making, Theory of mind


Attention deficit/hyperactivity disorder (ADHD) is the most common neurodevelopmental disorder of childhood [1]. Children with ADHD may have considerable difficulty in academic, psychosocial, and community functioning [1].

Deficits in executive functions (EFs), although not universal, are very common in ADHD individuals [2-5]. These functions are the mental capacities needed to formulate, plan, and perform the actions required to reach an objective [6]. Among the EFs, working memory (WM) have been shown to be repeatedly deficient in ADHD, as described in several meta-analyses [7-10]. WM facilitates active maintenance and manipulation of information without external stimuli for enough time to enable use of this information for some purpose [11]. WM has assumed a prominent role as a primary neurocognitive deficit or endophenotype in extant models of ADHD [2,8]. An intervention aimed at improving this cognitive ability would, therefore, be of considerable value in the treatment of ADHD.

Developmental theorists have proposed a neuropsychological subdivision of EFs. Zelazo et al. [12] differentiate between “cool” more abstract-cognitive EFs, such as WM, response inhibition, and cognitive flexibility, and “hot” affective EFs that involve incentives and motivation. Hot EFs include [13,14]: 1) delayed gratification and affective decision-making [15]; and 2) identification of the desires, thoughts, feelings, and intentions of others, and one’s own, also known as theory of mind (ToM) [16,17]. Although there is also evidence to the contrary [18-22], recent and increasingly robust evidence shows that ADHD individuals have deficits in ToM [19,23-32] and in decision-making [33- 35]. Deficits in hot EFs in ADHD could be due to deficits in general regulatory processes. For example, Barkley’s ADHD model [36] is particularly relevant in the context of a potential contribution of cool EFs to social cognition deficits in ADHD, and some authors have specifically noted that WM contributes to hot EF processes [37].

Several studies have described a relationship between cool EFs and ToM in children and adolescents with normal development [38-43] and in neurodevelopmental disorders such as in ADHD [29,44-47]. One specific cool EF domain more strongly associated with ToM is WM [40,41,48-51], probably because social cognition tasks require an individual to keep relevant social information in mind and to flexibly evaluate and process this information [52]. There are also other reasons to suspect that cool EFs and ToM might be related:1) evidence from brain-imaging studies has identified the frontal lobes as the seat of ToM abilities and cool EFs [53-55]. 2) ToM acquisition emerges with improvements in cool executive tasks in preschool age [56]. 3) individual differences in cool EFs and ToM correlate in individuals with normal development, even after adjusting for the effects of age and intellectual ability [38,39]. Furthermore, there may be directionality in this relationship, such that cool EFs predict ToM performance over time [57,58]. In view of the scarcity of studies that have examined this possibility, additional research including intervention research and longitudinal data is certainly needed.

There is some controversy regarding the relationship between decision-making and cool EFs. Some authors argue that cool EFs and decision-making are related and specifically cite WM, as WM provides the mechanism to hold on-line representations of various options and scenarios over a period of time [59-61]. Several studies have reported a role for WM in performing decision-making tasks [62,63]. This relationship may be asymmetrically dependent because decision-making seems to be influenced by the intactness or impairment of WM, but WM is not dependent on the intactness of decision-making [64]. On the other hand, some studies have found no relationship between WM and decision-making [65,66].

Klingberg et al. developed Robomemo® Cogmed Working Memory Training™ (CWMT), a computerized WM training program with several auditory and visuospatial WM tasks that are presented as attractive games designed for children [67]. This training has been used in various populations and has been effective for improving certain cognitive functions and psychiatric symptoms [67-71]. In healthy adults and in ADHD, CWMT is reported to produce changes in brain activity in areas involved in WM [72-75] and to facilitate dopaminergic transmission [76], which plays an important role in this cognitive function.

.The effect of training on non-trained task performance can be differentiated into neartransfer effects (post-training improvement of performance in tasks similar to the training tasks) and far-transfer effects (post-training improvement in tasks that are different in nature or appearance from the training tasks) [77]. Fartransfer effects occur when two different tasks share an underlying processing component and neuroanatomical areas or neural circuits [78].

In summary, despite growing evidence of the presence of ToM and decision-making deficits in ADHD, it remains unclear whether there really is a relationship between these cognitive skills and cool EFs such as WM. This also raises questions about whether improving cool EFs in ADHD could improve hot EFs in this population. To our knowledge, there are no studies evaluating the effectiveness of cognitive training in hot EFs in ADHD.

The main aim of this study was to analyze the far-transfer effect of an intervention using the Robomemo® CWMT on decision-making and ToM in a sample of children with ADHD with or without comorbid disruptive behavior disorders, by conducting a randomized, doubleblind, placebo-controlled, parallel-group clinical trial with an active control group and a 6-month post-intervention follow-up. An additional aim in this study was to analyze the relationship (correlation) between WM and decisionmaking and ToM in baseline in this sample of patients. Our hypotheses were that WM and ToM and decision-making would be related and that CWMT would produce far-transfer improvements in these cognitive skills.

Materials and Methods

▪ Study design

For the main objective, we conducted a randomized, double-blind, placebo-controlled, parallel-group clinical trial in which participants were randomized (1:1) to an experimental group (CWMT) or a control group (non-adaptive training). For the secondary objective, we performed a correlational study.

▪ Participants

A power analysis was calculated assuming the criterion of 1 SD group difference in visuospatial and verbal WM performance-based tasks because, in the absence of increased WM capacity, it is theoretically unclear why WM training should lead to improvements in far-transfer tasks [79]. We assumed 1 SD group difference, a risk of α=5% and a statistical power (1-β) of 95%, and a dropout rate of 20%. The sample size included 66 subjects.

In total, 66 outpatients from the Child and Adolescent Psychiatric Unit of the Mutua de Terrassa University Hospital participated in the study. All had been diagnosed with combinedtype ADHD according to the DSM-IV-TR criteria. Comorbidity with other DSM-IV-TR disruptive behavior disorders (oppositional defiant disorder or conduct disorder) or elimination disorders was accepted. All diagnoses were confirmed using the Kiddie-Schedule for Affective Disorders and Schizophrenia, Present and Lifetime version (K-SADS-PL) [80] semistructured interview, which was administered to the participants’ parents. Other inclusion criteria included: 1) age between 7 and 12 years; 2) T score on the Conners ADHD index for parents and teachers > 70 at diagnosis; 3) no previous psychological or pharmacological treatment for ADHD; 4) access to a personal computer with an Internet connection. Exclusion criteria included: 1) IQ < 80; 2) comorbidity with autism spectrum disorder, psychosis, affective or anxiety disorder, consumption of toxic substances, learning disorder; 3) history of traumatic brain injury in the last two years; 4) perceptual-motor abnormalities that would preclude the use of a computer. Participants whose educational or socioeconomic context would make it unlikely for families to comply with the study requirements or follow the treatment procedure (families who did not speak Spanish or were monitored by social services due to suspected abuse/neglect) were also excluded from the study. Participants who participated in fewer than 20 training sessions or who initiated other pharmacological or psychological treatment during study participation were excluded from the subsequent data analysis.

A professional from the research team enrolled the participants and assigned them to either study group by random allocation using a computergenerated sequence. Study group allocation was blinded to the children, their families, their teachers, and the professionals who performed the cognitive assessments. Participants, their families, and their teachers were not aware of the differences between the experimental and control training (i.e., automatic adjustment of difficulty). The double-blind condition was maintained in all evaluations conducted throughout the study.

Following a thorough description of the study, verbal assent was obtained from the children and written informed consent from the parents. Upon completion of the study, participants in the control group were offered CWMT.

This study adhered to the principles outlined in the current legislation regarding clinical investigation (Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects, World Medical Association, 2004, Spanish Organic Law 15/1999 on Personal Data Protection, and Spanish Law 41/2002 on Patient Autonomy) and was approved by the Clinical Research Ethics Committee of the Mutua de Terrassa University Hospital. This study is registered as ISRCTN00767728 (www.

This study forms part of a broader line of research on the effects of CWMT in ADHD children. A previous publication [81] analyzed the effects of CWMT on cool EFs, clinical symptoms, functional impairment, and academic achievement. Results from the same participant population are being published separately because each article describes results related to different objectives and theoretical aspects, which allows further analysis of the results without overextending a single publication.

▪ Intervention

The experimental group underwent the CWMT RoboMemo® (2005, Cogmed Cognitive Medical Systems AB, Stockholm, Sweden), involving the following WM tasks: visuospatial, auditory, and location memory, plus tracking of moving visual objects. The level of difficulty was automatically adjusted to the performance of each participant, thus generating a prolonged cognitive demand that exceeded existing capacity limits to keep the task challenging throughout the training phase and thereby maximize WM performance gains [82]. Each training session included 90 trials and lasted 30 to 45 minutes. Participants attended 5 sessions per week over a 5-week period for a total of 25 sessions. The control group (nonadaptive training) engaged in the MegaMemo (2005, Cogmed Cognitive Medical Systems AB, Stockholm, Sweden), which consists of the same WM tasks as CWMT but without adjustment for difficulty. The remaining characteristics were the same for both groups, and both training programs were the translated Spanish versions.

After randomization, participants were given the respective training programme (CWMT or nonadaptive training) on a CD that contained 25 training sessions. Training was conducted in the patient’s home, under the supervision of a family member. The training included performance feedback on each task and a reinforcement game at the end of each session. The family was advised to add an additional reward at the end of each session. The response to each session, training time and number of sessions completed were recorded in an Internet database. A member of the research team (coach) examined this information on a weekly basis and contacted each family via telephone to ensure adherence to the rules and to answer questions. Participants in the analysis received no other pharmacological or psychological treatment until the end of their participation in the study, as was verified by asking the families and checking the records of participants’ visits to the Unit.

▪ Measures

An improvement index score was calculated for participants in the experimental group by subtracting the start index (results for training days 2 and 3) from the max index (results from the two best training days).

The Wechsler Intelligence Scale for Children (WISC-IV) [83] was administered to the entire sample to check that all participants met the inclusion criteria (IQ > 80).

Outcome measures: Assessments of the outcome measures were conducted at baseline (T0), at 1 to 2 weeks post-training (T1), and at 6 months post-training (T2). Participants, their parents and teachers, and the professionals who performed the cognitive assessments were blinded to each child’s group assignment. Cognitive assessments were administered by appropriately trained psychology graduates in two sessions no more than one week apart and always in the same order.

For the evaluation of WM, we used: 1) backward digit span of the Wechsler Intelligence Scale for Children-IV (WISC-IV) [83] to measure verbal WM, 2) letter-number sequencing of the WISCIV [83] to measure verbal WM, and 3) backward spatial span of the Wechsler Memory Scale-III (WMS-III) [84] to measure visuospatial WM.

To evaluate decision-making, we used the Iowa Gambling Task (IGT) [85], an experimental paradigm designed to mimic real-life decisionmaking situations in the way it factors uncertainty, reward and punishment.

To evaluate ToM, we used: 1) Happé’s Strange Stories [86], a measure of advanced cognitive ToM [52], which is the ability to understand “cold” mental states, ie, infer others’ thoughts and beliefs [87]; and 2) The Folk Psychology Test [88], the children’s version of the Reading the Mind in the Eyes Test, adapted from the adult version [89] that aims to assess “mind-reading” ability by understanding emotional states through the expression of the eye region. It is a measure of advanced affective ToM [52], namely the ability to understand “hot” mental states, i.e., infer others’ emotions [87] (Supplementary Information includes a more detailed description of outcome measures).

▪ Statistical analysis

A descriptive statistical analysis was performed using the variables of age, sex, years of schooling, and comorbid disorders. The chi-square test or Fisher exact test was used, when appropriate, to compare baseline categorical variables between the groups and the Student t test or Mann-Whitney U test was used for quantitative variables.

We computed composite scores for WM and ToM because the measure of a cognitive ability is more robust when obtained by combining several tasks that measure the same processes. This reflects their shared performance or ability [79]. The arithmetic mean of the corresponding standardized scores was calculated as the final composite score. The WM composite score included backward digit span and letter-number sequencing of the WISC-IV and backward spatial span of WMS-III. The ToM composite score included Happé’s Strange Stories and Folk Psychology Test.

To evaluate the association between WM and ToM and decision-making, the Pearson correlation coefficient or Spearman’s rho was calculated at baseline, when appropriate.

To study the far-transfer effect of CWMT on ToM and decision-making, score changes between evaluations at study time points T0, T1, and T2 (T1-T0, T2-T1, T2-T0) were used as variables and analyzed using a general linear model, adjusted for age, sex and presence of a disruptive behavior disorder. In the evaluation of ToM (Happé’s Strange Stories and Folk Psychology Test), the WISC-IV (Wechsler Intelligence Scale for Children) Vocabulary subtest [83] was added as a predictor variable in the adjusted analysis because some studies have suggested a relationship between verbal ability and ToM [90,91]. The analyses were conducted as complete case analyses, i.e., did not include missing values. The effect size (d’), ie, the difference between the scores obtained (T1-T0, T2-T1, T2-T0) for each group divided by the pooled standard deviations of both groups at T0 [92], and the 95% CI were calculated and classified as small (0.2), moderate (0.5), or large (0.8). Statistical tests were conducted assuming two-tailed contrasts with an alpha significance level of 5%. The Statistical Package for the Social Sciences (SPSS®, version 17.0) was used for the statistical analyses.

The flow chart showing the participants’ progress through the study is presented in Figure 1. Of the 65 participants analyzed at T0, a total of 6.15% (n=4) completed fewer than 20 training sessions (2 due to technical problems, 2 who dropped out) and were not included in the subsequent data analysis. All other participants (93.85%) completed the 25 training sessions within a mean of 35.15 calendar days (SD: 3.15), with no statistically significant differences between the groups in this respect (Z=-0.54, df=59, p=0.59). Furthermore, 9.2% (n=6) started pharmacological treatment between T1 and T2 and were excluded from the study. There were no significant differences in the percentage of dropouts between the experimental and control groups in any study period (Fisher exact test: from T0 to T1: X2=3.65, df=1, p=0.08; from T1 to T2: X2=0.18, df=1, p=0.51; from T0 to T2: X2=2.41, df=1, p=0.12). Another participant was excluded from the final data analysis due to a diagnosis of pervasive developmental disorder not otherwise specified. Missing values refer to measures not administered for organizational or technical reasons (T0: 1 IGT; T1: 1 IGT). The study was conducted between June 2010 and December 2012.


Figure 1: Flow chart of participant progress through the study


▪ Sociodemographic results

The demographic and clinical characteristics of the participants at T0 (baseline) are shown in Table 1. No significant differences between the groups were found for any of these variables or for the performance-based measures.

Experimental group Control Statistical value p-value
Girls, % 60 (n=21) 50 (n=15) 0.65 (χ2) 0.46
Age, years, mean (SD) 8.79 (1.75) 9.04 (1.68) 0.44
Years of schooling, mean (SD) 2.40 (1.80) 2.57 (1.59) -0.767 (Z) 0.47
Elimination disorder, % 2.86 (n=1) 6.67 (n=2) 0.533 (χ2) 0.59
Oppositional defiant disorder, % 31.43 (n=11) 23.33 (n=7) 0.529 (χ2) 0.58
Conduct disorder, % 0 0 ---
IQ, mean (SD) 100.63 (12.66) 96.57 (11.26) 1.91 (Z) 0.18
Ethnicity, % 3.03 (χ2)
Spanish 47.69 (n=31) 43.08 (n=28)
Latin American 3.08 (n=2) 1.54 (n=1)
Other 3.08 (n=2) 1.54 (n=1) 1.00
Race, % 0.43 (χ2)
White 45 (n=29) 48 (n=27)
Arabic 3.08 (n=2) 0 (n=0)
African 1.54 (n=1) 3.08 (n=2)
American Indian 4.61 (n=3) 1.54 (n=1) 0.51
Marital status of parents, % 3.91 (χ2)
Married 36.92 (n=24) 35.38 (n=23)
Separated/divorced 16.92 (n=11) 7.69 (n=5)
Never married/single 0 (n=0) 3.08 (n=2) 0.14
Years of schooling of parents*, mean (SD) 11.63 (3.20) 10.87 (2.94) 0.437 (χ2) 0.36

Table 1: Baseline sociodemographic and clinical characteristics of participants, and p-value of the difference between the groups.

▪ Relationship between WM and ToM/ decision-making

Statistically significant Pearson correlations were found between WM composite score and ToM composite score at baseline (r=0.47, p<0.001). Correlations were also significant between WM and each separate ToM measure: Happé’s Strange Stories (r=0.36, p=0.003) and Folk Psychology test total score (r=0.43, p<0.001). To calculate the correlations between WM and IGT, we used Spearman’s rho because the IGT total net score last 40 cards variable did not follow a normal distribution at baseline (p<0.05 in Kolmogorov-Smirnov test of normality). No significant correlations were found at baseline between IGT total net score last 40 cards and WM composite score (rho=-0.01, p=0.96).

▪ Efficacy of CWMT on hot EF

The mean improvement index in the experimental group was 30 (SD: 13.04). The mean and SD cognitive measurements at T0, T1, and T2 for the two groups are shown in Table 2.

T0 T1 T2
Group Mean SD n Mean SD n d’* (95% CI) Mean SD n d’** (95%CI) d’*** (95% CI)
Cognitive measurements
Working memory composite score E -0.00 0.79 35 0.29 0.66 31 0.81
(0.30 to 1.32)
0.04 0.83 28 -0.69
(-1.19 to -0.19)
(-0.67 to 0.61)
C 0.01 0.69 30 -0.30 0.83 30 -0.04 0.80 27
Iowa Gambling Task: total net score last 40 cards E 0.47 7.60 34 0.19 10.04 31 0.14
(-0.35 to 0.63)
-0.64 13.80 28 0.17
(-0.32 to 0.66)
(-0.19 to 0.79)
C 0.13 9.41 30 -1.31 6.74 29 -3.56 7.26 27
Theory of Mind composite score E 0.06 0.83 35 -0.01 0.80 31 -0.18
(-0.67 to 0.31)
-0.11 0.84 28 -0.23
(-0.72 to 0.26)
(-0.90 to 0.08)
C -0.07 0.89 30 0.01 0.83 30 0.11 0.78 27

Table 2: Mean values for cognitive measurements at baseline (T0), post-intervention (T1), and 6-month follow-up (T2) in the experimental and control groups.

The results of the general linear model analysis are shown in Table 3. There were no statistically significant differences between the groups for the last two IGT blocks of 20 choices (second half of the task) at any point in time (T1-T0: t=-1.44, df=4, p=0.89, T2-T1: t=1.20, df=4, p=0.24, T2- T0: t=0.78, df=4, p=0.44), and effect sizes were small (T1-T0: d’=0.14, 95% CI: -0.35 to 0.63; T2-T1: d’=0.17, 95% CI:-0.32 to 0.66; T2-T0: d’=0.30, 95% CI: -0.19 to 0.79). The single significant predictive variable was age, seen at T1 to T2 (t=2.06, df=4, p=0.04), with a positive beta (0.29), indicating that older children showed better performance.

T1-T0 T2-T1 T2-T0
Predictor variable* R2 ** p of Beta*** 95% CI for B *** Predictor variable* R2 ** p of Beta*** 95% CI for B *** Predictor variable* R2 ** p of Beta*** 95% CI for B ***
Iowa Gambling Task (IGT): total net score, last 40 cards - -0.03 0.89 -6.07 to 5.26 age 0.02 0.24 -2.48 to 9.81 - -0.02 0.44 -4.89 to 11.04
Theory of Mind composite score - -0.08 0.45 -0.60 to 0.27 - -0.08 0.93 -0.43 to 0.40 - -0.10 0.57 -0.64 to 0.35

Table 3: Regression analysis for differences in hot EFs at T1-T0, T2-T1 and T2-T0.

No statistically significant differences between the groups for ToM composite score were recorded at any point in time (T1-T0: t=-0.76, df=4, p=0.45; T2-T1: t=-0.09, df=4, p=0.93; T2-T0: t=-0.58, df=4, p=0.57), and effect sizes were small (T1-T0: d’=-0.18 95% CI:-0.67 to 0.31; T2-T1: d’=-0.23, 95% CI:-0.72 to 0.26) or small to moderate (T2-T0: d’=-0.41, 95% CI:-0.90 to 0.08) (Table 3). The same analysis produced similar results when ToM variables were considered separately (Folk Psychology Test: T1-T0: p=0.77; T2-T1: p=0.98; T2- T0: p=0.77; Happé’s Strange Stories: T1-T0: p=0.30; T2-T1: p=0.66; T2-T0: p=0.37). The complete results of this analysis are not included, but are available upon request.


The results obtained in this study indicate, firstly, that WM relates differently with the two hot EFs evaluated, as WM and ToM show a correlation, but WM and decision-making do not. Secondly, an intervention using the Robomemo® CWMT in a sample of children with ADHD yielded no far-transfer effects post-training or at 6-months’ follow-up on hot EFs, decision-making, or advanced affective and cognitive ToM. To our knowledge, this is the first study analyzing the effectiveness of cognitive training on hot EF decision-making and ToM deficits in ADHD.

The correlational analyses performed between WM and ToM and decision-making at baseline makes it possible to elucidate different reasons why no such far-transfer effects were found. On the one hand, the results indicate that the lack of post-training improvement in decisionmaking was due to the absence of a relationship between WM and decision-making in ADHD. Additionally, our results indicate that ToM and WM are related in the sample of children with ADHD analyzed and, therefore, that CWMT is not effective in improving these cognitive skills and does not show far-transfer effects on ToM in ADHD.

The relationship between WM and ToM found is consistent with other results reported in the literature in subjects with normal development [40,41,48-51], and in subjects with attention and conduct problems [93], but to our knowledge, this is the first study to demonstrate a relationship between WM and ToM in ADHD. Although other studies have found a relationship between other EFs and ToM in ADHD [29,44-47], the relationship between WM and ToM seen in this study highlights the primary role of WM deficits in this neurodevelopmental disorder.

Despite the relationship between WM and ToM, CWMT does not improve ToM post-training or at 6-months’ follow-up in a sample of children with ADHD, which suggests that CWMT does not produce far-transfer effects in this cognitive skill. This absence of far-transfer effects was not explained by a lack of near-transfer WM improvements. In a previous publication with the same sample [81] in a randomized, doubleblind, placebo-controlled, parallel-group clinical trial with an active control group and a 6-month post-intervention follow-up, CWMT was seen to improve post-training WM with a large effect size, and the improvement remained significant over the long term [81]. Further, other fartransfer effects after CWMT were found, such as short and long term improvements in cool EFs, ADHD symptoms, and functional impairment related to school learning [81]. In contrast, other studies with ADHD samples have not found fartransfer effects with CWMT [94-97]. This has seriously questioned the effectiveness of such training, because finding evidence of far-transfer effects in cognitive training is by far the aspect considered most relevant to demonstrate its effectiveness [70]. Furthermore, some authors have noted methodological limitations in research on Cogmed [79,98,99], thus generating much controversy in the literature. To our knowledge, this is the first study analyzing the effectiveness of CWMT on ToM deficits in ADHD.

The absence of far-transfer effects in ToM in this study may be related to the specificity of the stimuli used in the CWMT. In one study [100], WM training with neutral material was compared to training with emotional material; only WM training including emotional material produced transfer to an affective executive control task (emotional Stroop task). The authors concluded that studies relying solely on neutral material may fail to target processes specific to the manipulation and processing of affective information; hence, affective effects would be selective to affective executive training [100].

More speculatively, the results of our study could indicate deficits in WM do not have a causal relationship with deficits in ToM in ADHD. The correlation between ToM and WM found in this study, and in previous ones, including longitudinal correlations, are an insufficient demonstration of causality, since any observed relationship may be mediated by some unknown third factor. The design used here (randomized controlled trial) allowed us to explore the possible existence of a causal relationship between WM and ToM and decision-making [101]. It is possible that the reason of the lack of improvement in ToM after cognitive WM training was that WM and ToM do not have a causal relationship. ToM deficits in ADHD may be causally related to other cognitive deficits [38,52,57,66,102] or may reflect primary difficulties, but not secondary consequences of more general cognitive dysfunctions [27]. Additional randomized trials with cognitive WM training that evaluate the effects on ToM are needed to confirm this possibility.

The absence of a relationship between WM and decision-making is consistent with results from other correlational studies [65,66,103] and also with results from other studies using an experimental design. For example, in a study with methadone maintenance patients, CWMT did not improve decision-making [104]. All these results are consistent with the absence of a relationship between WM and decision-making and with separable pathway models in ADHD [105-107], which include the dissociable contributions of (cool) executive dysfunction and motivational dysfunction. For example, in the dual-pathway model proposed by Sonuga-Barke [108], two dissociable neurodevelopmental pathways can lead to ADHD: The first is the executive dysfunction pathway, a top-down dysregulation characterized by poor inhibitory control, set-shifting, and reorienting of attentional resources. This pathway is subserved by cortical and subcortical networks (dorsolateral prefrontal cortex, dorsal anterior cingulate cortex, supramarginal gyrus, dorsal caudate nucleus, frontal eye fields, and supplementary motor cortex) [109]. The second is the motivational dysfunction pathway, a bottom-up dysregulation characterized by delay aversion associated with fundamental alterations in reward mechanisms [108]. It is subserved by frontolimbic circuits (subgenual and orbitofrontal cortices, amygdala, hippocampus, and ventral striatum) [110]. Additionally, the results obtained in this study,together with those from a previous report [81], indicate that CWMT can improve cool EFs but not decision-making in ADHD and, therefore, these two pathways show different responses to treatment.

Another explanation for the present results is that WM resources are necessary but not sufficient for the development of decision-making; that is, these cognitive functions develop independently from WM, but WM is relevant to the expression or application of these skills [64]. This hypothesis is based on a series of studies showing that an overload in WM worsens performance in tasks such as decision-making [62,63]. The design used in this study does not rule out this possibility.

Another possible explanation for the absence of effects of training on hot EFs could be related to the characteristics of the training used. In the experimental group, the level of difficulty was automatically adjusted to each participant’s performance, leading to prolonged cognitive demand that can be frustrating [111]. This could have minimized the effects of training on hot EFs, since children with ADHD have higher cognitive difficulties in situations that generate anger, frustration or negative emotions [112,113].

This study has some limitations: We cannot ensure that there are ToM and decision-making deficits in the sample used in this study because we did not have a sample of healthy comparison subjects. Although there is abundant evidence of the existence of hot EF deficits in ADHD, consensus on this issue is incomplete. Some studies have found no deficits in decisionmaking [20,22] or ToM [18,19,21] in ADHD. Limitations related to the sensitivity and ecological validity of some measures may explain the inconsistencies described in the literature, especially the ToM measures, which probably fail to capture the complexities of social interaction in the real world [114].

Several investigators in a research line focusing on differences between ADHD and conduct disorder based on underlying brain substrates have reported that hot EF deficits are specific to conduct disorder, but not to ADHD [115], although Groen [33] report evidence to support the contrary, for example. The absence of deficits in these areas could explain why improvements were not observed with training.

Linking to this argument, it may be that inclusion of ADHD children with comorbid disruptive behavior disorders in the sample hampered detection of changes in the outcome measures, since the skills related to hot EFs may differ in those disorders [25,115]. Nonetheless, the statistical analyses controlled for comorbidity with oppositional defiant disorder.

It may be difficult to draw conclusions about the relationship between WM and ToM and decision-making because these skills may continue to develop until late adolescence [116,117]. This notion is partially supported by the results of our study, because older children showed less difficulty in a decision-making task from post-training to 6 months’ follow-up. Further, some authors have argued that the IGT is too difficult for children [20], although it has been used in other studies in this population. Perhaps the inclusion of older subjects would have provided clearer results.

Due to the comprehensive evaluation used, we had to deal with the risk of committing a Type I error if we analyzed the measures separately. On the other hand, we risked committing a Type II error if we corrected for multiple comparisons using strict criteria. Instead, we chose to compute robust composite measures when possible. The analyses were not conducted on an intent-totreat basis, but rather as complete case analyses.

The results cannot be generalized to ADHD children with IQ<80, to children with comorbidities other than disruptive behavior disorders or elimination disorders, to children whose educational or socioeconomic context would make it unlikely for families to comply with the treatment procedure, to children under < 7 or > 12 years of age, or to children who have already received psychological or pharmacological treatment for ADHD.


Conclusions Robomemo® CWMT did not improve hot EFs decision-making and ToM in a sample of ADHD children at post-training or at 6 months of follow-up. This is explained primarily by the absence of relationship between decision-making and WM in this sample of children with ADHD, which in turn supports the view that different pathways exist in ADHD, with dissociable contributions of decision-making and cool EF deficits that respond differently to WM training. Secondly, because the results indicate the presence of a relationship between WM and ToM, the lack of improvement in this cognitive ability posttraining and at the 6-month follow-up seems to indicate that CWMT does not produce far-transfer effects on ToM. It should be noted, however, that there are other possible explanations (such as lack of a causal relationship between WM and ToM) and, consequently, the results require replication.


Maribel Ahuir, Llanos Artigao, Clara Barba, Andrea Bracho, Bernat Carreras, Noemi Carrillo, Marta Doñate, Cristina Enero, Alejandra Escura, Adrian Gaitan, Javi Sanchez, Pablo Vidal-Ribas, Maria Teresa Ordeig, Sylva-Astrik Torossian, Celine Cavallo, Helen Casas.

This study received financial support through the award 22è PREMI FERRAN SALSAS I ROIG – Salut Mental i Comunitat granted by the City Council of Rubi (Spain) in 2010.