News The latest developments in autism research.
Profiles Portraits of scientists who are making a mark on autism research.
Toolbox Emerging tools and techniques that may advance autism research.
Spotted A roundup of autism papers and media mentions you may have missed.
Opinion Conversations on the science of autism research.
Viewpoint Expert opinions on trends and controversies in autism research.
Columnists Dispatches from experts on various facets of autism.
Crosstalk Debates and conversations about timely topics in autism.
Reviews Exploring the intersection of autism and the arts.
Q&A Conversations with experts about noteworthy topics in autism.
Deep Dive In-depth analysis of important topics in autism.
Special Reports Curated collections of articles on special topics in autism.
Webinars Presentations by leading experts on their latest research.

Pooling autism brain imaging data can distort results

by  /  31 August 2015
Scanner scramble: Three scanners tell their own stories (top) about how the thickness of the brain’s outer rind changes with age. The results from the machines vary for multiple brain regions (bottom, orange.)

Guillaume Auzias

Combining data from different brain scanners can lead to false findings if variation between the machines is not taken into account, suggests a new study. The work helps explain why results detailing the neuroanatomy of autism often contradict one another or are not replicated1.

Merging imaging data from different labs offers a possible solution to a perennial problem: recruiting enough participants to get a statistically significant result. Yet scanners vary along dimensions such as the strength of the magnetic field they generate. What’s more, research teams may use different programs to extract anatomical information from the images.

Although investigators who use magnetic resonance imaging (MRI) to examine brain structure knew about this variability, until now, no one had quantified it in studies of autism or in children.

Christine Deruelle and her colleagues found that discrepancies in the measurements from scanners at different imaging centers can introduce huge variation in the data about brain structure in autism. The results were published 22 July in the IEEE Journal of Biomedical and Health Informatics.

“For the first time, we show that these recording parameters have a major effect,” says Deruelle, research director at the National Center for Scientific Research in Marseille, France. “This result explains the numerous inconsistencies found in the literature.”

Other researchers say this sort of analysis is long overdue.

“This study is to be applauded for highlighting the critical importance of accounting for the effects that different scanners have on the fidelity of brain imaging signals,” says Nicholas Lange, associate professor of psychiatry at Harvard University, who was not involved in the research. “It should give brain scientists pause when making claims based on multi-scanner data.”

Dubious datasets:

Deruelle and her colleagues combed through more than 1,000 images from the Autism Brain Imaging Data Exchange (ABIDE). This database includes MRI data from people with and without autism, compiled from 17 different research centers.

They narrowed their sample to 159 people, about half of whom have autism, from three MRI centers. They restricted the study to right-handed boys and men, aged 8 to 23 years, to avoid differences based on gender and handedness, says Guillaume Auzias, postdoctoral fellow in Deruelle’s lab at Aix-Marseille University in Marseille.

The team used a standard statistical method to parse data on the thickness of the cerebral cortex, the brain’s outer rind, at various locations. This step eliminates any variability from the use of different algorithms to measure the depth. What remains, then, are differences related to scanner quirks or to autism itself.

The finding “should give brain scientists pause when making claims based on multi-scanner data.”

Crunching the numbers to isolate the effects of autism, the researchers found that the brains of people with autism consistently differ from those of their typically developing peers only in the thickness of the motor cortex — a region that controls movement. This finding jibes with multiple studies that link an abnormal thickness in this area with the disorder2.

By contrast, they found that variations among scanners at least partially explain the inconsistencies in reports regarding the thickness of the frontal cortex. Abnormalities in this region, particularly in sections governing decision-making and planning, have been thought to underlie some of the core features of autism.

The evidence so far conflicts on what the abnormality might be. Some studies indicate that the frontal cortex is unusually thick in people with autism, whereas others suggest the reverse. The new analysis shows, however, that the thickness in the frontal cortex does not differ between controls and affected individuals.

The researchers also confirmed that age is a factor. In general, the cerebral cortex thins as the brain matures. One 2014 study reported that in people with autism, the cortex shrinks unusually rapidly in late childhood and then more slowly in adulthood than in controls. But the new analysis indicates no difference in thinning over time between the groups. Again, scanner-related variations can at least partially account for the lack of consensus, Auzias says.

For certain brain regions, such as the insular cortex, which is involved in emotion processing, the researchers found that some scanners register a larger age-related change than others. Depending on the scanner, “the way the cortical thickness evolves with age is not the same,” Auzias says.

The findings indicate that the variations between scanners and the way they are used play an important but complex role in the data they produce.

“We all imagined that this was the case, and ABIDE allows us now to measure the extent of these differences,” says Roberto Toro, neuroscience researcher at the Institut Pasteur in Paris, who was not involved in the research.

Deruelle and her colleagues plan to extend their study to anatomical features such as the depth of cortical folds and the total surface area of the cortex. They expect that some of these results, such as shape of the cortical surface, are likely to be more consistent than cortical thickness. “These surface-related landmarks appear to be much more robust and reliable across MRI centers than the typically used indices such as cortical thickness or volume,” says Deruelle.

Still, any researcher using pooled MRI data needs to control for variability between scanners. One potential strategy would be to test each scanner using a reference substance — say, an inert material of constant volume, geometry and magnetic properties — for calibration purposes, Deruelle says. Ideally, she adds, investigators would also collect anatomical data at various time points from the same people at each site to control for differences.

  1. Auzias G. et al. IEEE J. Biomed. Health Inform. Epub ahead of print (2014) PubMed
  2. Nickl-Jockschat T. et al. Hum. Brain Mapp. 33, 1470-1489 (2012) PubMed
  • Pierre Bellec

    Very interesting article. I am surprised about the highlight “The finding should give brain scientists pause when making claims based on multi-scanner data.” One can always look for consistent effects despite systematic variations across scanners. Which is what the authors of the publication did. Those consistent effects are more likely to reflect true biological differences. From my perspective, the finding reported here should really give brain scientists pause when making claims based on single-scanner data. Their result may not generalize at all.

    • Roberto Toro

      Agree. There’s no reason why the datasets in ABIDE should be more or less “dubious” than those in the previous literature.

  • Planet Autism

    “They narrowed their sample to 159 people, about half of whom have autism, from three MRI centers. They restricted the study to right-handed boys and men, aged 8 to 23 years,”

    So yet ANOTHER study which has excluded females!!!!

    • Guillaume Auzias

      Yes, this is a problem in the research on autism, but there are some reasons to not mix males and females in such a work.
      In the particular case of ‘morphometry’ (analysis of the structure of the brain using MRI), it is known that important differences exist between males and females in healthy population. Because these differences are complex, mixing males and females would have impeded the interpretation of the effect of scanner, which was the main point of interest.
      This also holds for right-handed versus left-handed subjects.

  • Roberto Toro

    It is also very likely that the differences between centres in the paper by Auzias are due to the small sample sizes…

  • Roberto Toro

    A more detailed comment:

    It is often the case that contradictory finding are reported relative to the neuroanatomy of autism, which makes it difficult to make sense of the scientific literature. Auzias et al suggest that the two main reasons for these discrepancies may be (1) the differences in the cohorts recruited and (2) the differences in the methodologies used. I think that we need to add an important third one: small sample sizes (which we analysed and discussed in our paper in Biol. Psychiatry, attached).

    If the trait that we want to measure is very variable — and neuroanatomical traits are — we will require large cohorts to obtain a reliable measure. This is also the case when we want to measure a difference between two groups, for example, in cortical thickness or brain volume between persons with autism and controls. If the difference is large, a small group of subjects should be enough; but if it is small, we may required very large sample sizes. For the last 30 years, researchers studying the neuroanatomical correlates of ASD have often relied on extremely small cohorts, often of the order of 20 patients compared to 20 controls, or less. For reference, if we wanted to detect reliably the difference in weight between males and females — a strong and easily detectable difference — we would required at least 90 subjects!

    With small cohorts estimations are very variable and unreliable. For example, if you wanted to measure the difference in weight between males and females, but you only measured 3 males and 3 females, it would be very easy to find a sample were there is no significant difference between groups, or even a sample were females are heavier than males. This would be extremely unlikely if you had 1000 males and 1000 females (statistical power analysis shows that in the case of weight, with ~100 subjects you are sure that you will obtain the correct answer 80% of the times).

    Our field has realised today that we require cohorts much larger than those that have been traditionally recruited. In genetics, AGP, AGRE, and the Simons Simplex Cohort are examples of such cohorts, with several thousands of subjects. No such large cohorts exist currently to study the neuroanatomy of autism. This motivated Adriana Di Martino from NYU and collaborators to create ABIDE, an international effort to pool MRI data from 17 different research centres, providing open access to > 1000 patients and controls.

    In their article Auzias et all show that the differences among centres were very important. We all imagined that this was the case, and ABIDE allows us now to measure the extent of these differences. Whereas in the past the only way of comparing data from different centres was by reading (or better, meta-analysing) the results published in the literature, ABIDE makes it now possible to have access to the raw data, and to analyse more than 1000 subjects using the same methodology (which allows us in particular to alleviate the issue of publication bias: the fact that papers reporting significant results are much more likely to be published than those reporting non-significant results).

    Ideally, we would require a large, open access, cohort of several thousands of patients and controls, with data acquired using the same protocols. An alternative would be to agree on a “harmonisation” protocol, so that data acquired by different research groups would be more comparable. Concretely, this does not exist today, and ABIDE provides a viable, open access alternative to study a large cohort of patients and controls using homogeneous analysis methods, and without suffering from publication bias. The other alternative is to try to make sense of the published literature…

  • Paul Thompson

    Excellent article! One way to address this directly is meta-analysis, as the ENIGMA consortium is doing – see the first 2 papers here, on schizophrenia and major depression:
    We don’t assume that the effect sizes will be the same across scanners and cohorts, and it’s interesting to look at what is driving the differences.
    Paul Thompson,

  • Sue Gerrard

    Not sure it’s the differences between scanners so much as the differences between participants. Why is it so often assumed that everyone with autism has the same cause for their autism?

    • Roberto Toro

      ABIDE includes many results from different tests: ADOS (+ subscales), SRS (+ subscales), AQ, full IQ, verbal IQ, performance IQ, etc. These scores are not perfect, but they give researchers an idea of the differences among subjects within centres and across centres. The differences exist, but there’s still a large source of differences that may be due to something else. That something else may be the scanner, but also many other methodological parameters not directly related to the cognitive profile of the subjects.

  • Shree Vaidya

    Excellent article. It opens new eyes to the brain researchers. Thank you.

  • Guillaume Auzias

    This discussion is very interesting, thank you all!
    I entirely agree with the remarks of R.Toro on sample sizes.
    It seems important to point that it was not possible to include more subjects while keeping the match in age between subsamples across the three centers and also between patients and controls within each center at the same time; despite the +1,000 subjects available.
    For me, initiatives like ENIGMA can therefore afford to go further but attention must be paid to standardization of treatments and quality control, which is also an important confounding factors we controlled for in this work.

  • katiebelardi

    Brain imaging studies are fascinating, but there are many methodological limitations, above and beyond sampling (a very real challenge in the field) that readers should be aware of. Different scanners will produce different results making it messy to combine data sets. It’s similar to combining audio from two different audio recorder models. Differences are inevitable, especially when it comes to quality. These studies also do not include participants who are claustrophobic, thus excluding a fair number of individuals with ASD who may be more representative of the population.


Log in to your Spectrum Wiki account

Email Address:



Request your Spectrum Wiki account

Spectrum Wiki is a community of researchers affiliated with an academic or research institutions. To be considered for participation, please fill out this form and a member of our team will respond to your request.


Email Address:

Title and Lab:

Area of Expertise: