Brain decoding and networks studies: well beyond standard analysis in fMRI with Python and scikits.learn

Gael Varoquaux (Unicog INSERM, Neurospin, France), Alexandre Gramfort (Martinos Center, Harvard Medical School, MGH, Boston), Michel Vincent (Parietal INRIA, Neurospin, France), Fabian Pedregosa (Parietal INRIA, Neurospin, France), Bertrand Thirion (Parietal INRIA, Neurospin, France)

Brain function arises from the interaction of functionally-specialized regions. However, the well-established standard analysis framework performs statistical analysis separately on each voxel of the functional brain images. This restriction arises from the difficulty of performing multivariate analysis in the small-sample limit, also known as the curse of dimensionality. When considering full-brain images comprising many features, there is an explosion of the number of possible statistical parameters or hypothesis to investigate. Addressing this limitation is a topic of very active research, that applies modern multivariate analysis to fMRI data.

The development of advanced fMRI data analysis often draws from machine learning. Indeed many problems that arise, such as pattern analysis, cluster analysis, penalized regression, graphical modeling, are common and well-studied learning problems that have applications in a variety of data-intensive fields. However, these methods often rely on challenging mathematical derivations or non trivial optimization algorithms. As a result, investigators in neuroscience tend to limit themselves to build-blocks available off the shelf, even if they do not answer their specific needs. Implementing and comparing the wide variety of algorithms published in the machine learning literature is a daunting task.

To this extend, we present the scikits.learn, a Python toolbox for machine learning, and its use combined with nipy -neuroimaging in Python- and Mayavi -3D scientific visualization- for advanced fMRI analysis. The scikits.learn does not provide end-user tools for neuroimaging data analysis, but rather a large amount of elementary components, machine learning algorithms, that can be assembled to build a model or answer a neuroimaging problem. The scikits.learn is driven by a large community of experts working of different data intensive fields, such as text mining, image processing, speech recognition. It strives to present high-quality implementation of state-of-the-art methods and models used in these different fields under a common programming interface to facilitate comparisons. It is distributed under the very liberal BSD license and comes with a length documentation that can be browsed on

We have used the scikits.learn to address various aspects of neuroimaging data processing. In supervised learning settings, we have developed algorithms for decoding, i.e. prediction of behavior from brain images [1,2]. In unsupervised settings, we have developed models and algorithms for data-driven extraction of brain networks [3] using ICA, or to segment brain regions with sparse dictionary learning [4]. Finally, we have implemented population-level graphical models of functionality connectivity with sparse inverse covariance estimators [5]. We believe that, by exposing new algorithms to the neuroscience data processing community, a domain-agnostic package such as the scikits.learn can speed up the development of new methods and models.


  1. V. Michel, A. Gramfort, G. Varoquaux, E. Eger and B. Thirion, Total variation regularization for fMRI-based prediction of behaviour., IEEE Transactions on Medical Imaging, In press.
  2. V. Michel, A. Gramfort, G. Varoquaux, E. Eger, C. Keribin and B. Thirion, A supervised clustering approach for fMRI-based inference of brain states, Pattern Recognition, In press
  3. G. Varoquaux, S. Sadaghiani, P. Pinel, A. Kleinschmidt, J.B. Poline and B. Thirion, A group model for stable multi-subject ICA on fMRI datasets. NeuroImage 51, pp.288-99, 2011
  4. G. Varoquaux, A. Gramfort, F. Pedregosa, V. Michel and B. Thirion, Multi-Subject dictionary learning to segment an atlas of brain spontaneous activity, Information Processing in Medical Imaging 2011
  5. G. Varoquaux, A. Gramfort, and J.B. Poline and B. Thirion, Brain covariance selection: better individual functional connectivity models using population prior, Advances in Neural Information Processing Systems 2010
Preferred presentation format: Poster
Topic: Neuroimaging

Document Actions