Medical data management with XNAT: from study organisation to distributed processing with OpenMOLE

21st October 2016

Held in conjunction with the 2016 MICCAI international conference

MICCAI16 logo

Outline of the event topic and scope

This tutorial will introduce attendees with XNAT, the widely-used medical imaging informatics platform. XNAT is designed to help manage medical imaging projects by serving as a central data store and workflow system. It natively deals with different user-level permissions to access the data, and provides a simple interface to run processing pipelines on the stored data.

We will introduce XNAT’s programming interfaces and a high-level Python API (Application Programming Interface) that interacts with XNAT through its REST API and integrates seamlessly in Python applications. This Python wrapper enables XNAT to be integrated in a fully automated pipeline.

Part of the tutorial will demonstrate how the OpenMOLE workflow management system can be used to implement XNAT pipelines. OpenMOLE is a scientific workflow management system with a strong emphasis on workload distribution. The OpenMOLE platform is formed of a combination of 1) reusable cutting edge methods and exploratory algorithms, 2) expressed using a high level workflow formalism and 3) exploiting distributed computing (clusters, grids, clouds) to scale up to the needs of real world scientific experiments.

Objectives and relevance to MICCAI

Medical research is increasingly dependent on very large data sets. These data are at least shared among the partners of the projects, or at a larger-scale in the case of Open Data policies. The MICCAI community will benefit from a tutorial demonstrating the management of medical images and projects using one of the most adopted platforms: XNAT.

Acquired data are then usually exploited at two levels: one is targeted queries on a particular subject or a global overview of the dataset, and the other is automated queries as part of a processing pipeline. XNAT addresses both of these uses cases thanks to its web interface and available programming APIs and wrappers.

The size of today's datasets makes it impossible to study them on a single machine due to memory and processing time constraints. A pipeline engine such as OpenMOLE abstracts researchers from the complexity of distributed computing environments, and enables them to reuse the tools from their prototypes at a large scale. The workflow formalism introduced by OpenMOLE is an efficient way to introduce reproducibility in experiments as workflows can be shared between teams and re-executed on different computing environments.

At the end of the tutorial, attendees will own the basic skills to navigate the XNAT web interface, to query the storage platform from their python scripts, and to delegate the workload of their custom processing pipelines to a wide range of distributed computing environments using OpenMOLE.

Brain connectivity network analysis has the potential to improve understanding of neural processes and neurological diseases. Large-scale imaging projects such as the (developing) Human Connectome are collecting vast imaging databases of brain connectivity data for young adults, neonates and fetuses. Using these studies to build a common brain connectome within a population would allow us to identify abnormal connectivity patterns and link them with environmental, cognitive or genetic knowledge.

At the era of Big Data and High Performance Computing, large scientific datasets made available to the public must be accompanied by original and efficient methods to process them, store them reliably and make them available to the widest number of scientists. This requires the development of tools and methods that not only allow sensible analysis on the single subject level, but also allow robust group-wise analysis.

XNAT logo OpenMOLE logo