"False"
Skip to content
printicon
Main menu hidden.
Published: 2010-08-09

Umeå researchers outline “best practice” in analysis of gene expression data

NEWS Gene expression profiling is among the most commonly used analytical tools in biomedical research and is applied to predict preclinical and clinical endpoints, e.g. diagnosis of disease, risk assessment and response to treatment. However, the reliability of these predictions has not yet been established.

Johan Trygg and Max Bylesjö, researchers at Umeå University, have participated in a large international project (MAQC-II) aimed to examine and generate “best practice” protocols in data analysis for predicting clinical endpoints based on gene expression data. This project was coordinated by the United States Food and Drug Administration (FDA) and is part of its recent launch of a “Critical Path Initiative” to medical product development. The Umeå University researchers contributed with their expertise in the multivariate data analysis technique known as chemometrics. The results have been published in the latest issue of the journal Nature Biotechnology.

(In photo: Johan Trygg, associate professor)

Gene expression data can be used for diagnosis, early detection (screening) and prediction of response to treatment. However, the reliability of the predicted clinical endpoint can profoundly influence the results. In this project, gene expression profiles for 13 different endpoints from more than 3100 samples, including breast and lung cancer, were analyzed by 36 independent analysis teams that generated more than 30,000 prediction models for these 13 endpoints. This provides a unique resource for regulatory agencies and scientists.

“Even though the primary goal was not to evaluate individual contributions, I was very happy to see that our OPLS prediction models did so well, and ranked highest for one of the 13 endpoints,” says Johan Trygg, associate professor, Computational Life Science Cluster (CLiC) at Umeå University, coordinator of the Swedish effort.

A large effort was put into the structure and review of the data analysis protocol, generation of 36 candidate models and the statistical validation, including blinded validation sets. Three observations were particularly highlighted. (1) The performance of the prediction models depend largely on the quality and relevance of data (2) The experience and proficiency of the data analysis team are crucial factors for success (3) Different prediction methods yield similar prediction results.

Understanding the limitations using gene expression data for predicting clinical endpoints is critical to the formulation of general guidelines and procedures for safe and effective use, e.g. development of diagnostic tests. The “best practice” guidelines provided by this unprecedented collaboration provide a solid foundation for other types of high-dimensional biological data such as proteins and metabolites to be applied for personalized medicine.

Reference: Leming Shi et al., The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models, Nature Biotechnology, Aug 09 2010. doi:10.1038/nbt.1665
Published online 30 July 2010

The MicroArray Quality Control (MAQC) consortium involves 36 different analysis teams represented by almost 100 regulatory agencies, academic institutions and industry. Umeå University/CLiC is represented by Associate Professor Johan Trygg, CLiC, Department of Chemistry and Dr. Max Bylesjö, currently with Almac Diagnostics Ltd, UK. Together they collaborated with Dr. Andreas Scherer, Spheromics, Finland.

Computational Life Science Cluster (CLiC), is a joint effort to stimulate, organize and advance bioinformatics and computational life sciences at Umeå University. More than 30 researchers are represented with expertise in a wide range of areas, including design and analysis of metabolomics, gene expression data, proteomics, and array/deep sequencing-based technologies, protein sequence/structure analysis, chemometrics and systems biology.

For more information, please contact: Johan Trygg, Associate Professor Computational Life Science Cluster (CLiC), Department of Chemistry, Umeå University Phone: +46 90 786 69 17 or mobile +46 730 647 137
Email: johan.trygg@chem.umu.se

Max Bylesjö, Ph.D. (in photo to right) Almac Diagnostics Ltd, Craigavon, UK
Phone: +44 28 3839 7575