Clinical biomarkers of response to neoadjuvant endocrine therapy in breast cancer: exploring the potential of gene expression data integration
Turnbull, Arran Kristian
Introduction Aromatase inhibitors (AIs) have an established role in the treatment of estrogen receptor alpha positive (ER+) post-menopausal breast cancer. However, response rates are only 50-70% in the neoadjuvant setting and lower in advanced disease. There is a need to identify pre- or early on-treatment biomarkers to predict sensitivity which outperform those currently used, in a move towards stratified treatments and improved patient care. Given the heterogeneity known to exist in the breast cancer population, and the limited availability of matched pre- and on-treatment clinical material, this study also sought to develop novel data integration approaches allowing for the inclusion of similar previously published datasets, thus maximising the power of this study. Experimental Design Pre- and on-treatment (at 14 days and 3-months) biopsies were obtained from 34 postmenopausal women with ER+ breast cancer receiving 3 months of neoadjuvant letrozole. Illumina Beadarray gene expression data from these samples were combined with Affymetrix GeneChip data from a similar published study (n=55) and crossplatform integration approaches were evaluated. Dynamic clinical response was assessed for each patient from periodic 3D ultrasound measurements during treatment. Results Despite intrinsic differences between different microarray technologies, suitably similar studies can be directly integrated for robust and meaningful meta-analysis with improved statistical power. After mapping probe sequences to Ensembl genes it was demonstrated that, ComBat and cross platform normalisation (XPN), significantly outperform mean-centering and distance-weighted discrimination (DWD) in terms of minimising inter-platform variance. In particular it was observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. A pipeline for the successful integration of microarray datasets from different platforms was developed. Using this approach a classifier of clinical response to endocrine therapy in the neoadjuvant setting based on the expression of 4 genes was developed which predicted response with 96% and 91% accuracy in training (n=73) and independent validation (n=44) datasets respectively. An early on-treatment biopsy was found to improve predictive power in addition to pre-treatment alone. Conclusions Using a novel data integration approach developed as part of this study, a model comprising 4 novel biomarkers for accurate and robust prediction of clinical response to AIs by two weeks of treatment has been generated and validated. On-going work will investigate the applicability to other anti-estrogens, and the adjuvant setting and will assess the potential for a new therapy response test.