Development of a statistical method for the identification of gene-environment interactions
Item Status
Embargo End Date
Date
Authors
Abstract
In order to understand common, complex disease it is necessary to consider not just
genetic risks and environmental risks, but also the interplay between them. This thesis
aims to develop methodology for the detection of gene-environment interactions
specifically; both by looking at the strengths and weaknesses of traditional approaches
and through the development and testing of a novel statistical method. Developments
in genotyping technology enable researchers to collect large volumes of
polymorphisms in human genes, yet very few statistical methods are able to handle
the volume, variation and complexity of this data, especially in combination with
environmental risk factors. Interactions between genes and the environment are often
subject to the curse of dimensionality, with each new variable increasing the potential
number of interactions exponentially, leading to low power and a high false positive
rate.
The Mixed Tree Method (MTM) exploits the differences between environmental and
genetic variables, by selecting the most appropriate features from conventional
methods (including recursive partitioning, random forests and logistic regression) and
combining them with new comparison algorithms which rank the genetic variables by
the likelihood that they interact with the environmental variable under study.
Results show the MTM to be as effective as the most successful current method for
identification of interactions, but maintaining a much lower false positive rate and
computational burden. As the number of SNPs in the dataset increases, the success of
MTM compared to other methods becomes greater while the comparator approaches
exhibit computational problems and rapidly increasing processing times. The MTM is
also applied to a colorectal cancer dataset to show its use in a practical setting. The
results together suggest that MTM could be a useful strategy for identifying gene environment
interactions in future studies into complex disease.
This item appears in the following Collection(s)

