Development of a statistical method for the identification of gene-environment interactions
dc.contributor.advisor
Anderson, Niall
en
dc.contributor.advisor
Campbell, Harry
en
dc.contributor.advisor
Wild, Sarah
en
dc.contributor.author
Golding, Pauline Lindsay
en
dc.contributor.sponsor
Chief Scientist Office (CSO)
en
dc.date.accessioned
2012-11-15T14:18:18Z
dc.date.available
2012-11-15T14:18:18Z
dc.date.issued
2012-06-30
dc.description.abstract
In order to understand common, complex disease it is necessary to consider not just
genetic risks and environmental risks, but also the interplay between them. This thesis
aims to develop methodology for the detection of gene-environment interactions
specifically; both by looking at the strengths and weaknesses of traditional approaches
and through the development and testing of a novel statistical method. Developments
in genotyping technology enable researchers to collect large volumes of
polymorphisms in human genes, yet very few statistical methods are able to handle
the volume, variation and complexity of this data, especially in combination with
environmental risk factors. Interactions between genes and the environment are often
subject to the curse of dimensionality, with each new variable increasing the potential
number of interactions exponentially, leading to low power and a high false positive
rate.
The Mixed Tree Method (MTM) exploits the differences between environmental and
genetic variables, by selecting the most appropriate features from conventional
methods (including recursive partitioning, random forests and logistic regression) and
combining them with new comparison algorithms which rank the genetic variables by
the likelihood that they interact with the environmental variable under study.
Results show the MTM to be as effective as the most successful current method for
identification of interactions, but maintaining a much lower false positive rate and
computational burden. As the number of SNPs in the dataset increases, the success of
MTM compared to other methods becomes greater while the comparator approaches
exhibit computational problems and rapidly increasing processing times. The MTM is
also applied to a colorectal cancer dataset to show its use in a practical setting. The
results together suggest that MTM could be a useful strategy for identifying gene environment
interactions in future studies into complex disease.
en
dc.identifier.uri
http://hdl.handle.net/1842/6520
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.subject
statistics
en
dc.subject
genetics
en
dc.title
Development of a statistical method for the identification of gene-environment interactions
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
This item appears in the following Collection(s)

