On the genetics of intermediate phenotypes and their utility
Item Status
Embargo End Date
Date
Authors
Abstract
The vast majority of common disease-associated genetic variation is non-coding. However,
the route by which non-coding genetic variation influences disease susceptibility is largely
unknown. The dissection of the genetic control of variation in intermediate phenotypes,
such as protein abundance or DNA methylation status, represents an important method to
interrogate the pathway between genotype and phenotype.
Using array-based technologies, I assessed the genetic associations of 573,027 CpG sites in
5,101 individuals, and 249 plasma proteins in two cohorts, one of 909 and the other of 998
individuals. In addition, using mass-spectrometry, I assessed the genetic association for
4,433 proteins in the peripheral blood mononuclear cells of 251 individuals. These analyses
generated a wealth of genetic associations that were further exploited in a number of ways,
including Mendelian Randomisation, co-localisation with expression (RNA) quantitative trait
loci (eQTLs), and enrichment analyses.
Using the genetic associations of the 249 plasma proteins, I performed proteome-by-phenome Mendelian Randomisation and demonstrated 509 putative causal links between
various proteins and outcome diseases and traits, including such links to cardiovascular
disease and schizophrenia. However, total plasma protein abundance derives from multiple
sources and is unlikely to be representative of any single cell-type or tissue. Therefore, in an
exploratory analysis, I demonstrate the feasibility of studying the cellular proteome of
peripheral blood mononuclear cells using mass-spectrometry. Mass-spectrometry proteomics provides a depth of coverage of the proteome not currently possible with other
technologies, as well as enabling the possibility of additional complementary future
analyses, for example, that of protein post-translational modifications.
I identified potential molecular intermediates mediating inter-chromosomal methylation
quantitative trait loci (meQTLs) by assessing their co-localisation with locally- acting eQTLs. I
found strong enrichment for genes encoding C2H2-ZF transcription factors, especially those
containing a Krüppel associated box (KRAB) domain. In addition, I identified DNA
methylation affected by dominance inter-chromosomal meQTL in the binding sites of many
transcription factors and associated proteins.
Collectively, these analyses represent an assessment of the genetic control of plasma
proteins and DNA methylation, with projection onto disease and other human traits. In
addition, I lay the foundation for a much larger population-scale analysis of the cellular
proteome of peripheral blood mononuclear cells to unprecedented depth: data acquisition
for which is currently ongoing.
This item appears in the following Collection(s)

