Investigating the genetic control of complex traits
Oppong, Richard Fordjour
One aspect of the effort to disentangle the genetic component of complex traits is the mapping of genetic loci that are associated with variations in complex traits. The model used to assess these genetic associations is vital because any inaccurate or biased estimate obtained will undermine the effort to unravel the genetic component of complex traits. This project, therefore, explores in detail the performance of existing analytical methods and develops novel statistical methods to bring to the fore how violations of the model assumptions impact model estimates. Using a linear mixed effects model, a genome-wide association analysis of eight urine phenotypes was performed on 2,934 individuals, where it was shown that violations of the model assumption of normality of residuals impact heritability estimates and subsequent genetic associations. The issue of normality was explored further in a simulation study that used a Bayesian mixture model that assumed that the effect SNPs of complex traits are drawn from a mixture of four normal distributions instead of one. The Bayesian mixture model was applied to the urine phenotypes and it was shown that the effect SNPs are constituted from more than one normal distribution and over 99% of genotyped SNPs were sampled to have no effect on the urine traits. The urine traits were also shown to have a polygenic architecture with much of the additive genetic variance being explained by SNPs with small to moderate effects. Departing from SNP based approaches, I present a novel analytical approach that utilises a relationship matrix that is based on natural haplotype blocks defined by recombination boundaries in the genome. This method was developed on the premise that haplotypes provide a better strategy for capturing the true genomic relationship amongst individuals in the presence of rare variants and thus provide real benefit over SNPs in recovering much of the hidden heritability of complex traits and in the identification of novel gene variants. The method was implemented on simulated data and was explored in detail. The results from the simulation showed that the haplotype approach complemented existing GWAS analytical approaches by capturing regions in the genome contributing to the phenotypic variation that existing GWAS methods fail to capture. It was further demonstrated that there are real benefits to be gained from this approach by applying it to real data from circa 20,000 individuals from the Generation Scotland: Scottish Family Health Study. Height and major depressive disorder were analysed, and novel genomic regions were identified for both traits. In conclusion, this thesis shows that inappropriate use of analytical models can impact results which may have consequences on conclusions we draw from genetic association studies. Also, the thesis shows the benefits of implementing models that capture important features of the underlying architecture driving the variation in complex traits. Lastly, the thesis also demonstrates that haplotype methods can complement conventional SNP-based methods in the bid to understand the genetic control of complex traits.