Application of gene-set analysis to identify the molecular genetic correlates of human cognitive abilities
Hill, William David
Individual differences across seemingly disparate cognitive tests are not independent. This general factor of cognitive ability allows around half of the variation in a diverse battery of cognitive tests to be explained in terms of individual differences along a single dimension. An individual's position on this dimension, as ascertained using standardised tests of cognitive ability (intellectual quotient (IQ) tests), has been shown to be predictive of important life events ranging from educational and occupational success, to enjoying good health and longevity. Genetic differences have been shown to be associated with differences in cognitive ability and recent molecular genetic research has demonstrated that variants in linkage disequilibrium with common single nucleotide polymorphisms (SNPs) can explain around 50% of the variation in general cognitive ability. The goal of this thesis was to build on these findings by applying gene-set analysis methods to examine genome-wide association data sets to test guided hypotheses regarding the mechanisms and genetic architecture of human cognitive differences. Gene set analysis is a method that can lead to an increase in statistical power and help derive functional meaning from the results of genome wide association studies (GWAS). Existing GWAS data sets provided by the Cognitive Ageing Genetics in England and Scotland (CAGES) consortium, the Brisbane Adolescent Twin Study (BATS) and the Norwegian Cognitive NeuroGenetics (NCNG) cohort were used. The individuals in each of these groups have also completed a battery of cognitive tests enabling the extraction of a general factor of fluid cognitive ability and a measure of crystallised ability. In Chapter 3, the role of synaptic plasticity was examined using data derived from proteomic experiments on human and animal brain tissue which details the molecular constituents of the postsynaptic density and the associated components of the glutamatergic synapse. These components include: the a-amino-3-hydroxy-5-methyl-4-isoxazoiepropionic acid receptor complex (AMPA-RC), the A-methyl-D-aspartate receptor complex (NMDARC), and the metabotropic glutamate 5 receptor complex (mGlu5-RC). Using a competitive test of enrichment it was shown that the genes responsible for the proteins of the NMDA-RC were associated with fluid cognitive ability. This study (published as Hill et al., 2014) indicates that individual differences in synaptic plasticity may underlie some of the differences in fluid cognitive ability. In Chapter 4, rather than using traditionally defined linear pathways, the focus was on a gene set created by grouping genes according to their cellular function. Linear pathways, such as the glutamatergic system share proteins, a property which can be exploited by utilising horizontal pathway analysis, also termed functional gene group analysis. In a functional gene group analysis genes are grouped according to their cellular function such as ligand gated ion channels, neurotransmitter metabolism, and G protein relays. This chapter (published as Hill et al., 2014) examined the role that heterotrimeric G proteins play in cognitive abilities as previous work has indicated a role for them in individual differences in human cognitive ability. The analyses carried out in this chapter indicate that whilst heterotrimeric G proteins may be required to engage in cognitive tasks, genetic variation in the genes that code for these proteins is not associated with normal variation in cognitive ability. Chapter 5 examined the role of functional SNPs, defined as those that have been shown to be able to alter protein expression. Previous research has shown an association between genotype and methylation status and between genotype and gene expression in human cortical tissue. Using the results of previous research, gene sets were assembled which detailed SNPs known to alter methylation status and gene expression in the frontal cortex, the temporal cortex, the pons, and the cerebellum. In addition, the bioinformatics database dbQsnp was mined to assemble a SNP set detailing SNPs in known promoter regions. Finally, a gene set was made using published literature to capture SNPs affecting microRNA. Two complementary statistical methods were used to examine these sets for an association with general cognitive ability. The results of these analyses indicate that these gene sets are not more associated with cognitive ability beyond what would be expected by chance. Chapter 6 exploits the current knowledge of the molecular genetics of non-syndromic autosomal recessive intellectual disability (NS-ARID). The 40 genes associated with NSARID have a large deleterious effect on cognitive ability and appear to do so without the cognitive deficit being the product of obvious pathology. These 40 NS-ARID genes were examined as a gene set for an enriched association with cognitive abilities. Additionally, the biological systems that these genes are involved in were examined using an automated literature mining tool. These systems were then examined for an enriched association with general cognitive ability. When examining the 40 NS-ARID genes as a set there was no evidence that they were associated with cognitive abilities. The results of the literature search provided 180 additional gene sets based on the relationship between the 40 NS-ARID genes. These gene sets were examined for an enriched association with cognitive ability where the sodium ion transporter gene set (G0:0006814) was shown to be significantly enriched in the CAGES data set, but not BATS data set, for fluid ability. This could indicate that whilst the same genes are not involved in both intellectual disabilities and in cognitive abilities, the genes that can contain mutations resulting in intellectual disabilities are found in pathways that govern the normal range of cognitive ability. The results of this thesis indicate that common SNPs which tag causal variants are not randomly distributed across the genome but are clustered in genes that work together as part of a larger mechanism. In addition this work provides working examples of how multiple data sources that can be utilised to construct gene sets designed to explore the known relationship between genotype and cognitive ability and to utilise GWAS data sets to prioritise groups of genes.