Application of gene-set analysis to identify the molecular genetic correlates of human cognitive abilities
View/ Open
Date
30/06/2015Author
Hill, William David
Metadata
Abstract
Individual differences across seemingly disparate cognitive tests are not independent.
This general factor of cognitive ability allows around half of the variation in a diverse battery
of cognitive tests to be explained in terms of individual differences along a single dimension.
An individual's position on this dimension, as ascertained using standardised tests of cognitive
ability (intellectual quotient (IQ) tests), has been shown to be predictive of important life
events ranging from educational and occupational success, to enjoying good health and
longevity. Genetic differences have been shown to be associated with differences in cognitive
ability and recent molecular genetic research has demonstrated that variants in linkage
disequilibrium with common single nucleotide polymorphisms (SNPs) can explain around
50% of the variation in general cognitive ability.
The goal of this thesis was to build on these findings by applying gene-set analysis
methods to examine genome-wide association data sets to test guided hypotheses regarding
the mechanisms and genetic architecture of human cognitive differences. Gene set analysis
is a method that can lead to an increase in statistical power and help derive functional
meaning from the results of genome wide association studies (GWAS). Existing GWAS
data sets provided by the Cognitive Ageing Genetics in England and Scotland (CAGES)
consortium, the Brisbane Adolescent Twin Study (BATS) and the Norwegian Cognitive
NeuroGenetics (NCNG) cohort were used. The individuals in each of these groups have
also completed a battery of cognitive tests enabling the extraction of a general factor of
fluid cognitive ability and a measure of crystallised ability.
In Chapter 3, the role of synaptic plasticity was examined using data derived from
proteomic experiments on human and animal brain tissue which details the molecular
constituents of the postsynaptic density and the associated components of the glutamatergic
synapse. These components include: the a-amino-3-hydroxy-5-methyl-4-isoxazoiepropionic
acid receptor complex (AMPA-RC), the A-methyl-D-aspartate receptor complex (NMDARC),
and the metabotropic glutamate 5 receptor complex (mGlu5-RC). Using a competitive
test of enrichment it was shown that the genes responsible for the proteins of the NMDA-RC
were associated with fluid cognitive ability. This study (published as Hill et al., 2014)
indicates that individual differences in synaptic plasticity may underlie some of the
differences in fluid cognitive ability.
In Chapter 4, rather than using traditionally defined linear pathways, the focus was
on a gene set created by grouping genes according to their cellular function. Linear pathways,
such as the glutamatergic system share proteins, a property which can be exploited by
utilising horizontal pathway analysis, also termed functional gene group analysis. In a
functional gene group analysis genes are grouped according to their cellular function such as
ligand gated ion channels, neurotransmitter metabolism, and G protein relays. This chapter
(published as Hill et al., 2014) examined the role that heterotrimeric G proteins play in
cognitive abilities as previous work has indicated a role for them in individual differences in
human cognitive ability. The analyses carried out in this chapter indicate that whilst
heterotrimeric G proteins may be required to engage in cognitive tasks, genetic variation in
the genes that code for these proteins is not associated with normal variation in cognitive
ability.
Chapter 5 examined the role of functional SNPs, defined as those that have been
shown to be able to alter protein expression. Previous research has shown an association
between genotype and methylation status and between genotype and gene expression in
human cortical tissue. Using the results of previous research, gene sets were assembled which
detailed SNPs known to alter methylation status and gene expression in the frontal cortex, the
temporal cortex, the pons, and the cerebellum. In addition, the bioinformatics database
dbQsnp was mined to assemble a SNP set detailing SNPs in known promoter regions.
Finally, a gene set was made using published literature to capture SNPs affecting microRNA.
Two complementary statistical methods were used to examine these sets for an association
with general cognitive ability. The results of these analyses indicate that these gene sets are
not more associated with cognitive ability beyond what would be expected by chance.
Chapter 6 exploits the current knowledge of the molecular genetics of non-syndromic
autosomal recessive intellectual disability (NS-ARID). The 40 genes associated with NSARID
have a large deleterious effect on cognitive ability and appear to do so without the
cognitive deficit being the product of obvious pathology. These 40 NS-ARID genes were
examined as a gene set for an enriched association with cognitive abilities. Additionally, the
biological systems that these genes are involved in were examined using an automated
literature mining tool. These systems were then examined for an enriched association with
general cognitive ability. When examining the 40 NS-ARID genes as a set there was no
evidence that they were associated with cognitive abilities. The results of the literature search
provided 180 additional gene sets based on the relationship between the 40 NS-ARID genes.
These gene sets were examined for an enriched association with cognitive ability where the
sodium ion transporter gene set (G0:0006814) was shown to be significantly enriched in the
CAGES data set, but not BATS data set, for fluid ability. This could indicate that whilst the
same genes are not involved in both intellectual disabilities and in cognitive abilities, the
genes that can contain mutations resulting in intellectual disabilities are found in pathways
that govern the normal range of cognitive ability.
The results of this thesis indicate that common SNPs which tag causal variants are
not randomly distributed across the genome but are clustered in genes that work together as
part of a larger mechanism. In addition this work provides working examples of how
multiple data sources that can be utilised to construct gene sets designed to explore the
known relationship between genotype and cognitive ability and to utilise GWAS data sets to
prioritise groups of genes.