Genetic Analyses of Age at Onset Traits
Abstract
The identification of factors underlying complex trait variation is a major goal in
the field of genetics. For normally distributed, fully observed trait data there are
many well established statistical methods for partitioning phenotypic variation
and for mapping quantitative trait loci (QTL). Survival or time-to-event traits
often follow non-normal distributions and frequently contain partially-known
(or censored) trait data. If standard statistical methods are used to analyse
age at onset data a bias can be introduced through a failure to account for the
non-normal distribution of the data and the presence of censoring. Complex
statistical methods have been developed to partition trait variation and map
QTL for age at onset or survival traits. In this thesis, the use of these survival
analysis methods is compared to more established statistical methods for the
analysis of age-at-onset data.
A brief introduction to the analysis of human variation and the issues associated
with the analysis of age at onset data is given. The methods currently used to
partition trait variation and map QTL for survival traits are discussed (Chapter
1). Age-specific penetrances can be used to model the age-at-onset of disease
in unaffected individuals. This parametric method is used to identify loci
underlying susceptibility to a novel co-morbid psychiatric phenotype (depression
and unexplained swelling). The method is compared to a non-parametric
variance component (VC) QTL mapping method that does not account for the
age at onset of the disease. Parametric linkage analysis identified two suggestive loci, neither of which were supported by the standard variance component
analysis. VC analysis identified a suggestive linkage region on chromosome 14
which decreased upon fine mapping (Chapter 2).
Many of the current methods used to analyse survival data in human genetics
are based on methods originally derived by animal geneticists. The analysis of
survival traits in some experimental populations is simplified by the presence
of fully inbred lines. However, for complex traits the methods are both
computationally intensive and not widely available. A grouped linear regression
method is proposed for the analysis of continuous survival data in fully inbred
lines. Using simulation the method is compared to both the Cox and Weibull
proportional hazards models and a standard linear regression method that
ignores censoring. The grouped linear regression method is of equivalent power
to both the Cox and Weibull proportional hazards methods, is significantly
better than the standard linear regression method when censored observations
are present and is computationally simple (Chapter 3).
A sample of 446 monozygotic (MZ) twins, 633 dizygotic (DZ) twins and 223
siblings was used to partition the inter-individual variance in age at menarche.
The analysis was carried out using both a standard method which failed to
account for the censored nature of the data and a mixed effects Cox model
which fits a frailty model to the random effects. The standard methodology
suggested that an additive genetic model best described the data. The
most parsimonious model when using the frailty method included additive
genetic and common environmental effects (ACE). The difference between
the two models was caused by the different ascertainment of the siblings. The
frailty model estimated the heritability of age at menarche to be 0.57 (Chapter 4). In Chapter 5, a sample of 2,685 pseudo-independent sib-pairs is used in a genomewide
linkage scan for QTL underlying variation in age-at-menarche. The sample
comprises of the adolescent sample discussed in chapter 4, and three adult
cohorts. The proportion of censoring in the sample is 1.20% so a standard QTL
mapping method is used. Two QTL of suggestive significance are identified on
chromosomes 11p and 3p. The candidate genes WT1 and FSHB are located
within the linkage peak on 11p. After the removal of bivariate outliers a locus
on chromosome 12q was identified. No significant QTL were detected which
suggests age-at-menarche is influenced by multiple genes of small effect. The
thesis concludes with a general discussion (Chapter 6).