Exploiting large-scale exome sequence data to determine the genetic control of healthy aging

Supervisors: Pau NavarroSara KnottJim WilsonChris Haley

Project Description:

Genome-wide association studies (GWAS) have been effective at beginning to unravel the genetics of traits underlying healthy aging, such as bone strength, cognition, lung and kidney function and obesity, by locating common variants of small effect. However, there has been limited success at rare variant detection affecting these and other phenotypes. Detection of rare variants of large effect would improve our ability to dissect causal pathways underlying these traits as well as to predict individual risk, potentially contributing to lengthening ‘healthspan’ (the proportion of lifespan that we are in good health).

We previously developed regional heritability mapping approaches that have been effective at detecting clusters of rare and common variants that have escaped detection in standard GWAS. By taking advantage of the particular characteristics of rare variants and newly available sequence information in large numbers of individuals, both from local cohorts and the UK Biobank, this project will explore the effectiveness of Regional Heritability Mapping in simulated data and apply this promising approach to real population high-density exome sequence data. Our enhanced approach takes advantage of the insight that rare variants are associated with the relatively large haplotype in which they first arose and restricted to pedigrees descended in the relatively recent past from a common ancestor.

We will apply our methods first in regions that are known to harbour common variation for our traits of interest when these regions are available, to test the hypothesis that the same regions will also harbour rarer associated variants. We will continue by extending our survey to the whole exome, to uncover further novel associations.

We will follow-up the results obtained from the regional analysis of real data by integrating our findings with publicly available resources including gene expression data as well as local resources including methylation and proteomic data, to better understand variation related to healthy aging. We will be able to integrate this data to explore causality in Mendelian randomisation frameworks and understand the correlation between phenotypes using multitrait analyses when appropriate.

With 4000 participants, the local data we have access to are very rich both in terms of relevant phenotypes (including DEXA scans, metabolomics data and cognitive tests) and high density sequence data. UK Biobank will release its sequence data in 2019, well within the lifespan of this project, and we will be able to follow-up the research in our dataset in this much larger resource for a selection of overlapping relevant traits (for example DEXA scan phenotypes and cognition).

By integrating expertise in genetics and population informatics, the project will provide training and experience in computational genomics. The successful student will have access to large data-sets with multiple phenotypes recorded and genotyped with high-density SNP arrays and exome sequence data). Training will be provided where necessary in complex trait genetics and in use of high-performance computational resources and computational languages and software packages used in the field. The skills developed will prove the foundation for a career in industry or research in the life sciences or pharmaceutical or bioindustries.

If you wish to apply for this project, please check this link and send your application to this email.

Other: