Investigating the genetic architecture of complex traits in Soay sheep

Supervisors: Sara Knott, Josephine Pemberton, Pau Navarro

Project description:

Almost all traits of importance in medicine, evolution and agriculture are complex and influenced by the actions and interactions of many genes and environmental factors.  One of the major challenges in biology is to understand the genetic control of these complex traits. Advances in genomic technologies, especially in the development of high-density genotyping arrays have made it possible to begin to dissect the genetic variation and characterise the genetic architecture of traits in many, including non-model, species. 

This project aims to investigate the genetic architecture of morphometric, fitness and health related traits in the Soay sheep (Ovis aries), using new analytical methods and data collected as part of a long-term study on the St. Kilda archipelago (Scotland) that is one of the largest projects encompassing genomic data and high quality phenotypes in a wild population ( The data consist of over 6,500 individuals that have been genotyped for around 39K polymorphic loci (SNPs), and, in addition, a subset of 188 that have been genotyped with a high-density SNP chip containing 450K polymorphic SNPs in Soays. By applying cutting edge quantitative genomic methodologies to this population, we will be able to explore strategies that could be used in other populations as the data become available.

The project will have three main stages. The first will involve imputing SNP genotypes for all 450K SNPs for all 6500 individuals, taking advantage of the well-resolved pedigree available for the population. Alternative phasing and imputation strategies will be investigated including long range phasing algorithms, and their performance assessed. Once complete genotype data have been imputed, the next stages will investigate different approaches to dissecting the genetic variation underpinning a range of traits. The second stage will work at the whole genome level, partitioning the genetic variance into a component associated with common and another with variants not efficiently captured by the SNP array, including rare variants, and accounting for environmental confounders. The third stage will map the genetic variation by estimating the contribution from regions across the genome ('regional heritability') using, amongst other approaches, a new method based on haplotype relationships. Combining the results will give new insights into the genetic control of a range of traits in this wild population.

This project will provide training and experience in key areas of genomics combined with statistics and computation relevant to all species as well as field data collection. In particular, experience will be gained in state-of-the-art analytical methods and software being used in the analysis of human data.  Training in genetics and genomics is available through our MSc programme in Quantitative Genetics and Genome Analysis ( Additionally training is offered in generic transferable and professional skills.

The project is relevant to students with a background in statistics or computational sciences as well as those with a training in quantitative or population genetics and related subjects and would suit a student with strong mathematical or computational abilities and a keen interest in genetics.


Bérénos C, Ellis PA, Pilkington JG, Lee SH, Gratten J and Pemberton JM (2015) Heterogeneity of genetic architecture of body size traits in a free‐living population. Molecular Ecology, 24, 1810-1830.
Xia C, Amador C, Huffman J, Trochet H, Campbell A, Porteous D, Generation Scotland, Hastie N, Hayward C, Vitart V, Navarro P and Haley C (2016) Pedigree- and SNP-Associated Genetics and Recent Environment are the Major Contributors to Anthropometric and Cardiometabolic Trait Variation. PLoS Genetics, 12 (2), e1005804
Nagamine Y, Pong-Wong R, Navarro P, Vitart V, Hayward C, Rudan I, Campbell H, Wilson J, Wild S, Hicks AA, Pramstaller PP, Hastie N, Wright AF and Haley CS (2012) Localising Loci underlying Complex Trait Variation Using Regional Genomic Relationship Mapping. PLoS One, 7 (10), e46501