Statistical and integrative methods for the identification of causal variation in highly structured populations

Supervisors: Albert Tenesa, Andreas Kranis

Project Description:

Identifying the genetic variation that determines differences among individuals is important in fields as diverse as animal breeding, human genetics and evolutionary biology. However, identifying these nucleotides (the DNA bits that influence these traits) is challenging because each of them explains very little of the differences among individuals. Hence, scientists need to use very large datasets (sometimes known as big data) to identify them. Furthermore, to increase the chances of success (what scientists call statistical power) it is necessary to increase the signal to noise ratio by reducing the amount of noise, but these analyses are computationally very demanding. In some instances, these analyses may take years to complete. However, we can use supercomputers to speed up the calculations.

During this project we will therefore need to:

(1) develop new methodology and tools that can exploit the power of supercomputers [Nature Communications volume 6, Article number: 10162 (2015)]. This needs very intricate programming skills developed for other fields like physics that we are now using in biology.

(2) apply the previous tools to economically important traits (including growth, reproduction and welfare) traits measured on hundreds of thousands of birds from Aviagen Ltd. [doi: https://doi.org/10.1101/176834]

(3) use integrative approaches to pinpoint causal variation by identifying the putative target tissue for relevant traits and potential off-target tissues that maybe affected by correlated responses to selection [https://doi.org/10.1534/genetics.115.185967].

Our tools and approaches will be also very useful to identify genetic variants in humans, companion and wild animals and to help understand how to reduce the disease burden in animal populations, as well as to help understand the genes that underpin traits under natural selection in wild animals.

Training outcomes

The proposed project aims to exploit new ways of working under a data-driven approach for delivering impactful research through the collaboration with an industrial partner with a global reach.

The current explosion of big and diverse data in biology requires a new breed of researchers that have the data-handling and digital excellence. This project aims to equip the student with the much sought quantitative and data science skills for the future research leaders in biosciences, including advanced statistical analysis, data management, programming, bioinformatics and streamlining complex computational approaches on large genomic data.

If you wish to apply for this project, please check this link and send your application to this email.

Other:

2019

Projects:

Edinburgh

Agriculture and Food Security

You are here

Statistical and integrative methods for the identification of causal variation in highly structured populations