Research

We use computational approaches and large genomic datasets to uncover novel genetic variants that cause rare disease, and understand the mechanisms through which they do so.

Clinical genetic testing has become common place for rare disease patients. Identifying the genetic cause of disease is of huge benefit to both patients and their families; allowing us to screen additional family members to identify those also at risk, return an accurate diagnosis to the patient, and dictate personalised treatment approaches. Through current approaches, however, we only find a genetic diagnosis in around half of all rare disease patients. These approaches almost exclusively focus on the regions of the genome that code directly for proteins.

We believe that a subset of the undiagnosed patients will have genetic variants in regions outside of this protein-coding sequence, that have crucial roles in regulating the amount of proteins that are produced. Our aim is to identify these disease-causing regulatory variants and determine how they lead to disease. By identifying these variants, we hope to influence clinical genetic testing guidelines and allow a valuable genetic diagnosis to be returned to more rare disease patients.

We are fascinated by untranslated regions and aim to understand how and when genetic variants within them cause disease.

Untranslated regions (UTRs) are the regions of a gene directly up- and down-stream of the protein coding region. They form part of the mRNA molecule, but are not translated into protein. These regions have very important regulatory roles: they control the stability of the RNA, the location of it within the cell, and the rate at which it is translated into protein. Variants within these regions that affect these regulatory processes can therefore have a large impact, however, we are currently limited in our ability to predict the likely effect of any single variant. We aim to identify subsets of UTR variants that are deleterious and create tools and resources to help find and interpret these variants.

Data

We mainly analyse publicly available large-scale genomic datasets including:

In addition, we collaborate with others to access large patient cohorts including the Centre for Mendelian Genomics at the Broad Institute (through collaboration with Anne O’Donnell Luria and Heidi Rehm) and the Deciphering Developmental Disorders (DDD) cohort at the Wellcome Sanger Institute (through collaboration with Matt Hurles, Caroline Wright and Hillary Martin).

Approaches

  • Using population cohorts to identify specific variants in functional non-coding elements that show signals of being under strong negative selection, indicating that they are likely deleterious

  • Identifying disease-causing non-coding region variants in rare disease cases

  • Assessing the contribution of non-coding variants in enhancer regions linked to known cardiomyopathy (heart muscle disease) genes

  • Creating tools and resources to improve annotation of functional non-coding variants

  • Developing clinical guidelines to support interpretation of non-coding region variants

Other interests

  • Assessing whether genetics variants identified in patients are too common to cause disease

  • Improving methods and tools for clinical variant interpretation