AstraZeneca PheWAS Portal

Model Definitions

This page summarises the variant filtering criteria applied in both gene-level and variant-level analysis, in order to define the models. For full technical details of the filtering criteria including quality metrics, please refer to the below publications:

Rare variant contribution to human disease in 281,104 UK Biobank exomes, Quanli Wang, Ryan S. Dhindsa, Keren Carss, Andrew R. Harper, Abhishek Nag, Ioanna Tachmazidou, Dimitrios Vitsios, Sri V. V. Deevi, Alex Mackay, Daniel Muthas, Michael Hühn, Sue Monkley, Henric Olsson, AstraZeneca Genomics Initiative, Sebastian Wasilewski, Katherine R. Smith, Ruth March, Adam Platt, Carolina Haefliger, Slavé Petrovski, Nature (2021): https://www.nature.com/articles/s41586-021-03855-y

Whole-genome sequencing of half-a-million UK Biobank participants, Shuwei Li, Keren J. Carss, Bjarni V. Halldorsson, Adrian Cortes, UK Biobank Whole-Genome Sequencing Consortium (preprint): https://doi.org/10.1101/2023.12.06.23299426

Model typeModelDefinition
Model for gene-level results (collapsing analysis)flexdmg
  • Non-synonymous (including protein-truncating variants, missense variants, in-frame insertions and deletions, splice acceptor variants and splice donor variants)
  • Predicted to be damaging (REVEL score ≥ 0.25)
  • Moderately rare (minor allele frequency ≤ 0.001 within the cohort and within gnomAD)
flexnonsyn
  • Non-synonymous
  • Moderately rare (minor allele frequency ≤ 0.001 within the cohort and within gnomAD)
flexnonsynmtr
  • Non-synonymous
  • Moderately rare (minor allele frequency ≤ 0.001 within the cohort and within gnomAD)
  • Missense variants must fall within a constrained region (MTR < 0.78 or MTR_centile < 0.5)
ptv
  • Protein-truncating variant
  • Moderately rare (minor allele frequency ≤ 0.001 within the cohort and within gnomAD)
ptv5pcnt

Care should be taken in the interpretation of this model because it has a higher allele frequency threshold than other models and may therefore be more susceptible to confounding factors.

  • Protein-truncating variant
  • Less common (minor allele frequency ≤ 0.05 within the cohort and within gnomAD)
ptvraredmg
  • Non-synonymous
  • Predicted to be damaging (REVEL score ≥ 0.25)
  • PTVs must be moderately rare (minor allele frequency ≤ 0.001 within the cohort and within gnomAD)
  • Non-PTVs must be rare (minor allele frequency ≤ 0.0005 within the cohort and within gnomAD)
raredmg
  • Missense variants
  • Predicted to be damaging (REVEL score ≥ 0.25)
  • Rare (minor allele frequency ≤ 0.0005 within the cohort and within gnomAD)
raredmgmtr
  • Missense variants
  • Predicted to be damaging (REVEL score ≥ 0.25)
  • Rare (minor allele frequency ≤ 0.0005 within the cohort and within gnomAD)
  • Missense variants must fall within a constrained region (MTR < 0.78 or MTR_centile < 0.5)
rec

Care should be taken in the interpretation of this model because it has a higher allele frequency threshold than most other models and may therefore be more susceptible to confounding factors. Additionally, be aware that the two qualifying variants in a gene in an individual may be on the same chromosome (in cis), and thus not in fact recessive. Includes variants that fulfil the following criteria:

  • Non-synonymous
  • Slightly rare (minor allele frequency ≤ 0.005 within the cohort and within gnomAD)
  • Variant must either be homozygous, or there must be two or more such qualifying variants in the gene in the individual
syn

This model is a negative control and should not be investigated for functional phenotypic associations.

  • Synonymous
  • Rare (minor allele frequency ≤ 0.0005 within the cohort and within gnomAD)
UR
  • Non-synonymous
  • Predicted to be damaging (REVEL score ≥ 0.25)
  • Ultra-rare (minor allele frequency ≤ 0.00025 within the cohort and absent from gnomAD)
URmtr
  • Non-synonymous
  • Predicted to be damaging (REVEL score ≥ 0.25)
  • Ultra-rare (minor allele frequency ≤ 0.00025 within the cohort and absent from gnomAD)
  • Missense variants must fall within a constrained region (MTR < 0.78 or MTR_centile < 0.5)
Model for variant-level resultsgenotypic / additive

Estimates the effect of having an additional copy of the alternate allele (assumed to be the same whether comparing zero to one alleles or one to two alleles). Autosomal genotypes are coded as the number of alternative alleles (0, 1 or 2) and this is modelled as a continuous covariate with one degree of freedom.

allelic

The units of the test are alleles rather than participants. Compares minor to major alleles, i.e. there are two categories of exposure.

dominant

Compares participants with one or two minor alleles to those with none. There are two categories of exposure and one degree of freedom.

recessive

Compares participants with two minor alleles to those with zero or one. There are two categories of exposure and one degree of freedom.