Ensembl TrainingEnsembl Home

<- Back to exercise page

Exploring a SNP in the human genome

The missense variation rs1801133 in the human MTHFR gene has been linked to elevated levels of homocysteine, an amino acid whose plasma concentration seems to be associated with the risk of cardiovascular diseases, neural tube defects, and loss of cognitive function.

  1. Find the page with information for rs1801133.

  2. Is rs1801133 a missense variant in all transcripts of the MTHFR gene? What is the amino acid change?

  3. Why are the alleles for this variation in Ensembl given as G/A and not as C/T, as in the literature?

  4. What is the major allele of rs1801133 in different populations?

  5. In which paper(s) is the association between rs1801133 and homocysteine levels described?

  6. According to the data imported from dbSNP, the ancestral allele for rs1801133 is G. Ancestral alleles in dbSNP are based on a comparison between human and chimp. Does the sequence at this same position in other primates confirm that the ancestral allele is G?

  1. Go to the Ensembl homepage. Type rs1801133 in the search box, then click Go. Click on rs1801133.

  2. Click on Genes and Regulation in the left-hand panel, or click on the Genes and Regulation icon at the top of the page.

    No, rs1801133 is missense variant in eight MTHFR transcripts. Please note that this variant is multialleleic with two alternative alleles - as this table displays one consequence per row, each transcript is listed twice.

The amino acid change is A/V for allele A, and A/G for allele C.

  1. In Ensembl, the alleles of rs1801133 are given as G/A/C because these are the alleles in the forward strand of the genome. In the literature, the alleles are given as C/T/G because the MTHFR gene is located on the reverse strand. The alleles in the actual gene and transcript sequences are C/T/G. In Ensembl, the allele that is present in the reference genome assembly is always put first.

  2. Click on Population genetics in the side menu.

    In all populations but one, the allele G is the major one. The exception is CLM (Colombian in Medellin; 1000 Genomes).

  3. Click on Phenotype Data in the left hand side menu.

    The specific studies where the association was originally described is given in the Phenotype Data table. Links between rs1801133 and homocysteine levels were described in four papers. Click on the pubmed IDs PMID:34707639, PMID:23696881, PMID:20031578 and PMID:23824729 for more details.

  4. Click on Phylogenetic Context in the side menu. Select Alignment: 10 primates EPO and click Apply.

    Gorilla, bonobo, Sumatran orangutan, chimp, macaque, gibbon, vervet, crab-eating macaque and mouse lemur all have a G in this position.