Ensembl TrainingEnsembl Home
Exploring sequence variant annotation in human

<- Back to exercise page

Exploring sequence variant annotation in human

The NF1 gene encodes a neurofibromin 1 protein. Variations in the gene have been associated with neurofibromatosis, a disease characterized by patches of skin pigmentation (also known as café-au-lait spots).

(a) Find the NF1 gene for human. How many variants are listed for this gene? How many structural variants are listed for this gene?

(b) Let’s focus on the Matched Annotation between NCBI and EBI (MANE) Select transcript of the NF1 gene. The MANE Select transcript flag describes the confidence of the transcript annotation. You can read more about this here. Find the rs876658658 variant. What Sequence Ontology (SO) term, or variant consequence type, has been assigned to this variant?

(c) What is the clinical significance for this variant?

(d) Why does Ensembl put the C allele first (C/T)?

(e) What is the ancestral allele predicted for this locus?

(f) Which allele is associated with type 1 neurofibromatosis (with supporting evidence) and what is the significance of the association?

(g) How many publications mention this variant?

(h) Besides neurofibromatosis, what other MIM morbid entries are available for this gene?

(a) Go to the Ensembl homepage, search for NF1 in human and click on the first hit “NF1 (Human Gene)” in the results page. Click on Variant table in the left-hand menu.

A notice in a yellow box will show you that there are 76,692 variants for this gene.

Now click on _Structural variants_ in the left-hand menu and scroll to the bottom of the table. >There are 966 entries for structural variants.

(b) Under Show transcript table, click on ENST00000358273.9. This opens the Transcript tab. Now go to Variant table on the left-hand menu and search for rs876658658 in the table.

The rs876658658 has a ‘stop gained’ SO term assigned to it.

(c) In the Variant table, look for the Clin. Sig. column.

The rs876658658 variants is described to be as both ‘likely pathogenic’ and ‘pathogenic’.

(d) In Ensembl, the allele that is present in the reference genome assembly is put first, i.e. C.

Usually in literature, the major allele (in the population of interest) is put first. In the case of rs876658658 the allele in the reference genome is the major allele, but as the reference genome is a mosaic of the genomes of just a few individuals this is by no means the case for all variants.

(e) Click on the variant rs876658658. This opens the Variant tab. Focus on the summary information at the top of the page.

The ancestral allele is reported as C, which you can find in the Alleles line.

(f) Click on Phenotype Data in the left-hand menu.

The T allele was reported to be associated with type 1 neurofibromatosis with supporting evidence.

(g) Click on Citations in the left-hand menu.

3 publications mention this variant.

(h) Switch to the Gene: NF1 at the top of the page. Click on Phenotypes in the left-hand menu. You can filter the Phenotypes, diseases and traits associated with this gene ENSG00000196712 table by entering “MIM morbid” in the text field at the top right corner of the table.

Other MIM morbid entries for the NF1 include Neurofibromatosis-Noonan syndrome and Watson syndrome.