Ensembl TrainingEnsembl Home

Genes and transcripts in Ensembl, Demo

Demo: The gene tab

If you click on any one of the transcripts in the Region in detail image, a pop-up menu will appear, allowing you to jump directly to that gene or transcript.

Another way to go to a gene of interest is to search directly for it.

We’re going to look at the pig NSDHL _gene.

From ensembl.org, type NSDHL into the search bar and click the_ Go_ button. You will get a list of hits with the human gene at the top.

Where you search for something without specifying the species, or where the ID is not restricted to a single species, the most popular species will appear first, in this case, human, mouse and zebrafish appear first. To find the pig gene, we should use the Restrict species to: option on the left and select Pig from … 211 more species ….

You will see links to the NSDHL gene in a number of pig breeds. We want the gene in the reference pig.

Click on the gene name or Ensembl ID for the reference pig._ The Gene tab should open:

Let’s walk through some of the links in the left hand navigation column. How can we view the genomic sequence? Click Sequence at the left of the page.

The sequence is shown in FASTA format. Take a look at the FASTA header:

Exons are highlighted within the genomic sequence. Variants can be added with the Configure this page link found at the left. Click on it now.

Once you have selected changes (in this example, Show variants and Line numbering) click at the top right.

You can download this sequence by clicking in the Download sequence button above the sequence:

This will open a dialogue box that allows you to pick between plain FASTA sequence, or s_equence in RTF_, which includes all the coloured annotations and can be opened in a word processor. This button is available for all sequence views.

To find out what the protein does, have a look at GO terms from the Gene Ontology consortium (www.geneontology.org). There are three pages of GO terms, representing the three divisions in GO: Biological process (what the protein does), Cellular component (where the protein is) and Molecular function (how it does it). Click on GO: Biological process to see an example of the GO pages.

Can our gene be found in other databases? Go up the left-hand menu to External references:

This contains links to the gene in other projects, such as EntrezGene, and papers where this sequence is published.

Demo: The transcript tab

Let’s now explore one splice isoform. Click on Show transcript table at the top.

Have a look at the largest one, NSDHL-202.

If we were to only choose one transcript to analyse, we would choose this one because it has an APPRIS P5.

Click on the ID, ENSSSCT00000033571.3.

You are now in the Transcript tab for NSDHL-202. The left hand navigation column provides several options for the transcript NSDHL-202.

For detailed information on the support for this transcript, click on Supporting evidence.

Click on the identifiers of the evidence to get a pop-up. This links out to the original records of these data in, for example, RefSeq, Uniprot or ENA.

Click on the _Exons _link.

You may want to change the display (for example, to show more flanking sequence, or to show full introns). In order to do so click on Configure this page and change the display options accordingly.

Now click on the_ cDNA_ link to see the spliced transcript sequence.

UnTranslated Regions (UTRs) are highlighted in dark yellow, codons are highlighted in light yellow, and exon sequence is shown in black or blue letters to show exon divides. Sequence variants are represented by highlighted nucleotides and clickable IUPAC codes are above the sequence.

Next, follow the General identifiers link at the left.

This page shows information from other databases such as RefSeq, UniProtKB, CCDS and others, that match to the Ensembl transcript and protein.

Now click on Protein summary to view domains from Pfam, PROSITE, Superfamily, InterPro, and more.

Clicking on Domains & features shows a table of this information.