Filter Events by Year
Ensembl Browser Workshop - Instituto Conmemorativo Gorgas de Estudio de la Salud
Course Details
- Lead Trainer
- Aleena Mushtaq
- Event Date
- 2025-11-18
- Location
- Panama City, Panama
- Description
- Work with the Ensembl Outreach team to get to grips with the Ensembl browser, accessing gene and comparative genomics data.
Demos and exercises
Ensembl species
The front page of Ensembl Metazoa is found at www.metazoa.ensembl.org/. It contains lots of information and links to help you navigate Ensembl Metazoa.

Mosquito species
-
Go to Ensembl Metazoa. How many genomes relating to the genus Anopheles are there in Ensembl Metazoa?
-
When was the current Anopheles gambiae genome assembly last revised?
- Go to metazoa.ensembl.org. Open the drop-down list or click on View full list of all Ensembl Metazoa species. In a latin binomial species name, the first word represents the genus. Type Anopheles into the filter box in the top left to find all genomes with this word in the binomial.
There are 22 Anopheles genomes (some species are represented by more than one genome).
- Click on Anopheles gambiae (African malaria mosquito, PEST), and then on More information and statistics.
The assembly hosted is AgamP4 (INSDC Assembly GCA_000005575.1) which was revised in Feb 2006.
Exploring genomic regions
Demo: Exploring genomic regions in Ensembl Metazoa
We’re going to look at a region of the Anopheles gambiae (African malaria mosquito, PEST) genome, 3L:17000000-17100000, and manipulate the view to see the data we are interested in.

Exploring a genomic region in Anopheles gambiae
(a) Go to the region from 7,300,000 to 7,450,000 bp on Anopheles gambiae chromosome 2L. On which cytogenetic band is this region located?
(b) How many genes are found in this region? Zoom in on the second exon of AGAP004970-RA. Turn on the track Start/Stop codons. Can you see the start codon of AGAP004970-RA?
(c) Highlight the start codon of AGAP004970-RA. Zoom out to view the whole gene. Can you see where you highlighted?
(a) Go to the Ensembl metazoa homepage.
Select Search: Anopheles gambiae (African malaria mosquito, PEST), type 2L:7300000-7450000 and click Go.
The region is located on the cytogenetic band 21B.
(b) There are nine genes within this region and one that overlaps the end.
Drag out a box around the second exon of the gene, the second red box from the left and click on Jump to region in the pop-up window to zoom in. If you have not zoomed in far enough, drag out another box and click on Jump to region.
The nucleotide sequence will appear either side of the blue contig as pale blue (C), yellow (G), green (A) and pink (T) boxes. As you zoom in further, you will see the letters on the bases.
Click on Configure this page and click on Sequence and assembly. Turn on the track for Start/stop codons.
Alternatively, you can find the tracks by typing the name into the yellow Find a track box at the top right. Close the menu.
Start codons are shown in green and stop codons in red. You should see a green start codon that coincides with the start of the filled in red box.
(c) Drag out a box around the green start codon and select Mark region. The highlighted region should be visible as a grey dotted line. Scroll up to the overview to drag out a box around the gene and select Jump to region, you will still see the highlight in this view.
Genes and Transcripts
Demo: Exploring genes and transcripts in Ensembl Metazoa
We’re going to look at the AAEL026647 gene in Aedes aegypti (Yellow fever mosquito, LVP_AGWG) to find out information about it and its transcript.

Exploring a leaf-cutter ant gene
-
Find the Atta cephalotes LOC105618535 gene on Ensembl Metazoa.
-
How long is its transcript? How long is the protein it encodes? How many exons does it have? Are any of the exons completely or partially untranslated?
-
Export the sequence of the gene, cDNA and protein in FASTA format.
-
Go to the Ensembl Metazoa homepage. Select Atta cephalotes (Leaf-cutter ant) from the species list and type LOC105618535 in the search box. Click Go. Click on LOC10561855.
-
Click on Show transcript table.
The transcript is 3451 base pairs and the length of the encoded protein is 810 amino acids.
Click on the Ensembl Transcript ID XM_012200067.1 in the transcript table.
It has nine exons.
Click on Sequence - Exons in the side menu.
The last exon is partially untranslated (sequence shown in orange). This can also been seen from the fact that in the transcript diagram on the Gene summary and Transcript summary pages the boxes representing the last exon is partially unfilled.
- Click on the blue Export data button. Under Options for FASTA sequence, select Genomic: Unmasked, cDNA and Peptide sequence. Click Next>. Click on Text.
This returns three sequences (one gene, one transcript and one protein sequence).
Exploring an Anopheles gambiae gene
Start in metazoa.ensembl.org/index.html and select the Anopheles gambiae (African malaria mosquito, PEST) genome.
(a) What GO: biological process terms are associated with the para gene?
(b) How many protein coding transcripts does this gene have? View all of these in the transcript comparison view.
(c) Go to the transcript tab for the transcript, AGAP004707-RH. How many exons does it have? Which one is the longest?
(a) Go to metazoa.ensembl.org/index.html. Click on Anopheles gambiae (African malaria mosquito, PEST) from the popular species list.
Search for para and click on the AGAP004707 link in the results.
Click on GO: biological process in the side menu.
There are nine GO terms listed. GO:0006811 ion transport, GO:0006814 sodium ion transport and GO:0019228 neuronal action potential are some of the terms listed.
(b) If the transcript table is hidden, click on Show transcript table to see it.
There are 13 protein coding transcripts.
Click on Transcript comparison in the left hand menu. Click on Select transcripts. Either select all the transcripts labelled protein coding one-by-one, or click on the drop down and select Protein coding. Close the menu.
(c) Click on the transcript named AGAP004707-RH. Click on Exons in the left hand menu.
There are 32 exons, of which exon 32 is longest with 1,017 bp.
Exploring a gene in Plasmodium falciparum
-
Find the Plasmodium falciparum 3D7 PF3D7_1145400 gene on Ensembl Protists. On which strand is this gene located? What are the coordinates of the gene?
-
How long is its transcript (in bp)? How long is the protein it encodes? How many exons does it have?
-
What is the Uniprot ID that maps to the translation of this transcript?
-
What are the GO:Biological process(es) associated with PF3D7_1145400?
- Go to the Ensembl Protists homepage. Select Plasmodium falciparum from the species list and type
PF3D7_1145400in the search box. Click Go. Click on PF3D7_1145400 in the search results. You can find the strand orientation and coordinates in the gene Summary page.PF3D7_1145400 is located on the reverse strand of chromosome 11 between 1,800,544 and 1,803,550.
- Click on Show transcript table.
The transcript is 2,514 base pairs and the length of the encoded protein is 837 amino acids.
Click on the transcript ID CZT99117 in the transcript table.
It has four exons.
- You can find this information in a number of places: the transcript table, External references on the Gene tab or General identifiers on the Transcript tab.
The UniProt ID that maps to protein encoded by the PF3D7_1145400 transcript is Q8IHR4.
- Click on GO: Biological process in the side menu of the Gene tab.
The PF3D7_1145400 gene is involved in mitochondrial fission.
Comparative Genomics
Demo: Exploring comparative genomics data for in Ensembl Metazoa
Navigate to “www.metazoa.ensembl.org”. Select “Anopheles gambiae (African malaria mosquito, PEST)” from the drop-down menu. Enter the gene ID: “AGAP004707”.

Synteny
Go to metazoa.ensembl.org.
Find the AGAP009734 (wingless-type MMTV integration site family, member 1) gene in Anopheles gambiae (African malaria mosquito, PEST). Go to the Location tab.
(a) Click Synteny at the left. Are there any syntenic regions in Aedes aegypti (Yellow fever mosquito, LVP_AGWG)s? If so, which chromosomes are shown in this view?
(b) Stay in the Synteny view. Is there a homologue in Aedes aegypti (Yellow fever mosquito, LVP_AGWG) for Anopheles gambiae AGAP009734?
(a) Yes, there is one syntenic region in Aedes aegypti to Anopheles gambiae chromosome 3R, which is in the centre of this view. Aedes aegypti chromosomes 2 has a syntenic region to A. gambiae chromosome 3R.
(b) Scroll down to the bottom of the page.
There is a homologue AAEL000599 in Aedes aegypti of Anopheles gambiae AGAP009734.
Ensembl Protists: Comparative genomics data in Ensembl Protists
The protist species L. panamensis is the main causative agent of tegumentary leishmaniasis in Panama and Colombia. In Panama, there are about 3,000 new cases per year, 5% of which progress to the mucocutaneous presentation. The condition occurs when parasites may occasionally migrate to nasopharyngeal tissues leading to highly disfiguring lesions (Llanes et al., 2015; https://doi.org/10.1038/srep08550) We will use the LPMP_140880 gene, a glutathione synthetase in Leishmania panamensis str. MHOM/PA/94/PSC-1 (GCA_000755165.1) as a reference to find the following information:
- Find the Ensembl gene tree ID. How many speciation and duplication nodes does it have?
- How many orthologues does Leishmania panamensis have? What type of orthologues are they?
- Does this species have an orthologue in Leishmania major? If so, what is the gene ID and coordinates in Leishmania major?
- View the protein alignment of the Leishmania major orthologue.
- What are the peptide IDs for the translations in Leishmania panamensis str. MHOM/PA/94/PSC-1 (GCA_000755165.1) and Leishmania major?
-
Go to Ensembl Protists, search for LPMP_140880 gene in Leishmania panamensis str. MHOM/PA/94/PSC-1 (GCA_000755165.1).
-
Click on orthologues on the left hand side of the gene tab.
-
Filter for Leishmania major in the second table.
-
Click on Ensembl stable ID for Leishmania major and select view protein alignment.
-
The peptide ID’s are given in the summary table above the protein alignments.
BioMart
Follow these instructions to guide you through BioMart to answer the following query:
You have three questions about a set of Anopheles gambiae genes: Rps19, APG5, CYP6AK1, CPR113, CPF1 and HPX10
What are the NCBI gene IDs for these genes?
Are there associated functions from the GO (gene ontology) project that might help describe their function?
What are their cDNA sequences?

Finding protein coding genes with AlphaFold DB import data in Bemisia tabaci
The whitefly Bemisia tabaci Uganda 1 has been reported from a range of vegetable and weed hosts. This species has been known to transmit different groups of plant-viruses that constrain sweetpotato production in Uganda (Fiallo-Olivé et al. 2020) and a comprehensive understanding of this species is crucial to food security.
- Use BioMart to export a list of protein coding genes in Bemisia tabaci Uganda 1 with AlphaFold DB data
- Retrieve their protein IDs
- Retrieve their sequence in the FASTA format
Go to Ensembl Metazoa. Click on BioMart on the navigation bar at the top of the page. Click the New button on the toolbar on the top left-hand corner, choose the Ensembl Metazoa Genes database and Bemisia tabaci Uganda 1 dataset. Now, filter for the genes with Gene type: Protein coding and Limit to genes: With AlphaFold DB import only.
Make sure the box next to the filter is ticked, otherwise the filter won’t work. Click the Count button on the toolbar.
> This will give you 20 / 13802 Genes.
Go to Attributes on the left-hand panel. Select Gene stable ID, Protein stable ID, AlphaFold DB import Click on Results on the toolbar and the table will display the options you have selected as attributes.
Go to Attributes on the left-hand panel. Expand the SEQUENCES section by clicking on the + box and select Peptide. Select the appropriate header information from the HEADER INFORMATION.
Click on Results on the toolbar and the sequence will be shown as FASTA format. You can export the sequence by downloading it directly to your local machine or sending it to your email.
Ensembl Protists: exporting homologues with BioMart
Go to Ensembl Protists. For a list of , export the Find Leishmania major orthologues for these Trypanosoma brucei genes: Tb927.3.3470, Tb927.3.3520, Tb11.01.7770, Tb927.8.5110, Tb927.1.1420
-
Go to BioMart, select the Ensembl Protists Genes database and choose the Trypanosoma brucei genes dataset.
-
Click on Filters in the left panel. Expand the GENE section. Enter the gene list in the Input external references ID list box. Gene stable ID(s) should be preselected.
-
Click on Attributes in the left panel. Select the Homologues attributes at the top of the page. Expand the GENE section. Deselect Gene stable ID version, Transcript stable ID and Transcript stable ID version. Expand the ORTHOLOGUES [K-O] section and select
Leishmania major Gene ID. -
Click Results. Select View: All rows as HTML to open the entire table in a new tab. If you prefer, you can also export as a CSV, TSV or XLS file by using the Export all results to option.
