Ensembl TrainingEnsembl Home

<- Back to exercise page

Finding protein coding genes with AlphaFold DB import data in Bemisia tabaci

The whitefly Bemisia tabaci Uganda 1 has been reported from a range of vegetable and weed hosts. This species has been known to transmit different groups of plant-viruses that constrain sweetpotato production in Uganda (Fiallo-Olivé et al. 2020) and a comprehensive understanding of this species is crucial to food security.

  1. Use BioMart to export a list of protein coding genes in Bemisia tabaci Uganda 1 with AlphaFold DB data
  2. Retrieve their protein IDs
  3. Retrieve their sequence in the FASTA format

Go to Ensembl Metazoa. Click on BioMart on the navigation bar at the top of the page. Click the New button on the toolbar on the top left-hand corner, choose the Ensembl Metazoa Genes database and Bemisia tabaci Uganda 1 dataset. Now, filter for the genes with Gene type: Protein coding and Limit to genes: With AlphaFold DB import only.

Make sure the box next to the filter is ticked, otherwise the filter won’t work. Click the Count button on the toolbar.
> This will give you 20 / 13802 Genes.

Go to Attributes on the left-hand panel. Select Gene stable ID, Protein stable ID, AlphaFold DB import Click on Results on the toolbar and the table will display the options you have selected as attributes.

Go to Attributes on the left-hand panel. Expand the SEQUENCES section by clicking on the + box and select Peptide. Select the appropriate header information from the HEADER INFORMATION.

Click on Results on the toolbar and the sequence will be shown as FASTA format. You can export the sequence by downloading it directly to your local machine or sending it to your email.