Export sequences in FASTA format

Retrieve the sequences of all chicken genes (Gallus gallus) that are located on chromosome 20, that are protein coding and that encode for proteins containing transmembrane domains. Do a count after selection of each filter to check the number of genes remaining in your dataset. Export the results of the protein sequences (FASTA) as Compressed web file and get the results notified to you by email.

On the Ensembl homepage, click on the BioMart link on the toolbar.

Start with all genes in chicken by choosing the Ensembl Genes database, then Gallus gallus genes dataset.

Now, filter for the genes on the 20 chromosome only:

Click on Filters in the left panel, expand the REGION section by clicking on the + box. Select Chromosome – 20.

Now click the Count button on the toolbar.

This will give you 473 / 24356 Genes.

Now filter further for genes that are protein-coding by expanding the GENE section (simply click on the + box). Then select Gene type – protein_coding and click again on Count.

This now gives you 332 / 24356 Genes.

Finally, filter for genes that encode proteins that contain transmembrane domains. Expand the PROTEIN DOMAINS section by clicking on the + box. Select Transmembrane helices – Only.

There are 69 genes on chromosome 20 in chicken that are protein coding and contain transmembrane domains.

Now you can specify the attributes to be included in the output (note that a number of attributes will already be selected by default). Click on Sequences, then Protein. The sequence will be exported as FASTA format.

Have a look at a preview of the results (only 10 rows of the results will be shown):

Click the Results button on the toolbar.

If you are happy with how the results look in the preview, output all the results by selecting Export all results to, then choose the Compressed web file (notify by email), click on _Unique results only, enter your email address in the appropriate box and click on Go.