Finding genes by protein domain

Find Tetranychus urticae (two-spotted spider mite) proteins with transmembrane helices located on chromosome/scaffold HE587301.

As with all BioMart queries you must select the dataset, set your filters (input) and define your attributes (desired output). For this exercise: Dataset: Ensembl Metazoa genes in Tetranychus urticae Filters: Transmembrane helix domain containing proteins on chromosome/scaffold HE587301 Attributes: Ensembl gene and transcript IDs and gene names

Go to the Ensembl Metazoa homepage and click on BioMart at the top of the page. Select Ensembl Metazoa genes as your database and Trichinella spiralis genes as the dataset. Click on Filters on the left of the screen and expand REGION. Change the chromosome/scaffold to HE587301. Now expand PROTEIN DOMAINS, also under filters, and select Limit to genes, choosing with with Transmembrane helices from the drop-down and then Only. Clicking on Count should reveal that you have filtered the dataset down to 352 genes.

Click on Attributes and expand GENE. Select Gene name. Now click on Results. The first 10 results are displayed by default; Display all results by selecting ALL from the drop down menu.

The output will display the Ensembl gene ID, Ensembl Transcript ID and gene names of all proteins with transmembrane helices located on chromosome/scaffold Tetranychus urticae HE587301. If you prefer, you can also export as an Excel sheet by using the Export all results to XLS option.