Ensembl TrainingEnsembl Home

<- Back to exercise page

Ensembl Plants: finding genes by protein domain

One class of disease resistance (R) genes in plants are the TIR-NBS-LRR genes, that code for proteins that contain an N-terminal Toll/Interleukin receptor homology region (TIR), a nucleotide binding site (NBS) and a C-terminal leucine rich repeat (LRR). TIR-NBS-LRR genes are common in dicots but seem to be rare in monocots (Tarr and Alexander. TIR-NBS-LRR genes are rare in monocots: evidence from diverse monocot orders. BMC Res Notes 2009 Sep 8;2:197).

The ID for the TIR domain in the Pfam (protein family) database is PF01582.

Use BioMart in Ensembl Plants to generate a list of all Solanum tuberosum (potato; a dicot) genes that are annotated to contain a TIR domain. Include the Ensembl stable ID and gene description. Do the same for Zea mays (maize; a monocot).

Do your results confirm the findings of Tarr and Alexander?

  1. Go to Ensembl Plants and click on the link Tools at the top of the page. Click on BioMart. Choose the Ensembl Plants Genes database. Choose the Solanum tuberosum genes dataset.

  2. Now, filter for the genes containing a TIR domain: Click on Filters in the left panel. Expand the PROTEIN DOMAINS AND FAMILIES section. Select Limit to genes with these family or domain IDs and enter PF01582 in the box. Select Pfam ID(s) (e.g. PF00004) from the drop-down menu. Click on Count in the toolbar.

    This should give you 78 / 40336 genes.

  3. Specify the attributes to be included in the output (note that a number of attributes will already be selected by default). Click on Attributes in the left panel. Expand the GENE section. Deselect Transcript stable ID. Select Gene description.

  4. Now click on Results. The first 10 results are displayed by default. You can display all results by selecting View: All rows from the drop-down menu. If you prefer, you can also export as a CSV, TSV or XLS file by using the Export all results to option.

    Repeat the above for the Z. mays genes (B73 RefGen_v4) dataset.

    Your results should show 3 / 44303 genes containing a TIR domain for maize. The results confirm the findings of Tarr and Alexander.