Finding genes linked to a GO term, Demo
We’re interested in finding all of the genes in sheep associated with the gene ontology term for milk secretion (GO:0007595), and their homology with other ungulates in Ensembl.
We will use BioMart to:
- Find their gene IDs and names
- Find their orthologues, and their location in goat and cow
- Export the cDNA sequence for the sheep genes
Let’s find out using BioMart. Click on BioMart in the top header of a www.ensembl.org page to go to: www.ensembl.org/biomart/martview.
We need to build a query in BioMart using the four step process.
Step 1: Select database and dataset.
We’re interested in finding sheep genes.
Step 2: Select appropriate filters
We’re interested in finding information on a set of genes associated with the gene ontology term for milk secretion GO:0007595.
Using the Count button, we can see that there are 19 genes that have this GO term annotation.
Step 3: Select appropriate attributes
We want to find out this information about these genes:
- Find their gene IDs and names
- Find their orthologues, and their location in goat and cow
- Export the cDNA sequence for the sheep genes
We can answer points 1 and 2 in a single query.
In the GENE section: Ensembl Gene stable ID and Transcript stable ID selected by default. Unselect Transcript stable ID, and select Gene name. This will answer question 1.
In the ORTHOLOGUES section: Scroll down to find the Cow Orthologues and select the checkboxes for Cow gene stable ID, Cow gene name and cow homology type. Scroll down and find the Goat Orthologues, and choose the same options. This will answer question 2. We will return to question 3 later.
Step 4: Get results
You can download the data if you desire. The table presented shows a sub-sample of ten results to enable you to check you have the correct attributes.
Looking at the whole list we can see all sheep genes have an orthologue in cow, but one gene, PRL is absent in goat.
What about the third point? Export their cDNA sequences?
In the Attributes section there are some radio buttons. You can only choose attributes from one of these at a time. If we want Sequence data we have to do a seperate query.
Step 3.2: Select attributes to answer Question 3
From the results page, click on Attributes in the left-hand navigation panel.
Step 4.2: Get results to answer Question 3
What did you learn about the sheep genes in this exercise?
Could you learn these things from the Ensembl browser? Would it take longer?
For more details on BioMart, have a look at these publications:
Smedley, D. et al. 2009 BioMart – biological queries made easy. BMC Genomics 10(22)
Kinsella, R.J. et al. 2011. Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database (Oxford):bar030