Ensembl TrainingEnsembl Home

<- Back to exercise page

Find genes associated with array probes

Forrest et al performed a microarray analysis of peripheral blood mononuclear cell gene expression in benzene-exposed workers (Environ Health Perspect. 2005 June; 113(6): 801–807). The microarray used was the human Affymetrix U133A/B (also called U133 plus 2) GeneChip. The top 25 up-regulated probe-sets were:

207630_s_at 221840_at 219228_at 204924_at 227613_at 223454_at 228962_at 214696_at 210732_s_at 212370_at 225390_s_at 227645_at 226652_at 221641_s_at 202055_at 226743_at 228393_s_at 225120_at 218515_at 202224_at 200614_at 212014_x_at 223461_at 209835_x_at 213315_x_at

(a) Retrieve for the genes corresponding to these probe-sets the Ensembl Gene and Transcript IDs as well as their HGNC symbols and descriptions.

(b) In order to analyse these genes for possible promoter/enhancer elements, retrieve the 2000 bp upstream of the transcripts of these genes.

(c) In order to be able to study these human genes in mouse, identify their mouse orthologues. Also retrieve the genomic coordinates of these orthologues.

(a) Click New. Choose the ENSEMBL Genes database. Choose the Human genes (GRCh38) dataset.

Click on Filters in the left panel. Expand the GENE section by clicking on the + box. Select Input microarray probes/probesets ID list - Affy hg u133 plus 2 probeset ID(s) and enter the list of probeset IDs in the text box (either comma separated or as a list).

Count shows 26 genes match this list of probesets.

Click on Attributes in the left panel. Select the Features attributes page. Expand the GENE section by clicking on the + box. In addition to the default selected attributes, select Description. Expand the External section by clicking on the + box. Select HGNC symbol from the External References section and AFFY HG U133-PLUS-2 from the Microarray Attributes section.

Click the Results button on the toolbar. Select View All rows as HTML or export all results to a file. Tick the box Unique results only.

Your results should show that the 25 probes map to 27 Ensembl genes.

(b) Don’t change Dataset and Filters – simply click on Attributes.

Select the Sequences attributes page. Expand the SEQUENCES section by clicking on the + box. Select Flank (Transcript) and enter 2000 in the Upstream flank text box. Expand the Header information section by clicking on the + box. Select, in addition to the default selected attributes, Description and gene name.

Note: Flank (Transcript) will give the flanks for all transcripts of a gene with multiple transcripts. Flank (Gene) will give the flanks for one possible transcript in a gene (the most 5’ coordinates for upstream flanking).

Click the Results button on the toolbar.

(c) You can leave the Dataset and Filters the same, and go directly to the Attributes section:

Click on Attributes in the left panel. Select the Homologues attributes page. Expand the GENE section by clicking on the + box. Select Gene name. Deselect Ensembl Transcript ID. Expand the ORTHOLOGUES [K-O] section by clicking on the + box. Select Mouse Ensembl Gene ID, Mouse Chromosome Name, Mouse Chr Start (bp) and Mouse Chr End (bp).

Click the Results button on the toolbar. Select View All rows as HTML or export all results to a file.

Your results should show that for most of the human genes at least one mouse orthologue has been identified.