Ensembl TrainingEnsembl Home

<- Back to exercise page

Convert IDs using BioMart

BioMart is a very handy tool when you want to convert IDs from different databases. The following is a list of 29 IDs of human proteins from the NCBI RefSeq database:
NP_001218, NP_203125, NP_203124, NP_203126, NP_001007233, NP_150636, NP_150635, NP_001214, NP_150637, NP_150634, NP_150649, NP_001216, NP_116787, NP_001217, NP_127463, NP_001220, NP_004338, NP_004337, NP_116786, NP_036246, NP_116756, NP_116759, NP_001221, NP_203519, NP_001073594, NP_001219, NP_001073593, NP_203520, NP_203522

Use BioMart in Ensembl to generate a list that shows to which Ensembl gene IDs and to which gene names these RefSeq IDs correspond. Do these 29 transcripts correspond to 29 genes?

  1. Go to BioMart. You can find a shortcut to the tool on any Ensembl page in the navigation bar at the top of the page. Click New in the top left-hand menu if you need to start a new query. Choose the Ensembl Genes database. Choose the Human genes dataset.

  2. Click on Filters in the left panel. Expand the GENE section. Select Input external references ID list - RefSeq peptide ID(s) and enter the list of IDs in the text box (either comma separated or as a list).

HINT: You may have to scroll down the menu to see these.

Count shows 10 genes (remember one gene may have multiple splice variants coding for different proteins, that is the reason why these 29 proteins do not correspond to 29 genes).

  1. Click on Attributes in the left panel. Select the Features attributes page. Expand the External section. Select HGNC symbol and RefSeq Peptide ID from the External References section.

  2. Click the Results button on the toolbar. Select View: All rows as HTML or export all results to a file.