Ensembl TrainingEnsembl Home

<- Back to exercise page

BioMart Convert IDs

BioMart is a very handy tool when you want to convert IDs from different databases. The following is a list of 29 IDs of human proteins from the NCBI RefSeq database:

NP_001218 NP_203125 NP_203124 NP_203126 NP_001007233 NP_150636 NP_150635 NP_001214 NP_150637 NP_150634 NP_150649 NP_001216 NP_116787 NP_001217 NP_127463 NP_001220 NP_004338 NP_004337 NP_116786 NP_036246 NP_116756 NP_116759 NP_001221 NP_203519 NP_001073594 NP_001219 NP_001073593 NP_203520 NP_203522

Generate a list that shows to which Ensembl Gene IDs and to which gene names these RefSeq IDs correspond. Do these 29 transcripts correspond to 29 genes?

Click New. Choose the ENSEMBL Genes database. Choose the Human genes (GRCh38) dataset.

Click on Filters in the left panel. Expand the GENE section by clicking on the + box. Select Input external references ID list - RefSeq peptide ID(s) and enter the list of IDs in the text box (either comma separated or as a list). HINT: You may have to scroll down the menu to see these. Count shows 11 genes (remember one gene may have multiple splice variants coding for different proteins, that is the reason why these 29 proteins do not correspond to 29 genes).

Click on Attributes in the left panel. Select the Features attributes page. Expand the External section by clicking on the + box. Select HGNC symbol and RefSeq Peptide ID from the External References section.

Click the Results button on the toolbar. Select View All rows as HTML or export all results to a file.