Ensembl TrainingEnsembl Home

<- Back to exercise page

Displaying BAM files in Ensembl

BAM files (.bam) are binary representation of Sequence Alignment/Map format (or SAM files). They contain sequence alignment data and it is the recommended format for visualisation in several genome browsers.

The URL link below contains chromosome 20 alignment data for one individual from the MXL population (sample NA19737, Mexican ancestry from Los Angeles) in the 1000 genomes project:

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/data/NA19737/alignment/NA19737.chrom20.SOLID.bfast.MXL.low_coverage.20101123.bam

Note: To display BAM files in Ensembl, you will also need the index file.

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/data/NA19737/alignment/NA19737.chrom20.SOLID.bfast.MXL.low_coverage.20101123.bam.bai

Both links are available on the 1000 genomes ftp site:

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/data/NA19737/alignment/

These data are aligned to the GRCh37 genome assembly.

(a) Can you display this BAM file in Ensembl to view the reads and their coverage?

(b) Zoom in around any region of the human chromosome 20 to view the underlying nucleotide sequence of the reads and to compare for example whether they all match or they have mistmatches.

(c) Scroll along the genome (both upstream and downstream) to view different coverage levels in a given region of chromosome 20.

(a) This BAM file is aligned to GRCh37, so you will need to go to grch37.ensembl.org. Go to the human species page and then click on Display your data in Ensembl. Paste the URL of the BAM file (not the folder or the bam.bai file) into the form and BAM will be automatically selected.

(b) These file contains alignments for chromosome 20 in human. If you zoom in around any region of this chromosome, you can view the underlying nucleotide sequence of the reads.

You can change the track style if you click on Configure this page in the left hand side in the Location tab to view Normal, Unlimited, Coverage only.

(c) You can scroll along the genome (both upstream and downstream) to view different coverage levels in any given region of chromosome 20 based on the BAM file you have just added.