Ensembl TrainingEnsembl Home

Custom data, Demo

Demo: Upload small files

We have some patients that present with microcephaly and developmental delay. They all have large scale deletions on chromosome five:

We can turn them into a BED file and view them in the genome browser:

chr5 36821632 37091234 P1
chr5 36731476 36978306 P2
chr5 36908552 37108671 P3

You can add data from a Region in Detail page by clicking on the Custom tracks button at the left. Alternatively, go to a species homepage and click on Display your data in Ensembl.

A menu will appear:

The interface detects file types if you upload or attach a file. When you paste in your data, it can’t do this so we have to tell it what our file type is. It will give you an option where you can select BED.

Click Add data.

You should get to a dialogue box telling you your upload has been successful. Close the menu to go back to your region of interest.

To have a look at the file, click on Custom tracks.

If you’ve got an Ensembl account, you can save this data to your account. Accounts are free to set up and allow you to save configurations and data, and share with groups.

Demo: Attach URLs of large files

Larger files, such as BAM files generated by NGS, need to be attached by URL. I’ve put a BAM file of human chromosome 20 RNASeq data online at: http://ftp.ebi.ac.uk/pub/databases/ensembl/training/emily_BAM/

Let’s take a look at the folder.

Here you can see a number of BAM files (.bam) with corresponding index files (.bam.bai). We’re interested in the files GRCh38.20.illumina.merged.1.bam and GRCh38.20.illumina.merged.1.bam.bai. These files are the BAM file and the index file respectively. When attaching a BAM file to Ensembl, there must be an index file in the same folder.

To attach the file, click on Custom tracks, then click on Add more data to add a new track.

We get to the same dialogue box as before. This time we’ll name our data Illumina reads.

Paste in the URL of the BAM file itself (http://ftp.ebi.ac.uk/pub/databases/ensembl/training/emily_BAM/GRCh38.20.illumina.merged.1.bam).

Since this is a file, the interface is able to detect the “.BAM” file extension, so automatically labels the format as BAM. Click on Add data and close the menu.

To see this data, jump to a region on chromosome 20. Let’s go to the region of the CDH22 gene. Search for the gene and click on the location.

We can zoom in to see the sequence itself. Drag out boxes in the view to zoom in, until you see a view like this.

Demo: Track hub registry

Our regulatory data incorporates data from sources such as ENCODE, Blueprint, and Roadmap Epigenomics. To see the data directly from these sources, you can add track hubs.

You can search for track hubs to add in different ways: -Search for track hubs in the Track Hub Registry and choose to add them to your genome browser of choice. -Search the track hub registry using the Track Hub Registry interface in Ensembl (there is a link from the homepage).

We will now add the track hub containing data from the Blueprint project.

You can add track hubs to view in Ensembl directly via the Track Hub Registry. Go to the Track Hub Registry homepage and search for blueprint.

There are two results for the Blueprint Hub, one for adding the track hub to GRCh37 and one for adding it to GRCh38, plus one RNA-seq alignment hub.

Alternatively, you can add track hubs by searching the Track Hub Registry through Ensembl. Click the Custom tracks -> Track Hub Registry Search in any region view within Ensembl.

You can only find track hubs for the selected species and assembly denoted in the search box.

Search for blueprint.

Click Attach this hub in the search results page.

Track Hubs often contain vast amounts of data, which can slow Ensembl down, so only add them if you need them, and trash them when you are finished with them.

Go to Configure this Page to see that a new category has been added to your menu.

You can add tracks to the Region in Detail view using the matrix.