Exploring genes in Bread wheat and its cultivars
- In Ensembl Beta, search for the
TraesCS2D02G248400
gene in the Triticum aestivum (Bread wheat) IWGSC assembly.- How many transcripts are there?
- What is the definition of ‘Ensmebl canonical’? What does this gene do in Bread wheat?
- Align the protein sequence of the Ensembl canonical transcript to cultivars Cadenza, Julius and Paragon.
- How many hits are found in each of the cultivars?
- Download your BLAST results. What information can you find in the files?
- In the BLAST alignment file, what sequence does
Query
refer to? What sequence doesSbjct
refer to?
- Open the Species selector and enter
bread wheat
in the search box or click on the Bread wheat icon in the species list at the bottom of the page. Select to add the IWGSC assembly and click on the green Add button. You should now see your selected species at the top of the page. Click on Find gene next to the species name, enterTraesCS2D02G248400
in the search bar and click Go. In the results, click on TraesCS2D02G248400 and select View in Genome Browser. You can find the number of transcripts in the genome browser view on the left, or the track panel on the right.TraesCS2D02G248400 has 2 transcripts.
Click on the three dots (…) next to the first transcript (TraesCS2D02G248400.2) in the track panel on the right. You can click on the questionmark (?) next to Ensembl canonical to find a description.
- The Ensembl canonical transcript is a single, representative transcript identified at every locus. The gene codes for an oxygen evolving enhancer protein.
- Stay in the transcript panel. Expand the Sequences option, select Protein sequence on the right and click on Blast whole sequence. This will take you to the BLAST app. Click on the blue Select species button and select the cultivars Cadenza, Julius and Paragon in Add species. In the BLAST app, click on the green Run button. Click on the blue Results button in the upper right-hand corner to view your results.
Cadenza has 3, Julius 10 and Paragon 3 hits.
Click on the Download icon next to the blue Results button in the upper right-hand corner. This will download your results in a compressed folder. Uncompress the folder and open the files using a plain text editor.
The folder containst 2 files: the
output.txt
file contains the BLAST alignments, including details about the sequence, alignment scores and statistics, and the sequence alignment. Thetable.tsv
file contains the metadata of your BLAST query, including any BLAST parameters and the results table you saw on the browser.Open the
output.txt
file.The Query (top sequence) refers to our sequence of interest (i.e. the protein in the IWGSC genome). The Sbjct (bottom sequence) refers to the homologue (the protein in the cultivar genome).