Exploring the MYH9 gene in Human
- In Ensembl Beta, find the MYH9 gene in the human GRCh38 (hg38) genome assembly.
- On which chromosome and which strand of the genome is this gene located?
- How many transcripts (splice variants) are there and how many of these are protein-coding?
- What’s the definition of MANE Select? How many coding exons does the MYH9 MANE Select transcript have?
- What sequences are available to download?
- Let’s explore the protein information that is available for the MANE Select transcript. Open the Entity Viewer.
- How long is the protein sequence (in aa) and what is the Ensembl protein ID?
- Open the Protein information panel under the ‘Gene function’ tab. What protein domain annotations are available for this transcript?
- What is the protein function according to UniProtKB?
- We’re going to explore homologues of the MYH9 gene. Open the ‘Gene relationships’ tab.
- Which species has the most similar sequence in terms of protein similarity? What is the corresponding gene ID?
- How many transcripts does the homologue have?
- Open the Species selector and enter
human
in the search box or click on the Human icon in the species list at the bottom of the page. Select to add the reference (GRCh38.p14) assembly and click on the green Add button. You should now see your selected species at the top of the page. Click on Find gene next to the species name, enter MYH9 in the search bar and click Go. In the results, click on MYH9 ENSG00000100345.23 and select Genome Browser. Click on the gene in the genome browser or click on the three dots (…) next to MYH9 protein_coding in the track panel on the right.- The gene is located on chromosome 22 on the reverse strand.
In the track panel, click on +22 transcripts to expand the list of transcripts.
- MYH9 has 23 transcripts. 6 of these are protein_coding (this includes the MANE select transcript) and 3 are defined as protein_coding_CDS_not_defined (a transcript that belongs to a protein_coding gene and does not contain an open reading frame).
In the track panel, click on the three dots (…) next to the MANE Select transcript to open the transcript panel and find out more details.
- The MANE Select is a default transcript per human gene that is representative of biology, well-supported, expressed and highly-conserved. The MANE Select transcript has 40 coding exons.
In the transcript panel, expand the Download option.
- You can download the genomic sequence and exons of the gene, and genomic sequence, cDNA (the sequence of the spliced exons of a transcript expressed in DNA notation – T rather than U – representing the coding or sense strand), exons, protein sequence and coding sequence (CDS) of the tanscript.
- To open the Entity Viewer, scroll to the bottom of the page and click on the Entity Viewer icon. Alternatively, if you are in the genome browser view, click on the first MYH9 transcript and click on View in Entity Viewer in the pop-up menu. Once in the Entity Viewer, click on the MANE Select transcript (this will open a grey information panel underneath).
- The protein is 1,960 aa long. The Ensembl protein ID is ENSP00000216181.6.
In the Entity Viewer, switch to the Gene function tab panel, click on the three dots (…) next to the MANE Select transcript to open the transcript panel and find out more details.
- Protein domains are distinct functional and/or structural units in a protein, which are usually responsible for a particular function or interaction, contributing to the overall role of a protein. There are several methods and algorithms that can be used to classify protein domains into families and functional sites. For the ENSP00000216181.6 protein, PANTHER and Pfam annotations are available.
In the Gene function tab, click on the UniProtKB/Swiss-Prot P35579 link to open the corresponding entry in UniProt.
- According to UniProtKB, the function of the protein is as follows: _Cellular myosin that appears to play a role in cytokinesis, cell shape, and specialized functions such as secretion and capping. Required for cortical actin clearance prior to oocyte exocytosis; promotes cell motility in conjunction with S100A4; and during cell spreading, plays an important role in cytoskeleton reorganization, focal contact formation (in the margins but not the central part of spreading cells), and lamellipodial retraction; this function is mechanically antagonized by MYH10.
- In the Entity Viewer, switch to the Gene relationships tab to view any homologues of the MYH9 gene. You can filter the homology table by header. The % protein similarity is the precentage of identical amino acid residues aligned against each other. Click on the % Protein similarity header once to sort the table in descending order.
- The Pan troglodytes (Chimpanzee) homologue is the most similar in terms of protein similarity. The gene ID of the homologue is ENSPTRG00000014309.6.
Click on the gene ID ENSPTRG00000014309.6, to open the corresponding homologue in the Entity Viewer. Count the number of transcripts you can see under the Transcripts tab.
- The chimp homologue has 4 transcripts.