Exploring the Drosophila ZAP3 gene
(a) Find the Drosophila ZAP3 gene on Ensembl Metazoa. On which chromosome and which strand of the genome is this gene located?
(b) Where in the cell is the ZAP3 protein located?
(c) How many transcripts does it have? How long is its longest transcript? How long is the protein it encodes? How many exons does it have? Are any of the exons completely or partially untranslated?
(d) Have a look at the external references for the transcript. What is the name of the Flybase transcript?
(e) Export the sequence of the gene, cDNA and protein in FASTA format.
(a) Go to the Ensembl Metazoa homepage. Select Drosophila from the species list and type ZAP3 in the search box. Click Go. Click on ZAP3.
The Drosophila ZAP3 gene is located on chromosome X on the reverse strand.
(b) Click on GO: cellular component in the side menu.
The protein is located in the nucleus.
(c) Click on Show transcript table.
There are four transcripts. The longest one is 6218 base pairs and the length of the encoded protein is 1885 amino acids.
Click on the Ensembl Transcript ID FBtr0308583 in the transcript table.
It has eleven exons.
Click on Sequence - Exons in the side menu.
The first and last exons are partially untranslated (sequence shown in orange). This can also been seen from the fact that in the transcript diagrams on the Gene summary and Transcript summary pages the boxes representing the first and last exon are partially unfilled and narrower.
(d) Click on General identifiers in the side menu.
The Flybase transcript name is ZAP3-RE.
(e) Click on the blue Export data button. Under Options for FASTA sequence, select Genomic: Unmasked, cDNA and Peptide sequence. Click Next>. Click on Text.
This returns three sequences (one gene, one transcript and one protein sequence):
>FBtr0308583 cdna:KNOWN_protein_coding
CCATCTCTAGTTTTCAAGCCAAATATGGCGAAGTTTTCATTATCAATTCGTTCAGATAGC
AAACTGAATTAGTGAAAGGTGGACGATTTTAAACGGAACGGAGCCAATCCGCCGGATTTT
GTGAAAGCCCAACCGAGCGGTACAAGTGTGTTATTCGCTAATAGGCTAATCAATCCAGTG
etc.
>FBtr0308583 peptide: FBpp0300807 pep:KNOWN_protein_coding
MWGQWQTAAAVAPTAALPPQPSVPPPLPDAPPPPPPSDATAGAGASSGASAPIVTTAAAV
TTAPGFNPYSSGQATGAGNPYQQYTAAQYAAMTPEQQYALQHHWQQWQTYQEEYAKWHAQ
YGEQYKREMAAAAATTTAVQGVPAPVVATPAAAPVPVVAPPGPQAYPVAHNYYQGVSASP
etc.
>X dna:chromosome chromosome:BDGP6:X:10071858:10082492:-1
CCATCTCTAGTTTTCAAGCCAAATATGGCGAAGTTTTCATTATCAATTCGTTCAGATAGC
AAACTGAATTAGTGAAAGGTGGACGATTTTAAACGGAACGGAGCCAATCCGCCGGATTTT
GTGAAAGCCCAACCGAGCGGTACAAGTGTGTTATTCGCTAATAGGCTAATCAATCCAGTG
etc.