Use of Virtual Ribosome: Practice and worked example

This assignment supports Bioinformatics II

Purpose

Gain practice with Virtual Ribosome and reading code by working through these examples.

Resource

Example

You need to copy and paste sequence into Virtual Ribosome. But Virtual Ribosome expects a certain format, the FASTA format. FAST  form: use the first line for text to identify what the sequence is. Then the sequence follows with no hard returns (one big line). For example, if you paste what is written below into the box for the Virtual Ribosome, that’s the FASTA format

> Sequence 1
ATCCGCGTGCAA

Here’s the screen from Virtual Ribosome after I copy-pasted the sequence in FASTA format into the text box (Fig. 1). Note also that I clicked on the “Reading frame” option to bring up the options for reading frames. Select “Plus 1,2,3” to get all three reading frames

virtual_ribosome_example.png

Figure 1. Screenshot of Virtual Ribosome menu.

When you are ready, click on the Submit query button. A new window in your browser will show up indicating that your query is in process.

Upon completion, here is the output from Virtual Ribosome (plus 1,2,3 reading frame)

VIRTUAL RIBOSOME
----------------
Translation table: Standard SGC0 

>sequence 1 - reading frame(s): plus

      P  R  A  
     S  A  C  
    I  R  V  Q  
5' ATCCGCGTGCAA 12
   ............ 

 

Question 1. Looking at the output from Virtual Ribosome above, What polypeptide sequence corresponds to RF1? RF2? RF3?

Hint: “RF1” refers to “reading frame 1.”

Answer: IRVQ

Question 1a. Here’s another “mystery sequence.”

>Mystery sequence 2
ATGTAAAAATACATTTGTTCCATAAGATAACCTCCGTGAAGGTAGAGACTTGGTCTGTTTTGTTTATTGC

— Run the mystery sequence through Virtual Ribosome, settings (plus 1,2,3 reading frame)

— Run a BLAST search on each protein predicted from each of the the three reading frames; What is your best guess as to the identity of this DNA sequence(e.g., is it from a gene? What gene?)?

Question 2. What are those single letters P R A and I R V Q? Write out the amino acid sequence in 3-letter code.

They are short code for a particular amino acid. For example “A” stands for the amino acid “alanine” (the three-letter code is “Ala”). The three and single letter codes for all amino acids are shown in the table below

Table 1. Common amino acids in biology

Amino acid 3-letter 1-letter
Alanine Ala A
Arginine Arg R
Asparagine Asn N
Aspartic acid Asp D
Cysteine Cys C
Glutamic acid Glu E
Glutamine Gln Q
Glycine Gly G
Histidine His H
Isoleucine Ile I
Leucine Leu L
Lysine Lys K
Methionine Met M
Phenylalanine Phe F
Proline Pro P
Serine Ser S
Threonine Thr T
Tryptophan Trp W
Tyrosine Tyr Y
Valine Val V

While you are working on this, here are some additional hints about things you may encounter.

Note: if you get a server error, this is coming from their site, not Chaminade. It may because the server limits the number of searches from a particular IP address and too many of us are searching within a certain limited time frame (e.g., 24 hours for Virtual Ribosome). Solution? Wait and try again at later time, or better, move to your favorite off campus site and run the work through their WiFi.

 

What to do, what to turn in

  1. Generate your own random 1000 nt long DNA sequence at https://www.bioinformatics.org/sms2/random_dna.html
    • You can use this sequence for ORF Finder exercise, too, or generate a new one for each exercise
  2. Paste your sequence in FASTA format to ORF Finder, set ORF length to 30, then investigate for presence of ORFs.
    • Copy and paste this same random sequence into the text submit box, no screenshots (so I can recreate your work).
  3. Capture a screenshot like the one in Figure 2. If no ORFs found, report as such.
    • Convert image to pdf
  4. Select one ORF and do BLAST search. Report your results.
  5. Your responses to other questions in this handout should be placed in your notebook.