1. What are the relationships of different ranaviruses based on
the sequence of their MCP genes?
2. Based on the MCP gene sequence, how closely related are
Ranavirus isolates from fishes and frogs?
3. Is local adaptation seen between different strains of Ambystoma
tigrinum virus isolated from different regions of the western
United States if we use the immediate early protein ICP- 4
(ORF13L in FV3)?
Student Learning Objectives
1. Obtain data from GenBank by using information from the literature (e.g., Eaton et al., 2007) and format this data in a manner
that can be used for further analysis.
2. Use nucleotide-nucleotide BLAST (nBLAST) searches to create
a data set for phylogenetic analysis.
3. Manipulate sequence data and ensure that it is in the proper
format to be used in other programs (e.g., MAFFT and
4. Use multiple free servers and/or programs for the manipulation of sequence data and the production of potentially novel
5. Use MEGA to produce phylogenetic trees and to accurately
describe the results.
6. Interpret sometimes confusing and unexpected phylogenetic
Data Collection Methods
Obtaining Sequence Data
The most efficient way to collect data to analyze is to use the
sequence of the ORF of interest in FASTA format obtained directly
from GenBank and to do an nBLAST search using that sequence.
1. To obtain the sequence data, find the ORF that you are looking for; then click the link “gene” beside it.
2. This will bring up a window at the bottom of the screen;
click the FASTA link beside it to bring up the FASTA formatted sequence in a new window.
The Basic Local Alignment Search Tool (BLAST) is a web-based search
engine that compares sequences and comes up with similar sequences
( https://blast.ncbi.nlm.nih.gov/Blast.cgi). To perform a nucleotide-nucleotide BLAST, simply go to the website above and paste your
sequence into the box and then click on the “BLAST” button on the
bottom of the page. There is no need to change any of the settings
before BLASTing your sequence. The results of your nBLAST search
will have three sections: a graphic summary, descriptions, and alignments. They will all be open.
1. Scroll down to the “Description” section; this is where you
can extract the data for your alignment. Depending on the
question your students are investigating, different sequences
can be selected using the “tick boxes” on the righthand side.
2. After you have selected all the sequences you are interested in
working with, click on the “Download” link and a submenu
will pop up; select “FASTA (aligned sequences)” and continue.
3. You will be asked to save or open the file “sequdump.txt”.
Open the file in Notepad (or another text editor).
4. The next step is to organize the data in Notepad. When the
data are “dumped” into the file, the result is a continuous line
5. The individual sequences need to be set on their own start
lines, with a line of space in between, so that they can easily
be read by the alignment software.
6. Make sure that the different sequences have easily identifiable
names (e.g., if you have two strains of FV3, use “FV3-1” and
“FV3-2” or something similar to easily differentiate between
the two). Now save the file so that the sequences can be analyzed further.
One of the best pieces of alignment software is MAFFT (Katoah
et al., 2017), which can either be downloaded to a computer or
usedonaserver.Th e server that I recommend using is at https://
mafft.cbrc.jp/alignment/server/. The interface is easy to use, and
the results can be downloaded in multiple formats. Since the data
set has been saved as a plain text file (.txt), it can be uploaded
directly to the server.
1. Once the data set has been uploaded, click the “Submit”
button just above the “Advanced Settings” heading. There is no
need to adjust any of the settings or use the “Advanced Settings” further down the page to get usable results.
2. Once the server has computed the alignment, you will get a
summary screen. The results can then be downloaded in several different formats and a tree can be visualized (see step 1
in the next section below).
It is important to teach students how to move data between different
programs. The free program MEGA (Molecular Evolutionary Genetic
Analysis; http://www.megasoftware.net/) is an excellent starting
point for the estimation of nucleotide substitution models and tree
visualization. Currently, if you use a Windows-based computer,
MEGA 7 (Kumar et al., 2016) is available only for 64-bit versions
of the operating system. Therefore, if you use an older operating
system, I recommend that you download MEGA 6 (Tamura et al.,
2013). To import the aligned sequences into MEGA, it is easiest if
you use FASTA formatted data.
1. To obtain the FASTA formatted data, select the FASTA format link at the top of the MAFFT summary page. This will
open a new tab in your web browser.
2. To select all the data, right click and then choose “Select
All” in the pop-up menu.
3. Copy and paste the data into a new Notepad file.
4. This file can then be uploaded into MEGA 6 and the format
changed into .meg for further analysis.