many others. For the environmental variables, WorldClim is the site
most researches use for the global climate data, but for marine species,
most of the data on this website is not useful. Other sites such as Bio-ORACLE, NASA’s Sea WIFS, Aqua MODIS, and the General Bathymetric Chart of the Oceans (GEBCO) provide environmental variables for
marine environments. When selecting environmental variables, it is
important to select those variables hypothesized to affect the species
directly to get a better output when modeling. (Reviewing scientific literature and selecting relevant and sufficient variables is recommended). For example, BioCLIM data (used to model terrestrial
species) offer 12 variables: annual mean temperature, minimum temperature of the coldest month, annual precipitation, and precipitation
of the wettest month, to name a few. If all variables are selected, it will
affect the reliability of the model output (Elith et al., 2011). In this
paper, a case study of whale shark’s current and future habitat suitability is provided. Whale shark records were obtained from the IUCN
website, and the environmental variables selected were from Bio-ORACLE. Only salinity, sea surface temperature (SST), and sea air temperature (SAT) were selected. Other available variables from
BioORACLE include ice thickness, phosphate, nitrate, pH, and sea
ice concentration, among others.
Cleaning the Data
Novice users of open-source data should give attention to the
source and quality of the dataset obtained. The source of the dataset can partly ensure that the final output is reliable. For example,
museum records often include the location of the facility but not
where the species was observed. For whale shark data, only records
that where from the Wildbook for Whale Sharks were used. This
database is maintained by local researchers who validate the data.
Quality of open-source data can be compromised by inaccurate
observations or inaccurate georeferencing ( i.e., a specific location
that can be mapped), duplicate records, origin of the source for validation, and accuracy of observations (e.g., eliminating a whale shark
observation documented in the Himalayas). For the whale shark case
study, only georeferenced records from 2000 to 2015 were chosen
to match the dates of the environmental variables used.
MaxEnt has a simple design that does not require extensive hours
of training. Inputting both datasets is self-evident, and the software
provides the output in a timely matter. In the software’s website,
MaxEnt and a tutorial can be downloaded (see http://biodiversity
informatics.amnh.org/open_source/maxent/). The tutorial, with its
visuals, provides explanations of the various features, such as the
“Jackknife” (Figure 1) and “Response curve” (Figure 2), that address
the significance of each environmental variable and the value for
the area under the curve (AUC), respectively (Phillips et al.,
2006). The output MaxEnt provides is a species distribution map
Since all the information needed is available and accessible online,
the exercise can be part of a formative assessment during a class
period. Teachers are encouraged to follow the process detailed in
the previous sections “Finding the Data” and “Modeling” to familiarized themselves with the software, find the dataset, and prepare
the student handout with questions (Table 3). The teacher should
Table 1. Connection to the AP Biology curriculum. These are the relevant essential knowledge and science
practices from the AP Biology Curriculum framework that are covered by this activity. After completing this
activity, students should know the topics listed here. (All information obtained from College Board, 2015).
4.B.3 Distribution of local and global ecosystems changes over time.
1 The student can use representations and models to communicate scientific phenomena and solve scientific problems.
4 The student can plan and implement data collection strategies appropriate to a particular scientific question.
5 The student can perform data analysis and evaluate evidence.
Table 2. Connection to Next Generation Science Standards. Below are the key concepts high school
students will know after completing this activity. (All information obtained from National Research Council, 2012).
Discipline Disciplinary Core Ideas
• HS-LS2 Ecosystems LS2. A. Interdependent Relationships in Ecosystems
LS2. C. Ecosystems, Dynamics, Functioning, and Resilience