Using Bioinformatics Tools and Databases in AP® Biology
Technical Support Specialist, Live Materials
As students move forward into research with molecular techniques, a solid understanding of bioinformatics tools will become invaluable as they further their study of biology. The AP® Biology course should provide students with a basic understanding of the tools used in molecular research and their application to a variety of fields within the life sciences. We will take a look at 2 bioinformatics tools that any student or researcher with a computer and an Internet connection can access and use.
Nucleotide analysis using BLAST®
The National Center for Biotechnology Information (NCBI) maintains a molecular biology public database and develops software tools for researchers to use when analyzing genomic data. This is known as the BLAST® (Basic Local Alignment Search Tool) database. Researchers can choose from several different algorithms depending on the sequence being analyzed and their specific research question.
To use BLAST®, the user submits a sequence of interest (it can be DNA, RNA, or an amino acid chain) for analysis by a selected algorithm. The algorithm then compares the submitted sequence with sequences in its database. BLAST® tells the user which database sequence most closely matches the submitted sequence. This tool can be used to link a variety of topics within the AP® Biology curriculum, such as evolution, protein structure and function, as well as some aspects of ecology and environmental science.
Wider availability of tools like BLAST® allow for AP® Biology students to study cladistics and phylogeny at the molecular level. In the AP® Biology Investigative Labs: An Inquiry-Based Approach manual, Investigation 3 teaches students about BLAST’s basic functions and allows for open-ended inquiry once they have mastered using the program. The investigation uses sequences that have been preloaded for students.
Identifying organisms has grown in importance as we monitor the effects of a changing climate and attempt to preserve biodiversity in our planet’s most compromised ecosystems. Several important molecular techniques can be used to analyze genomic information in collected samples. You can take the AP® investigation further and provide a molecular techniques component to your evolution unit with Carolina’s Using DNA Barcodes to Identify and Classify Living Things kits.
With these kits, students collect and extract DNA, and perform PCR and electrophoresis analyses on samples of biological material. You have the option to send samples away to a sequencing service for a small additional fee. Once sequences are obtained, they can be analyzed using BLAST®. Students are then able to build phylogenetic trees using information obtained from their samples.
Amino acid sequence comparison activity using UniProt
UniProt is a freely accessible database of protein sequence and function. It contains information derived from primary literature sources and large sequencing projects. The database continues to grow as more sequencing projects are completed.
The AP® Biology Investigative Lab: An Inquiry-Based Approach open inquiry activity for Investigation 3 suggests that students create a phylogenetic tree for a protein found in a variety of organisms of their choosing. Students then explain, using bioinformatics, how a group of organisms is related to one another at the protein level.
The manual gives a list of suggested proteins for students to research. Some additional options include hemoglobin (animals only), PEP carboxylase (plants only), tubulin, NADH-ubiquinone oxidoreductase, cytochrome c oxidase subunit, and collagen.
- Go to the UniProt site. Verify that the drop down menu in the search box shows “UniProtKB.”
- Enter your chosen protein and chosen organism’s Latin name in the search box. See the following example searches:
- Hemoglobin Mus musculus (house mouse)
- Hemoglobin Canis lupus familiaris (dog)
- Hemoglobin Procyon lotor (raccoon)
- Hemoglobin Myotis lucifugus (little brown bat)
- Hemoglobin Carassius auratus (goldfish)
- One or more proteins should be returned in each search. When searching for your protein in different organisms, be careful to make sure the sequence identifier name and the amino acid length are roughly equivalent. Record the entry number (e.g., P02088) for the first results of the search by writing or pasting the information into a document.
- Once the sequences are collected, use the “Align” tab to enter each entry number into the “Protein sequences” search box. Use a single space to separate each entry number (e.g., P02088 P11352 P70662). Click the “Run Align” button.
- Scrolling down on the results page will reveal a phylogenetic tree generated from the amino acid sequences entered into the query.
For an assessment, assign students a short paper explaining the conclusions they can draw about the evolutionary relationships between the organisms they chose based on the protein they chose. Would the results be the same if they analyzed a different protein? Encourage students to use relevant vocabulary from the phylogenetics unit, and concepts learned in previous investigations, to justify their conclusions.
AP® is a trademark registered and/or owned by the College Board®, which was not involved in the production of, and does not endorse, these products.