Given the following DNA sequence: GGTGTAAAGAATCTT a. Construct a keyword tree b. Construct a suffix tree 2.

2. How many different nucleotide sequences may code for the following protein sequence:


3.  Given the following MSA (Multiple Sequence Alignment), describe (in pseudocode) how you would determine which positions contained informative sites :





4. Describe how gene finding algorithms work. Include a description of all the elements that they search for to help determine whether or not a sequence is a protein coding gene

5. What is BLAST? Describe how the algorithm works. Be sure to include any statistical measures that are used in determining the strength of any BLAST results.

