BME 110 Computational Biology Tools


Homework 1 (30 pts total)

Using NCBI Resources, UCSC Genome Browser, and the UCSC Archaeal Browser

You are assigned to re-analyze the genome of Sulfolobus acidocaldarius DSM639, but you don't know anything about it.  Using the resources we practiced in class, answer the following questions:

1.  What domain of life, and sub-domain is this species from?   (2 pts)

2.  What are its favorite growth conditions?  Give its   (4 pts)
(a) optimal growth temperature range,
(b) oxygen requirements: aerobic/anaerobic,
(c) what it uses for respiration (i.e. what it "oxidizes" for energy),
(d)  what is the optimum pH range for growth?

3.  In reading the abstract associated with this genome sequence, why do these researchers believe this genome's stability and
organization is so different from the two other Sulfolobus species previously sequenced?  (2 pts)

4.  What is the systematic name (i.e. Saci_0001) for "reverse gyrase" in S. acidocaldarius?  Give its genome coordinates. (2pts)

5.  What Biological Processes (GO: terms) are associated with this gene? (2 pt)

6.  Give the protein sequence for this gene in FASTA format.  (2 pts)

7.  What is the name of the species and the systematic gene name that is most similar to this one?  (give your evidence) (2 pts)

8. How many species (and what are their names) have genomic DNA alignments for this gene? (2 pts)  

---------------------------------------------------------------------------------------

9. Using the UCSC genome browser (genome.ucsc.edu), tell me  (3 pts)
(a) the chromosome,
(b) genome coordiates, and
(c) name (i.e. DK....) of the dyskerin gene in the human genome, (March 2006 assembly). 

10. According to the "RefSeq Genes" track in the human genome browser,  (4 pts)
(a) how many exons does this gene have?
(b) how long is the gene in the genome (Genomic Size)?
(c) how long is the mature spliced mRNA (see mRNA/Genome Alignments size)
(d) what are the first five amino acids of this protein?

11.  Turn on the "RNA Genes" track.  Are there any RNA Genes hidden in the introns of this gene?  (2 pts)
If so, what are their names?

12.  Given the following partial protein sequence, tell me the gene name of the top-scoring hit, chromosome where it is found, and disease associated with this gene (use BLAT in the UCSC human genome browser, March 2006 assembly)  (3 pts)

>Protein-q11
RVNHCLTICENIVAQSVRNSPEFQKLLGIAMELFLLCSDDAESDVRMVAD
ECLNKVIKALMDSNLPRLQLELYKEIKKNGAPRSLRAALWRFAELAHLVR
PQKCRPYLVNLLPCLTRTSKRPEESVQETLAAAVPKIMASFGNFANDNEI
KVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRRTQYFYSWLLNVLLGL
LVPVEDEHSTLLILGVLLTLRYLVPLLQQQVKDTSLKGSFGVTRKEMEVS
PSAEQLVQVYELTLHHTQHQDHNVVTGALELLQQLFRTPPPELLQTLTAV
GGIGQLTAAKEESGGRSRSGSIVELIAGGGSSCSPVLSRKQKGKVLLGEE
EALEDDSESRSDVSSSALTASVKDEISGELAASSGVSTPGSAGHDIITEQ
PRSQHTLQADSVDLASCDLTSSATDGDEEDILSHSSSQVSAVPSDPAMDL
NDGTQASSPISDSSQTTTEGPDSAVTPSDSSEIVLDGTDNQYLGLQIGQP
QDEDEEATGILPDEASEAFRNSSMALQQAHLLKNMSHCRQPSDSSVDKFV
LRDEATEPGDQENKPCRIKGDIGQSTDDDSAPLVHCVRLLSASFLLTGGK
NVLVPDRDVRVSVKALALSCVGAAVALHPESFFSKLYKVPLDTTEYPEEQ
YVSDILNYIDHGDPQVRGATAILCGTLICSILSRSRFHVGDWMGTIRTLT
GNTFSLADCIPLLRKTLKDESSVTCKLACTAVRNCVMSLCSSSYSELGLQ
LIIDVLTLRNSSYWLVRTELLETLAEIDFRLVSFLEAKAENLHRGAHHYT
GLLKLQERVLNNVVIHLLGDEDPRVRHVAAASLIRLVPKLFYKCDQGQAD
PVVAVARDQSSVYLKLLMHETQPPSHFSVSTITRIYRGYNLLPSITDVTM
ENNLSRVIAAVSHELITSTTRALTFGCCEALCLLSTAFPVCIWSLGWHCG
VPPLSASDESRKSCTVGMATMILTLLSSAWFPLDLSAHQDALILAGNLLA
ASAPKSLRSSWASEEEANPAATKQEEVWPALGDRALVPMVEQLFSHLLKV
INICAHVLDDVAPGPAIKAALPSLTNPPSLSPIRRKGKEKEPGEQASVPL