Retrieve the sequence with the LOCUS accession number: AAC60746.
and give its (a) length, (b) gene name, (c) source species, and (d) the
first four letters of it. [Nucleotide or Protein]
This refers to a protein - also find the gene sequence encoding it
and look at all the information you are given in both records!
How many journal publication(s) were authored by
Sean R Eddy in the year 1997? (Hint - use: "eddy sr" AND 1997). [Pubmed]
Try a similar search with another
query... e.g. check up on one
of your professors from another class :-)
How many archaeal genomes are currently fully
sequenced?
[Start at http://www.ncbi.nlm.nih.gov/Genomes/], choose "Microbial"
under Genome Resources...]
Now see whether you can get to that
same information but starting
from the NCBI home page again and choosing the [Genome] database!
Who sequenced "Pyrobaculum aerophilum"?
So now you know where the sequence is
from - can you find where the
specific "sequence donor" was collected, and a bit more about it?
[Hint: the information is just one click away from where you are]
While you are here, read around and learn a bit about why this
species is of interest to biologists! What is the optimal growth
temperature for this species? If you read the publication
abstract you often gain a little additional insight without
having to spend much time.
Looking at the "COGs" table, how many genes have
"Function Unknown" or are not in a COG?
[Click on "C"
character at the right side of the table listing all species]
Note: COGs are one way to try to gain
an impression of what
general life processes are happening inside an organism based
on its sequence - in spite of the fact that so many proteins
are left unassigned (we'll talk later about why this is) keep
in mind that these assignments are made computationally and
put on the web automatically. In other words sometimes they
can be wrong - so always be on the lookout, unless a functional
class in the organism of your interest has many proteins
assigned to it there is no guarantee that it is happening.
Always use your biological judgment alongside automatic
information.
How many P.aerophilum proteins have their
best hit to another archaeal protein? how many to a
bacterial one?
[Click on "T" character at right side back on the table
listing from where you linked to the COGs previously]
Note the systematic gene naming
(PAE....) e.g. for the first
protein whose strongest similarity is to
a eubacterial protein.
Give the first 5 letters of the protein sequence
with the systematic name "PAE0034"
[This can be found in many different ways, one of them is
to link to the "P" in the table (for ProTable) and take it
from there. This will take a few clicks!]
Have fun and feel free to ask us
anything that you were
wondering about during this exercise.