BIOL 210: Application and Analysis of (Biological) Microarrays

Spring Quarter, 2007

          Final Assignment

Part A - Pseudo Grant (50%)

Write up a 5-8 page pseudo-grant in which you describe a pilot study to be funded using microarray-based technology.  The study should be able to
be completed within 3-9 months, for a total budget of $35,000 or less.  Your goal is collect enough preliminary data to later be able to write a full-fledged grant for NIH or NSF funding for 3-5 years (not part of this assignment).

You may work in groups of two (2) to brainstorm, outline and write-up your grant.  Each person should contribute equally to all parts of the
assignment (please make sure both group members carefully proof-read and edit final submission!)

This pilot grant proposal will include:

A. Brief introduction
    (1) biological problem to be studied
    (2) technology to be used (why is this an effective use of this technology? why is it better than a traditional molecular biology technique?)
B. Specific Aim(s) - specifically state what you are trying to do.
       For example:  To measure stage-specific changes in host cell/tissue gene expression upon infection with Salmonella in order to identify
            genes involved in host immunological response.
C. Experimental Plan - Describe how you plan to carry out your array experiment, includeing
    (1) Array platform, type of probes, approximate cost per array
    (2) Experimental design: how many biological replicates or time-course samples taken & when, how many technical replicates, use of dye swaps
    (3) Types of postive or negative controls you plan to use
D. Analysis Plan
    (1) What tools/program will you use to analyze your data?
    (2) How will you assess the significance of your new gene list of interest?
E. Supporting, Follow-up, & Future experiments --
    (1) In the best array papers we've read in class, all have follow-up experiments to support major findings.  Propose at least two types of supporting non-array based experiments that could be done to increase the confidence in the array findings
    (2) If you can think of any computational / genomic / proteomic follow-up future studies that could leverage your new findings, describe them briefly (i.e., identifying new transcription factor binding sites, pathogenicty islands of genes that could be targetted for drug development, working out full pathways
using genetic screens, etc).

Part B - Data Analysis from your wet-lab experiments (50%)

A. Apply a grid to your array data (TIFF files), and exract grp files.

B. Normalize your data using an appropriate method covered in class, show your arrays before and after normalization, comment on any
"blemishes" or other obvious problems pre-normalization, and if the normalization fixed these problems. 

C. Do at least one test for statistical significance of differentially expressed genes (i.e. SAM, but can use others as well if clearly described). 
If using SAMm, use a 3% false positive rate as cutoff value.

D. Give tables of genes (1) significantly up-reglated in heat-shock (up to top 30), (2) significantly down-regulated (up to 30 genes).
Are there any unknown or unexpected genes, based on their annotation?  Comment on any trends in annotation -- you can use the archaeal
genome browser to look at additonal comparative gene information (i.e. Pfam domains, conservation or lack of, nearby intergenic regions
that have similar expression patterns, etc). 

E. Perform average or complete linkage clustering using Pearson correlation on all genes and arrays (from all groups).  Did the arrays cluster by group,
by dye-swap, or by hyb protocol?  Describe at least two clusters that seems particularly well-defined WRT the annotated genes -- a 1-3 sentence
assessment of the biological significance of these clusters.  Do any of these genes appear in the same operon in looking at the genome browser? (if so,
give several examples). 

F. Based on one gene known to be involved in heat shock (hsp20/PF1883) and at least three others described in this published heat shock experiment,
did the arrays reflect a successful heat shock experiment (albeit very limited)? 

G. What follow-up experiments would you do to improve your confidence in your analyses using (1) more array-based experiments, and (2) non-array based experiments?

Requisite files: GAL file   Experiment Details file

Image Scan Data
Aaron & Kivanc Hyb TIFs:  C101  C102
Marcos & Daniel Hyb TIFs:  C103  C104
Pinal, John & John Hyb TIFs: C105  C106

GPR Files
From Nick:    C101   C102    C103    C104    C105    C106
From Grant:   C101   C102  
From Marcos:                         C103   C104 
From John:                                                      C105    C106  
Grant, Rachel, & Greg, please use Aaron and Kivanc's hyb experiments for your own group.

As soon as people have produced .GPR files (by thursday) from gridding your TIF files, please send them to me so that
I can post them for the other groups to use in their analyses (this includes Grant's group).