Validating modules using independent data sources

We have validated the quality of the results obtained with our module discovery algorithm, by performing analysis using four independent data sources: the MIPS database, a conventional chromatin-IP (chIP) experiments, transcription factor-gene interactions identified in the literature, and DNA sequence motif information.

  • Complete information regarding the MIPS results can be found on the modules page.
  • In order to investigate how our module discovery algorithm improved false negative and positive rates, we focused on factor-gene interactions in the cell cycle network (click here to see modules discovered for this network). We performed a chromatin-IP experiment as an independent assay of gene-factor interactions. Briefly, we examined interactions between the factor Stb1 and 36 genes. The profiled genes were picked randomly from the full set of yeast genes, with representatives selected from four p-values ranges. In the chromatin-IP experiment, three additional genes were determined to be bound by Stb1 that had p-values between .01 and .001 in the genomic binding experiments and were thus excluded with the stringent cutoff. Our algorithm identified all three genes as bound by a module controlled by Stb1 without adding any additional genes that were not detected using the experiment. You can download a text file containing the chromatin-IP experiments results from here.
  • We have also investigated the extent to which genes in our modules are enriched for DNA sequence motifs as compared with results obtained using genomic binding data alone. we computed the percentage of genes from each list that contained the appropriate known motif in the downstream region of DNA. A file containing the full list of factors, and the percentage identified for each of these factors can be found here.