Return to article
Figure 1. Overrepresented physical networks.

For each of the two organisms we collected several networks representing different genomic or physical interaction properties, shown in Table (a) and (b), see Supplementary Notes S1 for data sources. The similarity matrices, computed with Pearson correlation (R), mutual information (I), conditional mutual information (Ic), partial Pearson correlation (Rc) and graphical Gaussian model (Rcall) and representing the predicted likelihood of an edge between any two genes, are compared with the graphs of the various networks. The AUC values for the receiving operating characteristic are reported in the histograms for E.coli and S.cerevisiae (c). In panel (d) a coarse grain statistics is used to describe the results. It consists in sorting the inferred weights, binning them into 100 bins and counting the percentage of “true” edges (of each physical network) lying in each bin. The percentages of true positives in the top bin are shown in the bottom histograms (a randomly chosen network would yield 1% of true positives). The same qualitative conclusions can be drawn from both scoring methods. E.coli inference: two networks are neatly emerging, TU and PC. The first emphasizes the visibility in the expression pattern of the operonal structure of the DNA. The TU and PC detected have an overlap which is consistent but still below 50% (of the 2632 TU edges and 1364 PC edges in the top 1%, 694 are in common), meaning that also co-participation in a PC is a strong, independent source of co-expression. S.cerevisiae inference (cDNA and Affymetrix data): the dominant index is PC1 in both datasets, followed by the map of duplicated genes. The high magnitude of the peaks in the cDNA data alone strongly suggests that this technology may be affected by a systematic bias towards unspecific binding and cross-hybridization of genes with sequence similarities [46], [16], see also Fig. 6. The intersection of the results for the two platforms basically corresponds to the Affymetrix edges, see Supplementary Notes S6. With the exception of TF-BS for S.cerevisiae, all histograms in panel (c) and (d) are statistically significant (q.value <0.05, see Supplementary Notes S1 and S3).