| PLPTH 613 Bioinformatics Applications Spring 2009 |
||||
|
|
Lab 15. Genetical genomics and gene-regulatory-network reconstructionPurposeBecome acquainted with expression QTL (eQTL) analysis. Using QGene, run a GGM analysis to construct an inferred network, and export it for visualizing with Cytoscape.This exercise can't show all the things we can do in an eQTL analysis, as QGene's features for this are still under development. 1. Download and unzip the development version, QGene 4.3. The version (if any) currently installed on your machine doesn't handle eQTL data. 2. Download this partial data set extracted from the Brem et al. expression data, representing markers and genes on yeast chromosome 4. The full data set of 6000 e-traits (genes with expression data) and 1300 markers would take a long time to subject to QTL analysis followed by network construction, and would also require more than 1 Gb of RAM. The reduced data set will give the same idea in a few minutes. 3. Start QGene. Choose File/Load data and navigate to your data file. Once it has been loaded, choose Analysis/eQTL analysis. In the eQTL-mapping window, choose Analysis/New analysis and accept the default parameter values in the dialog except for setting Analysis to SimpleIM. 4. When the analysis is complete, you should see a "heat map" in the right-hand panel. If you don't, examine the boxes labeled Low and High at top left. These indicate the statistical cutoff values for showing points in the heat map. Because of the sampling algorithm QGene is using, sometimes the high value may be lower than the low so that no points are drawn. If this is happening, try entering a new High value greater than the Low one and click Redraw. 5. Note the pattern in the heat map. Why are there horizontal line segments arranged along the diagonal? 6. Still in the eQTL window, select Analysis/Find hotspots and accept the default parameters. This method highlights regions of denser-than-expected concentrations of points along a vertical axis, representing genes potentially co-regulated by another at the chromosomal position through which each band passes. 7. Now select Network/Graphical Gaussian model (GGM) and accept the default parameters. Lambda is a shrinkage factor that governs the strength of correlation required for an edge. If lambda is 0, all correlations are accepted and the network is full of false edges, while if lambda is 1, the correlation required is too rigorous and there are too few edges and too few highly connected subgraphs. With the appropriate box checked, lambda will be automatically estimated. 8. When the .gml (Graphical Modeling Language, another format recognized by Cytoscape) file has been saved, open Cytoscape and Import it. At first you will see nothing on the screen, but as soon as you select a layout the network will appear. You may want to use the VizMapper to adjust the visual properties -- if you click in the image of two nodes connected by an edge you will be able to do this in the resulting dialog. 9. You should see a lot of nodes with no edges. Here's one way to get rid of them: choose Select/Edges/Select all edges (it will take Cytoscape a while to select these), then Select/Nodes/Nodes connected by selected edges, and finally New/Network/From selected nodes, selected edges. Or you might be able to replace the first two steps by just dragging a selection box around the connected part of the network. In any case, we now have a connected network. 10. Although as indicated in lecture, it is possible to add at least some directed edges to an eQTL-based network, QGene doesn't do this yet because the algorithm we are working with is numerically unstable. 11. Export the network from Cytoscape with File/Export/Network as SIF file and determine as in Lab13 whether it has the scale-free property. Be sure to use the Y flag when you run the Perl script (first just run the script without arguments to see what is needed). Comment on your results and explain if you can. 12. It would be of interest to examine this network in light of what we know about biology and yeast genetics. For example, do highly connected subnetworks correspond to known pathways? Unfortunately BiNGO doesn't find the GO annotations for these nodes (I assume because a lot of them are markers, not genes, and the e-traits are labeled with gene and not ORF names) and I have not had the time to prepare them for us. 13. You may have noticed in step 11 that the minimum degree of any node is 10, which is somewhat surprising. Return to step 7, uncheck the box for automatic lambda estimation, and try creating networks with a couple of different values between 0 and 1. Examine the degree distribution and comment. Unfortunately QGene doesn't presently export directly to .sif format, which makes this task somewhat tedious since we have to go through Cytoscape. 14. Finally, note that we can carry out ordinary QTL analysis with these data. In QGene, close the eQTL window and choose Analysis/QTL mapping. Select the sole chromosome in the upper left panel. In the right-hand analysis panel, choose SimpleIM and the F statistic. Finally, click in the Traits panel at left and choose Ctrl-A to select all. After a couple of minutes you should see a plot. Discuss its correspondence with the eQTL heat map. If you need to look at the heat map again, it hasn't been lost. QGene saved it, and you may reload it by opening the eQTL window again, choosing Resume analysis, and locating the .qef file. You may have to adjust the plot thresholds again in order to see the heat map. |