PLPTH 613
Bioinformatics Applications
Spring 2009


Schedule
Research project
K-State Online

Lab 14. Automated network retrieval, viewing experimental data on networks

This lab doesn't have any embedded questions. Instructions for your report are given at the end.

Purpose

In this lab you'll use Cytoscape to contact public databases directly for retrieval of interaction networks, to view modules of a network, and to superimpose RNA-abundance data on an interaction network image. The tutorial will show you some possibilities, and then we will attempt to extend them to other data. You will have two weeks to report on this lab, but please finish as much as you can today. Software like Cytoscape, which supports community development and which exploits common standards for data integration across the WWW, is in my opinion the future of bioinformatics. Because the biological networks that Cytoscape is built to handle are arguably the future of biology, it's a good idea to learn how to use this software. You'll probably be seeing it again in your professional life.

1. You won't need to reinstall Cytoscape, as it has been installed in the PLPTH_613 directory on three of the lab computers. You'll find it convenient to install the program on your own computer if you have one available, since we must also download and install several plugins.

2. Begin with this tutorial on using Cytoscape with web services. When you reach the section on installing plugins, here is how to install them: choose menu Plugins/Manage Plugins and locate, select, and install the plugins listed in the tutorial (you may skip the Agilent Literature Search plugin, as we will skip the exercise that uses it). Most install quickly but a couple of them take a minute or two.

3. After all the plugins are installed, exit Cytoscape with File/Quit, and restart the program. If you don't do this, the newly installed web-services clients may not appear in the available-client list at step 2 in the tutorial.

4. As you follow the tutorial, follow these lab instructions too. When you get to the step labeled Import Known Pathways and Interactions from Pathway Commons, use menu File/Import/Network from web services... After the PPARG search returns, the way to "load all pathways", as directed in the tutorial, is rather unclear. First, the tutorial image shows only five pathways, but you'll find many more in the actual dialog and we don't need them all. So load only a few by double-clicking on them. When asked if you want to merge them, just select Create New Network. Sometimes when retrieving these networks you'll get an error message. I've found that if you just try the same thing again, it will work. Some of these WWW sites require a few seconds of delay between requests and will refuse to cooperate if too many requests are received in a short time.

5. To merge these networks as directed in the tutorial, first use Ctrl-clicking to select the networks you've just loaded  in the Network Panel at left of the main Cytoscape viewing frame. Now choose menu Plugins/Merge networks. Apply a spring-embedded layout. Comparing the union to the individual networks, do the latter appear to share any nodes?

6. In the section Import Binary Interactions from IntAct, you can name this network PPARG from IntAct.

7. In the section Import KEGG Pathway using BioRuby, this may seem very confusing and you may not realize how cool it is to a bioinformatics geek. BioRuby is a scripting language like a programming language, only used mainly to run other programs), and is being used here to drive Cytoscape's internal machinery both for retrieving remote data and for laying out networks. You can imagine that much larger research tasks could be automated in this way, though most biologists would need specialist help.

8. Examine the three networks imported from WikiPathways. What is different about these networks in comparison with the others?

9.
We won't do anything further with these networks (and they will not be needed in your report), so you may Destroy them. However, note for future reference that Cytoscape permits you to save all of your networks at once (using File/Save, of course) to a .cys file. You can see how this would be useful if it took half an hour or more to import the network data.

10. Cytoscape allows us to combine an expression data set with a network in order to visualize regions that are coregulated. Follow this tutorial (which is reached from topic 13.5.3 Visualizing Expression Data on a Network on the Cytoscape manual page) to see how this is done. Don't use the tutorial 04_Expression_Data, which doesn't apply to the latest version of Cytoscape and will not work right.

11. This tutorial on modules and complexes will not be needed for the report, so whether you follow it is your choice. Like the Omics Viewer that we played with two weeks ago, it will show you how subnetworks can be made to "light up" during a time-course expression experiment. It may take you about half an hour to complete.

12. Applying your learning
. We will now attack more open-ended problems, using Cytoscape with other interaction and pathway resources. You will have two weeks for this (submit reports by 5.14), but please don't put it off and do a hasty and careless job at the last minute. Select one of the following tasks (if it turns out easy, do the other one too and get extra credit):

Task 1
We have seen how to map expression data onto a gene network. We can also map metabolomics data onto a metabolic-pathway network. You are asked to acquire a suitable network and a set of experimental metabolite-abundance data from a public database, and create an exercise like the one you followed in step 10 above, but with metabolomic instead of RNA expression data. Cytoscape offers a plugin called MetScape, and you can also obtain several useful plugins from the Metabolic Network Exchange. You will have to find suitable experimental data on your own; don't just use sample data from someone else's tutorial! A good starting point may be PathGuide, which contains links to an overwhelming number of interaction and pathway databases.

Task 2
As you will have noticed, the most elaborate molecular-interaction data sets provided in public databases are from model organisms such as yeast, fly, Arabidopsis, rice, mouse, and human, whose genomes have been sequenced. In this task you are asked to create a network for the genes on a microarray of wheat (or any other organism you are interested in, but for which no gene-interaction network is directly available). You don't need to do this for all the thousands of genes -- just pick a manageable number (maybe 10 or 20), and focus on genes that are differentially regulated across some experimental conditions. Then map expression data onto them with Cytoscape as in the exercise above.

Recall that in Lab 10 we saw that about 30% of the 61K probesets on the wheat chip could be assigned GO classifications based on homology with counterparts in other organisms, and this allowed us to do a modest GO enrichment investigation. Surely the interactions among homologs of those genes in other species can be transferred in some way to our genes, so that we can map our expression results onto them. I'm thinking here of any kind of interaction: from protein - protein interaction to enzymatic pathways or transcription-factor - gene binding. The MetNet plugins mentioned above may be helpful here, especially one called OmicsViz, which allows you to translate names from one species to another. Again, don't just use sample data from a tutorial -- you must create your own results.

One resource providing metabolic pathways for many organisms (for example, wheat) is the Plant Metabolic Network. If you follow a hyperlink to a pathway you'll see a button allowing you to download the network in BioPAX format, which is readable by Cytoscape. You might try substituting wheat probeset names with the names of the enzymes to which the probesets are homologous, for example.

Try to do these tasks with available tools, but if you think you need a tool that isn't available, check with me and I may be able to supply a Perl script.

You are welcome to work together with classmates on a task (but, again, if it turns out very easy, please tackle the other task too). Reports must be written independently.

In your report, tell what you did and how you did it, in the usual concise but informative way, employing hyperlinks and images where helpful. Your results may be used in future exercises, so they must not leave anything vague. Work that makes use of many resources (showing that you really explored them and the Cytoscape plugins) will be especially favored.