Lab 14. Automated network retrieval, viewing experimental
data on networks
This lab doesn't have any embedded questions.
Instructions for your report are given at the end.
Purpose
In
this lab you'll use Cytoscape to contact public databases directly for
retrieval of interaction networks, to view modules of a network, and to
superimpose RNA-abundance data
on an interaction network image. The tutorial will show you some
possibilities, and then we will attempt to extend them to other data.
You will have two weeks to report on this lab, but please finish as
much as you can today. Software like Cytoscape, which supports
community development and which exploits common standards for data
integration across the WWW, is in my opinion the future of
bioinformatics. Because the biological networks that Cytoscape is built
to handle are arguably the future of biology, it's a good idea to learn
how to use this software. You'll probably be seeing it again in your
professional life.
1. You won't need to
reinstall Cytoscape, as it has
been installed in the PLPTH_613
directory on three of the lab computers. You'll find it convenient to
install the program on your own computer if you have one available,
since we must also download and install several plugins.
2. Begin with this
tutorial on using
Cytoscape with web services. When you reach the section on
installing plugins, here is how to install them: choose menu Plugins/Manage Plugins and locate,
select, and install the plugins listed in the tutorial (you may skip
the Agilent Literature Search
plugin, as we will skip the exercise that uses it). Most install
quickly but a couple of them take a minute or two.
3. After all the plugins
are installed, exit Cytoscape with File/Quit,
and restart the program. If you don't do this, the
newly installed web-services clients may not appear in the
available-client list at step 2 in the tutorial.
4. As you follow the
tutorial, follow these lab instructions too. When you get to the step
labeled Import Known Pathways and
Interactions from Pathway Commons, use menu File/Import/Network from web services... After
the PPARG search returns, the way to "load all pathways", as directed
in the tutorial, is rather unclear. First, the tutorial image shows
only five pathways, but you'll find many more in the actual dialog and
we don't need them all. So load only a few by double-clicking on them.
When asked if you want to merge them, just select Create New Network. Sometimes
when retrieving these networks you'll get an error message. I've found
that if you just try the same thing again, it will work. Some of these
WWW sites require a few seconds of delay between requests and will
refuse to cooperate if too many requests are received in a short time.
5. To merge these
networks as directed in the tutorial, first use Ctrl-clicking to select
the networks you've just loaded in the Network Panel at left of
the main Cytoscape viewing frame. Now choose menu Plugins/Merge networks. Apply a
spring-embedded layout. Comparing
the union to the individual networks, do the latter appear to share any
nodes?
6. In the section Import Binary Interactions from IntAct,
you can name this network PPARG from
IntAct.
7. In the section Import KEGG Pathway using BioRuby,
this may seem very confusing and you may not realize how cool it is to
a bioinformatics geek.
BioRuby is a scripting language like a programming language, only used
mainly to run other programs), and is being used here to drive
Cytoscape's internal machinery both for retrieving remote data and for
laying out networks. You can imagine that much larger research tasks
could be automated in this way, though most biologists would need
specialist help.
8. Examine the three
networks imported from WikiPathways.
What is different about these
networks in comparison with the others?
9. We won't do
anything further with these networks (and they will not be needed in
your report), so you may Destroy
them. However, note for future reference that Cytoscape permits you to
save all of your networks at once (using File/Save, of course) to a .cys file. You can see how this
would be useful if it took half an hour or more to import the network
data.
10.
Cytoscape allows us to combine an expression data set with a network in
order to visualize regions that are coregulated. Follow this
tutorial (which is reached from topic 13.5.3 Visualizing Expression Data on a
Network on the Cytoscape manual page) to see how this is done. Don't use the tutorial
04_Expression_Data, which
doesn't apply to the latest version of Cytoscape and will not work right.
11.
This tutorial
on modules and complexes will not be needed for the report, so
whether you follow it is your choice. Like the Omics Viewer that we
played with two weeks ago, it will show you how subnetworks can be made
to "light up" during a time-course expression experiment. It may take
you about half an hour to complete.
12. Applying your learning. We will now attack more open-ended
problems, using Cytoscape with other interaction and pathway resources.
You will have two weeks for this (submit reports
by 5.14),
but
please don't put it off
and do a hasty and careless job at the last minute. Select one of the
following tasks (if it turns out easy, do the other one too and get
extra credit):
Task 1
We have seen how to map expression data onto a gene network. We can
also map metabolomics data onto a metabolic-pathway network. You are
asked to acquire
a suitable network and a set of experimental metabolite-abundance data
from a public database, and create an exercise like the one you
followed in step 10 above, but with
metabolomic instead of RNA expression data. Cytoscape offers a
plugin called MetScape, and you can also obtain several useful plugins
from the Metabolic Network
Exchange.
You will have to find suitable experimental data on your own; don't
just use sample data from someone else's tutorial! A good starting
point may be PathGuide, which
contains links to an overwhelming number of interaction and pathway
databases.
Task 2
As you will have noticed, the most elaborate molecular-interaction data
sets provided in public databases are from model organisms such as
yeast, fly, Arabidopsis,
rice, mouse, and human, whose genomes have been sequenced. In this task
you are asked to create a network
for the genes on a
microarray of wheat (or any other organism you are interested in, but
for which no gene-interaction network is directly available).
You don't
need to do this for all the thousands of genes -- just pick a
manageable number (maybe 10 or 20), and focus on genes that are
differentially regulated across some experimental conditions. Then map
expression data onto them with Cytoscape as in the exercise above.
Recall that in Lab 10 we saw that about 30%
of the 61K probesets on the wheat chip could be assigned GO
classifications based on homology with counterparts in other organisms,
and this allowed us to do a modest GO enrichment investigation. Surely
the interactions among homologs of those genes in other species can be
transferred in some way to our genes, so that we can map our expression
results onto them. I'm thinking here of any kind of interaction: from
protein - protein interaction to enzymatic pathways or
transcription-factor - gene binding. The MetNet plugins mentioned
above may be helpful here, especially one called OmicsViz,
which allows you to translate names from one species to another. Again,
don't just use sample data from a tutorial -- you must create your own
results.
One resource providing metabolic pathways for many organisms (for
example, wheat)
is the Plant Metabolic Network.
If you follow a hyperlink to a pathway you'll see a button allowing you
to download the network in BioPAX format, which is readable by
Cytoscape. You might try substituting wheat probeset names with the
names of the enzymes to which the probesets are homologous, for example.
Try to do these tasks with available tools, but if you think you need a
tool that isn't available, check with me and I may be able to supply a
Perl script.
You are welcome to work together with classmates on a task
(but, again, if it turns out very easy, please tackle the other task
too). Reports must be written independently.
In
your report, tell what you did and how you did it, in the usual concise
but informative way, employing hyperlinks and images where helpful.
Your results may be used in future exercises, so they must not leave
anything vague. Work that makes use of many resources (showing that you
really explored them and the Cytoscape plugins) will be especially
favored.
|