This Special Advertising Feature is brought to you by AAAS OPMS

DOI: 10.1126/science.opms.p1000046

Protein-Protein Interaction Technologies Toward a Human Interactome

For PDF versionNew Products


The human genome has been called the "blueprint of life," but it's really more of a parts list. Cellular architecture is better defined by its complexes, the molecular machines that actually make a cell, a cell. French researchers first coined the term "interactome" in 1999; the first protein-protein interactome data appeared in 2000. Today the field—like the 11-year-old it is—is maturing rapidly. Interactome research has racked up more than 560 publications, and databases now house interactions numbering in the hundreds of thousands. Still, as international efforts to map the human protein-interaction network get under way, it's clear interactomics has a long way to go.

By Jeffrey M. Perkel

Inclusion of companies in this article does not indicate endorsement by either AAAS or Science, nor is it meant to imply that their products or services are superior to those of other companies.

There are some 213,000 protein-protein interactions logged in the IntAct database, 169,000 in the BioGRID database. Represented graphically as starbursts of protein "nodes" linked by interaction "edges," these interactions are collected for one important reason: Proteins, like humans, are social animals. From DNA replication to protein degradation, the work of the cell is accomplished mostly by macromolecular complexes—a fact that researchers, awash in genome sequence data but bereft of functional annotation, can exploit to gain insight into what proteins do.

"Knowing about the sociology of your protein is an integral part of today's discovery process," says Giulio Superti-Furga, scientific director of the Center for Molecular Medicine at the Austrian Academy of Sciences.

It's like profiling a protein by cataloging its Facebook friends. Functional annotation by molecular association is becoming standard practice, and today a genome without an interactome is effectively unfinished. Large-scale efforts are under way to fill the gaps. The Canadian government, in collaboration with other partners, has awarded nearly $23 million "for the creation of a national technology platform aimed at mapping the human interactome," including $9.16 million in June 2009 from the Canada Foundation for Innovation (CFI). A subset of these researchers, with colleagues in the United States and Europe, now seeks to take the project international through a nascent effort called the International Interactome Initiative, or I3.

"After the sequencing of the human genome, the next step for many people is to find how these proteins and other molecules interact and combine together to sustain life," says Benoit Coulombe of the Clinical Research Institute of Montreal, who is the principal investigator on the CFI grant and a member of the I3 Steering Committee.

Technologies are already in place, but no one lab can do it alone; the interactome, like the genome, must be deciphered collaboratively. Yet if the human genome is any guide, the time may not be long until a first, albeit "drafty," human interactome is unveiled, says Marc Vidal, director of the Center for Cancer Systems Biology at the Dana-Farber Cancer Institute and professor of genetics at Harvard Medical School.

"We are now 21 years after Fields and Song [published the pivotal yeast two-hybrid method of interaction detection]," Vidal says. "It took 25 years between Sanger and Maxim and Gilbert to get a good version of the human genome sequence. So let's say we are half a decade away from that, give or take."


Vidal's enthusiasm notwithstanding, even a drafty interactome will require massive effort.

A contraction of interaction and genome, the term interactome reflects the fact that its practitioners apply whole-genome sensibilities to their work. But that's where the similarities end. Whereas genomes are discrete entities of defined size, interactomes are moving targets. Like Sisyphus with his boulder, researchers could labor forever and never "finish" the interactome.

First, there's an issue of scale. The human genome encodes about 20,000 genes, producing some 200 million possible total pairwise protein combinations. In 2009, Vidal calculated that the human interactome, "excluding splice variant complexity," contains between 74,000 and 200,000 binary interactions. Of these, he says, researchers have mapped "perhaps 10,000" high-quality interactions. "So we are probably one order of magnitude if not 50-fold away from where we need to go."

Yet the problem certainly is bigger than that, if for no other reason than that these interactions need to be detected multiple times to be of high confidence. And in any event, the interaction landscape is fluid.

"These are very dynamic things," says Tony Pawson, distinguished scientist at the Samuel Lunenfeld Research Institute (SLRI) in Toronto. "Many of the interactions that control cellular behavior are dependent on posttranslational modifications induced by growth factors or other signals, and those interactions change continuously. And the expression of proteins and their interactions vary from cell to cell."

For every cell type, stressor, stimulus, nutritional condition—basically, for any variable one can imagine, including time—there exists a unique interactome, like a frame in a movie. Even without dramatic cellular changes, the interactome is a living, breathing thing. "The interactome is basically in flux all the time," says SLRI senior investigator Jeff Wrana.

Michael Rout of Rockefeller University and John Aitchison of the Institute for Systems Biology have devised one approach to deal with that problem. The two researchers, both PIs with the National Center for Dynamic Interactome Research (NCDIR), capture proteomic motion via interactome dynamics—effectively, collecting multiple interaction network movie frames over time.

"We are big believers that the most dangerous thing you can do in science is to get put off because you can't solve a problem," Aitchison says.

Focusing on such problems as viral infection, nuclear transport, and peroxisome biogenesis, NCDIR researchers collect interactomic "stills" under different biological conditions and then strive computationally to infer what's different.

"We start with two states and then try to bring down the time resolution to see how those interactions change with time," Aitchison says.


To collect their data, Rout and Aitchison use affinity purification coupled to mass spectrometry (AP/MS), in which endogenous multiprotein complexes are purified and analyzed en masse; Vidal prefers the yeast two-hybrid assay (Y2H), in which two proteins are coupled to the two halves of a transcription factor and expressed in yeast. If the two proteins—bait and prey—interact in the nucleus, they reconstitute a functioning DNA-binding protein, activating a reporter gene that signals the molecules' intracellular pas de deux.

Easily automated and amenable to high throughput analysis, both approaches have been applied in genome-scale studies. In one recent example, Vidal's team probed 100 million pairwise protein combinations in the Caenorhabditis elegans proteome using Y2H, from which they derived 1,816 interactions—about 1 percent of the predicted nematode network of 116,000 interactions.

Yet Y2H and AP/MS represent two complementary—some would say competing—strategies for interactome mapping, and most agree both are critical to a fully populated interactome map. "You are asking different questions with the [two] approaches," says Superti-Furga, who uses AP/MS to probe cell signaling pathways. On the one hand are what Vidal calls "binary interactome mappers," researchers who use techniques (such as Y2H, luminescence-based mammalian interactome [LUMIER], and the protein complementation assay [PCA]) to directly probe pairwise interaction potential—that is, can protein X interact with protein Y. On the other hand are "co-complex membership guys," scientists (including most members of the Canadian national platform) who use AP/MS to ask, when purifying protein X, what else comes along for the ride (whether directly or not)?

Not surprisingly, the two workflows produce different and often only partly overlapping results. For instance, AP/MS can detect indirect interactions in complexes mediated by a third (bridging) protein—contacts that cannot be seen via traditional binary methods. Yet co-complex-based approaches can distinguish neither proteins that physically interact directly from those that do not, nor mutually exclusive forms of the same complex, complicating the resulting networks.


Complicated as they are, protein-protein interactions represent only part of the interactome; a complete network map also contains protein±nucleic acid, protein±small molecule, and genetic interactions, too.

In 2009 Bernhard Palsson, Galletti Professor of Bioengineering at the University of California, San Diego (UCSD), and his group combined genome-scale protein-DNA interaction (ChIP-chip) data, expression profiling, 5'-end sequencing, and mass spectrometry data to detail what he calls "the metastructure of a genome"—basically, a multilayer atlas correlating gene expression with RNA polymerase binding and transcription start sites.

Using this dataset, collected using E. coli, Palsson's team determined that the classical model of bacterial operons is incomplete. "It turns out that segments of DNA can generate multiple different transcripts," he says. "For instance, the threonine operon in E. coli was once thought of as one operon, but we show that at least five different transcripts can come off that region of genome."

By analyzing such datasets and comparing them across multiple species, researchers can interrogate the evolutionary plasticity of regulatory networks. According to Trey Ideker, chief of medical genetics at UCSD, who has studied protein-protein network conservation across evolutionary time, while protein-protein interaction networks are largely static, transcriptional networks are relatively fluid. "Where those transcriptional complexes bind and what genes they control seems to change a lot from species to species," he says.

Also relatively plastic, Ideker says, are genetic interactions, which probe the programming logic of interaction networks.

Genetic interaction screens combine genetic mutations to infer redundancies and identify relationships between biological processes. Basically, when mutant A and mutant B are joined in a cell, if the resulting phenotype is synergistic rather than additive, an interaction is inferred.

"You can't just assay which proteins physically bump up against one another; one has to know about how these proteins function relative to one another," says Ideker.

Vidal has devised a different approach to sussing out logical relationships. Recognizing that many proteins have multiple partners, and thus that probing interaction function by deleting those proteins altogether likely will have unintended consequences, his team has begun developing what he calls "edgetic" mutants, that is, mutants that affect only a single protein-protein interaction "edge" in the vast interactome network.

"We call them edgetic because we want to push the notion of ±genetics of edges,'" explains Vidal, who in 2009 used such mutants to probe the biology of the apoptotic protein CED-9 in C. elegans. "We want to go after genetic perturbations that affect one edge at a time, leaving all the others wild-type and then asking in vivo, what is the consequence?"


As with all genomic-scale datasets, such information is merely a jumping-off point, a wellspring of new hypotheses. Yet many researchers question the quality of interactions derived from high throughput assays. To assuage that concern, researchers have devised confidence metrics, similar to the Q20 scores assigned to sequenced nucleotides.

Alexey Nesvizhskii, a computational scientist at the University of Michigan, together with Anne-Claude Gingras and Mike Tyers, two SLRI investigators who study kinase/phosphatase interaction networks, has developed one such metric for individual protein-protein interactions detected by AP/MS, which they call SAINT (significance analysis of interactome).

SAINT, Gingras explains, essentially computes an interaction's P value based on the number of peptides detected for a given protein, how often the protein is detected across different purification conditions, the length of the protein, and other variables.

"We are actually able to give you a number saying, for this particular bait-prey [pair], you get a score of 0.9, which we expect gives you a probability of 90 percent that it is real and not just a spurious detection," she says.

With Nesvizhskii and Tyers, Gingras has used SAINT to qualify interactions in a yeast kinase-phosphatase interaction network comprising some 1,800 interactions, and now is working with interaction databases like BioGRID ( and IntAct ( to incorporate those scores into their record architecture.

Vidal and his colleagues have also developed a confidence score. They tested reference sets of 92 well-characterized, positive-control interacting pairs and 92 random negative-control pairs in each of five binary approaches: Y2H, LUMIER, PCA, MAPPIT (mammalian protein-protein interaction trap), and NAPPA (nucleic acid programmable protein array). By counting how often a given interaction appears in each of the five methods, the score reflects the likelihood that the interaction is "true."

Yet the team's analysis also highlighted a concern with existing interaction methodologies: As the team profiled and tweaked the various assays, they found that under conditions that maximized detection of the positive controls while minimizing detection of the negative controls, "none of the methods were perfect," Vidal says. In fact, each technique detected only 20 percent to 30 percent of the positive control interactions.

But there is a silver lining: Now that the team knows how to maximize positive-control detection while minimizing negative-control detection, confidence in future experiments should rise accordingly.

Given the magnitude of the problem, these experiments should keep Vidal and his colleagues busy for some time to come.

"When can we declare victory? I don't know if we ever will," Vidal says. "It depends on how we define it."

Commercial Interactome Tools

Interactomics, like genomics, requires serious automation. Pipetting, tracking, and logging all those pairwise combinations or AP/MS trials is not for the faint of heart.

For those who lack the stomach (or the infrastructure), firms like Hybrigenics ( and Dualsystems Biotech ( offer yeast two-hybrid±based screening services.

Alternatively, researchers can do the work in-house. Vectors, strains, and reagents for both omics-level strategies are widely available and include the Invitrogen ProQuest Two-Hybrid System with Gateway Technology from Life Technologies ( and the InterPlay Adenoviral TAP System from Agilent Technologies (

Lower throughput methods have also been commercialized. One new offering: Life Technologies' TaqMan Protein Assays ( Combining TaqMan-based real-time PCR assays with Olink Biosciences' ( proximity ligation assay technology, the assay uses two antibodies, each directed to one member of an interacting pair and coupled to a unique DNA sequence. If the proteins physically interact in vitro, the binding of their cognate antibodies brings the oligonucleotide tails close enough for ligation to occur, the result of which may be detected via real-time PCR.

Whichever method you choose to tackle the interactome, be prepared for data overload. According to Anne-Claude Gingras, an investigator at the Samuel Lunenfeld Research Institute, data tracking "is a tremendous problem" in AP/MS studies, and in interactomics in general. A good laboratory information management system (LIMS), she says, "is actually really key."

Featured Participants

Austrian Academy of Sciences

Canada Foundation for Innovation

Clinical Research Institute of Montreal

Dana Farber Cancer Institute

Harvard Medical School

Institute for Systems Biology

National Center for Dynamic Interactomics Research

Rockefeller University

Samuel Lunenfeld Research Institute

University of California, San Diego

University of Michigan

Note: Readers can find out more about the companies and organizations listed by accessing their sites on the World Wide Web (WWW). If the listed organization does not have a site on the WWW or if it is under construction, we have substituted its main telephone number. Every effort has been made to ensure the accuracy of this information. Inclusion of companies in this article does not indicate endorsement by either AAAS or Science, nor is it meant to imply that their products or services are superior to those of other companies.

Jeffrey M. Perkel is a freelance science writer based in Pocatello, Idaho.

DOI: 10.1126/science.opms.p1000046

This article was published as a special advertising feature in the 23 July 2010 issue of Science magazine.

New Products: Interactomics

For PDF version



The Biacore 4000 is a powerful solution for large-scale, label-free molecular interaction analysis in drug discovery, from early screening to characterization. The system delivers high-quality binding, kinetic, affinity, concentration, and specificity data in both screening assays and detailed characterization studies. Designed for large-scale parallel interaction analyses, the Biacore 4000 is capable of analyzing up to 4,800 interactions in 24 hours. Dedicated software packages are available to support small-molecule drug discovery and antibody screening and characterization. In combination with the LMW Extension Package software, the Biacore 4000 delivers the±sensitivity, throughput, and high-quality data required for fragment screening, lead selection, and optimization. The Antibody Extension Package provides dedicated software tools for high throughput kinetic studies and epitope mapping, enabling rapid identification of promising candidates.±

GE Healthcare/Biacore
For info: 800-526-3593

Functional Protein Expression Kit

The Invitrogen MembranePro Functional Protein Expression (FPE) Kit is an easy-to-use method for obtaining functional membrane proteins, including GPCRs, from mammalian cells. The kit can be used to clone and transfect GPCRs or other membrane protein genes into 293FT cells, collect the culture media after 48 hours, and precipitate the secreted membrane particles overnight. After resuspension, the particles are ready for immediate use in downstream biochemical studies. The simple process produces particles comparable to conventional membrane preparations with the added advantage of increased receptor density. MembranePro products are offered in two configurations, an expression kit with 10 reactions and a support kit available in 10, 60, or 600 reaction sizes.

Life Technologies Corporation
For info: 760-603-7200

Human miRNA Expression Assay Kit

The Human microRNA (miRNA) Expression Assay Kit provides a simple and precise way to profile the human miRNA transcriptome in a single tube. Users can perform highly multiplexed, direct digital detection, as well as counting of miRNAs at single-base resolution without the need for polymerase chain reaction (PCR) amplification. The comprehensive and cost-effective assay enables users to profile more than 700 human and human-viral miRNAs with a specificity and sensitivity comparable to quantitative PCR. The assay kit contains all of the reagents and consumables required to conduct miRNA and gene expression experiments and can be combined with the easy-to-use, fully automated target profiling nCounter Analysis System.

NanoString Technologies
For info: 888-358-6266

ChIP-CHIP kits

Magna ChIP2 chromatin immunoprecipitation DNA microarray (ChIP-chip) kits allow users to map entire gene regulatory networks and patterns of epigenetic marks using microarray technology. The kits provide reagents, microarrays, and validated protocols for the entire ChIP-chip workflow, allowing users to examine protein-DNA interactions on a genome-wide scale. Two types of kits are available to accommodate different scientific approaches. The Magna ChIP2 Promoter Microarray kits contain the necessary reagents for ChIP-chip and either human or mouse Agilent promoter microarrays. The Magna ChIP2 Universal Microarray kits supply the necessary reagents for performing ChIP-chip with microarrays provided by the user.

For info: 800-645-5476

Coated 96-Well Plates

Well-Coated plates are ready-to-use coated plates that are suitable for colorimetric, chemiluminescence, and fluorescent detection systems. The plates are supplied pre-blocked in G-Biosciences Superior Blocking Buffer and are available as single 96-well plates or as 12 x 8-well strips in a 96-well holder in clear, white, or black polystyrene. Well-Coated plates are offered with the following coatings: protein A, protein G, protein A/G, protein L, goat anti-mouse antibody or goat anti-rabbit antibody for binding antibodies; neutravidin, streptavidin and biotin for biotin studies; nickel and glutathione for His or GST tagged recombinant protein binding; and activated plates for binding proteins and other molecules through amine or sulfhydryl residues.

For info: 314-991-6034

Electronically submit your new product description or product literature information! Go to for more information.

Newly offered instrumentation, apparatus, and laboratory materials of interest to researchers in all disciplines in academic, industrial, and governmental organizations are featured in this space. Emphasis is given to purpose, chief characteristics, and availabilty of products and materials. Endorsement by Science or AAAS of any products or materials mentioned is not implied. Additional information may be obtained from the manufacturer or supplier.

Look for these Upcoming Articles

Proteomics 2: Biomarkers — September 10
Genomics 2: Structural — October 29
Flow Cytometry — November 5