Background

COMPLEAT was developed at Harvard Medical School, in the lab of Norbert Perrimon and Drosophila RNAi Screening Center (DRSC). COMPLEAT broadens the scope of high-throughput data analyses by using backend annotations that make the tool complementary to existing tools. The tool incorporates several useful features in order to provide a comprehensive data-mining environment, including network-based visualization and interactive querying options.

Brief introduction to COMPLEAT

COMPLEAT is an online tool used to analyze high-throughput datasets (or small-scale datasets) using protein complex enrichment analysis. The tool uses a protein complex resource as the backend annotation data instead of conventional Gene Ontology- or pathway-based annotations. With the input data, there is no need to pre-select for ‘hits’ from the study. Users can simply upload the full data set. Users can upload multiple datasets and quickly zoom in to view the complexes that are enriched in one or both datasets, as well as complexes enriched differentially between two datasets.

Citing COMPLEAT

If you use COMPLEAT in your research, please cite our paper:
A. Vinayagam, Y. Hu, M. Kulkarni, C. Roesel, R. Sopko, S. E. Mohr, N. Perrimon, Protein Complex-Based Analysis Framework for High-Throughput Data Sets. Sci. Signal. 6, rs5 (2013).

Assembly of the human, fly and yeast complex resource

Protein complexes were assembled in two different ways. First, complexes annotated from literature were collected from various public resources and mapped to human, fly and yeast genes using the ortholog prediction tool DIOPT. Second, complexes were predicted based on high-quality protein-protein interaction (PPI) networks from human, fly, or yeast using CFinder and NetworkBlast.

Table 1: Original resources for literature–based complexes

Source Focus Original
species
Ortholog mapping
CORUM Protein complex from literature Mammal Mammal -> Fly/Yeast
Mouse -> Human
PINdb Protein complex from literature Human, Yeast Human/Yeast -> Fly
Human<->Yeast
CYC2008 Protein complex from literature Yeast Yeast -> Human/Fly
Protein complex from HT data Yeast Yeast -> Human/Fly
Gene Ontology Protein complex Human,
Fly, Yeast
No mapping
DPiM (manual) Protein complex from MS data Fly Fly -> Human/Yeast
KEGG module Signaling pathways protein complex Not clear Human -> Fly/Yeast
SignaLink Core signaling pathways Human, Fly Human/Fly -> Yeast
flyReactome Core signaling pathways Fly Fly -> Human/Yeast

Table 2. Number of complexes derived from different sources

Organism Literature Predicted Combined
Complexes Proteins Complexes Proteins Complexes Proteins
Human 3638 7524 6251 6334 9881 9293
Fly 3077 5619 3639 3933 6703 6536
Yeast 2173 3280 5551 3366 7713 3994

Annotation of the human, yeast and fly complex resource

In addition to the annotation of complexes from their original resource, the representative GO terms, common publications, and common sub-cellular localization are also searchable and displayed in the complex detail view. The binary interactions between the complex members are tracked and also displayed in the Cytoscape view of selected complexes.

Enrichment algorithms

Values from the input data are mapped to the complex members and sorted highest-to-lowest. A complex score is computed as the interquartile mean (IQM), which preserves the direction (positive/negative or up-/down-regulation) of the original data. A p-value is also computed to estimate the significance of complex scores as compared to 1000 random complexes of the same size. Enriched complexes (shown in color) are those that meet the p-value cutoff selected by the user (slider bar below the graph).