Reverse Phase Protein Microarrays
Molecular cancer research is driven also by advancements in technologies and tools. We have established reverse phase protein microarrays (RPPA) as a reliable and cost-effective experimental platform for quantitative protein profiling, and apply this in the tumor topics of the division and in collaborations. Furthermore, the division has been a driving force in national and international projects aimed at generating and providing tools to the scientific community.
In RPPA, samples are printed directly on solid-phase carriers. The detection of a specific protein, or a certain phosphorylation site, is carried out with a single, highly specific antibody per slide. We have adapted this approach that had been initially published by Paweletz et al. (Paweletz et al. 2001). We switched to fluorescence detection in the near infrared (NIR) range thus permitting protein profiling from as little as 20,000 cells with detection sensitivities in the fg-range (Loebke et al. 2007 Proteomics, Korf et al. 2008 Expert Opin Drug Discov). This way, less than 100 up to a few 1,000 different samples can be analyzed in parallel. Furthermore, we have advanced the RPPA technology by introducing the "RPPanalyzer" tool for data analysis (Mannsperger et al. 2010 Bioinformatics, von der Heyde et al. 2014 Biotechniques) as well with as protocols for antibody validation (Mannsperger et al. 2010 Proteome Sci, Mannsperger et al. 2011 Meth Mol Biol).
We have applied the RPPA-technology to analyze the activation status of signaling pathways, for example after RNAi-based silencing experiments (Sahin et al. 2007 PNAS), to identify protein networks regulated by miRNAs (Uhlmann et al. 2012 Mol Syst Biol), for profiling of tumor biopsy samples (Haller et al. 2008 J Pathol, Henjes et al. 2012 Oncogenesis, Wruck et al. 2015 Sci Data), to characterize drug resistance mechanisms (Borgoni et al. 2020 Cancers, Haga et al. Cancer Research 2021, Noronha et al. Cancer Discovery 2022) and to uncover protein bursts upon ligand induced cellular responses (Golan-Lavi et al. 2017 Cell Rep). Furthermore, we investigated signaling pathways associated with tumor progression and inter patient heterogeneity (Menck et al. J Exp Clin Cancer Res 2021, Menck et al. Cancers 2022).
The international ORFeome Collaboration
The long-term goal of the human genome project is to establish a comprehensive gene catalogue that contains all human genes as well as physical clones for every single gene, and the functional analysis of these genes and gene products. The International ORFeome Collaboration joins scientists from around the world who aim to generate and make widely accessible a comprehensive resource of cloned ORFs that shall cover the entire protein-coding part of the genome/transcriptome. The division Molecular Genome Analysis has been contributor to this project and has been deeply involved its further development (Wiemann et al., 2016).
The LIFEdb functional genomics resource
Based on the ORF-collection of the division Molecular Genome Analysis and the international ORFeome collaboration (see above), we generate tools for the expression of encoded proteins. These are systematically exploited to determine the subcellular localization of these proteins.
We have implemented the LIFEdb infostructure to disseminate information on the ORFeome resource (e.g., clone quality metrics) and on the utilization of this resource towards a functional annotation of encoded proteins (genes, subcellular localization data). The concept was originally described in [Simpson et al. 2000].
This project is a collaboration with the EMBL-Heidelberg (Pepperkok Team) and the UCD Dublin (Simpson Group). Data presented has been published e.g., in [Simpson et al. 2012, Laketa et al. 2007, Neubrand et al. 2005, Starkuviene et al. 2004, Simpson et al. 2001, Pepperkok et al. 2001]. LIFEdb has been described in [Mehrle et al. 2006, Bannasch et al. 2004].
The data is available here. The table provides several sort options as well as a number of links.
- GeneSymbol -> ENTREZ Gene symbol of the respective genes
- ParentCloneID -> information of the respective proteins (e.g., GO-term) - under construction
- EntryCloneID -> information on the quality and availability of clones through the ORFeome Collaboration - under construction
- NCBI -> link to sequence in EMBL/GenBank/DDBJ database
- UCSC -> link to mapping position in the UCSC genome browser
- ProteinLocalization -> experimentally determined localization of encoded proteins
- ImageFile -> link to higher resolution microscopic images of GFP-tagged fusion proteins (N- and C-terminal tagging)
(The presentation of data has been implemented by Oliver Heil)
Controls
In order to evaluate potential effects of GFP on the localization of fusion proteins we performed two types of assays.
1. We analyzed the sub-cellular localization of several fusion proteins in Vero (monkey kidney) and HeLa (human cervix carcinoma) cells (-> Cell Line Control). In every case the results obtained with the two cell lines were identical, demonstrating that the sub-cellular localization in most cases will be independent on the cell line used in the analysis. Nevertheless, we will use cell lines of different origin (e.g. hepatocytes, neuronal cells) for future experiments.
2. We analyzed the sub-cellular localization of several known proteins with predicted or known localization (-> Known Protein Control). In most cases, the localization was verified - independent of the GFP fusion part.
We conclude that the sub-cellular localization of the proteins is mostly not affected by the GFP fusion part. However, we always determine the localization of N- and C-terminal fusions (GFP-ORF and ORF-GFP since the relative orientation of the two fusion parts (ORF vs. GFP) in many cases does impact localization of either fusion-protein.
The 3of5 web application for complex and comprehensive pattern matching in protein sequences
The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities.
We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at http://dkfz.de/mga2/3of5/3of5.html.
The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms (Seiler et al. 2006).