PTMcode 2

General description

PTMcode is a web resource for known and predicted functional associations between protein post-translational modifications (PTMs) within and between proteins. In its second release it contains more than 300,000 modified sites from 69 different PTM types. PTM types are abbreviated in two/three letter code as in PTMtypes_acronyms.txt. These PTMs are present in more than 100,000 proteins of 19 different eukaryotes.

The database is focused on the analysis of the post-translational regulation of proteins, not only by describing the set of its modifications, but by identifying the functional associations among the PTMs within the protein and from interacting proteins. We combine five different evidence channels that explore different properties of the coupled PTMs: (i) the 'co-evolution' channel (Figure 3A) that extracts pairs of PTMs with a similar evolutionary history (see Methods Summary); (ii) the 'structural distance' channel (Figure 3B) that reports close modified residues in 3D structures; (iii) the 'competition' channel which highlights residues modified by several PTM types; (iv) the 'manual annotation' channel that offers associated PTM pairs described in the literature; (v) and finally the 'hotspots' channel that calculates high-density modified protein regions.

Browser requirements

PTMcode relies on Adobe Flash and Javascript for full functionality. Any modern browser which supports Flash should work fine. Visit to Adobe's Flash pages to install or update the Flash plugin for your browser. PTMcode requires Flash version 11.1.0 or greater.

We tested the following:
  • Linux/Unix: Chrome, Firefox and Opera work fine
  • Windows: Microsoft Internet Explorer 9 or later, Chrome, Firefox and Opera work fine. Even though many PTMcode features will work in Internet Explorer version 8 or lower, we do not support them, or plan to make the site fully functional in those browsers.
  • OS X: Chrome, Firefox, Opera and Safari all work without problems.

PTMcode does not use browser cookies at the moment.

Propagated PTMs

PTMcode v2 spreads the signal from experimental validated PTMs to the conserved sites in orthologs from other species, we tag them as 'propagated PTMs'. By this simple exercise we obtained 1,347,165 sites with a modification signal propagated from verified PTMs allowing us to produce post-translational information for species that otherwise count with almost none experimental data. Propagated PTMs are labeled with a red border and they only appear if the user activates them through the checkbox in the PTMs visualization menu.


Propagated PTMs (click to enlarge)

Functional associations between PTMs

PTMcode aims to provide a global picture on the post-translational regulation of a protein, from our second release we also explore the regulation of its interaction with other proteins. Thus, we provide the functional associations between PTMs within proteins and between interacting proteins. A functional association between two PTMs is broadly defined here, from the physical interaction between two modifications or their competition for the same site to the participation in the same functionality even in they do not happen at the same time. To catch all these possible situations, we combine five different channels of evidence based on different approaches that we explain deeply below.

Co-evolution

We used the co-evolution of two modified residues to predict their functional association as described in Minguez et al 0212. We measured the co-evolution between any two modified sites in a particular protein using the mutual information (MI) algorithm (see explanation in the figure below) and extracted those pairs that are significantly co-evolving in comparison to pairs of non-modified residues in the same protein(s) with the same type of amino acid and placed on similar protein regions (ordered or disordered).

Co-evolving score based on MI: Two PTMs (acetylation in blue and phosphorylation in red) are pairwise evaluated in three different situations in which both residues are present in half of the orthologs from 10 species. The score includes the capacity of MI to address the level of dependency of two variables (maximum in both the right and left sequence alignments), and the degree of conservation of the amino acid across the species in the alignment (which is maximal in the left coupling).


Co-evolving score based on MI (click to enlarge)

Structural distance

A straightforward mechanism of two PTMs to be associated is based on their proximity, measured here using the 3D structure of the protein. If they are close enough, they could be either competing for the same space. We mapped PTM residues to three-dimensional structures of proteins from the Protein Data Bank and calculated the spatial distance between pairs of modified residues. To delineate a first estimation for an appropriate distance to conclude physical interaction we measured the average distance for twelve pairs of associated modifications reported in the literature to physically interact. Thus, modified residues closer than 4.69 Ångströms are predicted to either be physically in contact or being mutually exclusive competing for the same protein niche.

Competing PTMs

Another mechanism for a direct crosstalk of two PTM would be their competition for the same residue, which implies that they are mutually exclusive at a given time.

Manual annotation

PTMcode not only holds predicted associations, but also PTM sites that are reported in the literature to crosstalk.

Hotspots

PTMs can also be part of regulatory hotspots, small regions in the protein sequence that are enriched in modifications. According to Beltrao et al 2012 modified lysines are more probably located within a distance of 15 amino acids to a phosphorylated residue therefore forming hotspot regions where PTMs tend to cluster. Thus, using this threshold, for each of the modified residues in a protein we define a window of 31 amino acids (15 downstream and 15 upstream), count the number of modifications there and compared them using a Fisher exact test to the number of modifications in the whole protein. Overlapping hotspots where then merged.

Hotspots are mapped into the interactive protein representation that PTMcode generates. A red line (purple is propagated PTMs are included) is displayed between the scale and PTMs as shown in the figure on the right. Clicking the hotspot regions will highlight them in the interactive view, and display a table with all the PTMs within it.


PTM hotspots (click to enlarge)

Browsing the PTMcode database

The known and predicted functional associations between two of the available PTM types of modifications are accessible by selecting any two PTM types in an interactive network that offers all possible connections (see image on the right). The table is splitted into associations within and between proteins. The propagated PTMs are by default hidden, clicking the "Show propagated" will display them.

The most abundant/remarkable 13 PTM types are represented here by symbols (the rest are merged into the label 'Oth' fom Others), consistent over the whole PTMcode framework, the size of which corresponds to their relative abundance in the database. The link widths represent the number of proteins modified by the two respective PTM types normalized by the total number of proteins harboring the less abundant PTM type. The width thus indicates relative coverage of the particular pair-wise functional associations. Upon selection in the web interface, two types of modifications are activated and all the pairs of known and predicted functional associations in all proteins are shown in a table (see below) where each of the entries can be further explored. These tables are available to download as plain text files.


Network (click to enlarge)

All the functional associations between phosphorylation and SUMOylation are shown in a table sortable by any of the columns. Clicking in any of the proteins, the protein box in the search form is filled up and user can explore all the functional associations for this particular protein. The table contains information about the conservation of the modified residues and the evidences supporting the predictions. See conservation score explanation below for more details. Within the table, users will also see whether the modification is part of a hotspot. See more information about hotspots above.


PTM information table (click to enlarge)

For every protein PTMcode v2 offers the possibility to explore its interacting partners. Selecting a protein in the network (right) will display the 'PPI view' mode where the regulatory model of the interaction between the two proteins is displayed.


Protein-protein interaction network (click to enlarge)

Searching the PTMcode database

PTMcode has a protein-centric approach. Our aim is to provide predicted models for the post-translational regulation of proteins from eukaryotes (19 in the second release). Thus, we provide the users with a search engine where they can introduce their protein of interest (all commonly used protein IDs are supported and a protein sequence can also be submitted) and even restrict the retrieved model to a particular amino acid (should be a modified residue) or to a region of interest. Users can also restrict the regulation model by the PTMs of interest by checking the individual types in the select option. The regulation of protein-protein interactions can be searched through the 'Explore a protein pair tab'.

Once the Explore protein button is pressed, the predicted model will be retrieved in the form of an interactive image. The protein backbone is represented by a straight grey line with protein globular domains and unstructured regions extrated from the SMART database displayed on top. SMART domains and other features are clickable and will redirect user to the corresponding entry in the SMART web server, with detailed information about the domain. Below the protein representation, all the PTMs are mapped following a shape/color code. On the top left corner PTMcode provides a draggable functions box allowing users to zoom the image, show/hide all PTM associations or save the currently displayed picture as a bitmap image.

The interactive image of the protein's regulation model

Below the protein representation, all the PTMs are mapped following a shape/color code. On the top left corner PTMcode provides a menu where users can:
  • Zoom in/out into the protein regions.
  • Show/hide all the known and predicted functional associations among the PTMs.
  • Save the image (as shown in the screen) in .png/.jpg formats where the resolution can be customized. Be aware that generating high resolution images for proteins with many links may take up to 30-60 seconds.

A second menu on the top right corner gives the possibility to filter the functional associations shown in the image according to the evidence channel they belong to. To remove the current selection, simply click again in the evidence selected.

Pressing the Display all associations button, PTMcode will display all the functional associations. The associations are color coded according to the evidence channel that retrieved them. See a a global representation on the right.

Clicking in any of the modifications will show the functional associations for that particular PTM, see two examples on the right.

Selecting any PTM will also display a table where all functional associations can be further explored.

The table provides information about all the PTMs predicted to be functionally coupled to the one selected which shows information about the conservation score (see details) and the evidence channels for each of the functional associations. Moreover, users may visualize the two PTMs associated within the 3D structure of the protein (when available) by clicking the Highlight in structure button which will pop up the Jmol plugin with full Jmol functionality. Clicking in any of the entries in the table will highlight that particular association in the interactive image as well.

The evidence channels are color coded and clicking in each of them will display more information about each of the predictions within a pop-up window. For instance, from the co-evolution evidence pop-up, users can explore the multiple sequence alignment (using Jalvivew) from the orthologous group where the co-evolution analysis was performed. In the Jalview alignment, the two columns of the analysed residues are highlighted (take note that the residue number in the alignment will correspond to the alignment position, not the position in the protein).

Within the table, users will also see whether the modification is part of a hotspot. See more information about hotspots.

PTMcode v2 also provides regulation models for protein-protein interactions. The 'PPI view' shows two proteins and the functional associations between their PTMs.


Default protein display (click to enlarge)
All PTM associations (click to enlarge)
Single PTM functional associations (click to enlarge)
Single PTM functional associations (click to enlarge)
PTM functional associations details (click to enlarge)
PTM functional associations in interacting proteins (click to enlarge)

Conservation score (RCS and rRCS)

PTMcode gives a score (the rRCS, relative Residue Conservation Score) to measure the conservation of the modified residue within the protein, it is shown under the column "CON" in the results table and is calculated as defined in Minguez el 2012.

The rRCS is a score (given in the form of a percentage) that measures the conservation of the modified residue compared with the same type of amino acid in the same type of protein regions (ordered or disordered regions). A rRCS > 95 means that the modified residue is more conserved than the 95% of the same amino acids within the same type of protein region. A "CON" column with a 0 means that we where not able to calculate the conservation score for that particular modification.

The actual value that gives the conservation, the RCS (Residue Conservation Score, not shown within the results table) is composed of two components: the MBL (Maximum Branch Lenght) that is the longest evolutionary distance among species that contain a conserved modified residue and the RCR (Residue Conservation Ratio) that quantifies the conservation ratio of the modified residue across the species in a taxonomic group in which a least one conserved site residue has been observed. In the figure below, the score is illustrated by a modified serine (circled) within a column of a MSA of orthologs where the species with the longest branch length containing the residue are Macaca mulatta and Rattus norvergicus; In the respective taxonomic group, 3 out of 4 species maintain the serine in the same position and thus the RCR is 0.75.


Residue Conservation Score (RCS) (click to enlarge)
License: Except as otherwise noted, this work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. For commercial use or customized versions, please contact biobyte solutions GmbH.