Conceived and designed the experiments: AK. Performed the experiments: AK SW AS. Analyzed the data: AK SW AS. Contributed reagents/materials/analysis tools: DB AK. Wrote the paper: AK. Other: Provided funding, edited manuscript: DB.
Current address: Department of Biochemistry and Biophysics, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
The authors have declared that no competing interests exist.
Specific protein associations define the wiring of protein interaction networks and thus control the organization and functioning of the cell as a whole. Peptide recognition by PDZ and other protein interaction domains represents one of the best-studied classes of specific protein associations. However, a mechanistic understanding of the relationship between selectivity and promiscuity commonly observed in the interactions mediated by peptide recognition modules as well as its functional meaning remain elusive. To address these questions in a comprehensive manner, two large populations of artificial and natural peptide ligands of six archetypal PDZ domains from the synaptic proteins PSD95 and SAP97 were generated by target-assisted iterative screening (TAIS) of combinatorial peptide libraries and by synthesis of proteomic fragments, correspondingly. A comparative statistical analysis of affinity-ranked artificial and natural ligands yielded a comprehensive picture of known and novel PDZ ligand specificity determinants, revealing a hitherto unappreciated combination of specificity and adaptive plasticity inherent to PDZ domain recognition. We propose a reconceptualization of the PDZ domain in terms of a complex adaptive system representing a flexible compromise between the rigid order of exquisite specificity and the chaos of unselective promiscuity, which has evolved to mediate two mutually contradictory properties required of such higher order sub-cellular organizations as synapses, cell junctions, and others – organizational structure and organizational plasticity/adaptability. The generalization of this reconceptualization in regard to other protein interaction modules and specific protein associations is consistent with the image of the cell as a complex adaptive macromolecular system as opposed to clockwork.
Protein interaction modules, such as PDZ, SH3, WW, EH, SH2 and other domains, mediate protein-protein interactions by recognizing and binding short and usually linear peptide epitopes within their interacting partners
PDZ domain is a prototypical and one of the best-characterized protein interaction modules. Approximately 90 amino acids long, PDZ domain was first discovered as sequence repeats in the primary structures of the post-synaptic density 95 (
While able to interact with internal amino acid sequences properly constrained within secondary structure, in their canonical and by far the most common mode of interaction PDZ domains recognize and bind short specific sequences at the extreme C-termini of their interacting partners
Binding histograms were obtained by individual phage ELISA performed on purified GST fusions of the indicated domains immobilized in micro-titer plate wells
Over the 15 years since the discovery of PDZ domains, the biochemistry and structural basis of PDZ domain recognition as well as the biology of PDZ domain-containing proteins have been subjects of numerous studies, which are summarized in a number of reviews
The first uncertainty is illustrated by the continual but so far failed attempts to classify PDZ domains in accord with their specificities (see the examples of at least five different classifications in Refs.
The second major uncertainty pertains to the contribution of the ligand residues that are situated upstream of the last three to four C-terminal amino acids. Since the initial structural studies implicating only the few carboxy-terminal ligand residues in direct interactions with PDZ domains, it has been assumed that the influence of the upstream residues in PDZ ligands is inconsequential for PDZ domain interactions, and the occasional experimental evidence to the contrary is normally regarded as exceptional, relevant only for a particular domain or even a particular domain-ligand pair
The widely diverse affinities reported for PDZ domain-peptide interactions, spanning more than three orders of magnitude, represent another source of confusion. Because in-solution methods, such as fluorescence polarization (FP), tend to estimate PDZ domain-peptide interaction affinities in the low micromolar range, well within the affinity range expected from protein interaction domains mediating transient specific associations inside the cell, they are generally perceived as more trustworthy than the
It should be emphasized that the ambiguities detailed above for the PDZ domain family are common, to a larger or smaller degree, to all the peptide recognition domain families
To address the above-mentioned questions in a comprehensive manner, we applied a number of novel biochemical and statistical approaches to generate and analyze large populations of peptide ligands for a number of well-studied PDZ domains. The results of this study reveal a hitherto unappreciated combination of specificity and adaptive plasticity inherent to PDZ domain recognition. The complexity of PDZ domain recognition and the seemingly contradictory and/or confusing observations accumulated in the field are reconciled within a novel, if unexpected, image of the PDZ domain emerging as a complex adaptive system evolved to ensure both structure and organizational plasticity of higher order dynamic macromolecular systems such as synapses, cell junctions, and others.
This study capitalizes on distinct advantages of the novel screening format for phage-displayed peptide libraries, target-assisted iterative screening (TAIS), introduced recently and described elsewhere
Visual examination of binding histograms suggests that 1) the recognition specificities of all six domains examined are very similar, albeit not identical; 2) the second domains of both proteins are noticeably more promiscuous than their first and third domains; and 3) the specificities of homologous domains across different proteins appear to be more similar than the specificities of the domains belonging to the same protein. The first two observations are in agreement with the established body of experimental evidence
To delineate recognition preferences of the target PDZ domains, the last sixteen C-terminal amino acids of the peptide ligands selected in TAIS screens were analyzed using the
The relative frequencies of four residues, valine, arginine, threonine and serine are twice as high as expected. Since the probability of such frequencies arising by chance is vanishingly low (about 2.8E-15, assuming Bernoulli trials approximation), it is fair to hypothesize that V, R, T and S are the ligand residues that are preferred at the domain-ligand interaction interface and thus are likely to be important for binding to the target PDZ domains. The relative overrepresentation of V, T and S does not come as a surprise, as the known minimal recognition consensus of the PSD95 PDZ domains is X-(S/T)-X-(V/I/L)-COOH
To explore the relationships between amino acid frequencies at specific ligand positions and the strength of PDZ domain-ligand interactions we arranged peptide ligands into four groups in accord with their relative affinities: 1) best binders (normalized phage ELISA signal from 0.8 to 1.0); 2) good binders (0.6 to 0.8 ELISA signal); 3) moderate binders (0.4 to 0.6 ELISA signal) and 4) weak binders (0.2 to 0.4 ELISA signal) (see
As the described positional patterns of overrepresented residues hold for all six target domains (not shown), it is fair to conclude that the general recognition consensus of the PSD95 and SAP97 PDZ domains is X-R-E-(T/S)-X-V-COOH. Indeed, this inferred consensus represents a refinement of the well-known minimal recognition consensus of the class I PDZ domains, X-(S/T)-X-(V/I/L)-COOH, first defined for the PSD95 PDZ domains through analysis of C-terminal sequences in the PSD95 interacting partners
A search of SWISS-PROT and TrEMBL databases with the described above queries gave 126 potential interacting partners for the PSD95 PDZ domains (see the individual C-terminal sequences of putative interactors together with their i.d. numbers in
Binding histograms were obtained by peptide ELISA performed on purified GST fusions of the indicated domains immobilized in micro-titer plate wells as described previously
From visual inspection of the binding histograms shown in
To pinpoint the molecular determinants in natural ligands that are responsible for strong interactions with the target PDZ domains we looked for statistical biases in the relative amino acid frequencies within the positional window “−4 to −7”, the fully degenerate positions in our queries. It is worth emphasizing that all 126 natural ligands had been selected based on their match with the last four C-terminal amino acids of artificial ligands only. In other words, if the residues upstream of the last four amino acids in natural ligands were relatively unimportant for interactions with the target PDZ domains, one would expect no significant biases in amino acid frequencies at those positions. If, on the contrary, they are both essential and specific for the target domains, then the statistical biases within this region in natural ligands should be analogous to, or at least reminiscent of, the amino acid frequency patterns observed within the same positional window in artificial ligands.
The histograms in
In order to gain insight into the pattern recognition differences of the PDZ domains that belong to the same class but are able to differentiate between peptide ligands sharing the class-defining C-terminal consensus, we investigated a particular case of this general puzzle, namely, the differences in recognition preferences between the second and third PDZ domains of PSD95, which are classified as type I PDZ domains but are known to discriminate between various X-(S/T)-X-(V/I/L)-COOH ligands
To generalize, we suggest that by increasing statistical power one is likely to detect and define robust differences in pattern recognition between any pair of individual PDZ domains. Indeed, the comparison of the natural ligands exhibiting several-fold preference, in terms of relative affinity, for the PSD95-PDZ2 domain over the PSD95-PDZ1 domain with the best binders of the latter clearly suggests that these domains are also capable of differential recognition (
Summarizing the recognition patterns of the PSD95 and SAP97 PDZ domains inferred from the comparative analysis of amino acid organization of their affinity-ranked artificial and natural ligands, the PSD95 and SAP97 PDZ domains prefer positively charged residues, lysine and arginine, in the positional window “−4 to −7”, while strongly favoring lysine or arginine at the position “−4”, glutamate at the position “−3”, threonine at the position “−2” and valine at the position “0”. Even though the ligand position “−1” appears to accept various residues, it is apparently used for discrimination between individual PDZ domains – lysines and arginines are disfavored in this position by the first two PDZ domains (
On the whole, it appears that PDZ domain interactions are driven by the interdependent contributions of multiple ligand positions to the overall energy of interaction, which may span the last eight or more C-terminal amino acids of ligands. The unexpected plasticity and complexity of PDZ recognition are rooted in an apparently integral nature of the individual ligand residue contributions. Sub-optimal amino acids at some of the ligand positions can be compensated by optimal amino acids at other positions to preserve the strength of interaction. At the same time, even the major favorable energetic contributions of threonine and valine at the “−2” and “0” positions can be compromised by delinquent residues acting somewhere else along the chain. In the ligands featuring sub-optimal amino acids within the last four or five C-terminal ligand positions the individual contributions of upstream residues may become critical, thus allowing for highly differential recognition of such ligands by very similar PDZ domains.
The paradoxical behavior of the third PDZ domains of PSD95 and SAP97, which appear to be exquisitely selective towards natural ligands, but promiscuous toward artificial ligands, is unlikely to find its explanation in the physicochemical idiosyncrasies of the third domains only. Instead, we suggest that what appears as the exquisite selectivity of the third domain towards natural ligands may simply reflect the selective pressures imposed by evolution on functional organization of the postsynaptic density, which led to a relatively limited number of the PDZ3 ligands encoded in the genome. In this regard, a few examples from the artificial ligands dataset are most illustrative. Tryptophan is a significantly overrepresented amino acid at the “−1” position in artificial ligands, in all affinity groups, suggesting that the presence of tryptophan at the “−1” ligand position is not detrimental for interaction
What is then the biological meaning of adaptive plasticity in PDZ domain recognition? And what are physiologically relevant affinities of PDZ domain interactions? We speculate that both the adaptive plasticity and the wide range of interaction affinities of SAP PDZ domains are directly linked to the managerial/scaffolding role of SAP proteins in synapse organization. It is fair to suggest that both a large ligand sequence space and a wide affinity range of scaffold-mediated interactions are beneficial, if not essential, for synapse plasticity, because they define the spatio-temporal ranges within which the synapse organizational dynamics operate. The imaging studies of molecular dynamics in living cells, tissues and animals indicate that synapses
We also speculate that the synaptic environmental and organizational invariants are encoded in the genome in the form of matching spectra of synapse-associated PDZ domains and their cognate ligands. In this way the genome loosely specifies the overall schematics and principles of synapse organization, while maturation, fine-tuning, and adaptation of individual synaptic structures take place as a result of their individual development and experience. In the same sense as neuronal organization of every newborn brain has been shaped by evolution to recognize certain perceptual/environmental invariants, but is not limited to recognition of those patterns only, the PDZ domains have been shaped by evolution to recognize certain C-terminal sequences present in a given proteome, but are not limited to the recognition of those sequences only. In this way, the composition, organization, and functioning of individual synapses remain open for evolution at both ontogenetic and phylogenetic levels, accommodating novel C-terminal sequences that can potentially arise from a plethora of the epigenetic and genetic molecular mechanisms known to generate molecular diversity, including posttranslational modifications, regulated proteolysis, RNA splicing, mutations, DNA rearrangements, protein splicing, and others. In short, we suggest that the adaptive plasticity of SAP PDZ domain recognition and the wide affinity range of SAP PDZ domain interactions are evolutionarily enforced by requirements of synapse plasticity and reflect the managerial role of SAP scaffolds in synaptic organizational dynamics.
It should be emphasized that the proposed conceptualization of the PDZ domain as a complex adaptive system evolved to ensure both structure and organizational flexibility of higher order macromolecular organizations not only resolves the uncertainties pertaining to PDZ domain recognition, but also suggests a fundamental molecular mechanism underlying the adaptive plasticity of sub-cellular molecular organization revealed in a number of the recent studies in which advanced imaging techniques were used to address molecular dynamics in living cells
In the same sense as the overall organization of a newborn brain represents, essentially, a form of evolutionary memory, subject to both ontogenetic and phylogenetic development and maturation, the overall organization of cellular protein interaction networks encoded in the matching spectra of peptide interaction modules and their cognate ligands within a given genome may represent an evolutionary memory that is subject to ontogenetic and phylogenetic development, maturation, and adaptation (see reference
The 16-mer random peptide library was generated in-house using the T7 phage display library construction kit from Novagen. The human brain cDNA library was purchased from Novagen. The GST fusion protein expression constructs of the PSD95-PDZ2, PSD95-PDZ3, SAP97-PDZ1 and SAP97-PDZ2 domains were kindly provided by Dr. B. K. Kay (The University of Illinois at Chicago). The GST fusion constructs of the PDS95-PDZ1 and SAP97-PDZ3 domains were generated by PCR amplification of the corresponding PDZ domain coding regions from SAP cDNAs (generously provided by Dr. David S. Bredt (University of California, San Francisco)) followed by cloning into the pGEX2TK expression vector (Amersham Pharmacia). The PDZ domain borders were defined by the SMART software tools (
A detailed description of the TAIS method is presented in Kurakin et al. (
GST fusion-coated microtiter ELISA plates (COSTAR) were prepared by passive immobilization of 1 µg of the indicated GST fusion proteins (
Wells of microtiter plates were coated with 1 µg of the indicated GST-PDZ domain fusions, washed with TBS-T and blocked with 1% BSA in the same way as described above for phage ELISA. Individual biotinylated peptides (30 ng) were pre-incubated with 1 µg of streptavidin-HRP conjugate (Pierce) in 300 µl of TBS-T for 30 min at RT. One hundred µl of the peptide-streptavidin-HRP conjugate were added to 100 µl of TBS-T left in each coated well after the final wash of the protein immobilization/blocking procedure. Microtiter plates were incubated for 1 hour at RT, and then washed 5 (x1mL) times with TBS-T. The amounts of peptides retained were quantified colorimetrically by adding soluble HRP substrate (ABTS/H2O2) and measuring ELISA kinetic slopes. ELISA readings were taken on a SpectraMAX190 plate reader (Molecular Devices) at 405 nm. To ensure reproducibility, all peptide ELISA experiments presented were repeated at least three times in at least three separate experiments performed by two different experimenters. The representative sets of binding histograms are shown in
Evaluation of statistical significance of amino acid frequency biases was based on the Bernoulli trials approximation, i.e. on the assumption of random and independent sampling of individual amino acids from a population with specified amino acid frequencies. In the case of artificial ligands obtained from cDNA library (about 50% of ligands) and from random peptide library (another 50%), we assumed that sampling was done from a population where individual amino acids are equally represented. We believe it is a reasonable, albeit coarse-grained, approximation both for random peptide library and for the cDNA library used, considering that about 90% of the peptides isolated from the latter represented frameshifts. It was assumed that natural ligands were sampled from a population where individual amino acid are distributed in accord with their average occurrence in the SWISS-PROT database (the v. 51.1 issue statistics). The chi-square and binomial tests
Distribution of charged residues within affinity-ranked artificial ligands of the PSD95-PDZ2 and PSD95-PDZ3 domains. The aligned sequences of artificial peptide ligands are arranged in four groups based on their relative affinities to the indicated PDZ domains. The numbers in parentheses indicate the range of normalized phage ELISA values within a given affinity group. Arginines and lysines are highlighted green, while aspartic and glutamic acids are red. Upper panel - PSD95-PDZ2 ligands; lower panel - PSD95-PDZ3 ligands.
(0.86 MB TIF)
Distribution of charged residues within affinity-ranked artificial ligands of the SAP97-PDZ1 and SAP97-PDZ2 domains. The aligned sequences of artificial peptide ligands are arranged in four groups based on their relative affinities to SAP PDZ domains. The numbers in parentheses indicate the range of normalized phage ELISA values within a given affinity group. Arginines and lysines are highlighted green, while aspartic and glutamic acids are red. Upper panel - SAP97-PDZ1 ligands; lower panel - SAP97-PDZ2 ligands.
(0.80 MB TIF)
Distribution of charged residues within affinity-ranked artificial ligands of the SAP97-PDZ3 domain. The aligned sequences of artificial peptide ligands are arranged in four groups based on their relative affinities to SAP PDZ domains. The numbers in parentheses indicate the range of normalized phage ELISA values within a given affinity group. Arginines and lysines are highlighted green, while aspartic and glutamic acids are red.
(0.41 MB TIF)
Artificial peptide ligands isolated from phage-displayed random peptide and cDNA libraries by TAIS using various SAP PDZ domains as targets
(0.05 MB DOC)
Putative natural peptide ligands of SAP PDZ domains
(0.06 MB DOC)
We thank Dr. B. K. Kay (The University of Illinois at Chicago), Dr. F. W. Studier (Brookhaven National Laboratory) and Dr. David S. Bredt (University of California, San Francisco) for providing research reagents. We express our gratitude to Dr. D. A. Greenberg (Buck Institute) for critical reading of the manuscript. We thank Afanasy Kurakin (Moscow State University) for software tools development and Dr. M.-Y. Chou (Industrial Technology Research Institute, Taiwan) for technical assistance. We thank Dr. M. Zhang (Hong Kong University of Science and Technology) and anonymous reviewer for their comments and suggestions.