Conceived and designed the experiments: GS NAM AC CAW GM. Performed the experiments: GS NAM. Analyzed the data: GS NAM AC MA AMS CAW GM. Contributed reagents/materials/analysis tools: GS NAM AC MA PS AMS CAW GM. Wrote the paper: GS NAM AC PS AMS CAW GM.
The authors have declared that no competing interests exist.
The study of Parkinson's disease (PD), like other complex neurodegenerative disorders, is limited by access to brain tissue from patients with a confirmed diagnosis. Alternatively the study of peripheral tissues may offer some insight into the molecular basis of disease susceptibility and progression, but this approach still relies on brain tissue to benchmark relevant molecular changes against. Several studies have reported whole-genome expression profiling in post-mortem brain but reported concordance between these analyses is lacking. Here we apply a standardised pathway analysis to seven independent case-control studies, and demonstrate increased concordance between data sets. Moreover data convergence increased when the analysis was limited to the five substantia nigra (SN) data sets; this highlighted the down regulation of dopamine receptor signaling and insulin-like growth factor 1 (IGF1) signaling pathways. We also show that case-control comparisons of affected post mortem brain tissue are more likely to reflect terminal cytoarchitectural differences rather than primary pathogenic mechanisms. The implementation of a correction factor for dopaminergic neuronal loss predictably resulted in the loss of significance of the dopamine signaling pathway while axon guidance pathways increased in significance. Interestingly the IGF1 signaling pathway was also over-represented when data from non-SN areas, unaffected or only terminally affected in PD, were considered. Our findings suggest that there is greater concordance in PD whole-genome expression profiling when standardised pathway membership rather than ranked gene list is used for comparison.
Parkinson's disease (PD, OMIM: #168600) is a uniquely human disease that is clinically characterised by cardinal motor symptoms such as postural instability, bradykinesia and resting tremor
High throughput discovery platforms, such as microarrays, that assume no
One study assayed SNc dopaminergic neurons only, following isolation by laser capture microscopy (LCM)
Given an apparent lack of concordance in published data sets one might ask what relevance these transcriptional approaches can have to PD pathogenesis. Certainly the utilisation of post mortem brain tissue appears to represent the best opportunity for finding PD-specific changes in gene expression. Furthermore such ‘benchmarks’ facilitate the evaluation of clinical samples and model systems for their utility in PD research. However the approaches to generating and analysing microarray data are not standardised and therefore could account for much of the apparent discrepancy between reported gene lists.
Here we apply a common analytical approach to the available human transcriptomic data in an attempt to find greater data convergence and generate new insight into the pathways systematically altered during PD pathogenesis. We have also generated an online search tool and extend an invitation to other researchers to explore the data themselves
Ten published transcriptomic studies met the initial criterion of comparing primary tissues derived from PD patients and controls (
First Author | PMID | Database Accession # | Tissue used in study | # of PD cases | # of controls | NPDC | Array Type | Summarisation & normalisation Methods | Statistical test used to determine differential expression |
Grünblatt | 15455214 | - | SN,CB | 7 | 7 | - | HG-Focus | MAS5 | Wilcoxon rank-sum test+fold-change cut-off |
Hauser | 15956162 | - | SN | 6 | 5 | 3 (2 PSP, 1 FTDP) | U133A | MAS5 | 2-sample t test |
Vogt | 16626704 | - | OCTX, CB , PT | 4 | 4 | 4 MSA | U133A | RMA | ANOVA |
Zhang | 15965975 | - | SN ,PT, BA9 | 15 | 19 | - | U133A | RMA | ANOVA+FDR |
Miller | 16143538 | - | SN, STR | 6&4 | 8&4 | - | CodeLink | CodeLink | Student t-test |
Moran | 16344956 | GSE8397 | LSN, MSN, SFG | 15&9&5 | 8&7&3 | - | U133A+B | GC-RMA+MAS5+PLIER | two-class unpaired+FDR |
Lesnick | 17571925 | GSE7621 | SN | 16 | 9 | - | U133 plus 2.0 | MAS5 | ANOVA |
Scherzer | 17215369 | GSE6613 | whole blood | 50 | 22 | AD, PSP, MSA, CBD, ET) | U133A | MAS5 | LOOCV |
Castelvetri | 17412603 | E-MEXP-1416 | LCM DA-SN | 8 | 8 | - | X3P | GC-RMA | SAM |
Bossers | 18462474 | - | SN, PT, CN | 3 | 4 | 1 PD/DEM | Agilent 22K | Loess | Linear regression with FDR |
Studies | Number of PD patients used in microarray analysis | Number of controls used in microarray analysis | RNA tissue source | Number of differentially expressed probes ≤0.01 | Number of differentially expressed probes ≤0.01 after neuronal correction |
Hauser | 6 | 5 | SN | 159 | 152 |
Zhang | 11 | 18 | SN | 1014 | 951 |
Moran | 15 | 6 | LSN | 1975 | 1779 |
Moran | 15 | 6 | MSN | 2149 | 1924 |
Lesnick | 16 | 9 | SN | 2030 | 1993 |
Zhang | 14 | 19 | BA9 | 2373 | - |
Zhang | 15 | 15 | PT | 197 | - |
Moran | 15 | 6 | SFG | 598 | - |
Vogt | 4 | 4 | OCTX | 1727 | - |
Vogt | 3 | 3 | PT | 155 | - |
Vogt | 4 | 4 | CB | 174 | - |
Castelvetri | 8 | 8 | LCM DA-SN | 491 | - |
Scherzer | 55 | 22 | Blood | 208 | - |
no probes were removed from the non-SN brain regions data sets.
We postulated, as others had done previously, that genes and pathways that appear consistently as differentially expressed in multiple studies and different source tissues are likely to be important in PD
IPA Pathway category | Number of studies with over-represented pathways at ≤0.05 in all published lists (11) |
ERK/MAPK Signaling | 4 |
G-Protein Coupled Receptor Signaling | 3 |
Huntington's Disease Signaling | 3 |
α-Adrenergic Signaling | 3 |
Synaptic Long Term Potentiation | 3 |
PPARα/RXRα Activation | 3 |
Our common analysis method was applied to the 13 datasets (from seven studies) that met our platform inclusion criteria, and new ranked probe lists were produced. These are listed in
Dopaminergic neuron loss in the SNc is the prominent neuropathological entity in PD so we initially focused on the SN data sets for their convergence and reproducibility. Over-representation of the dopamine receptor signaling pathway was consistently and significantly observed in all SN data sets (p-values >0.003–0.026) suggesting that not only can PD-related pathways be dissected out of complex transcriptomic data but that these changes are robustly reproducible between comparable studies (
IPA Pathway category | Number of studies with over-represented pathways at ≤0.05 in SN data sets (n = 5) | Number of studies with over-represented pathways at ≤0.05 in SN data sets (n = 5) after neuronal correction |
Dopamine Receptor Signaling | 5 | 1 |
IGF-1 Signaling | 3 | 2 |
PTEN Signaling | 3 | 3 |
JAK/Stat Signaling | 3 | 3 |
Glucocorticoid Receptor Signaling | 3 | 3 |
Huntington's Disease Signaling | 3 | 3 |
PPAR Signaling | 3 | 3 |
Ephrin Receptor Signaling | 2 | 4 |
VEGF Signaling | 2 | 2 |
Axonal Guidance Signaling | 2 | 3 |
PI3K/AKT Signaling | 2 | 2 |
Insulin Receptor Signaling | 2 | 2 |
BMP signaling pathway | 2 | 2 |
Synaptic Long Term Depression | 2 | 2 |
Synaptic Long Term Potentiation | 2 | 0 |
PDGF Signaling | 2 | 1 |
B Cell Receptor Signaling | 2 | 2 |
Lysine Degradation | 2 | 2 |
Estrogen Receptor Signaling | 2 | 2 |
G-Protein Coupled Receptor Signaling | 2 | 0 |
Inositol Phosphate Metabolism | 2 | 1 |
IL-2 Signaling | 2 | 1 |
Neuronal loss in PD is more severe in the lateral SN compared to the medial SN
In order to bias our analysis towards underlying pathogenic mechanisms rather than terminal pathology, we devised a correction paradigm based on Moran's observations on neuronal loss. Our rationale is described in detail in
Given the loss of dopaminergic cells in the PD SN the major contribution to the expression profile in the SN PD samples would presumably now come from the non-dopaminergic cells. Furthermore neuropathology in the PD SN is characterised by a reactive gliosis or “glial inflammation” (reviewed by Orr
We also analysed non-SN tissues as they are not subject to cytoarchitectural changes seen in the SN or are only affected late in the disease. Five IPA pathways were overrepresented in three out of five non-SN data sets (
IPA Pathway category | Number of studies with over-represented pathways at ≤0.05 in non-SN data sets (n = 6) | Number of studies with over-represented pathways at ≤0.05 in SN data sets (n = 5) after neuronal correction |
IGF-1 Signaling | 3 | 2 |
VEGF Signaling | 3 | 2 |
Synaptic Long Term Potentiation | 3 | 0 |
Calcium Signaling | 3 | 0 |
ERK/MAPK Signaling | 3 | 0 |
PTEN Signaling | 2 | 3 |
JAK/Stat Signaling | 2 | 3 |
Ephrin Receptor Signaling | 2 | 4 |
Axonal Guidance Signaling | 2 | 3 |
Finally, given its clinical accessibility, we also re-analysed a whole blood dataset
Microarrays promise much in elucidating the pathogenesis of complex diseases such as PD but the lack of concordance in published data sets to date certainly questions their relevance. Here we have shown that a standardised approach to analysing PD-related microarray data can account for a considerable proportion of the discordance. We used a common analytical approach which improved data convergence and uncovered new leads for PD pathogenesis. We also recognised a potential anatomical bias in the datasets derived from brain regions with high neuronal loss. Our approach therefore provided an improved comparative analysis between existing datasets and further considered ‘tissue-of-origin’ effects.
Complex phenotypes, by their very nature, are aetiologically heterogeneous. This implies that single gene signatures may not be shared by all affected individuals. However, the identification of particularly relevant genetic pathways, have a higher probability of being revealed as convergent across multiple individuals and multiple studies than individual genes
Context is very important in gene expression studies and as expected the analysis of SN tissue-derived data sets further improved our concordance and highlighted the ‘dopamine receptor signaling’ pathway. However rather than representing a primary pathogenic effect the extensive down regulation of genes such as DOPA decarboxylase (
Accordingly we pursued two alternative approaches to maximise potentially useful information on the underlying biological processes. First, the implementation of a correction factor for dopaminergic neuronal loss in the SN data sets and second, the analysis of non-SN or unaffected tissues for data convergence. The purpose of the correction factor was not to magically recreate the early disease landscape but to remove ‘red herrings’ that solely reflected the relative numbers of neurons between cases and controls. The retention of genes differentially regulated in the residual dopaminergic neuron data set should have improved the overall specificity of this approach.
Following our ‘neuronal loss’ correction, two alternative pathways gained prominence: ephrin receptor signaling and the axonal guidance pathway. The latter is consistent with the findings of recent studies
As discussed above microarray data is very powerful in illustrating cytoarchitectural differences between cases and controls such as dopaminergic neuron loss. Given the considerable literature supporting the involvement of glia in PD pathogenesis
The distinct gene expression pattern of brain areas that are not overtly affected by PD pathology may be less confounded than the SN with respect to the cell death associated with PD. It could be argued that the transcriptomes of unaffected tissues might be too divergent from those of predilection sites such as the SN, such that they provide very little informative data. Our analysis, which has highlighted consistent differences in growth-factor signaling in non-SN datasets, argues that areas affected late in the disease, such as the prefrontal cortex
This pathway has been largely unexamined for associations with PD although IGF1 signaling is reported to have neuroprotective effects on dopaminergic neurons
Similarly VEGF is known to promote the growth and survival of dopaminergic neurons
Case-control expression analysis in a degenerative disease like PD poses difficult issues when attempting to uncover pathways contributing to disease initiation. It would be advantageous to target tissues that express the proteins that are fundamental to the disease process and are different in individuals who are at risk of the disease. At the same time we need to account for any influences of the pathological process on these profiles. Microarray data of predilection sites such as the SN illustrates cytoarchitectural differences between cases and controls but to understand some of the early pathogenic processes, we would ideally want to assay a brain region very similar to SN but that is only belatedly affected.
An additional consideration is the ability of the pathway approach, used in our analyses, to provide adequate specificity for PD over other neurodegenerative conditions. This issue remains to be clarified, and requires further investigation. It is important to recognise that there may be genetic expression patterns common to neurodegenerative diseases, generally. These may reflect common pathological changes (such as cell death, markers of oxidative stress or neuro-inflammation etc) or shared risk factors influencing neurodegeneration.
There are still inherent difficulties in obtaining reproducible gene expression data from post mortem brain, even if an optimal region of the brain could be assayed
Interestingly one peripherally accessible neural tissue, the olfactory mucosa, has been used to demonstrate significant differences in functional assays and gene expression between schizophrenics, bipolar affective disorder and controls
In this paper we have presented a summary of the available microarray data from PD case-control studies and have suggested some potential strategies for uncovering primary pathogenic mechanisms. For others who wish to use and explore these data we have constructed an online database which enables rapid evaluation on a single gene or pathway basis.
We conducted literature searches in National Center for Biotechnology Information (NCBI) PubMed and dataset searches in NCBI gene expression omnibus and ArrayExpress (EBI)
All studies used Affymetrix arrays, the probes on the arrays and the experimentally chosen fluorescence thresholds varied. Consequently, the data could not be simply combined without avoiding study bias and the effects of probe-level sequence information
The ranked genes lists for each study were assessed by integrating the data at a pathway level. Each ranked list was imported into the Ingenuity Pathways Analysis 6.3 (IPA, from Ingenuity® Systems,
The substantia nigra pars compacta of PD patients is characterised by the loss of neuromelanin-containing dopaminergic neurons
The differentially expressed gene list generated for each study by this re-analysis can be found with their respective p-values and fold changes can be found at
The National Center for Biotechnology Information Entrez Gene website (
Comparison of overlap in genes between PD-related transcriptomic studies. The enclosed table illustrates the increase in data convergence between PD-related transcriptomic studies following the implementation of our common analysis methodology.
(0.02 MB PDF)
The differentially expressed probes generated by common analysis for each study. Ranked probe lists for each study generated by common analysis method with fold change and p-value.
(1.08 MB XLS)
Over-represented pathway categories from IPA analysis. Over-represented pathway categories from IPA analysis of the differentally expressed probes of PD patients compared to controls: A- SN - Hauser study; B- SN - Zhang study; C- lateral SN - Moran study; D- medial SN - Moran study; E- SN - Lesnick study; F- Brodmann Area 9 - Zhang study; G- Putamen - Zhang study; H- Superior Frontal Gyrus - Moran study; I- Occipital Cortex - Vogt study; J- Putamen - Vogt study; K- Cerebellum - Vogt study; L- Whole Blood - Scherzer study; M- Laser Captured SN dopaminergic neurons - Castelvetri study.
(0.07 MB XLS)
Selection of neuronal loss-associated genes. A work flow diagram and hypothetical examples illustrate the selection of neuronal loss-associated genes that were removed from the SN datasets prior to pathway analysis.
(0.12 MB PDF)
‘Neuronal-loss’ associated genes removed by correction paradigm. Identified probes with fold change's for the three brain regions in the Moran study that followed the pattern, LSN>MSN>SFG.
(0.05 MB XLS)
Identified probes with fold change's for the three brain regions in the Moran study that followed the pattern, LSN>MSN>SFG. Over-represented pathway categories from IPA analysis after neuronal correction of the differentally expressed probes from SN tissue: A- Hauser study; B- Zhang study; C- lateral SN; Moran study; D- medial SN Moran study; E- Lesnick study.
(0.04 MB XLS)
Fold changes in PD-related glial markers. The lack of differential expression of PD-related glial markers is illustrated in the enclosed table.
(0.07 MB PDF)
The authors would like to acknowledge our PD colleagues who have kindly made their gene expression data available through public databases or forwarding it directly to us. We would like to thank Amanda Miotto and Othmar Korn for informatics support.