Conceived and designed the experiments: JB MS AS. Performed the experiments: AB. Analyzed the data: KS JF QRA ML. Contributed reagents/materials/analysis tools: JF SS VS QRA. Wrote the paper: KS JB ML. Coordinated the study: KS. Performed the data analysis: KS. Was responsible for protein isolation, proteomic analyses, and Western blotting: AB. Was responsible for creation of the relational database: JF. Contributed to the development of the relational database: SS VS. Wrote custom software for and participated in proteomic data analysis: QRA. Contributed to the study design: AYK JB. Contributed to the writing of the manuscript: AYK JB ML. Contributed to the data analysis: JF ML. Contributed to conception of the study and design of the proteomics experiments: MS. Conceived the study and oversaw all aspects of the study: AS.
The authors have declared that no competing interests exist.
Although prior studies have demonstrated a smoking-induced field of molecular injury throughout the lung and airway, the impact of smoking on the airway epithelial proteome and its relationship to smoking-related changes in the airway transcriptome are unclear.
Airway epithelial cells were obtained from never (n = 5) and current (n = 5) smokers by brushing the mainstem bronchus. Proteins were separated by one dimensional polyacrylamide gel electrophoresis (1D-PAGE). After in-gel digestion, tryptic peptides were processed via liquid chromatography/ tandem mass spectrometry (LC-MS/MS) and proteins identified. RNA from the same samples was hybridized to HG-U133A microarrays. Protein detection was compared to RNA expression in the current study and a previously published airway dataset. The functional properties of many of the 197 proteins detected in a majority of never smokers were similar to those observed in the never smoker airway transcriptome. LC-MS/MS identified 23 proteins that differed between never and current smokers. Western blotting confirmed the smoking-related changes of PLUNC, P4HB1, and uteroglobin protein levels. Many of the proteins differentially detected between never and current smokers were also altered at the level of gene expression in this cohort and the prior airway transcriptome study. There was a strong association between protein detection and expression of its corresponding transcript within the same sample, with 86% of the proteins detected by LC-MS/MS having a detectable corresponding probeset by microarray in the same sample. Forty-one proteins identified by LC-MS/MS lacked detectable expression of a corresponding transcript and were detected in ≤5% of airway samples from a previously published dataset.
1D-PAGE coupled with LC-MS/MS effectively profiled the airway epithelium proteome and identified proteins expressed at different levels as a result of cigarette smoke exposure. While there was a strong correlation between protein and transcript detection within the same sample, we also identified proteins whose corresponding transcripts were not detected by microarray. This noninvasive approach to proteomic profiling of airway epithelium may provide additional insights into the field of injury induced by tobacco exposure.
Cigarette smoking, the leading cause of preventable death in the United States, is responsible for 440,000 deaths per year
Cigarette smoke creates a field of molecular injury in the epithelial cells lining the entire respiratory tract. Changes include cellular atypia
Prior studies have analyzed lung tissue from never, current and former smokers using two-dimensional electrophoresis (2DE) coupled with mass spectrometry, leading to the hypothesis that smoke exposure induces an unfolded-protein-like response
Although studies have tried to address the large-scale correlation between protein production and mRNA expression in both cell lines
In this study, we profiled proteins and genes expressed within the same bronchial epithelium of never and current smokers via 1D-PAGE with LC-MS/MS and DNA microarrays respectively. The relationship between protein detection and mRNA expression was explored both globally and for individual proteins of interest. We found that the majority of airway proteins detected by mass spectrometry have their corresponding transcripts detected at measurable levels by microarray, and that changes at the protein level in response to cigarette smoke parallel smoking-induced changes in mRNA. This approach also detected proteins whose corresponding transcript expression was not detected by microarrays. This study represents the first application of this approach to the simultaneous proteomic and transcriptomic profiling of airway epithelium within the same individual, providing insight into the normal and smoking-affected airway proteome and the relationship between protein changes and the previously described changes in airway gene expression.
The idemographics for subjects recruited into this study are shown in
NS1 | 23 | Male | 0 | 101% | 96% | 0.82 |
NS2 | 32 | Male | 0 | 88% | 97% | 0.91 |
NS3 | 28 | Male | 0 | 98% | 101% | 0.87 |
NS4 | 32 | Female | 0 | 108% | 111% | 0.89 |
NS5 | 27 | Male | 0 | 127% | 140% | 0.92 |
CS1 | 34 | Male | 17 | 87% | 84% | 0.81 |
CS2 | 34 | Female | 15 | 84% | 85% | 0.72 |
CS3 | 45 | Female | 14 | 90% | 94% | 0.88 |
CS4 | 45 | Male | 16 | 88% | 97% | 0.91 |
CS5 | 47 | Male | 39.5 | 91% | 89% | 0.81 |
NS indicates never smokers, and CS indicates current smokers. FVC indicates the forced vital capacity as a percent of the predicted value. FEV1% indicates the forced expiratory volume at one second as a percent of the predicted value. A Student's t-test was performed for continuous variables, and a chi square test for dichotomous variables. Never and current smokers differed in age and pack years of smoking (p<0.05).
A total of 652 proteins were detected in one or more never smokers, with 197 proteins found in the majority of never smokers (
The circles represent proteins detected in at least one sample. A total of 859 proteins were detected by LC-MS/MS in any sample. 652 proteins were detected by LC-MS/MS in any never smoker, and 613 proteins were detected in at least one current smoker. The inner oval represents proteins detected by LC-MS/MS in the majority of samples. 197 proteins were detected in the majority of never smokers, and 169 proteins were detected in the majority of current smokers. *A total of 23 proteins differ between never and current smokers based on the criteria described in the
8.9*10−5 | 3.5*10−2 | |
1.0*10−5 | 7.0*10−3 | |
|
9.6*10−6 | 8.8*10−3 |
|
8.5*10−6 | 1.2*10−2 |
|
8.4*10−6 | 2.3*10−2 |
|
||
Oxidoreductase activity, acting on the Aldehyde or oxo Group of donors | 7.0*10−5 | 3.2*10−2 |
Oxidoreductase activity, acting on the Aldehyde or oxo group of donors, NAD or NADP as acceptor | 3.3*10−5 | 1.8*10−2 |
Statistically enriched functional categories (FDR<0.05) and subcategories of the 197 proteins detected in the majority of never smokers as determined by DAVID. Over-represented categories that contain more than two probe sets are included. Functional categories that are also over-represented (FDR<0.05) among transcripts detected in the all never smokers in this cohort are
613 proteins were detected in one or more current smokers, and 169 proteins were detected in the majority of current smokers (
Due to the small sample size, a second list of differentially detected proteins was defined using a qualitative criterion: proteins were included if present in three or more samples of one class compared to the other. Twenty-three proteins differed between never and current smokers based on these criteria (
alpha-2-macroglobulin precursor | NP_000005 | 0/3 |
transferrin; PRO2086 protein | NP_001054 | 0/3 |
ribosomal protein S2; 40S ribosomal protein S2 | NP_002943 | 1/4 |
superoxide dismutase 2, mitochondrial | NP_000627 | 2/5 |
prolyl 4-hydroxylase, beta subunit | NP_000909 | 2/5 |
S-adenosylhomocysteine hydrolase | NP_000678 | 3/0 |
aldehyde dehydrogenase 9A1 | NP_000687 | 3/0 |
dynein, axonemal, heavy polypeptide 5 | NP_001360 | 3/0 |
dynein, axonemal, heavy polypeptide 9 isoform 2 | NP_001363 | 3/0 |
dynein, cytoplasmic, heavy polypeptide 1 | NP_001367 | 3/0 |
prostatic binding protein | NP_002558 | 3/0 |
phosphoglycerate mutase 1 (brain) | NP_002620 | 3/0 |
secretoglobin, family 1A, member 1 (uteroglobin) | NP_003348 | 3/0 |
Fc fragment of IgG binding protein | NP_003881 | 3/0 |
aminopeptidase puromycin sensitive | NP_006301 | 3/0 |
arachidonate 15-lipoxygenase | NP_001131 | 4/1 |
S100 calcium binding protein A11 | NP_005611 | 4/1 |
valosin-containing protein | NP_009057 | 4/1 |
CGI-38 protein | NP_057048 | 5/2 |
tubulin beta MGC4083 | NP_115914 | 5/2 |
The proteins that are differentially detected in never and current smokers are listed by protein name and by RefSeq identification number. The right column shows the numbers of never and current smokers samples in which the protein was detected. Proteins with a Fisher exact p≤0.05 comparing never and current smokers are shown in
We validated mass spectrometry findings by immunoblot for three of the proteins that differed between never and current smokers (
Western blotting shows significantly higher levels of PLUNC in the never smokers. Higher levels of uteroglobin were also observed in never smokers, although there was heterogeneity among the current smokers. There was a small increase in P4HB in two of the current smoker samples.
An average of 93% of proteins detected by mass spectrometry had at least one matching probe set on the HG-U133A array. Of these, an average of 86% had detectable gene expression (Pdetection<0.05) in samples collected from the same participants demonstrating a significant level of co-detection (χ2 = 347, p = 2.2×10−16). There was not a significant difference in the rate of co-detection between never and current smokers.
For select proteins where detection varied between never and current smokers, we examined the expression of the corresponding mRNA for smoking-related differential expression. PLUNC (NP_570913), ALDH3B1 (NP_000685), and hypothetical protein DKFZP586A0522 (NP_054752) were selected based on the results of the Fisher exact test. Uteroglobin (NP_003348) and the prolyl 4-hydroxylase beta subunit (P4HB) (NP_000909) were selected based on their qualitative differences between never and current smokers. Within this cohort, mRNA expression positively correlated with protein detection for PLUNC, uteroglobin, and P4HB (
Boxplots of the gene expression levels and bar graphs of LC-MS/MS results for A) ALDH3B1, B) hypothetical protein DKFZP586A0522, C) PLUNC, D) uteroglobin (CC10), and E) P4HB subunit. The borders of each boxplot represent the interquartile range of z-score normalized natural logarithm of the MAS5 gene expression data from this cohort of 5 never smoker and 5 current smokers, and from a previously published cohort (AGED) of 23 never smokers and 34 current smokers, excluding one never smoker in common to this study. The solid line within each box represents the median gene expression, and the whiskers of the plot extend to the upper and lower extremes of the data for each gene. Bar plots represent the number of smoker and nonsmoker samples in the current study where the protein was detected. Proteomic analysis detected ALDH3B1, hypothetical protein DKFZP586A0522, PLUNC and uteroglobin in more never smokers, while P4HB was detected in more current smokers. There is concordance in the direction of change for smoking-related protein and gene expression changes for these 5 genes. * p<0.05. ** p<0.005. *** p<0.0005.
The association between smoking and gene expression was also examined in a previously published cohort
Differences in protein detection by mass spectrometry and transcript detection by microarray were also explored. In the matched samples, there was no expression by microarray of transcripts corresponding to 41 proteins that were detected in ≥50% of samples by mass spectrometry (
Actin, alpha 1, skeletal muscle (NP_001091) |
Succinate dehydrogenase complex, subunit B, iron sulfur (Ip) (NP_002991) |
Myosin, heavy polypeptide 14 (NP_079005) |
Superoxide dismutase 2, mitochondrial (NP_000627) |
Tubulin, beta 1 (NP_110400) |
Phosphorylase, glycogen; brain (NP_002853) |
Tubulin, beta 4 (NP_006078) |
Phosphorylase, glycogen; muscle (McArdle syndrome, glycogen storage disease type V) (NP_005600) |
Spectrin, alpha, non-erythrocytic 1 (alpha fodrin) (NP_003118) |
3-hydroxyisobutyrate dehydrogenase (NP_689953) |
Spectrin, beta, non-erythrocytic 1 (NP_842565) |
Adenylate kinase 1 (NP_000467) |
Villin 2 (ezrin) (NP_003370) |
N-acylsphingosine amidohydrolase (acid ceramidase) 1 (NP_808592) |
Histone 1, H1b (NP_005313) |
Apolipoprotein A-I (NP_000030) |
Histone 1, H3f (NP_066298) |
Cytochrome c oxidase subunit IV isoform 1 (NP_001852) |
Histone 1, H4k (NP_068803) |
Heat shock 70 kDa protein 1-like (NP_005518) |
RAB6A, member RAS oncogene family (NP_002860) |
Heat shock 70 kDa protein 6 (HSP70B′) (NP_002146) |
Albumin (NP_000468) |
Heterogeneous nuclear ribonucleoprotein C (C1/C2) (NP_112604) |
Karyopherin (importin) beta 1 (NP_002256) |
Heterogeneous nuclear ribonucleoprotein M (NP_005959) |
Lamin A/C (NP_733821) |
Peptidylprolyl isomerase A (cyclophilin A) (NP_066953) |
Lamin B2 (NP_116126) |
Peroxiredoxin 2 (NP_005800) |
Stomatin (NP_004090) |
Phosphoglycerate kinase 1 (NP_000282) |
Carbonic anhydrase I (NP_001729) |
Pyruvate kinase, muscle (NP_002645) |
Carbonic anhydrase II (NP_000058) |
Solute carrier family 4, anion exchanger, member 1 (erythrocyte membrane protein band 3, Diego blood group) (NP_000033) |
Catalase (NP_001743) |
Tumor rejection antigen (gp6) 1 (NP_003290) |
Hemoglobin, delta (NP_000510) |
Voltage-dependent anion channel 3 (NP_005653) |
Hemoglobin, gamma A (NP_000550) |
A total of 41 proteins detected in at least half of the samples by LC/MS-MS lacked detectable expression by microarray at a detection p-value<0.05. Fewer than 5% of airway samples from a previously published dataset
Cell organization and biosynthesis (PDAVID<0.05).
Cortical cytoskeleton (PDAVID<0.05).
Cytoskeleton (PDAVID<0.05).
Cell cortex (PDAVID<0.05).
Transition metal ion binding (PDAVID<0.05).
Pyridoxal phosphate binding (PDAVID<0.05).
Oxygen binding (PDAVID<0.05).
Unclassified in DAVID.
Component of the erythrocyte proteome
We applied 1D-PAGE coupled with LC-MS/MS to the study of the airway epithelium proteome and its response to cigarette smoke exposure. This study presents the first proteomic profile of a relatively pure population of bronchial epithelial cells obtained from bronchoscopy brushings. We also used differences in the rate of protein detection between never and current smokers to identify candidates for proteins that vary in abundance in response to tobacco-smoke exposure. The effect of smoking on several of these proteins was confirmed by Western blot. We also found that for many candidates, smoking similarly affected expression of the mRNA transcripts that gave rise to these proteins. This was accomplished by measuring gene expression in the same samples that were profiled at the proteomic level and in an independent data set. The majority of proteins identified by LC-MS/MS had detectable levels of their corresponding transcript by microarray. Differing methodologies may account for the stronger relationship between protein and gene expression reported here relative to prior studies
Analysis of the proteome using 1D-PAGE coupled with LC-MS/MS resulted in the detection of 41 proteins for which expression of corresponding transcripts was not detected by microarray. Some of these failures to detect transcript expression could represent technical limitations of the microarray platform. However, we were intrigued that several of the proteins whose transcripts were not detected by microarray represent erythrocyte-specific proteins. This suggests that: 1) the airway epithelial samples collected for this study were likely contaminated with erythrocytes, and 2) that more generally, stable proteins may be detected by proteomic methods long after the mRNA which encodes for them has disappeared.
Using habitual smoking as a paradigm for inhalational exposures affecting airway epithelium, we have identified changes in protein among smokers by LC-MS/MS and validated select changes with Western blotting. A decrease in the short isoform of PLUNC has previously been described in the pooled nasal lavage fluid of current smokers when compared with nonsmokers
This study was limited by a relatively small sample size, the sensitivity of the proteomic technique, and challenges in the quantification of proteins. While age was a confounding variable in this study, the gene expression changes in the airway epithelium of never and current smokers were validated using age-matched samples from current and never smokers in a previously published gene-expression study
In summary, we have described the proteomic profile of normal bronchial epithelial cells using 1D-PAGE coupled with LC-MS/MS and linked this profile to smoking-induced transcriptional changes in these same cells. This approach has the potential to provide additional insight into host response to tobacco smoke and the pathogenesis of smoking-related lung disease.
Never (n = 5) and current smokers (n = 5) were recruited for fiberoptic bronchoscopy at Boston Medical Center. Detailed medical and smoking histories were obtained including number of cigarettes smoked per day, cumulative tobacco exposure measured in pack-years, and an estimation of second-hand smoke exposure. Screening prior to bronchoscopy included an electrocardiogram, chest radiograph and spirometry. Participants with a history of underlying lung disease, significant second hand smoke exposure, an abnormal baseline EKG, or evidence of obstructive lung disease on spirometry (defined as an FEV1/FVC<0.7) were excluded from the study. This study was approved by the Institutional Review Board at Boston Medical Center, and all subjects provided written informed consent.
Bronchial epithelial cell brushings from the right mainstem bronchus were obtained at the time of bronchoscopy with an endoscopic cytology brush (Cellebrity Endoscopic Cytology Brush, Boston Scientific, Natick, MA). Cytokeratin staining has demonstrated that this method results in the collection of greater than 90% pure population of bronchial epithelial cells
After cell lysis with 2% SDS, proteins were separated on a 4–20% polyacrylamide minigel by electrophoresis and stained with Coomassie Blue (Supporting
All samples were analyzed by LC-MS/MS using an LTQ ProteomeX ion trap mass spectrometer (ThermoFinnigan, Waltham, MA). Peptides from each gel slice were serially injected onto a home-packed C18 reverse-phase column (Magic C18AQ, 15 cm×100 micron ID, Michrom Bioresources, Inc., Auburn, CA) interfaced directly to the mass spectrometer. Peptides were separated using short, biphasic, 20-minute gradients of 0–90% acetonitrile in the presence of 0.5% acetic acid. From each parent ion scan (MS scan), the ten most intense ions were selected for collision-induced dissociation, and the spectra of the peptide fragments were recorded (MS2 scan).
The data were analyzed using SEQUEST software
Residual protein lysates from two never and five current smoker samples were quantified by 1D-PAGE and Coomassie blue staining (Supporting
Six to eight micrograms of RNA obtained from five of the never smoker and four of the current smoker participants was processed and hybridized to an Affymetrix HG-U133A GeneChip (Affymetrix Inc., Santa Clara, CA) containing ∼22,215 probesets as previously described
Expression Console Version 1.0 (Affymetrix Inc.) was used to generate a MAS5 weighted-mean expression level for each transcript and a detection p-value (Pdetection), which indicates the reliability of detection of that transcript above background on the array. The mean intensity for each array was scaled to 100. Each array included in the final analysis had at least 30% of the probesets detected above background (percent present >30%) and a 3′ to 5′ ratio of signal intensity for GAPDH of less than or equal to 5. One never smoker microarray was excluded based on these quality control filters (low percent present, high 3′ to 5′ GAPDH ratio), leaving four never and four current smoker arrays for analysis.
Sample contamination with significant numbers of non-epithelial cells was evaluated, as described previously
For each protein, we queried the microarray data from the same patient for expression (Pdetection<0.05) of a matching transcript. The significance of the overlap between detected proteins and transcripts was determined using Pearson's Chi-squared test with Yates' continuity correction.
A comparison of protein detection and transcript expression level was also performed for individual proteins of interest using the microarray data generated in this study and a previously published cohort of 23 never smokers and 34 current smokers
Functional enrichment analysis was performed using DAVID (
Additional information, including clinical data for all of the study participants, the complete list of proteins detected in each sample, percent peptide coverage for each protein and the expression levels for all genes in all samples are stored in a relational MYSQL database that is available at
1D-PAGE of a current smoker sample prior to mass spectrometry. Proteins from each sample were separated by 1D-PAGE prior to mass spectrometry. A representative sample is shown. MW indicates the molecular weight marker. BSA indicates a bovine serum albumin standard. CS indicates current smoker.
(2.28 MB TIF)
1D-PAGE for approximation of protein yield prior to Western Blot. A small amount of material from each sample was retained for Western blotting. To roughly normalize the protein contribution from each sample, a small amount of material from the remaining samples were analyzed on 1D-PAGE and stained with Coomassie blue. MW indicates a molecular weight standard. NS indicates never smokers, and CS indicates current smokers.
(2.04 MB TIF)
The authors thank Yves-Martine Dumas for her assistance with sample collection.