Research Article

A Genome-Wide Association Study Identifies rs2000999 as a Strong Genetic Determinant of Circulating Haptoglobin Levels

  • Philippe Froguel equal contributor mail,

    equal contributor Contributed equally to this work with: Philippe Froguel, Ndeye Coumba Ndiaye (PF); (SVS)

    Affiliations: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France, Genomic Medicine, Imperial College London, Hammersmith Hospital, London, England

  • Ndeye Coumba Ndiaye equal contributor,

    equal contributor Contributed equally to this work with: Philippe Froguel, Ndeye Coumba Ndiaye

    Affiliation: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France

  • Amélie Bonnefond,

    Affiliations: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France, EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France

  • Nabila Bouatia-Naji,

    Affiliation: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France

  • Aurélie Dechaume,

    Affiliation: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France

  • Gérard Siest,

    Affiliation: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France

  • Bernard Herbeth,

    Affiliation: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France

  • Mario Falchi,

    Affiliation: Genomic Medicine, Imperial College London, Hammersmith Hospital, London, England

  • Leonardo Bottolo,

    Affiliation: Genomic Medicine, Imperial College London, Hammersmith Hospital, London, England

  • Rosa-Maria Guéant-Rodriguez,

    Affiliation: Institut National de la Santé et de la Recherche Médicale (INSERM) U954, Faculté de Médecine, Nancy-Université, Nancy, France

  • Cécile Lecoeur,

    Affiliation: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France

  • Michel R. Langlois,

    Affiliations: Department of Clinical Chemistry, Ghent University Hospital, Ghent, Belgium, Department of Cardiovascular Diseases, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium

  • Yann Labrune,

    Affiliation: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France

  • Aimo Ruokonen,

    Affiliation: Institute of Clinical Medicine/Biochemistry, University of Oulu, Oulu, Finland

  • Said El Shamieh,

    Affiliation: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France

  • Maria G. Stathopoulou,

    Affiliation: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France

  • Anita Morandi,

    Affiliations: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France, Regional Centre for Juvenile Diabetes, Obesity and Clinical Nutrition, Verona, Italy

  • Claudio Maffeis,

    Affiliation: Regional Centre for Juvenile Diabetes, Obesity and Clinical Nutrition, Verona, Italy

  • David Meyre,

    Affiliation: Centre National de la Recherche Scientifique (CNRS) 8199 - Institute of Biology, Pasteur Institute, Lille 2 University, Lille, France

  • Joris R. Delanghe,

    Affiliation: Department of Clinical Chemistry, Ghent University Hospital, Ghent, Belgium

  • Peter Jacobson,

    Affiliation: Department of Molecular and Clinical Medicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

  • Lars Sjöström,

    Affiliation: Department of Molecular and Clinical Medicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

  • Lena M. S. Carlsson,

    Affiliation: Department of Molecular and Clinical Medicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

  • Andrew Walley,

    Affiliation: Genomic Medicine, Imperial College London, Hammersmith Hospital, London, England

  • Paul Elliott,

    Affiliation: Department of Molecular and Clinical Medicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

  • Marjo-Riita Jarvelin,

    Affiliations: Department of Epidemiology and Biostatistics, MRC Health Protection Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College London, London, England, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Finland, Department of Children, Young People and Families, National Institute for Health and Welfare, Oulu, Finland

  • George V. Dedoussis,

    Affiliations: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France, Department of Nutrition–Dietetics, Harokopio University, Athens, Greece

  • Sophie Visvikis-Siest mail (PF); (SVS)

    Affiliations: EA4373–‘Cardio-vascular Genetics’ Research Unit, Université de Lorraine, Nancy, France, Department of Internal Medicine and Geriatrics, ‘Centre Hospitalier Universitaire de Nancy’, Nancy, France

  • Published: March 05, 2012
  • DOI: 10.1371/journal.pone.0032327


Haptoglobin is an acute phase inflammatory marker. Its main function is to bind hemoglobin released from erythrocytes to aid its elimination, and thereby haptoglobin prevents the generation of reactive oxygen species in the blood. Haptoglobin levels have been repeatedly associated with a variety of inflammation-linked infectious and non-infectious diseases, including malaria, tuberculosis, human immunodeficiency virus, hepatitis C, diabetes, carotid atherosclerosis, and acute myocardial infarction. However, a comprehensive genetic assessment of the inter-individual variability of circulating haptoglobin levels has not been conducted so far.

We used a genome-wide association study initially conducted in 631 French children followed by a replication in three additional European sample sets and we identified a common single nucleotide polymorphism (SNP), rs2000999 located in the Haptoglobin gene (HP) as a strong genetic predictor of circulating Haptoglobin levels (Poverall = 8.1×10−59), explaining 45.4% of its genetic variability (11.8% of Hp global variance). The functional relevance of rs2000999 was further demonstrated by its specific association with HP mRNA levels (β = 0.23±0.08, P = 0.007). Finally, SNP rs2000999 was associated with decreased total and low-density lipoprotein cholesterol in 8,789 European children (Ptotal cholesterol = 0.002 and PLDL = 0.0008).

Given the central position of haptoglobin in many inflammation-related metabolic pathways, the relevance of rs2000999 genotyping when evaluating haptoglobin concentration should be further investigated in order to improve its diagnostic/therapeutic and/or prevention impact.


Human haptoglobin (Hp) is an acute phase inflammatory glycoprotein essentially synthesized by the liver and up-regulated by cytokines [1]. Hp is polymorphic with two co-dominant alleles, Hp1 and Hp2 encoded by the Haptoglobin (HP) gene located in chromosome 16 and resulting in three common isoforms: Hp1-1, Hp2-2 and Hp2-1 (called HP ‘common polymorphism’) [2]. In normal physiological conditions, Hp protein concentration in blood ranges between 0.3 and 2.0 g/L in adults [3] but significant fall in its level during the first decade of life [4]. The main property of Hp is to scavenge circulating hemoglobin (Hb) released by hemolysis or normal red blood cells turnover [5]. The resulting circulating Hp-Hb complexes are eliminated by Kupffer's cell(s) in the liver, preventing the generation of reactive oxygen species [2], [6]. Therefore, Hp plays an important role in preventing renal damage and iron loss that can occur following an intravascular hemolysis. Hp is also able to bind apolipoprotein (Apo) A-I [7] to protect the Apo A-I effector domain of lecithin-cholesterol acyltransferase against oxidative stress, and Hp consequently modulates the high-density lipoprotein (HDL) function [8]. Furthermore, Hp can bind Apo E and the resulting complexes influence cholesterol esterification [9]. These functional characteristics confer to Hp a major role in the reverse transport of cholesterol between peripheral cells and the liver for degradation.

Hp levels and HP rs72294371 ‘common polymorphism’ (Data S1) have been consistently associated with inflammatory-linked infectious [10], [11] and non-communicable diseases [11], [12]. Malaria caused by Plasmodium falciparum, which is associated with extensive intravascular hemolysis, decreases Hp to undetectable levels as the Hb-scavenging system is saturated [13].13 In malaria-endemic areas, hypohaptoglobinemia has been proposed as an indirect biochemical indicator of malaria [14]. HP ‘common polymorphism’ should be considered at diagnosis of tuberculosis. Eisaev and colleagues [15] described an increased recurrence of pulmonary tuberculosis with worse prognosis in Hp2-2 Caucasians. Furthermore, this HP ‘common polymorphism’ contributes to mortality and viral load in Human immunodeficiency virus (HIV) infection [16]. Hp2-2 HIV carriers have a more pronounced viral replication rate and a worse prognosis compared to Hp1-1 or Hp2-1 HIV carriers [16], [17]. Hepatitis C infection has also been associated with low serum Hp concentrations [18] and an overrepresentation of the Hp1-1 phenotype has been associated with high risk for chronic hepatitis C [19], [20]. The HP ‘common polymorphism’ has also an effect on various other infectious diseases [21], [22], [23], [24].

In Type 2 diabetes patients, the Hp2-2 phenotype has been suggested to confer greater risk of cardiovascular events [12], [25], [26] and of carotid atherosclerosis [27]. Moreover, high Hp level is a risk factor for acute myocardial infarction, stroke and heart failure [28], [29]. Therefore, the routine measurement of Hp level has been suggested to be incorporated in daily medical practice to evaluate cardiovascular risk [29].

Despite these findings, the basis of Hp level inter-individual variability is still unknown. To identify genetic variants modulating physiological levels of Hp, we analyzed genome-wide association study (GWAS) data generated in European children for whom no age-related disease may influence Hp concentrations. We also assessed the association between the identified variants and cardiovascular risk factors (total, HDL and low density lipoprotein-cholesterol, Apolipoproteins A1 and B).


Table 1 shows the phenotypic characteristics of the studied populations. Hp levels in children were low, in accordance with its reference distribution [4].


Table 1. Phenotypic characteristics of the studied populations.


Patterns of family correlations for serum Hp concentrations were assessed following both unadjusted values (Table 2). Model 1, which was not adjusted, did not show any family correlation. Model 2, which took into account age and body mass index (BMI) as covariates showed significant correlations for all the various pairs of relatives. Model 3, which hypothesized no effect of gender on family correlations, showed significant father-mother, father-son and son-son correlations (Table 2).


Table 2. Estimates of familial correlations ± standard error for serum haptoglobin concentration (656 familles/2680 individuals).


We then assessed the components of variance attributable to additive genetic effects, shared household effects and residual environmental factors (including assay variability) in 656 nuclear bi-parental families (2,680 individuals) from the STANISLAS Family Study (SFS) cohort (Table 3). Model 2, which included the three components after adjustment for age and BMI gave a better description of the variance decomposition than model 1 which was not adjusted. Hp genetic variance represented 26% (P<0.001) of the total variance. Shared (i.e. within families) and random environmental variances were 11.6 and 62.4% respectively (Table 3).


Table 3. Variance components of serum haptoglobin concentrations (656 families/2680 individuals, rs2000999 & rs4788597 as allelic frequency).


Our GWAS based on 631 unrelated children of the SFS cohort showed strongest association signal for Hp levels in a 218-kb linkage disequilibrium (LD) block on chromosome 16 that includes the HP gene (Figure 1). Using the square-root transformed Hp measurement adjusted for gender, age and z-BMI under the additive model, we identified in this region two significant association signals 90-kb apart: rs2000999 (with A as allele effect: β = −0.123, standard error [SE] = 0.017, P = 6.32×10−13; Table 4) and rs10492825 (with C as effect allele: β = −0.0876, SE = 0.016, P = 5.50×10−08; Table 4). Both SNPs rs2000999 and rs10492825 display moderate LD (r2 = 0.48, HapMap CEU release #27). In order to assess the redundancy between these two signals, we ran conditional regression analyses for both SNPs adjusted for each other and found that rs2000999 alone drove the association observed at the HP locus (rs2000999: Prs10492825 adjusted = 1.95×10−7, rs10492825: Prs2000999 adjusted = 0.91, Data S2, Table S1).


Figure 1. Manhattan plot of the GWAS of the discovery cohort comprising 631 children.

A, A Manhattan plot showing the −log10(P values) of SNPs from the association analysis of the 631 SFS children from stage 1. B, An overview of the −log10(P values) of Chromosome 16. C, The genomic region of the 618 LD block displayed in UCSC Genome Browser.


Table 4. Discovery and replication of rs2000999 association data for Hp levels.


We confirmed the association between SNP rs2000999 and circulating Hp levels in three additional independent European cohorts: the GENDAI study of Greek children, a subset of obese children from the East of France (Ntotal = 1,434), and a familial subset (Ntotal = 2,957) of the SFS cohort (Preplication = 3.49×10−41, Poverall = 8.09×10−59; Table 4).

Accounting for the rs2000999 allelic frequency, the pattern of familial correlation (Table 2, model 4) decreased from 0.230 to 0.206 and from 0.274 to 0.239 for sibling and child-parent respectively, whereas the adequacy of the model was significantly improved. Additional adjustment for rs2000999 for the components of variance attributable to additive genetic effects, shared household effects and residual environmental factors (Table 3, model 3) significantly improved the likelihood function and the proportion of phenotypic variability accounted for by genetic effects decreased (26.0% to 14.2%, in comparison to model 2). Moreover, the component attributable to household factors increased (11.6% to 14.7%). We thus determined that rs2000999 is the major genetic determinant of Hp levels accounting for 11.8% of Hp global variance and 45.4% of the genetic variance of this trait.

In order to assess the degree of independence of rs2000999 from the HP rs72294371 ‘common polymorphism’ (Data S1), we genotyped the latter in the GWAS first stage children (SFS cohort) by using a PCR-based method and a gel reading. Only a subset of 341 out of 631 samples was successfully genotyped after independent readings by two readers. In this sample set, we found no evidence for LD between HP ‘common polymorphism’ and rs2000999 (r2 = 0.135) nor with the other SNPs that were genotyped by the Illumina array within the 218 kb LD block that includes both HP ‘common polymorphism’ and rs2000999 (0.001<r2<0.137; N = 31 SNPs). Furthermore, the HP ‘common polymorphism’ (minor allele frequency [MAF] = 0.46) and rs2000999 (MAF = 0.20) were both highly associated with Hp levels, as expected (P = 4×10−7 and P = 1×10−7, respectively). When both variants were included in the same regression model, we found that they significantly and independently contributed to the increased Hp levels (PHP rs72294371 ‘common polymorphism’ = 0.001 and Prs2000999 = 5×10−5) indicating that the association with rs2000999 would be novel and not redundant with the HP ‘common polymorphism’. However, it is noteworthy that despite strong efforts, we did not succeed by far in genotyping all samples. We used two other technologies: a pre-designed TaqMan copy number assay (Applied Biosystems) and a PCR-based method with another design than previously used. Unfortunately, we did not find a good concordance (<70%) between the three methods. We conclude that given the state of art, we cannot definitively conclude that the present signal of association is not related to the HP ‘common polymorphism’ genotype.

In order to validate our main results, we secondary assessed the effect of SNP rs2000999 on HP gene expression in subcutaneous adipose tissue sample from 194 non-obese subjects ascertained from the Swedish SibPair cohort (Data S2). We found a significant contribution of rs2000999 to HP expression (β = 0.23±0.08; P = 0.007; PBayesian = 0.006).

We finally assessed by additive model the effect of SNP rs2000999 on total, HDL and low-density lipoprotein (LDL) cholesterol, Apolipoproteins A1 and B in five independent European pediatric cohorts totaling 8,789 children. Total cholesterol was ln-transformed and we normalized the LDL cholesterol by computing the square root. All measurements were adjusted for gender, age (excepting the NFBC1986) and z-score BMI. Our data showed that rs2000999, with A as allele effect, was associated with total cholesterol (β = −0.011, SE = 0.003, P = 0.002; Table 5) and LDL-cholesterol (β = −0.017, SE = 0.004, P = 0.0008; Table 5). The association with HDL-cholesterol and Apolipoproteins A1 and B are displayed in Table S2.


Table 5. Association of rs2000999 with lipid traits.



We first determined in 656 nuclear families that 26% of the Hp plasma level variance was under genetic control. Then, using a GWAS in 631 children from the same population and replicating in three independent populations, we identified rs2000999 as the major genetic determinant of Hp levels. This genetic variant alone explained 45.4% of the genetic variance of this trait (11.8% of Hp global variance). SNP rs2000999 is located in the intronic region of HP gene, in a region previously believed to be the HPR gene (encoding the haptoglobin-related protein) which shares more than 90% nucleotide sequence homology with HP [30]. It is 17 kb apart a duplication of 59 α chain amino acid residues resulting to an intragenic duplication of 1.7 kb and which is known as the HP ‘common polymorphism’ [31].

Our study shows that SNP rs2000999 also modulated expression levels of the Hp mRNA in human adipose tissue suggesting that this SNP (or a SNP in very strong LD with this one) is indeed functional. It is noteworthy that SNP rs2000999 has been previously reported to associate with total cholesterol in 4,200 adults from the EUROSPAN consortium [32] and with both total and LDL-cholesterol in 100,000 adults of European and non-European ancestry [33]. Interestingly, we confirmed the effect of this SNP on these lipid traits in European children.

Increased plasma levels of several inflammatory markers correlate with higher incidence and prognosis of various cardiovascular diseases [34], [35], [36], [37]. Hp level measurement has been recently shown to improve the predictive information for major cardiovascular events [29]. As rs2000999 is also associated with lipid levels, this marker links inflammation and cardiovascular risk. It is noteworthy that the impact of rs2000999 association on lipids occurs early in life and is consistent with previous findings that the precursors of cardiovascular diseases originate in childhood [38], [39].

Interestingly, the effect of rs2000999 on Hp levels is more important in our discovery cohort which includes healthy children having low Hp concentration (0.65 g/L±0.39). As shown in the analyses for other diseases [40], the statistical power of GWAS can be increased in healthy homogeneous controls.

In addition, the effect of aging and of the environment is minimized in children. Then, by using healthy pediatric populations, we were able to assess more accurately the effect of the SNP rs2000999 on Hp levels.

We tried to assess the degree of independence between rs2000999 and the HP ‘common polymorphism’. Three different methods were evaluated to genotype the HP ‘common polymorphism’ in our whole GWAS sample set. Unfortunately, we found no concordance between the three methods, which underlie a major difficulty to carry out an accurate genotyping of this polymorphism. Even if this difficulty was not clearly discussed and not published to our knowledge, it is admitted in the scientific field and it should also be present in the clinical diagnosis setting. In contrast, SNP rs2000999 can be accurately and easily genotyped.

Our findings should be further replicated in non-European adults, especially in those affected by infectious diseases. More generally, rs2000999 should be assessed in cohorts of patients affected by the large variety diseases associated with Hp levels. It is not a trivial task, as Hp is a trait that has been infrequently measured in cohorts used for genetic studies. Given the major effect of rs2000999 on Hp gene expression and on Hp levels, Mendelian randomization approach would be of interest to test the causative effect of this SNP on infectious and non-communicable phenotypes in order to assess its clinical relevance.

Materials and Methods

Ethics Statement

All the populations involved in the present study were recruited in accordance with the latest version of the Declaration of Helsinki for Ethical Principles for Medical Research Involving Human Subjects. All participants and their parents gave a written informed consent. Genetic studies protocols were approved by the local ethics committees for the protection of subjects for biomedical research: the Comité Consultatif de Protection des Personnes dans la Recherche Biomédicale (CCPPRB).

Study populations

The STANISLAS Family Study (SFS).

The SFS is a 10-year longitudinal survey involving 1,006 volunteer families of European ancestry whose members were free of chronic disease (cardiovascular or cancer) with recruitment taking place from 1993–95 [41]. The SFS samples and data are part of the Biological Resources Centre (BRC) “Interactions Gène-Environnement en Physiopathologie CardioVasculaire” (IGE-PCV) in Nancy, France. Genome-wide genotyping was performed on a subset of 631 unrelated children (mean age 11.93 years [11.76–12.11]) constituting the discovery cohort [42] after screening for latent population substructure (Data S2). The 2,957 remaining individuals after quality control were analysed in the replication studies (mean age 29.84 [29.38–30.30]). Hp levels, BMI and the cardiovascular risk traits including total, high density lipoprotein (HDL) and low density lipoprotein (LDL)-cholesterol (calculated by the Friedewald formula [43]), Apolipoprotein A1 and B were available for all participants.

Obese Children.

We studied obese children (defined as BMI>97th percentile for age and sex according to a French cohort [44]) ascertained from 449 nuclear families with at least one obese offspring, recruited in the Paediatric Endocrine Unit of Jeanne de Flandres Hospital of Lille, France or through a national media campaign. We analyzed 1,015 children (mean age 11.07 years [10.86–11.27]) for whom Hp, BMI, total, HDL and LDL-cholesterol, Apolipoprotein A1 and B measurements were available.

The GeNe and Diet Attica Investigation (GENDAI).

The GENDAI pediatric cohort was recruited from children living in the Attica region of Greece [45]. From November 2005 to June 2006, 1,138 peri-adolescent children were recruited from randomly selected elementary schools of Attica. We analyzed 419 children (mean age 11.16 years [11.10–11.23]) for whom Hp, BMI, total, HDL and LDL-cholesterol, Apolipoprotein A1 and B measurements were available.

The Northern Finland 1986 Birth Cohort (NFBC1986).

The NFBC1986 is a prospective birth cohort including all Finnish mothers of European ancestry with children whose expected date of birth fell between July 1, 1985 and June 30, 1986 in the two northernmost provinces in Finland [46]. Clinical examination at 15–16 years follow-up was conducted between August 2001 and June 2002. All cohort members living in Finland with known address (n = 9,215) were invited, and 6,798 participated (74%). We analyzed 5,310 adolescents successfully genotyped in the NFBC1986 cohort for whom BMI, total, HDL and LDL-cholesterol, Apolipoprotein A1 and B measurements were available.

The Verona cohort.

The Verona cohort consists of Italian children recruited from the general population of Verona, Italy, whose families were randomly chosen from the registry office database of the town, and contacted by post. We analysed 401 children (mean age 10.90 years [10.75–11.04]) successfully genotyped for whom at least BMI, total, HDL and LDL-cholesterol, Apolipoprotein A1 and B measurements were available.

The SibPair cohort.

The SibPair cohort comprises 154 nuclear families (732 subjects) from Sweden, each containing an obesity-discordant sib pair (at least 10 kg/m2 difference in BMI). Gene expression and genetic variation were analysed in 194 non-obese subjects from the SibPair cohort.


Genomewide genotypes were generated for the 631 unrelated SFS children using the Illumina Human CNV370-Duo array [42]. Briefly,750 ng of genomic DNA was processed using Illumina's protocol for the BeadStation genotyping platform (Illumina), followed by GenCall software analysis(Illumina) to automatically cluster, call genotypes, and assign confidence scores using the GenTrain clustering algorithm (Illumina). We discarded a total of 2,552 SNPs due to the following reasons: extreme Hardy-Weinberg disequilibrium (P<0.001), low genotyping call rates (<95%) or low minor-allele frequencies (<1%). We retained 318,237 SNPs for analysis. Genomic control λGC was 1.01.

We used the Applied Biosystems SNPlex™ technology to replicate the association of genome-wide significant genetic variants in the SFS replication set, obese children and GENDAI, NFBC1986 and Verona cohorts.

SNPlex is based on the Oligonucleotide Ligation Assay (OLA) combined with multiplex PCR target amplification and was carried out as per the manufacturer's instructions ( Allelic discrimination was performed by capillary electrophoresis analysis using an Applied Biosystems 3730xl DNA Analyzer and GeneMapper 3.7 software. Genotyping call rate was above 95% in all populations studied and genetic variants were in HW equilibrium (p>0.001).

We used a PCR-based method [47] to genotype for the HP ‘common polymorphism’ in the 631 children of the discovery cohort (SFS cohort) in order to determine any linkage disequilibrium with regard to genome-wide significant variants identified in the analysis. Only genotypes that were concordant following a double blind genotyping call by two independent readers were retained for statistical analyses (N = 341). Two additional genotyping methods for HP ‘common polymorphism’ were used in order to validate the above-method: a custom TaqMan copy number assay (Applied Biosystems) following the manufacturer's recommendations and another PCR-based method using the following oligonucleotide primers : 5′-CTCTCCTTTCTCCCTTCCTGTC-3′ and 5′-TTTATCCACTGCTTCTCATTGT-3′. We didn't obtain correspondence between the banding patterns and the Hp genotypes.

Haptoglobin measurement

Blood samples were collected between 8:00 and 9:00 am or 11:30 and 12:30 pm by venipeuncture after overnight fasting. Hp protein levels were measured in blood plasma samples by high sensitivity immunophotometry analyses using the BN™II Siemens analyzer (Siemens, Marburg, Germany) and Siemens reagents and following the manufacturer's instructions.

Lipids measurements

Total cholesterol, HDL-cholesterol and apolipoproteins A1 and B were assayed using enzymatic methods (AU640 [Olympus, Watford, UK]) and LDL-cholesterol was calculated using the Friedewald formula [43].

Statistical analyses

Heritability estimate of Hp levels in the SFS.

Intra-familial correlations were estimated by using maximum likelihood techniques [48] with and without adjustment for covariates. This statistical approach allowed adjustment for covariates within models, simultaneously and separately for fathers, mothers, sons and daughters. The significance of various familial correlations, or sex and generation differences in correlations, was tested using the log-likelihood ratio test. Correlations were computed under two sets of hypotheses: gender effects on correlations for parents and children and no gender effect for all correlations.

Variance component analysis was applied in order to assess the relative contributions of genetic, common household factors and individual specific environment in familial aggregation of serum haptoglobin concentrations. The variable used to estimate variance component was adjusted for age and BMI, separately for fathers, mothers, sons and daughters. The analysis was conducted by using a multivariate normal model for pedigree analysis as described by Lange and colleagues [49], [50]. with the software FISHER, which also performed tests of goodness-of-fit of the underlying multinormal distribution. The general model assumed that the studied trait was the result of the sum of three independent random components: a polygenic component (G) representing additive genetic factors, household factors common to individuals within a family (H) and unmeasured environmental factors particular to an individual (including measurement error) (E). These three components were assumed to be normally distributed with mean equal to 0 and variance equal to σ2G, σ2H and σ2E, respectively.

The hypothesis of no polygenic component or no household effect was checked by comparing a model including σ2G, σ2H and σ2E with a model including only σ2H and σ2E or σ2G and σ2E, respectively. In addition, possible effects of covariates (age and BMI) and genome-wide significant variants' allelic frequency on these variance components were tested.

Comparison of nested models was based on the likelihood ratio criteria. Eventually, the best parsimonious model was selected. The percentage contributions of the three components, additive genetic factors (heritability), household factors and residual environmental, to residual phenotypic variance (after adjustment for covariates) were determined.

Genome-wide association and replication analyses.

We carried out genome-wide association and replication analyses on Hp levels using linear mixed regression models under the additive genetic model with one degree of freedom, adjusting for age, gender and BMI and using PLINK [51]. The summary statistics were combined in the meta-analyses (Data S2), using the inverse normal method with equal weight for each population. In this method, P values of each study are transformed into their inverse normal z score and the weighted sum, over all studies, is compared to a normal N (0, 1), provided the sum of squared weights equals 1. The estimates of variants effects on Hp and their standard errors for each separate analysis were combined in the meta-analysis using the weighted inverse normal method, and the overall effect and its confidence interval were estimated using the inverse variance method implemented in the ‘meta.summaries’ function of the R RMETA package (​meta/index.html). No major heterogeneity in effects was observed (P<0.02). The same mixed model and the same software were used to analyse the association of genetic variants with lipid traits.

Gene-expression investigation.

To investigate the effect of genome-wide significant variants on gene expression (Data S2), we used data from 194 non-obese individuals from the SibPair cohort [52]. Gene expression data for HP was measured in subcutaneous adipose tissue [53] from 347 siblings using the Affymetrix Human U133 Plus 2.0 platform (208470_s_at and 208471_at, respectively). DNA was isolated from peripheral blood and genotypes were generated using Illumina 610-Quad arrays.

We used a linear mixed model (Pinheiro and Bates, 2000) to assess association of SNPs with gene expression. Log-transformed expression level was regressed on the random-effect term, which accommodates the family pedigree structure, and on the fixed-effect terms i.e. sex, age, BMI level and the SNP of interest (recoded as 0 = AA; 1 = AG; 2 = GG according to an additive model). Analysis was carried out using the R function lmer() (package lme4) with p-values obtained from the t-statistic.

Significance of the fixed effects was further investigated in the Bayesian set-up using the R function mcmcsamp() (package lme4) that generates Monte Carlo Markov Chain samples from the posterior distribution of the parameters of a linear mixed model. The prior on the fixed effects parameters is taken to be locally uniform while the prior on the variance-covariance matrices of the random effects is taken to be the locally non-informative prior. Based on 100,000 samples drawn from the posterior distribution, we calculated the smallest p such that the (1−p) credible interval does not contain the value 0. This parameter was finally used to assess the p-value obtained from the t-statistic: if smaller than p, it was considered anticonservative and its value discarded.

Supporting Information

Data S1.

HP rs72294371 ‘common polymorphism’ flanking sequence. Source: 1000Genome. (​ens/Variation/Summary?r=16:72090747-7209​1766v=rs72294371vdb=variationvf=13544249).



Data S2.

Supplementary Methods. 1. Screening of latent population substructure. 2. Conditional analysis. 3. Meta-analysis. 4. Gene-expression analysis.



Table S1.

Conditional regression within the recombination hotspot around rs2000999. UNADJ: unadjusted p-values, SNP: single nucleotide polymorphism, add: additive model.



Table S2.

Association of rs2000999 with HDL-cholesterol and Apolipoproteins A1 and B. N: sample size; MAF: Minor Allele Frequency; β: beta coefficient for the effect allele A.




The authors would like to thank the study participants and their families for their time and effort to help create the present study. They would like to thank also all the field investigators for the recruitment and examinations of the populations involved in this study.

We thank Sidonie Vivequin for the double reading of all gels that were generated for the genotyping of the HP ‘common polymorphism’, Jean-Claude Chèvre and Franck De Greave for bioinformatics support, Marianne Deweirder and Michèle Pfister for DNA preparation.

Author Contributions

Conceived and designed the experiments: PF NCN SVS. Performed the experiments: RMGR AB NBN AD MF LB SES AW PJ LS LC. Analyzed the data: NCN BH CL YL. Contributed reagents/materials/analysis tools: PF SVS. Wrote the paper: PF NCN AB NBN GS MF AW GVD SVS. Study supervision: PF NCN SVS. Literature search: PF NCN AB NBN GS MRL SES MGS JRD SVS. Population providers: PF AM CM PJ LS LC PE MRJ GVD SVS AR. Haptoglobin measurements: GS RMGR SVS. Common polymorphism genotyping: AB NBN AD. Functionality studies/Gene-expression investigation: MF LB SES AW. Microarray expression profiling of adipose tissue: PJ LS LC. Data interpretation: PF NCN AB NBN MRL MGS DM JRD SVS. Equally contributed as first authors: PF NCN. Equal corresponding authors: PF SVS.


  1. 1. Gabay C, Kushner I (1999) Acute-phase proteins and other systemic responses to inflammation. N Engl J Med 340: 448–454.
  2. 2. Bowman BH, Kurosky A (1982) Haptoglobin: the evolutionary product of duplication, unequal crossing over, and point mutation. Adv Hum Genet 12: 189–261.453-184
  3. 3. Dati F, Schumann G, Thomas L, Aguzzi F, Baudner S, et al. (1996) Consensus of a group of professional societies and diagnostic companies on guidelines for interim reference ranges for 14 proteins in serum based on the standardization against the IFCC/BCR/CAP Reference Material (CRM 470). International Federation of Clinical Chemistry. Community Bureau of Reference of the Commission of the European Communities. College of American Pathologists. Eur J Clin Chem Clin Biochem 34: 517–520.
  4. 4. Ritchie RF, Palomaki GE, Neveux LM, Navolotskaia O, Ledue TB, et al. (2000) Reference distributions for the positive acute phase serum proteins, alpha1-acid glycoprotein (orosomucoid), alpha1-antitrypsin, and haptoglobin: a practical, simple, and clinically relevant approach in a large cohort. J Clin Lab Anal 14: 284–292.
  5. 5. Quaye IK (2008) Haptoglobin, inflammation and disease. Trans R Soc Trop Med Hyg 102: 735–742.
  6. 6. Okazaki T, Yanagisawa Y, Nagai T (1997) Analysis of the affinity of each haptoglobin polymer for hemoglobin by two-dimensional affinity electrophoresis. Clin Chim Acta 258: 137–144.
  7. 7. Braeckman L, De Bacquer D, Delanghe J, Claeys L, De Backer G (1999) Associations between haptoglobin polymorphism, lipids, lipoproteins and inflammatory variables. Atherosclerosis 143: 383–388.
  8. 8. Balestrieri M, Cigliano L, Simone ML, Dale B, Abrescia P (2001) Haptoglobin inhibits lecithin-cholesterol acyltransferase in human ovarian follicular fluid. Mol Reprod Dev 59: 186–191.
  9. 9. Salvatore A, Cigliano L, Carlucci A, Bucci EM, Abrescia P (2009) Haptoglobin binds apolipoprotein E and influences cholesterol esterification in the cerebrospinal fluid. J Neurochem 110: 255–263.
  10. 10. Kasvosve I, Speeckaert MM, Speeckaert R, Masukume G, Delanghe JR (2010) Haptoglobin polymorphism and infection. Adv Clin Chem 50: 23–46.
  11. 11. Langlois MR, Delanghe JR (1996) Biological and clinical significance of haptoglobin polymorphism in humans. Clin Chem 42: 1589–1600.
  12. 12. Asleh R, Marsh S, Shilkrut M, Binah O, Guetta J, et al. (2003) Genetically determined heterogeneity in hemoglobin scavenging and susceptibility to diabetic cardiovascular disease. Circ Res 92: 1193–1200.
  13. 13. Rother RP, Bell L, Hillmen P, Gladwin MT (2005) The clinical sequelae of intravascular hemolysis and extracellular plasma hemoglobin: a novel mechanism of human disease. Jama 293: 1653–1662.
  14. 14. Mohapatra MK, Mohanty S, Mohanty BK, Sahu GN (1999) Hypohaptoglobinaemia as a biochemical and epidemiological marker of falciparum malaria. J Assoc Physicians India 47: 874–877.
  15. 15. Eisaev BA (1995) [Results of the treatment of patients with recurrence of pulmonary tuberculosis with different types of haptoglobin]. Probl Tuberk 20–22.
  16. 16. Delanghe JR, Langlois MR, Boelaert JR, Van Acker J, Van Wanzeele F, et al. (1998) Haptoglobin polymorphism, iron metabolism and mortality in HIV infection. Aids 12: 1027–1032.
  17. 17. Friis H, Gomo E, Nyazema N, Ndhlovu P, Krarup H, et al. (2003) Iron, haptoglobin phenotype, and HIV-1 viral load: a cross-sectional study among pregnant Zimbabwean women. J Acquir Immune Defic Syndr 33: 74–81.
  18. 18. Bacq Y, Schillio Y, Brechot JF, De Muret A, Dubois F, et al. (1993) [Decrease of haptoglobin serum level in patients with chronic viral hepatitis C]. Gastroenterol Clin Biol 17: 364–369.
  19. 19. Louagie HK, Brouwer JT, Delanghe JR, De Buyzere ML, Leroux-Roels GG (1996) Haptoglobin polymorphism and chronic hepatitis C. J Hepatol 25: 10–14.
  20. 20. Van Vlierberghe H, Delanghe JR, De Bie S, Praet M, De Paepe A, et al. (2001) Association between Cys282Tyr missense mutation and haptoglobin phenotype polymorphism in patients with chronic hepatitis C. Eur J Gastroenterol Hepatol 13: 1077–1081.
  21. 21. Delanghe J, Langlois M, Ouyang J, Claeys G, De Buyzere M, et al. (1998) Effect of haptoglobin phenotypes on growth of Streptococcus pyogenes. Clin Chem Lab Med 36: 691–696.
  22. 22. Rohde KH, Dyer DW (2004) Analysis of haptoglobin and hemoglobin-haptoglobin interactions with the Neisseria meningitidis TonB-dependent receptor HpuAB by flow cytometry. Infect Immun 72: 2494–2506.
  23. 23. Calderoni DR, Andrade Tdos S, Grotto HZ (2006) Haptoglobin phenotype appears to affect the pathogenesis of American trypanosomiasis. Ann Trop Med Parasitol 100: 213–221.
  24. 24. Speeckaert R, Speeckaert MM, Padalko E, Claeys LR, Delanghe JR (2009) The haptoglobin phenotype is associated with the Epstein-Barr virus antibody titer. Clin Chem Lab Med 47: 826–828.
  25. 25. Asleh R, Guetta J, Kalet-Litman S, Miller-Lotan R, Levy AP (2005) Haptoglobin genotype- and diabetes-dependent differences in iron-mediated oxidative stress in vitro and in vivo. Circ Res 96: 435–441.
  26. 26. Levy AP, Roguin A, Hochberg I, Herer P, Marsh S, et al. (2000) Haptoglobin phenotype and vascular complications in patients with diabetes. N Engl J Med 343: 969–970.
  27. 27. Ryndel M, Behre CJ, Brohall G, Prahl U, Schmidt C, et al. (2010) The haptoglobin 2-2 genotype is associated with carotid atherosclerosis in 64-year old women with established diabetes. Clin Chim Acta 411: 500–504.
  28. 28. Holme I, Aastveit AH, Hammar N, Jungner I, Walldius G (2009) Haptoglobin and risk of myocardial infarction, stroke, and congestive heart failure in 342,125 men and women in the Apolipoprotein MOrtality RISk study (AMORIS). Ann Med 41: 522–32.
  29. 29. Holme I, Aastveit AH, Hammar N, Jungner I, Walldius G (2010) Inflammatory markers, lipoprotein components and risk of major cardiovascular events in 65,005 men and women in the Apolipoprotein MOrtality RISk study (AMORIS). Atherosclerosis 213: 299–305.
  30. 30. Maeda N (1985) Nucleotide sequence of the haptoglobin and haptoglobin-related gene pair. The haptoglobin-related gene contains a retrovirus-like element. J Biol Chem 260: 6698–6709.
  31. 31. Koda Y, Soejima M, Yoshioka N, Kimura H (1998) The haptoglobin-gene deletion responsible for anhaptoglobinemia. Am J Hum Genet 62: 245–252.
  32. 32. Igl W, Johansson A, Wilson JF, Wild SH, Polasek O, et al. (2010) Modeling of environmental effects in genome-wide association studies identifies SLC2A2 and HP as novel loci influencing serum cholesterol levels. PLoS Genet 6: e1000798.
  33. 33. Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, et al. (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466: 707–713.
  34. 34. Zethelius B, Berglund L, Sundstrom J, Ingelsson E, Basu S, et al. (2008) Use of multiple biomarkers to improve the prediction of death from cardiovascular causes. N Engl J Med 358: 2107–2116.
  35. 35. Melander O, Newton-Cheh C, Almgren P, Hedblad B, Berglund G, et al. (2009) Novel and conventional biomarkers for prediction of incident cardiovascular events in the community. JAMA 302: 49–57.
  36. 36. Turner SJ, Ketch TR, Gandhi SK, Sane DC (2008) Routine hematologic clinical tests as prognostic markers in patients with acute coronary syndromes. Am Heart J 155: 806–816.
  37. 37. Marcovina SM, Crea F, Davignon J, Kaski JC, Koenig W, et al. (2007) Biochemical and bioimaging markers for risk assessment and diagnosis in major cardiovascular diseases: a road to integration of complementary diagnostic tools. J Intern Med 261: 214–234.
  38. 38. Raitakari OT, Juonala M, Kahonen M, Taittonen L, Laitinen T, et al. (2003) Cardiovascular risk factors in childhood and carotid artery intima-media thickness in adulthood: the Cardiovascular Risk in Young Finns Study. Jama 290: 2277–2283.
  39. 39. Mahoney LT, Burns TL, Stanford W, Thompson BH, Witt JD, et al. (1996) Coronary risk factors measured in childhood and young adult life are associated with coronary artery calcification in young adults: the Muscatine Study. J Am Coll Cardiol 27: 277–284.
  40. 40. Bouatia-Naji N, Rocheleau G, Van Lommel L, Lemaire K, Schuit F, et al. (2008) A polymorphism within the G6PC2 gene is associated with fasting plasma glucose levels. Science 320: 1085–1088.
  41. 41. Visvikis-Siest S, Siest G (2008) The STANISLAS Cohort: a 10-year follow-up of supposed healthy families. Gene-environment interactions, reference values and evaluation of biomarkers in prevention of cardiovascular diseases. Clin Chem Lab Med 46: 733–747.
  42. 42. Bouatia-Naji N, Bonnefond A, Cavalcanti-Proenca C, Sparso T, Holmkvist J, et al. (2009) A variant near MTNR1B is associated with increased fasting plasma glucose levels and type 2 diabetes risk. Nat Genet 41: 89–94.
  43. 43. Friedewald WT, Levy RI, Fredrickson DS (1972) Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin Chem 18: 499–502.
  44. 44. Rolland-Cachera MF, Cole TJ, Sempe M, Tichet J, Rossignol C, et al. (1991) Body Mass Index variations: centiles from birth to 87 years. Eur J Clin Nutr 45: 13–21.
  45. 45. Papoutsakis C, Vidra NV, Hatzopoulou I, Tzirkalli M, Farmaki AE, et al. (2007) The Gene-Diet Attica investigation on childhood obesity (GENDAI): overview of the study design. Clin Chem Lab Med 45: 309–15.
  46. 46. Jarvelin MR, Elliott P, Kleinschmidt I, Martuzzi M, Grundy C, et al. (1997) Ecological and individual predictors of birthweight in a northern Finland birth cohort 1986. Paediatr Perinat Epidemiol 11: 298–312.
  47. 47. Koch W, Latz W, Eichinger M, Roguin A, Levy AP, et al. (2002) Genotyping of the common haptoglobin Hp 1/2 polymorphism based on PCR. Clin Chem 48: 1377–1382.
  48. 48. Donner A, Koval JJ (1981) A multivariate analysis of family data. Am J Epidemiol 114: 149–154.
  49. 49. Lange K, Weeks D, Boehnke M (1988) Programs for Pedigree Analysis: MENDEL, FISHER, and dGENE. Genet Epidemiol 5: 471–472.
  50. 50. Lange K, Westlake J, Spence MA (1976) Extensions to pedigree analysis. III. Variance components by the scoring method. Ann Hum Genet 39: 485–491.
  51. 51. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575.
  52. 52. Carlsson LM, Jacobson P, Walley A, Froguel P, Sjostrom L, et al. (2009) ALK7 expression is specific for adipose tissue, reduced in obesity and correlates to factors implicated in metabolic disease. Biochem Biophys Res Commun 382: 309–314.
  53. 53. Trayhurn P, Wood IS (2004) Adipokines: inflammation and the pleiotropic role of white adipose tissue. Br J Nutr 92: 347–355.