Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Total Hip Replacement for the Treatment of End Stage Arthritis of the Hip: A Systematic Review and Meta-Analysis

  • Alexander Tsertsvadze,

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Amy Grove ,

    A.L.Grove@Warwick.ac.uk

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Karoline Freeman,

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Rachel Court,

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Samantha Johnson,

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Martin Connock,

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Aileen Clarke,

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

  • Paul Sutcliffe

    Affiliation Warwick Evidence, Division of Health Sciences, Warwick Medical School, The University of Warwick, Coventry, England

Abstract

Background

Evolvements in the design, fixation methods, size, and bearing surface of implants for total hip replacement (THR) have led to a variety of options for healthcare professionals to consider. The need to determine the most optimal combinations of THR implant is warranted. This systematic review evaluated the clinical effectiveness of different types of THR used for the treatment of end stage arthritis of the hip.

Methods

A comprehensive literature search was undertaken in major health databases. Randomised controlled trials (RCTs) and systematic reviews published from 2008 onwards comparing different types of primary THR in patients with end stage arthritis of the hip were included.

Results

Fourteen RCTs and five systematic reviews were included. Patients experienced significant post-THR improvements in Harris Hip scores, but this did not differ between impact types. There was a reduced risk of implant dislocation after receiving a larger femoral head size (36 mm vs. 28 mm; RR = 0.17, 95% CI: 0.04, 0.78) or cemented cup (vs. cementless cup; pooled odds ratio: 0.34, 95% CI: 0.13, 0.89). Recipients of cross-linked vs. conventional polyethylene cup liners experienced reduced femoral head penetration and revision. There was no impact of femoral stem fixation and cup shell design on implant survival rates. Evidence on mortality and complications (aseptic loosening, femoral fracture) was inconclusive.

Conclusions

The majority of evidence was inconclusive due to poor reporting, missing data, or uncertainty in treatment estimates. The findings warrant cautious interpretation given the risk of bias (blinding, attrition), methodological limitations (small sample size, low event counts, short follow-up), and poor reporting. Long-term pragmatic RCTs are needed to allow for more definitive conclusions. Authors are encouraged to specify the minimal clinically important difference and power calculation for their primary outcome(s) as well CONSORT, PRISMA and STROBE guidelines to ensure better reporting and more reliable production and assessment of evidence.

Introduction

Over the past few decades, total hip replacement (THR) has been reported as clinically effective in treating pain and disability resulting from late stage arthritis of the hip [1]. THR is indicated for patients who failed to respond to non-surgical management options such as pharmaceutical treatments (e.g., analgesics, anti-inflammatory agents, steroid injections, topical treatments), self-management, patient education, acupuncture, exercise, physical therapy, or manual therapy [2][3]. This procedure involves the replacement of a damaged hip joint with an artificial hip prosthesis consisting of an acetabular cup (with or without shell) a femoral stem, and femoral head.

Rates of THR in the western world have steadily increased between 2005 and 2010 [3]. A total of 86,488 hip procedures were recorded on the UK National Joint Registry in 2012; a 7.5% increase from 2011 [4]. In 2012, 76,448 primary hip procedures were undertaken and 10,040 revisions. This ‘revision’ burden now stands at 12% of total hip activity compared to 11% in 2011 [4].

Continuing marketing approval for evolving design of implant components, of prosthesis to bone fixation methods (e.g., cemented, cementless, hybrid), of prosthesis femoral head size, and of bearing surface articulations (e.g., metal, ceramic, polyethylene) has resulted in a multitude of options for care providers and patients.

This systematic review aimed to evaluate the evidence on the clinical effectiveness of different types of THR used in the treatment of pain and disability in people with end stage arthritis of the hip.

Materials and Methods

This systematic review forms part of independent research commissioned by the National Institute for Health Research (project number 11/118); the full protocol and guidance is accessible from: http://www.nice.org.uk.

Search strategy

Searches were undertaken in December 2012 and were date-limited from 2008. Electronic searches were conducted in MEDLINE, MEDLINE In-Process, Embase, Science Citation Index, Cochrane Library (Cochrane Database of Systematic Reviews and Cochrane Central Register of Controlled Trials), Current Controlled Trials, ClinicalTrials.gov, Database of Abstracts of Reviews of Effectiveness (DARE), and HTA databases. Reference lists and websites of hip implant manufacturers and major orthopaedic organisations were screened for relevant publications. Details of MEDLINE and Embase searches are presented in Appendix supporting information File S1. Searches were adapted for other databases.

Study eligibility criteria

Full text English-language reports of RCTs and systematic reviews comparing different types of primary THR were eligible for inclusion. The population included patients with end stage hip arthritis for whom non-surgical management has failed. The THR types were compared on the composition/material, design, bearing surface, fixation method, and size of components (acetabular cup, femoral stem, and femoral head). Non-RCTs, cohort studies, economic evaluations, editorials, letters, and conference abstracts were excluded. Studies focusing on indications other than end stage arthritis of the hip, on revision surgery, on hip resurfacing or those comparing different THR operative approaches (e.g., mini-incision vs. standard-incision) were also excluded.

We further limited our inclusion to studies with sample size of 100 participants or more. This was done in order to minimize evidence with inconclusive, i.e., uninformative results (i.e., statistically non-significant effect estimates with wide 95% confidence intervals). Based on our calculations, the sample size of 100 was the minimum sample for a study which would have 90% power (two-tailed test significance level of 0.05) to detect the mean difference of at least 10 points on the Harris Hip score (with standard deviation of 15 based on external sources) [5][6].

Outcomes of interest

Primary outcome measures were measures of hip function and symptoms (Harris Hip; [7] Oxford Hip; [8] Western Ontario and McMaster University Osteoarthritis Index [WOMAC] [9]), all-cause mortality; risk of revision (or implant survival rate); and femoral head penetration rate. Secondary outcomes included other validated clinical/functional measures (McMaster-Toronto Arthritis patient Preference Disability Questionnaire [MACTAR] [10], Merle D'Aubigne Postel [11], University of California Los Angeles [UCLA] activity score [12], health-related quality of life [HRQOL] measures), and peri/post-procedural complications (i.e., implant dislocation, infection, osteolysis, aseptic loosening, femoral fracture, and deep vein thrombosis).

Study selection and data extraction

Two independent reviewers screened all bibliographic records for title/abstract and then for full text. Reasons for exclusion of full text papers were documented in the study flow diagram [13]. The same reviewers independently extracted relevant data which was then cross-checked. Disagreements were resolved by discussion and with a third reviewer. The extracted data included study, participant, intervention/comparator (types of THR, basis of comparison, operator skill), and outcome characteristics. If data permitted, we attempted to calculate missing statistical parameters (e.g., risk ratios, mean differences, and 95% confidence intervals). For individual studies with zero events in one or both treatment arms, risk ratios and 95% confidence intervals (95% CIs) were not estimated. The 95% CIs and standard errors were used to derive standard deviations or vice versa. All calculated parameters were entered into the data extraction sheets.

Assessment of risk of bias (ROB) and methodological quality

Two reviewers independently assessed ROB of RCTs and methodological quality of systematic reviews using the Cochrane Collaboration ROB tool [14] and the AMSTAR tool [15], respectively.

The Cochrane ROB tool [14] addresses threats to several internal validity domains (selection, performance, detection, attrition, reporting, and other pre-specified bias). The ROB for performance, detection, and attrition bias was assessed for a priori defined groups of objective and subjective outcomes separately and was classified as high, low, or unclear. Afterwards, for each RCT, within-study summary ROB rating was derived for subjective and objective outcomes. At data synthesis stage (evidence grading), the across-study average summary ROB was determined and assigned to each outcome of interest.

The AMSTAR tool [15] covers domains of research question, inclusion/exclusion criteria, search strategy, data extraction, ROB assessment, heterogeneity, and publication bias. For convenience of presentation, the quality of each SR was rated according to the number of items satisfied: high (range: 9–11), medium (range: 5–8), and low (range: 0–4).

Meta-analysis

The decision to pool study results was based on degree of similarity in the methodological and clinical characteristics of studies under consideration. Estimates of post-treatment mean difference (MD) for continuous outcomes and risk ratios (RR) for binary outcomes (except for rare events) were pooled using a random-effects model [16]. Dichotomous outcomes with low event rates (5.0%–10.0%) were pooled as RR using Mantel-Haenszel (MH) fixed-effect models. Dichotomous outcomes for studies with very low event rates (≤5.0%) or zero events in one of the treatment arms were pooled as odds ratio (OR) using Peto fixed-effect model [17]. The heterogeneity was assessed through inspection of forest plots, Cochran's Q and I2 statistics, and was judged according to pre-determined levels of statistical significance (Chi-square p<0.10 and/or I2>50%).

Other analyses

Publication bias was planned to be examined by visual inspection of asymmetry and regression tests on funnel plots [18]. Clinical and methodological sources of statistical heterogeneity was planned to be explored through a priori defined subgroup and sensitivity analyses (age, gender, activity levels, duration of follow-up, risk of bias items).

Grading overall quality of clinical effectiveness evidence

The overall quality of evidence for each gradable outcome was assessed using the system developed by Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) Working Group system (http://www.gradeworkinggroup.org). This approach [19] indicates levels of confidence in the observed treatment effect(s) and categorizes the evidence for each outcome into high, moderate, low, or very low grade based on the summary ROB across studies, consistency (heterogeneity), directness (applicability), precision, and publication/reporting bias. Gradable outcomes were Harris Hip score, WOMAC score, revision, mortality, femoral head penetration, and implant dislocation.

Evidence synthesis and interpretation

Comparison and synthesis of results for each outcome of interest were summarised and categorised as conclusive (either ‘there is difference’ or ‘there is no difference’) or inconclusive (indeterminate results due to statistical uncertainty, statistical heterogeneity/inconsistency in treatment effects, and/or incomplete information). This conclusion was based on statistical significance of the observed difference, magnitude of the effect estimate, width of the 95% CIs, whether the 95% CI included a minimal clinically important difference (MCID) for a given outcome, and consistency in terms of effect direction and statistical significance. We ascertained the MCIDs for clinical/functional measures such as Harris hip score (MCID range: 7–10), Oxford hip score (MCID range: 5–7), WOMAC score (MCID: 8), and EQ-5D (MCID: 0.074) from previous empirical research evidence [6], [20][21].

Results

Our searches identified 1,523 unique records, of which, 27 were included in this review [22], [23] (This piece of information contains information from a study with multiple publications [66] (See Table S1 in File S1)), [24][48]. Four RCTs were represented by multiple publications and the review cites them as Bjorgul 2010 [22], Engh 2012 [26] [This piece of information contains information from a study with multiple publications [69] (See Table S1 in File S1)], Capello 2008 [28] [This piece of information contains information from a study with multiple publications [70] (See Table S1 in File S1)], and Corten 2011 [32].

Thus, the review included 14 RCTs [22], [24][26], [28], [32], [36][43] and five systematic reviews [44][48]. The study flow diagram is given in Figure 1 and Checklist S1. Please see Table S20 in File S1 for full details of the systematic reviews. [The reviews contain information from studies with multiple publications [80], [81] (See Table S20 in File S1)].

RCTs

Study characteristics.

Included RCTs compared evidence on clinical effectiveness between different types of THR based on the composition [40], design [28], [41], bearing surface [25][26], [28], [37][39], [43], fixation method [22], [24], [32], [42], and size [36] of implant components (Table 1) [The studies contain information from multiple publications [28]– (See Table S1 in File S1)]. RCTs were conducted in the USA, the UK, Australia, Norway, Serbia, South Korea, and Canada. [Please see Table S1 in File S1 for full details of the RCT studies [23], [24], [26], [28], [29], [30], [31], [33], [43], [67][79]].

thumbnail
Table 1. Randomized controlled trials according to basis of hip implant comparison.

https://doi.org/10.1371/journal.pone.0099804.t001

Maximum length of follow-up was 20 years [32], [42]. The mean age in individual studies ranged from 45 [42] to 72 years [25], [36] and the proportion of women ranged from 24% [42] to 75% [43]. The mean follow up period of included studies is 8.4 years with a range of 1 [74] to 20 [33], [71][73], [79] years. Participant baseline characteristics are given in File S1, and Table S1 in File S1.

Risk of bias.

Overall, five (36%) and eight (57%) RCTs reported an adequate method for random sequence generation and treatment allocation concealment respectively (low ROB). RCTs had lower risks of performance and detection bias for objective (e.g., mortality, dislocation) vs. subjective (e.g., functional scores) outcomes (92%–100% vs. 15%–23%). Most RCTs failed to report the blinding status of patients, study personnel, and/or outcome assessors. Attrition bias was judged at low risk for at least eight RCTs (57%). Five RCTs (36%) were at high risk of selective reporting of outcome. Risk of other bias (e.g., funding source, baseline imbalance, inappropriate analysis) was rated as high for about one third of the RCTs. See the ROB assessment for the included RCTs (File S1 and Table S2 and Figure S1 in File S1).

Synthesis of evidence on clinical effectiveness.

Outcome-specific results are provided in Appendix Tables (File S1 and Tables S3–S18 in File S1).

To render outcome reporting bias and consistency criteria applicable for grading, only THR comparison categories which included at least two studies (cup fixation: cemented vs. cementless; cup liner surface: cross-linked polyethylene [XLPE] vs. [non-XLPE]) were selected. The overall quality grade for gradable outcomes was very low/low (for WOMAC, revision, mortality), moderate (for Harris Hip score, femoral head penetration), and high (for implant dislocation). See the results for graded outcomes (File S1 and Table S19 in File S1).

Summary results are provided in Table 2. Across seven studies, the mean post-THR Harris Hip score measured at different follow-ups (6 months to 10 years) did not differ between the THR groups of cup fixation (cemented vs. cementless; moderate grade) [22], [24], cup liner surface (XLPE vs. traditional polyethylene [PE]; moderate grade; pooled MD = 2.29, 95% CI: −0.88, 5.45) [Figure 2] [25][26], cup and stem fixation (cemented vs. cementless) [32], and femoral head-on-cup articulation (metal/oxinium-on-XLPE vs. metal/oxinium-on-PE [39]; ceramic-on-ceramic vs. metal-on-XLPE [43]). Similarly, there were no differences in WOMAC and Short Form (SF)-12 scores between the THR groups of XLPE vs. traditional PE cup liners; very low grade [25] as well as in MACTAR and Merle D'Aubigne Postel scores between the THR groups of cup and femoral stem fixation (cemented vs. cementless) [32].

thumbnail
Figure 2. Mean post Harris hip score measured at follow up.

https://doi.org/10.1371/journal.pone.0099804.g002

thumbnail
Table 2. Summary of evidence regarding the differences between the compared types of THR for each reported outcome (randomized controlled trials).

https://doi.org/10.1371/journal.pone.0099804.t002

There was a reduced risk of implant dislocation with use of cemented cup (vs. cementless cup; high grade; pooled OR = 0.34, 95% CI: 0.13, 0.89) (Figure 3) [22], [24] or larger femoral head size (36 mm vs. 28 mm) [36]. In three other RCTs, patients who received THR with XLPE cup liners experienced reduced femoral head penetration rate (moderate grade evidence) [25][26], [39] and risk of revision (risk ratio: 0.18, 95% CI: 0.04, 0.78; very low grade evidence) [26] compared to recipients of conventional PE cup liners. The recipients of ceramic-on-ceramic articulations (vs. metal-on-PE) had a reduced risk of osteolysis [28]. Although, in one trial, the use of cementless fixation of cup and femoral stem (vs. cemented) was associated with better implant survival rate [32], other trials showed no apparent impact of cup [22], [24] or femoral stem [42] fixation (cemented vs. cementless) and cup shell design (porous-coated vs. arc-deposited HA-coated) [28] on implant survival rates.

thumbnail
Figure 3. Implant dislocation of cemented cup vs. cementless cup.

https://doi.org/10.1371/journal.pone.0099804.g003

Evidence on revision [24], [28], [32][38], [40][43], the UCLA score [42], mortality (very low-to-low grade; pooled RR = 1.39, 95% CI: 0.78, 2.49) [Figure 4] [22], [25][26], [32], [36], [41], aseptic loosening [24], [26], [32], [37], [40], femoral fracture [26], [28], [40], infection [24],[37][38],[40],[43], and deep vein thrombosis [38], [43] was inconclusive. Also, the evidence for all outcomes reported in four studies was rendered inconclusive (very low grade evidence) [37][38], . Results were considered inconclusive due to partial reporting (missing data to allow for effect estimates, confidence intervals, standard errors, standard deviations, p-values), great uncertainty (wide confidence intervals), zero event counts, and/or inconsistency in estimates (Table 2).

Systematic reviews

Five systematic reviews evaluated the effectiveness of THRs (see Table S20 in File S1) according to cup fixation methods (cemented vs. cementless) [44][46] and implant articulations [47][48] on post-operative functional scores (Harris Hip score, Oxford Hip score) [44][45], [47], risk of revision, and implant survival rate [45][46]. Searches in the systematic reviews were undertaken between July 2007 [48] and June 2011 [46].

The methodological quality of the five systematic reviews is presented in File S1 (and Table S21 in File S1). Two systematic reviews [44], [47] were of high quality (AMSTAR score range of: 9–10) and two systematic reviews [45], [48] were of medium quality (AMSTAR score range of: 5–7). The one remaining systematic reviews [46] had a low quality (AMSTAR score: 4) because of inappropriate analysis, absence of duplicate study selection, limited literature search, failure to address publication bias, and lack of information on conflict of interest.

The outcome-specific and summary evidence results for the systematic reviews [44][48] are provided in File S1 (and Tables S22–S29 in File S1) and Table 3, respectively. Most evidence was rendered inconclusive due to unreported pooled results across RCTs (i.e., only narrative synthesis), inappropriate pooling methods (e.g., indirect naïve comparison of single group cohorts; pooling of studies of different design) [45]–, or inconsistent summary findings [47]. One review indicated no difference in the risk of revision between zirconium-on-polyethylene vs. non zirconium-on-polyethylene articulations [48].

thumbnail
Table 3. Summary of evidence regarding the differences between the compared types of THR for each reported outcome (Systematic Reviews).

https://doi.org/10.1371/journal.pone.0099804.t003

Publication bias and heterogeneity

The extent of publication bias could not be explored due to insufficient numbers of data points in the forest/funnel plots. The data from RCTs was too sparse and heterogeneous (in terms of different types of THRs) to allow for the exploration of whether study-level methodological or patient-related characteristics influenced treatment effects. None of the included RCTs reported within-study subgroup treatment effects.

Discussion

The large proportion of evidence summarised in this review was inconclusive due to poor reporting, missing data, inconsistent results, and/or great uncertainty in the treatment effect estimates. The majority of studies suggested significantly improved post-surgery scores for functional and clinical measures (Harris Hip, Oxford Hip, WOMAC, MACTAR, Merle D'Aubigne Postel, and SF-12) in participants regardless of the type of THR they received. Most evidence indicated no difference for these measures between different types of THR. There was a reduced risk of implant dislocation for participants receiving THR with a larger femoral head size (vs. smaller head size) or with cemented cup (vs. cementless; high grade evidence). Moreover, the evidence suggested reduced femoral head penetration rate and risk of implant revision for participants who received cross-linked polyethylene vs. conventional polyethylene cup liner bearings. Participants with ceramic-on-ceramic articulations (vs. metal-on-polyethylene) experienced reduced risk of osteolysis.

The limitations of the evidence warrant cautious interpretation of the findings. Great uncertainty in treatment effect estimates and incomplete reporting rendered some of the evidence inconclusive. The evidence on complications was scarce. It is unclear whether this is due to the absence or rarity of these events or it is simply due to under reporting. In light of poor reporting, it was not possible to explore contextual factors which might have influenced study results. For example, the lack of blinding of participants and study personnel may have led to systematic differences in care giving or co-interventions across implant groups which would independently influence outcome measures. None of the studies reported the experience levels and skills of study personnel and care givers. Any imbalance between study treatment groups in these factors may have influenced participants' prognosis independently of treatment. Systematic differences in the maturity of any given implant technology may have additionally influenced the observed treatment effects [49][53]. The paucity of data hindered the exploration of variation in treatment effect across subgroups of patients or methodological features of RCTs. Apart from limitations of the evidence itself, we limited the scope of this review to evidence published in English in 2008 or later. However, note that systematic reviews would provide the summary evidence for individual studies published before 2008. We limited our focus on studies with sample size of 100 or more participants. Since this limitation was not dependant on statistical significance (i.e., smaller studies were excluded regardless of statistical significance of their effect estimates), the effect of selection bias is less likely. Moreover, it has been empirically shown that inclusion of smaller studies may bias the observed treatment benefit upwards due to phenomena called ‘small study effect’ [54][57].

The poor reporting reduces the applicability of the findings to routine clinical practice in the UK. Generally, most studies were conducted in the Western world and reported patient-oriented as well as other important outcomes (e.g., revision, survival, mortality, complications) representative of those measured in clinical practice. The proportion of patients with primary osteoarthritis across the majority of studies was 60% or greater.

Auto alerts of searches set up to capture relevant articles published after the dates of the searches identified three new relevant systematic reviews which compared the effectiveness of THR using different articulations (metal-on-metal vs. metal-on-polyethylene) [58], implant fixation methods (cemented vs. cementless) [59], or femoral stem coating materials (hydroxyapatite-coated vs. non-hydroxyapatite-coated) [60]. Outcomes measured were risk of revision, Harris Hip score, mortality, and complications. In agreement with our findings, pooled estimates for post-surgery Harris Hip scores reported in all three systematic reviews showed no difference between THR groups. Pooled estimates for revision (6 RCTs; RR = 1.44, 95% CI: 0.88, 2.36), mortality (5 RCTs; RR = 1.06, 95% CI: 0.73, 1.52), and complications (4 RCTs; RR = 1.54, 95% CI: 0.21, 11.03) between THR groups with cemented vs. cementless fixation methods were statistically non-significant in one systematic review with wide 95% CIs (due to low event counts and small sample size of trials) compatible with a moderate-to-large effect size in either direction, rendering these findings inconclusive [59]. The pooled result from another systematic review [58] showed a greater risk of complications in the metal-on-metal vs. metal-on-polyethylene articulation group (3 RCTs; OR = 3.37, 95% CI: 1.57, 7.26).

Future large and long-term pragmatic RCTs are needed to replicate the findings of this review before more definitive conclusions are made. Study authors are encouraged to specify the minimal clinically important difference and power calculation for their primary outcome(s). This information would help to interpret the study findings both in terms of clinical and statistical terms. To improve the quality of reporting, authors are encouraged to conform to the recommendations outlined in the CONSORT (CONSOLIdated Standards of Reporting Trials) Statement [61] and its extension for RCTs evaluating non-pharmacologic interventions [62]. The recent CONSORT extension on patient-reported outcomes (PROs) would help to further improve the reporting quality of patient-reported functional and health quality outcome measures [63]. Use of the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) [13] statement for reporting systematic reviews and meta-analyses and the STROBE (Strengthening the Reporting of Observation Studies in Epidemiology) [63] statement for reporting observational studies are also encouraged. Adequate reporting would facilitate more reliable assessment of evidence to inform health care decision makers, physicians, and patients regarding the selection of the most appropriate implants for particular patient groups.

In the absence of definitive findings from RCTs on the clinical effectiveness of different types of THR, patients and surgeons should probably consider observational data presented in the large National Registry reports; these are updated annually (e.g. UK NJR, Australian Registry, Swedish Registry), and hold data on important outcomes, notably revision rates, for tens to hundreds of thousands of patients who have received a variety of THR prostheses over one or more decades. Issa and Mont 2013 [64] point to the potential limitations of such large registries including: unequal distribution of measures that are included in the database, missing data for some patients, duplicated or unreported cases, delays in reporting, misclassification of outcomes, and also problems of showing causalities. However, in the absence of high quality randomised study reports as here, judicious consideration of Registry analyses may provide a better guide than inconclusive results from small RCTs of short duration. Nevertheless, well-designed clinical trials with appropriate power and follow-up are clearly preferred.

Supporting Information

File S1.

Figure S1, Table S1–S29. Figure S1. Risk of bias graph for randomized controlled trials: review author's judgments about each risk of bias item. Table S1. Study and participant characteristics (randomized controlled trials). Table S2. Risk of bias summary table for randomized controlled trials: review author's judgments about each risk of bias item. Table S3. Harris Hip score (range: 0–100) NB: Tables 3–18 (results for specific outcomes reported in randomized controlled trials). Table S4. The Western Ontario and McMaster University Osteoarthritis Index (range: 0–100). Table S5. The McMaster-Toronto Arthritis Patient Preference Disability Questionnaire score (range: 0–30). Table S6. Merle D'Aubigne and Postel score (range: 0–18). Table S7. The University of California, Los Angeles activity scale (range: 1–10). Table S8. Short Form Health Survey (SF-12; range: 0–100). Table S9. Risk of revision (n/N). Table S10. Risk of mortality (n/N). Table S11. Femoral head penetration rate (mm/year). Table S12. Implant survival rate (%). Table S13. Risk of implant dislocation (n/N). Table S14. Risk of osteolysis (n/N). Table S15. Risk of aseptic loosening (n/N). Table S16. Risk of femoral fracture (n/N). Table S17.Risk of infection (n/N). Table S18. Risk of deep vein thrombosis (n/N). Table S19. GRADE evidence profile for gradable outcomes reported in randomized controlled trials (adapted from Guyatt et al., 2011)19. Table S20.Characteristics of included systematic reviews. Table S21. Methodological quality of systematic reviews (AMSTAR items). Table S22. Harris Hip score (range: 0–100) NB - Tables 22–29 (results for each outcome reported in systematic reviews). Table S23. Oxford Hip score (range: 0–48). Table S24. Short Form Health Survey (SF-12; range: 0–100). Table S25. Risk of revision (n/N). Table S26. Implant survival rate (%). Table S27. Risk of implant dislocation (n/N). Table S28. Risk of osteolysis (n/N). Table S29. Risk of aseptic loosening (n/N).

https://doi.org/10.1371/journal.pone.0099804.s002

(DOCX)

Acknowledgments

The authors would like to acknowledge the clinical advice and expert contribution of Professor Matthew Costa. Professor of Trauma and Orthopaedic Surgery at Warwick Clinical Trials Unit.

Author Contributions

Analyzed the data: AT PS AG KF MC. Wrote the paper: AT AG AC. Commented on final manuscript: AG AT KF RC SJ MC AC PS. Performed the information searches and obtained the data: SJ RC. Scoping and refining the research questions and design: AG AT KF RC SJ MC AC PS.

References

  1. 1. Smith MA, Smith WT (2012) The American Joint Replacement Registry. Orthopaedic Nursing 31: 296–299.
  2. 2. National Institute for Health and Clinical Excellence. Guidance on the selection of prostheses for primary total hip replacement: NICE technology appraisal guidance 2 (2000) National Institute for Health and Care Excellence: 1–22 Available: http://www.nice.org.uk/TA2. Accessed 18 April 2013.
  3. 3. Pivec R, Johnson K, Mears S, Mont MA (2012) Hip Arthroplasty. The Lancet 380: 1768–1777.
  4. 4. National Joint Registry for England, Wales and Northern Ireland: 10th Annual Report (2013) National Joint Registry. Available: http://www.njrcentre.org.uk/njrcentre/Portals/0/Documents/England/Reports/10th_annual_report/NJR%2010th%20Annual%20Report%202013.pdf. Accessed 8 Oct 2013.
  5. 5. Kalairajah Y, Azurza K, Hulme C, Molloy S, Drabu KJ (2005) Health outcome measures in the evaluation of total hip arthroplasties-a comparison between the Harris hip score and the Oxford hip score. Journal of Arthroplasty 20: 1037–1041.
  6. 6. Achten J, Parsons NR, Edlin RP, Griffin DR, Costa ML (2010) A randomised controlled trial of total hip arthroplasty versus resurfacing arthroplasty in the treatment of young patients with arthritis of the hip joint. BMC Musculoskeletal Disorders 11: 8
  7. 7. Dawson J, Fitzpatrick R, Carr A, Murray D (1996) Questionnaire on the perceptions of patients about total hip replacement. J Bone Joint Surg Br 78: 185–90.
  8. 8. Harris WH (1969) Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am 51: 737–55.
  9. 9. Bellamy N, Wilson C, Hendrikz J, Whitehouse SL, Patel B, et al. (2010) WOMAC NRS 3.1 Osteoarthritis index delivered by mobile phone (m-WOMAC) is valid, reliable and responsive. Internal Medicine Journal 40: 6.
  10. 10. Verhoeven AC, Boers M, van der Liden S (2000) Validity of the MACTAR questionnaire as a functional index in a rheumatoid arthritis clinical trial. The McMaster Toronto Arthritis. Journal of Rheumatology 27: 2801–2809.
  11. 11. Merle D'Aubigné R, Postel M (1954) Functional results of hip arthroplasty with acrylic prosthesis. Journal of Bone & Joint Surgery - American Volume 36: 451–475.
  12. 12. Zahiri CA, Schmalzried TP, Szuszczewicz ES, Amstutz HC (1998) Assessing activity in joint replacement patients. Journal of Arthroplasty 13: 890–895.
  13. 13. Moher D, Liberati A, Tetzlaff J, Altman DG (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 339: b2535.
  14. 14. Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, et al. (2011) The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ 343: d5928.
  15. 15. Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, et al. (2007) Development of AMSTAR: a measurement tool to assess the methodological quality of systematic review. BMC Med Res Methodol 15: 10.
  16. 16. DerSimonian R, Larid N (1986) Meta-analysis in clinical trials. Controlled Clinical Trials 7: 177–188.
  17. 17. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (2011) The Cochrane Collaboration (2011) Available: http://handbook.cochrane.org/. Accessed 24 April 2013.
  18. 18. Egger M, Davey SG, Schneider M, Minder C (1997) Bias in meta-analysis detected by a simple, graphical test. BMJ 315: 629–634.
  19. 19. Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, et al. (2011) GRADE guidelines: 1. Introduction- GRADE evidence profoles and summary of findings. J Clin Epidemiol 64: 383–394.
  20. 20. Tubach F, Ravaud P, Baron G, Falissard B, Logeart I, et al. (2005) Evaluation of clinically relevant changes in patient reported outcomes in knee and hip osteoarthritis: the minimal clinically important improvement. Annals of the Rheumatic Diseases 64: 29–33.
  21. 21. Walters SJ, Brazier JE (2005) Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D. Quality of Life Research 14: 1523–1532.
  22. 22. Bjorgul K, Novicoff WM, Andersen ST, Brevig K, Thu F, et al. (2010) No differences in outcomes between cemented and uncemented acetabular components after 12–14 years: results from a randomized controlled trial comparing Duraloc with Charnley cups. Journal of Orthopaedics & Traumatology 11: 37–45.
  23. 23. Bjorgul K, Novicoff WM, Andersen ST, Brevig K, Thu F, et al. (2010) The Charnley stem: Clinical, radiological and survival data after 11–14 years. Orthopaedics and Traumatology: Surgery and Research 96 (2) 97–103.
  24. 24. Angadi DS, Brown S, Crawfurd EJ (2012) Cemented polyethylene and cementless porous-coated acetabular components have similar outcomes at a mean of seven years after total hip replacement: A prospective randomised study. J Bone Joint Surg Br 94: 1604–1610.
  25. 25. McCalden RW, MacDonald SJ, Rorabeck CH, Bourne RB, Chess DG, et al. (2009) Wear rate of highly cross-linked polyethylene in total hip arthroplasty. A randomized controlled trial. Journal of Bone & Joint Surgery - American Volume 91: 773–782.
  26. 26. Engh CA, Hopper RH, Huynh C, Ho H, Sritulanondha S, et al. (2012) A prospective, randomized study of cross-linked and non-cross-linked polyethylene for total hip arthroplasty at 10-year follow-up. Journal of Arthroplasty 27: 2–7.
  27. 27. Engh CA, Stepniewski AS, Ginn SD, Beykirch SE, Sychterz-Terefenko CJ, et al. (2006) A randomized prospective evaluation of outcomes after total hip arthroplasty using cross-linked marathon and non-cross-linked Enduron polyethylene liners. J Arthroplasty 21: 17–25.
  28. 28. Capello WN, D'Antonio JA, Feinberg JR, Manley MT, Naughton M (2008) Ceramic-on-ceramic total hip arthroplasty: update. Journal of Arthroplasty 23: 39–43.
  29. 29. D'Antonio J, Capello W, Manley M, Naughton M, Sutton K (2005) Alumina ceramic bearings for total hip arthroplasty: five-year results of a prospective randomized study. Clinical Orthopaedics & Related Research 436: 164–171.
  30. 30. D'Antonio J, Capello W, Manley M (2003) Alumina ceramic bearings for total hip arthroplasty. Orthopedics 26: 39–46.
  31. 31. Mesko JW, D'Antonio JA, Capello WN, Bierbaum BE, Naughton M (2011) Ceramic-on-Ceramic Hip Outcome at a 5- to 10-Year Interval. Journal of Arthroplasty 26: 172–177.
  32. 32. Corten K, Bourne RB, Charron KD, Au K, Rorabeck CH (2011) Comparison of total hip arthroplasty performed with and without cement: a randomized trial. A concise follow-up, at twenty years, of previous reports. Journal of Bone & Joint Surgery - American Volume 93: 1335–1338.
  33. 33. Laupacis A, Bourne R, Rorabeck C, Feeny D, Tugwell P, et al. (2002) Comparison of total hip arthroplasty performed with and without cement: a randomized trial. Journal of Bone & Joint Surgery - American Volume 84: 1823–1828.
  34. 34. Bourne RB, Corten K (2010) Cemented versus cementless stems: a verdict is in. Orthopedics 33: 638.
  35. 35. Corten K, Bourne RB, Charron KD, Au K, Rorabeck CH (2011) What works best, a cemented or cementless primary total hip arthroplasty?: minimum 17-year followup of a randomized controlled trial. Clinical Orthopaedics & Related Research 469: 209–217.
  36. 36. Howie DW, Holubowycz OT, Middleton R (2012) Large Articulation Study Group (2012) Large femoral heads decrease the incidence of dislocation after total hip arthroplasty: a randomized controlled trial. Journal of Bone & Joint Surgery - American Volume 94: 1095–1102.
  37. 37. Lewis PM, Moore CA, Olsen M, Schemitsch EH, Waddell JP (2008) Comparison of mid-term clinical outcomes after primary total hip arthroplasty with Oxinium vs cobalt chrome femoral heads. Orthopedics 31: 371–383.
  38. 38. Amanatullah DF, Landa J, Strauss EJ, Garino JP, Kim SH, et al. (2011) Comparison of surgical outcomes and implant wear between ceramic-ceramic and ceramic-polyethylene articulations in total hip arthroplasty. Journal of Arthroplasty 26: 72–77.
  39. 39. Kadar T, Hallan G, Aamodt A, Indrekvam K, Badawy M, et al. (2011) Wear and migration of highly cross-linked and conventional cemented polyethylene cups with cobalt chrome or Oxinium femoral heads: A randomized radiostereometric study of 150 patients. Journal of Orthopaedic Research 29: 1222–1229.
  40. 40. Healy WL, Tilzey JF, Iorio R, Specht LM, Sharma S (2009) Prospective, randomized comparison of cobalt-chrome and titanium trilock femoral stems. Journal of Arthroplasty 24: 831–836.
  41. 41. Kim YH, Choi Y, Kim JS (2011) Comparison of bone mineral density changes around short, metaphyseal-fitting, and conventional cementless anatomical femoral components. Journal of Arthroplasty 26: 931–940.
  42. 42. Kim YH, Kim JS, Park JW, Joo JH (2011) Comparison of total hip replacement with and without cement in patients younger than 50 years of age: the results at 18 years. Journal of Bone & Joint Surgery - British Volume 93: 449–455.
  43. 43. Bascarevic Z, Vukasinovic Z, Slavkovic N, Dulic B, Trajkovic G, et al. (2010) Alumina-on-alumina ceramic versus metal-on-highly cross-linked polyethylene bearings in total hip arthroplasty: a comparative study. International Orthopaedics 34: 1129–1135.
  44. 44. Voigt JD, Mosier MC (2012) Cemented all-polyethylene acetabular implants vs other forms of acetabular fixation: a systematic review and meta-analysis of randomized controlled trials. Journal of Arthroplasty 27: 1544–1553.
  45. 45. Pakvis D, van Hellemondt G, de Visser E, Jacobs W, Spruit M (2011) Is there evidence for a superior method of socket fixation in hip arthroplasty? A systematic review. International Orthopaedics 35: 1109–1118.
  46. 46. Clement ND, Biant LC, Breusch SJ (2012) Total hip arthroplasty: to cement or not to cement the acetabular socket? A critical review of the literature. Archives of Orthopaedic and Trauma Surgery 132: 411–427.
  47. 47. Sedrakyan A, Normand S-L, Dabic S, Jacobs S, Graves S, et al. (2011) Comparative assessment of implantable hip devices with different bearing surfaces: Systematic appraisal of evidence. BMJ 343: d7434.
  48. 48. Yoshitomi H, Shikata S, Ito H, Nakayama T, Nakamura T (2009) Manufacturers affect clinical results of THA with zirconia heads: A systematic review. Clinical Orthopaedics and Related Research 467: 2349–2355.
  49. 49. van der Linden W (1980) Pitfalls in randomized surgical trials. Surgery 87: 258–262.
  50. 50. McCulloch P, Taylor I, Sasako M, Lovett B, Griffin D (2002) Randomised trials in surgery: problems and possible solutions. BMJ 324: 1448–1451.
  51. 51. McLeod RS (1999) Issues in surgical randomized controlled trials. World Journal of Surgery 23: 1210–1214.
  52. 52. Boutron I, Tubach F, Giraudeau B, Ravaud P (2003) Methodological differences in clinical trials evaluating nonpharmacological and pharmacological treatments of hip and knee osteoarthritis. Jama 290: 1062–1070.
  53. 53. Ergina PL, Cook JA, Blazeby JM, Boutron I, Clavien PA, et al. (2009) Challenges in evaluating surgical innovation. Lancet 374: 1097–1104.
  54. 54. Turner RM, Bird SM, Higgins JPT (2013) The Impact of Study Size on Meta-analyses: Examination of Underpowered Studies in Cochrane Reviews. PLoS ONE 8: e59202.
  55. 55. Stanley TD, Jarrell SB, Doucouliagos H (2010) Could It Be Better to Discard 90% of the Data? A Statistical Paradox. The American Statistician 64: 70–77.
  56. 56. Kraemer HC, Gardner C, Brooks JO, Yesavage JA (1998) Advantages of excluding underpowered studies in meta-analysis: Inclusionist versus exclusionist viewpoints. Psychological Methods 3: 23–31.
  57. 57. Nuesch E, Trelle S, Reichenbach S, Rutjes AW, Tschannen B, et al. (2010) Small study effects in meta-analyses of osteoarthritis trials: meta-epidemiological study. BMJ 341: c3515.
  58. 58. Voleti PB, Baldwin KD, Lee GC (2012) Metal-on-Metal vs Conventional Total Hip Arthroplasty: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. Journal of Arthroplasty 27: 1844–1849.
  59. 59. Abdulkarim A, Ellanti P, Motterlini N, Fahey T, O'Byrne JM (2013) Cemented versus uncemented fixation in total hip replacement: a systematic review and meta-analysis of randomized controlled trials. Orthopedic Reviews 5: e8.
  60. 60. Li S, Huang B, Chen Y, Gao H, Fan Q, et al. (2013) Hydroxyapatite-coated femoral stems in primary total hip arthroplasty: A meta-analysis of randomized controlled trials. International Journal Of Surgery 11: 477–482.
  61. 61. Schulz KF, Altman DG, Moher D (2010) CONSORT Group (2010) CONSORT 2010 statement: updated guidelines for reporting parallel group randomized trials. Annals of Internal Medicine 152: 726–732.
  62. 62. Boutron I, Moher D, Altman DG, Schulz KF, Ravaud P (2008) CONSORT Group (2008) Methods and processes of the CONSORT Group: example of an extension for trials assessing nonpharmacologic treatments. Annals of Internal Medicine 148: 60–66.
  63. 63. Calvert M, Blazeby J, Altman DG, Revicki DA, Moher D, et al. (2013) Reporting of patient-reported outcomes in randomized trials: the CONSORT PRO extension. Jama 309: 814–822.
  64. 64. Issa K, Mont MA (2013) Total hip replacement: mortality and risks. 382: 1074–1076.
  65. 65. Lie SA, Engesaeter LB, Havelin LI, Furnes O, Vollset SE (2002) Early postoperative mortality after 67,548 total hip replacements - Causes of death and thromboprophylaxis in 68 hospitals in Norway from 1987 to 1999. Acta Orthopaedica Scandinavica 73: 392–399.
  66. 66. Pakvis D, van Hellemondt G, de Visser E, Jacobs W, Spruit M (2011) Is there evidence for a superior method of socket fixation in hip arthroplasty? A systematic review. International Orthopaedics 35 (8) 1109–18.
  67. 67. Bjorgul K, Novicoff WM, Andersen ST, Brevig K, Thu F, et al. (2010) No differences in outcomes between cemented and uncemented acetabular components after 12–14 years: results from a randomized controlled trial comparing Duraloc with Charnley cups. Journal of Orthopaedics & Traumatology Mar 11 (1) 37–45.
  68. 68. McCalden RW, MacDonald SJ, Rorabeck CH, Bourne RB, Chess DG, et al. (2009) Wear rate of highly cross-linked polyethylene in total hip arthroplasty. A randomized controlled trial. Journal of Bone & Joint Surgery - American Volume Apr 91 (4) 773–82.
  69. 69. Engh CA Jr, Stepniewski AS, Ginn SD, Beykirch SE, Sychterz-Terefenko CJ, et al. (2006) A randomized prospective evaluation of outcomes after total hip arthroplasty using cross-linked marathon and non-cross-linked Enduron polyethylene liners. J Arthroplasty Sep 21 (6 Suppl 2) 17–25.
  70. 70. D'Antonio J, Capello W, Manley M, Naughton M, Sutton K (2005) Alumina ceramic bearings for total hip arthroplasty: five-year results of a prospective randomized study. Clinical Orthopaedics & Related Research Jul (436) 164–71.
  71. 71. Corten K, Bourne RB, Charron KD, Au K, Rorabeck CH (2011) Comparison of total hip arthroplasty performed with and without cement: a randomized trial. A concise follow-up, at twenty years, of previous reports. Journal of Bone & Joint Surgery - American Volume Jul 20 93 (14) 1335–8.
  72. 72. Bourne RB, Corten K (2010) Cemented versus cementless stems: a verdict is in. Orthopedics Sep 33 (9) 638.
  73. 73. Corten K, Bourne RB, Charron KD, Au K, Rorabeck CH (2011) What works best, a cemented or cementless primary total hip arthroplasty?: minimum 17-year followup of a randomized controlled trial. Clinical Orthopaedics & Related Research Jan 469 (1) 209–17.
  74. 74. Howie DW, Holubowycz OT, Middleton R (2012) Large Articulation Study Group (2012) Large femoral heads decrease the incidence of dislocation after total hip arthroplasty: a randomized controlled trial. Journal of Bone & Joint Surgery - American Volume Jun 20 94 (12) 1095–102.
  75. 75. Lewis PM, Moore CA, Olsen M, Schemitsch EH, Waddell JP (2008) Comparison of mid-term clinical outcomes after primary total hip arthroplasty with Oxinium vs cobalt chrome femoral heads. Orthopedics Dec 31 (12 Suppl 2) 37183.
  76. 76. Amanatullah DF, Landa J, Strauss EJ, Garino JP, Kim SH, et al. (2011) Comparison of surgical outcomes and implant wear between ceramic-ceramic and ceramic-polyethylene articulations in total hip arthroplasty. Journal of Arthroplasty Sep 26 (6 Suppl) 72–7.
  77. 77. Healy WL, Tilzey JF, Iorio R, Specht LM, Sharma S (2009) Prospective, randomized comparison of cobalt-chrome and titanium trilock femoral stems. Journal of Arthroplasty Sep 24 (6) 831–6.
  78. 78. Kim YH, Choi Y, Kim JS (2011) Comparison of bone mineral density changes around short, metaphyseal-fitting, and conventional cementless anatomical femoral components. Journal of Arthroplasty Sep 26 (6) 931–40.
  79. 79. Kim YH, Kim JS, Park JW, Joo JH (2011) Comparison of total hip replacement with and without cement in patients younger than 50 years of age: the results at 18 years. Journal of Bone & Joint Surgery - British Volume Apr 93 (4) 449–55.
  80. 80. Voigt JD, Mosier MC (2012) Cemented all-polyethylene acetabular implants vs other forms of acetabular fixation: a systematic review and meta-analysis of randomized controlled trials. Journal of Arthroplasty Sep 27 (8) 1544–53.
  81. 81. Sedrakyan A, Normand S-L, Dabic S, Jacobs S, Graves S, et al. (2011) Comparative assessment of implantable hip devices with different bearing surfaces: Systematic appraisal of evidence. BMJ 343: d7434.