## Abstract

The ever-growing volume of data routinely collected and stored in everyday life presents researchers with a number of opportunities to gain insight and make predictions. This study aimed to demonstrate the usefulness in a specific clinical context of a simulation-based technique called probabilistic sensitivity analysis (PSA) in interpreting the results of a discrete time survival model based on a large dataset of routinely collected dairy herd management data. Data from 12,515 dairy cows (from 39 herds) were used to construct a multilevel discrete time survival model in which the outcome was the probability of a cow becoming pregnant during a given two day period of risk, and presence or absence of a recorded lameness event during various time frames relative to the risk period amongst the potential explanatory variables. A separate simulation model was then constructed to evaluate the wider clinical implications of the model results (i.e. the potential for a herd’s incidence rate of lameness to influence its overall reproductive performance) using PSA. Although the discrete time survival analysis revealed some relatively large associations between lameness events and risk of pregnancy (for example, occurrence of a lameness case within 14 days of a risk period was associated with a 25% reduction in the risk of the cow becoming pregnant during that risk period), PSA revealed that, when viewed in the context of a realistic clinical situation, a herd’s lameness incidence rate is highly unlikely to influence its overall reproductive performance to a meaningful extent in the vast majority of situations. Construction of a simulation model within a PSA framework proved to be a very useful additional step to aid contextualisation of the results from a discrete time survival model, especially where the research is designed to guide on-farm management decisions at population (i.e. herd) rather than individual level.

**Citation: **Hudson CD, Huxley JN, Green MJ (2014) Using Simulation to Interpret a Discrete Time Survival Model in a Complex Biological System: Fertility and Lameness in Dairy Cows. PLoS ONE 9(8):
e103426.
doi:10.1371/journal.pone.0103426

**Editor: **Dawit Tesfaye, University of Bonn, Germany

**Received:** January 10, 2014; **Accepted:** July 2, 2014; **Published:** August 7, 2014

**Copyright:** © 2014 Hudson et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Funding: **This work was part of an internally (University of Nottingham) funded Ph.D. studentship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests:** The authors have declared that no competing interests exist.

### Introduction

The ever-growing volume of data routinely collected and stored in everyday life presents researchers with a number of opportunities to gain insight and make predictions. Routine collection of data with potential research application is now very widespread, and is facilitating research using much larger sample sizes than in the past. One example of this is in agriculture, where widespread adoption of computerised recording systems has largely been driven by the need to manage larger enterprises and maximise efficiency, but is also creating an invaluable resource for researchers. A wide variety of traditional and new techniques have been applied to analysis of the large, retrospective datasets generated in this way, but in some cases more sophisticated and robust analytical techniques can yield results which are harder for the end user of the research to interpret and understand.

This study focuses on the relationship between a time-to-event outcome (in this case, the time between parturition and subsequent conception in a dairy cow) and a disease event (in this case lameness). Techniques for analysis of such data have evolved over the years, and this specific field has seen publications evaluating this relationship in a univariate way [1] using Kaplan-Meier survival analysis, and in a multivariate framework, using various modifications of the Cox proportional hazards model [2]–[4]. However, accounting appropriately for time-dependent variables (for example, accounting for the possibility that a case of lameness may affect probability of conception within a specific frame of time around the case) using such approaches can be challenging, and model assumptions can be difficult to satisfy and are not always tested [5].

Another approach is discrete time survival analysis [6], [7], where the dataset is amplified into smaller units of time for each individual animal and logistic regression is used to predict the probability of the outcome of interest at each time-point. This method is substantially more flexible, and more easily incorporates statistical advances such as multilevel regression using random effects to account for hierarchical clustering within data [7], [8] (for example, of cows within herds), and Markov chain Monte Carlo sampling for parameter estimation within a Bayesian framework [9]. However, results from this type of analysis can be difficult to interpret, especially at the population level. For example, such analysis may yield an estimated odds ratio for the association between a lameness event and the probability of conception occurring during a given period of time, but there is no intuitive way to interpret the likely importance of this at the population level. In this context, on-farm interpretation is very important, because decision makers (e.g. a dairy herd’s manager or veterinary clinician) need to be able to estimate the potential improvement in a herd’s reproductive performance that could result from a reduction in lameness in order to conduct a cost benefit analysis for intervention. Simulation based approaches can be used to help address this issue, allowing the researcher to evaluate relationships between inputs and outputs of a given system across plausible scenarios. One such technique is known as probabilistic sensitivity analysis (PSA), and is commonly employed in health economic evaluations [10], [11] to explore which inputs to a complex model have the most capacity to perturb the output. Such simulation models can also be used as the basis for decision support tools, an application common in the financial sector [12].

Good reproductive performance is essential for efficiency in dairy production [13], which in turn is increasingly important in the context of a global increase in demand and downward pressure on resource use [14]. A wide range of factors are known to affect dairy cow fertility, including incidence of clinical disease. Lameness is one of the most common endemic diseases in the modern dairy herd, with reported prevalence in the UK at over 35% [15], and has previously been associated with depressed reproductive performance in affected cows compared to unaffected controls [2], [4], [16], [17]. However, a very high proportion of previous studies have been carried out using either a single herd or a small number of herds, and those deriving data from wider populations have often failed to detect an association [18], [19], as did the most recent study in UK dairy cows [1]. Alongside this, a very wide variety of other factors are known to affect cow fertility. Therefore the clinician wishing to improve a herd’s reproductive performance needs to interpret this research evidence in the context of the other influences on fertility when deciding how much weight should be given to control of lameness to improve reproduction.

In this study, the association between clinical lameness events and reproductive performance was evaluated using routinely collected management data from a group of dairy herds. The aim of the study was to explore the usefulness of simulation-based techniques as an aid to interpret the clinical significance of a discrete time survival model evaluating association between disease events and reproductive performance at herd level.

### Materials and Methods

#### Data Collection and Restructuring

Routinely recorded farm management data were collected from 39 dairy herds across England and Wales. These were the subset of herds described in an earlier study [20] which demonstrated consistent recording of clinical lameness events (i.e. treatment of lame cows). Data collection, quality auditing and study inclusion criteria are described in detail by Hudson et al. [20]. Herds were not excluded on the basis of breed: 38 were mainly Holstein or Holstein–Friesian and one predominantly Guernsey. Detail regarding each event (for example, which limb was affected and the diagnosis made) was not evaluated in this study: all clinical lameness events were treated as equal. Where two lameness events were recorded for the same cow within 7 days, the second was removed (since both treatment records would have been likely to reflect the same disease event). Table 1 shows descriptive information for these herds.

**Table 1. Summary statistics of basic herd information for 39 dairy herds with good fertility and lameness records.**

Data were restructured into a format where each unit (line) of data was a two-day period during each lactation between 20 and 220 days after parturition (days in milk, DIM) where the cow was “at risk” of becoming pregnant (lactations were censored after culling, death, sale or conception occurred). For each of these two-day risk periods, a binary variable was used to represent whether the cow became pregnant during the risk period. Clinical lameness records were used to determine whether a case of lameness was recorded at a variety of different time-frames relative to each risk period (see Table 2). Additional variables at both lactation level (e.g. parity of cow, lactation 305-day adjusted milk yield) and risk period level (e.g. DIM at beginning of risk period, month and year of risk period) were calculated for each risk period (Table 2). Where necessary, categorical variables were recoded to avoid categories containing small numbers of risk periods/lactations (e.g. animals of parity 5 or above were grouped as a single category). This generated a dataset consisting of 1,247,677 risk periods from 21,913 lactations in 12,515 cows from 39 herds. Initial data collation and restructuring was carried out using Microsoft Access 2010 (Microsoft Corp.), with further restructuring and variable calculation carried out using R v2.14 [21].

**Table 2. Potential explanatory variables calculated for each risk period in a study investigating the association between lameness and fertility in 39 dairy herds.**

#### Discrete-time survival analysis

A multilevel discrete-time survival model [22] was constructed to evaluate the association between the probability of a cow becoming pregnant during a two-day risk period (the outcome) and the potential explanatory variables described in Table 2. A three-level hierarchical structure (with risk periods nested within cows nested within herds) was used to account for correlations between risk periods from the same cow and cows from the same herd.

The model took the standard form:

(3)

where t represents a two-day risk period and i and j the i^{th} cow in the j^{th} herd; µ_{tij} the fitted probability of Preg_{tij} (the outcome of the i^{th} cow in the j^{th} herd becoming pregnant during risk period t); lnDIM_{tij} the natural logarithm of DIM at the beginning of risk period t; α the regression intercept; β_{1} and β_{2} the coefficients for the terms representing days in milk; **X**_{tij} the vector of risk period level covariates and **β**_{3} the corresponding vector of coefficients; **X**_{ij} the vector of cow-level covariates and **β**_{4} the corresponding vector of coefficients; u_{ij} the random effect to reflect variation between individual cows (e.g. due to genetic variation) and v_{j} the random effect representing variation between herds (e.g. due to nutritional management or environmental conditions of the herd), with and the variances of the normal distributions of the respective random effects terms.

Model building and final parameter estimation was carried out using MLwiN v2.20 [23]. Model building and selection used the approach described in Hudson et al. [20], with Markov chain Monte Carlo (MCMC) sampling used for final parameter estimation [9] and retention in the model of variables where the 95% area of highest posterior density (HPD) for the variable’s coefficient did not cover zero. Biologically plausible first order interaction terms were tested, and retained in the model only if their inclusion made a substantial difference to parameter estimates for coefficients of the main effects. Inclusion of herd-level random effects (slope variation) for the lameness-related model terms was also tested, to account for the possibility that the association between lameness and reproductive performance could vary between herds. These were again retained in the model only if they altered parameter estimates for main effects by more than 1%, or if between-herd variation was large relative to mean effect size (such that the variance of the herd-level random effect for the variable was more than 20% of the mean/overall effect).

Model sensitivity analysis revealed that the parameters of interest were not sensitive to choices made during data restructuring and model building (e.g. choice of risk period duration, choice of function to represent DIM or selection of timeframes for lameness events). Simulation-based posterior predictions were used to evaluate model fit as described in Hudson et al. [20], by subsetting the data in a variety of ways, using the model to predict probability of pregnancy for each risk period in the subset and checking that the observed proportion of risk periods where pregnancy occurred lay within the 95% coverage interval of the predicted risk. Model results were illustrated as relative risks using a similar prediction-based approach [20]. Posterior predictions were carried out in R v2.14, using MCMC chains exported from MLwiN.

#### Probabilistic sensitivity analysis

In order to explore the relationship between herd reproductive performance and the incidence rate of lameness at herd level, a simulation model was developed. The aim of this part of the study was to evaluate the results of the discrete time survival analysis in a wider context to assess its potential usefulness to inform clinical on-farm management decisions.

##### Simulation model structure and process.

The outline structure of the simulation model is shown in Figure 1. The model was constructed in Microsoft Excel 2010 (Microsoft Corp.), using Visual Basic for Applications (Microsoft Corp.) for process control. The explanatory variables in the final discrete-time survival model became input parameters for the herd-level simulation model, which was used to simulate 50,000 herds of 200 lactations each. Simulating a herd first involved drawing the herd-level input parameters (e.g. the herd’s mean 305-day adjusted milk yield and incidence rate of clinical lameness) from the distributions shown in Table 3.

**Figure 1. Structure of the simulation model used for probabilistic sensitivity analysis.**

Solid black lines indicate process flow, dotted lines indicate that information from the source of the line is used in the step of the process to which the line leads (denoted by a diamond).

doi:10.1371/journal.pone.0103426.g001**Table 3. Input parameters used for each level of simulation in a study investigating the association between lameness and fertility.**

Simulation of the first cow-lactation in the herd was then commenced by drawing the lactation-level inputs (e.g. the parity of the cow) from the relevant distributions and simulating a clinical lameness history for the lactation. The latter was accomplished by using the distribution of DIM of all clinical lameness events from the original dataset (Figure 2) to assign a probability that a lameness event would occur at each two-day risk period through the lactation in a herd with a given overall lameness incidence rate. The discrete-time survival model described in the previous section was used to calculate the predicted probability of pregnancy occurring during each two-day risk period given the input parameters for that herd, lactation and risk period. This probability was adjusted to account for the herd’s overall (“background”) level of submission rate and pregnancy rate (i.e. the variation in these parameters not explained by lameness, milk yield or other model inputs). These are measures of specific aspects of a dairy herd’s reproductive performance, submission rate being the proportion of eligible cows inseminated every 21 days (the normal length of the oestrous cycle) and pregnancy rate being the proportion of inseminations leading to a pregnancy. Both of these “background” herd fertility characteristics were represented as herd-level input parameters with a separate value for each simulated herd drawn from the relevant distribution.

**Figure 2. Distribution of lameness cases observed by days in milk.**

A binary outcome to represent whether or not the cow became pregnant during the risk period was drawn from a binomial distribution based on this calculated probability. Repeated risk periods were simulated for each cow, until she either became pregnant or reached 300 DIM (a point at which farmers would commonly elect to remove cows from the herd if not pregnant), at which time simulation of the next lactation was begun. When 200 lactations had been simulated, the herd was considered complete. The mean number of DIM at pregnancy and the proportion of lactations ending without a pregnancy in each herd were stored along with the herd-level input parameters before beginning simulation of the next herd.

##### Simulation model inputs.

Uniform input distributions were specified for all herd-level inputs, so that every potential combination of herd-level inputs was equally likely to be selected at each iteration of the simulation. The ranges for these distributions were selected based on the authors’ clinical experience, such that they would be expected to cover the vast majority of realistic possibilities in UK dairy herds (Table 3). This was not considered to represent the true joint distributions of these parameters across herds: the objective was not to speculate on which situations might occur most commonly, but to evaluate the potential impact of all different lameness incidence rates across as wide a variety of herd scenarios as possible. Some of the lactation-level inputs were drawn from non-uniform distributions so that the architecture of each simulated herd was realistic (so, for example, the milk yield for a lactation was drawn from a beta distribution parameterised such that a cow was likely to draw a lactation yield close to the herd average, and there was a smaller chance of drawing a yield much further from the average), as described in Table 3.

##### Simulation model outputs and analysis.

A single herd-level outcome was devised to represent reproductive performance for each simulated herd (to allow evaluation of associations between this and the various input parameters). The mean number of DIM at pregnancy and the proportion of cows reaching 300 DIM without conceiving were combined using a modification of the method of Esslemont et al. [24] to produce a “modified FERTEX” (mFX) score. This involved comparing each value to a pre-set target (set at 60 days for the herd’s mean DIM at pregnancy and zero for proportion of cows in the herd reaching 300 DIM without conceiving), and applying a unit cost to the difference from target for each. The sum of these two costs on a per-cow basis for each simulated herd gave that herd’s mFX score. Since the cost of a culled cow and an additional empty day are widely acknowledged to vary from herd to herd, these were considered as herd-level inputs, and each drawn randomly for each herd from the distributions described in Table 3. The mFX score for each simulated herd was therefore a cost-based single measure of overall fertility performance (so that higher performing herds had lower mFX scores and vice versa).

Results from the simulations were analysed initially by illustrating associations between herd-level input parameters and mFX scores graphically using high-density scatterplots. Spearman rank correlation coefficients were calculated for the association between each herd-level input and mFX score (a non-parametric measure of correlation was selected as the mFX scores were positively skewed). Multiple regression (with the natural logarithm of mFX score as the outcome) was used to partition variance in mFX score between the various herd-level inputs. The resulting regression model was also used to predict the effect on mFX score of increasing each individual input in turn from the middle of its input distribution to the upper quartile so that results could be displayed graphically as a tornado plot (a standard approach for presentation of PSA results).

### Results

There were a total of 16,706 pregnancies from the 1,247,677 risk periods in the dataset, so that 1.34% of risk periods resulted in a pregnancy (corresponding to around 14% of cows becoming pregnant during each 21 day oestrous cycle). Of the 22,319 lactations in the dataset, 4,360 involved at least one case of lameness (corresponding to a lactational first case incidence rate of 19.5%).

#### Discrete-time survival analysis

Table 4 shows the parameter estimates for the regression model derived to predict the probability of pregnancy resulting during a two-day risk period. The predictor variables not directly associated with lameness showed very similar associations to those seen by Hudson et al. [20], with probability of pregnancy peaking at around 110 DIM, decreasing with increasing 305-day adjusted milk yield and lower predicted probabilities of pregnancy for cows in higher parities and during the months April to September. Clinical lameness events during four different time frames relative to the two-day risk period showed associations with the probability of pregnancy during the risk period. The largest association was seen when a lameness event was recorded within 14 days of the risk period, when the odds of pregnancy were reduced by almost 25% (odds ratio [OR] 0.76, area of 95% highest posterior density [HPD] 0.69–0.84). Lameness events recorded 43 to 70 days before, 43 to 70 days after and 71 to 100 days after a risk period were all associated with a reduction in the odds of pregnancy during the risk period of around 15% (ORs 0.85, 0.88 and 0.86 respectively; areas of 95% HPD 0.76–0.95, 0.80–0.98 and 0.79–0.95 respectively). These associations are represented as posterior predicted relative risks in Figure 3. Predicted risks were also used to demonstrate that model fit was good. For each subset of data tested, the observed proportion of risk periods where pregnancy occurred fell within the 95% area of HPD of predicted risk for that subset (Figure 4).

**Figure 3. Association between predicted relative risk of pregnancy at a given risk period and clinical lameness.**

Error bars represent the 95% credible interval for each predicted relative risk.

doi:10.1371/journal.pone.0103426.g003**Figure 4. Predicted and observed risk of pregnancy across various categories.**

Predicted absolute risk of pregnancy (black bars) at risk periods in various categories (x-axis) compared to the observed proportion of risk periods in that category where a pregnancy occurred (white bars). Error bars represent the 95% credible interval for each predicted relative risk.

doi:10.1371/journal.pone.0103426.g004**Table 4. Parameter estimates for discrete time survival model with pregnancy during a two-day risk period as the outcome, in a study investigating the association between lameness and fertility in 39 dairy herds.**

#### Probabilistic sensitivity analysis

##### Univariate analysis.

Univariate analysis of PSA results is presented using high-density scatterplots in Figure 5. These show that a herd’s “background” level of submission and pregnancy rate were the individual inputs with the strongest influence on overall herd fertility performance, with both being moderately strongly correlated with herd mFX score (Spearman rank correlation coefficient −0.65 for submission rate and −0.59 for pregnancy rate). The herd incidence rate of clinical lameness had no clear relationship with mFX score, with a Spearman rank correlation coefficient of 0.028 and the scatterplot showing a square appearance with no clear trend in the area of highest point density.

**Figure 5. Associations between simulation inputs and overall herd-level reproductive performance.**

High density scatterplots showing the association between each simulated herd’s reproductive performance (represented by modified FERTEX score, mFX, y-axis) and selected simulation input variables. Darker colours indicate areas of higher point density, IRCL: Incidence rate of clinical lameness.

doi:10.1371/journal.pone.0103426.g005##### Multivariate analysis.

Analysis of the simulation results in a multivariate framework allows visualisation of results from the discrete time survival model in a clinical context. Table 5 shows that the herd’s “background” level of submission and pregnancy rate explained the vast majority of the variation in herd mFX score, with 75% of overall variance explained by these two input parameters. It is important to remember that these inputs represent the marginal effect of between-herd variation in these aspects of fertility performance after the other model inputs have been accounted for (so, for example, a herd’s “background” pregnancy rate would reflect its insemination success rate after accounting for any effects of milk yield, age structure and level of lameness).

**Table 5. Multiple regression derived partition of variance in modified FERTEX score across simulation input variables in a study evaluating associations between lameness and fertility in dairy herds.**

Figure 6 shows the predicted change in herd mFX score which would result from a herd increasing each input parameter in turn from the middle of the range of the input distribution by 25% of the total range while the other inputs remain at the population median. For example, the top line on the plot shows that an increase in submission rate from the median value of the range of distribution for this input (45%) to the value representing the lower boundary of the upper quartile of the range (62.5%) would be expected to result in a decrease in mFX score (i.e. an improvement in overall reproductive performance) of around £100/cow/year. Increasing the herd’s incidence rate of lameness cases from 80 to 115 cases/100 cow-years would be expected to increase herd mFX score by just over £5/cow/year. Therefore, a reduction in lameness incidence of 35 cases/100 cow-years (which would represent a large improvement, and may require substantial financial and time investment from the farmer) would be expected to lead to the same degree of improvement in fertility as an increase in submission rate of less than 1% (a small change, which would be expected to require substantially less investment).

**Figure 6. Predicted effect of an equivalent increase in each input parameter on overall reproductive performance.**

Tornado plot showing the predicted effect of increasing each input parameter in turn by a value representing 25% of the range of its input distribution from the median value, while the other input parameters are held at their population medians. The input parameters are listed on the right hand side of the graph, and the change in each input (from median to upper quartile) is given in parentheses. For example, the top bar shows that the predicted effect of moving from a submission rate of 45% (the median of the input distribution for this parameter) to 62.5% (the upper quartile of the input distribution) would be a decrease of just over £100/cow/year in the herd’s modified FERTEX (mFX) score.

doi:10.1371/journal.pone.0103426.g006### Discussion

This study showed that relatively large associations between clinical lameness events and reproductive performance could be demonstrated at the level of a risk period within lactation (e.g. occurrence of a lameness case within 14 days of a risk period was associated with a 25% reduction in the risk of the cow becoming pregnant during the risk period, Figure 3). However, PSA revealed that a herd’s incidence rate of lameness was highly unlikely to make a significant contribution to its overall level of reproductive performance when other factors affecting fertility were also taken into account. As the simulation model was constructed to represent a herd with even all-year-round calving, it is possible that the results will be less applicable to block calving herds, where cows may have a limited timeframe in which to conceive. It is plausible that clinical lameness events may have an increased importance in the latter situation, as even a modest reduction in risk of pregnancy during the breeding period could increase the risk of a cow being culled.

There is substantial variation in the conclusions of existing work evaluating the association between lameness and reproductive performance. A variety of previous studies have found associations between decreased fertility and either clinical lameness events [2], [3], [25] and/or identification of lameness through visual gait assessment [26], [27]. In contrast, other studies have failed to reveal such an association [1], [18], [19]: notably there seems to be a tendency for studies involving larger numbers of herds to fail to identify significant associations. Many of the pre-existing papers in this area describe studies involving less than five herds (and most use a single herd); the notable exceptions to this are Loeffler et al. [18] (43 herds) and Sogstad et al. [19] (112 herds), neither of which found significant associations between lameness events and reproductive outcomes. It is biologically plausible that any effect of lameness on reproductive performance will vary between herds (for example, due to the variation in the predominant causes of lameness in each herd and variation in the effectiveness of management of lame cows). The current study used data from 39 herds, but from a much larger number of cows compared to previous work. The possibility of between-herd variability in the association between lameness events and fertility was explored here using herd-level random effects terms for the explanatory variables related to lameness. This revealed relatively little between-herd variability in effect within this group of herds. It is possible that somewhat different results would have been derived from building the statistical model using data from a different group of herds, but since the simulation model showed such a small potential for lameness to influence herd fertility, the difference in the statistical model results would have had to be extremely large in order to change the interpretation of the simulation results to any meaningful degree.

Some of the variation in previous published results may also be related to the way in which reproductive outcomes were measured: this study revealed significant associations between lameness events and the probability of pregnancy over a specific window of time relative to the lameness case, but when results were used to evaluate this within a PSA framework it transpired that lameness incidence rate was unlikely to influence overall herd reproductive performance. This means that previous studies focussing on particular categories or timings of lameness event and/or reproductive outcome may have been more likely to generate significant findings than those using broader categories or timeframes.

This study illustrates the usefulness of simulation-based techniques (such as PSA) to aid interpretation and contextualisation of model results. The approach we describe provides a potential route for researchers to facilitate better understanding of the results of their work and how they should be interpreted in a clinical context. This in turn can enhance research impact, and accelerate change in clinical practice. Although this example describes application of PSA to help interpret the results of a discrete time survival analysis, the technique would be equally applicable to other types of complex model, and to other analyses based on logistic regression. In logistic regression, the model coefficients themselves can be difficult to interpret. Classically the coefficients are exponentiated to produce odds ratios (as shown in Table 4), but odds ratios themselves can be misleading because humans intuitively tend to think in terms of risk or probability rather than odds (and these can be quite different, especially where the risk is close to 0.5). This topic has been extensively explored in the medical literature [28]–[30], where results of such analyses must be interpreted by clinicians, some of whom may have a limited understanding of statistical methods. It is possible to convert an odds ratio to a relative risk for more intuitive interpretation (as shown in Figure 3), but where decisions are to be made at population level these can also be difficult to interpret. For example, in this case the relative risks would have been hard to interpret without a method to incorporate the likely range of herd-level lameness incidence rates and the distribution of lameness events through lactation. Here, the results from the discrete time survival model alone (along with some of the pre-existing literature) may have encouraged clinicians to place too much emphasis on control of lameness to improve herd-level reproductive performance.

This study highlights the usefulness of simulation-based techniques such as PSA as an extension of statistical modelling to help illustrate model results in an intuitive way within a clinical veterinary context. In this example, while there are associations between lameness events and reproductive performance at specific time-points, it is unlikely that a herd’s incidence rate of lameness will have a substantial impact on herd fertility. This does not mean that lameness control is not important: lameness has significant impacts on both animal welfare and productivity [31]. Rather, our analysis suggests that herd lameness control is unlikely to lead to a significant improvement in overall reproductive performance in the majority of situations.

### Acknowledgments

The authors acknowledge the assistance of the farmers and their veterinary surgeons who provided the data on which this work was based.

### Author Contributions

Conceived and designed the experiments: CDH MJG JNH. Performed the experiments: CDH. Analyzed the data: CDH. Wrote the paper: CDH JNH MJG.

### References

- 1. Peake KA, Biggs AM, Argo CM, Smith RF, Christley RM, et al. (2011) Effects of lameness, subclinical mastitis and loss of body condition on the reproductive performance of dairy cows. Vet Rec 168: 301. doi: 10.1136/vr.c6180
- 2. Alawneh JI, Laven RA, Stevenson MA (2011) The effect of lameness on the fertility of dairy cattle in a seasonally breeding pasture-based system. J Dairy Sci 94: 5487–5493. doi: 10.3168/jds.2011-4395
- 3. Hernandez J, Shearer JK, Webb DW (2001) Effect of lameness on the calving-to-conception interval in dairy cows. J Am Vet Med Assoc 218: 1611–1614. doi: 10.2460/javma.2001.218.1611
- 4. Machado VS, Caixeta LS, McArt JAA, Bicalho RC (2010) The effect of claw horn disruption lesions and body condition score at dry-off on survivability, reproductive performance, and milk production in the subsequent lactation. J Dairy Sci 93: 4071–4078. doi: 10.3168/jds.2010-3177
- 5. Bellera CA, MacGrogan G, Debled M, de Lara CT, Brouste V, et al. (2010) Variables with time-varying effects and the Cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. BMC Med Res Methodol 10: 20. doi: 10.1186/1471-2288-10-20
- 6. Singer JD, Willett JB (1993) It’s about time: Using discrete-time survival analysis to study duration and the timing of events. J Educ Behav Stat 18: 155. doi: 10.3102/10769986018002155
- 7. Steele F (2003) A discrete-time multilevel mixture model for event history data with long-term survivors, with an application to an analysis of contraceptive sterilization in Bangladesh. Lifetime Data Anal 9: 155–174.
- 8.
Rasbash J, Steele F, Browne WJ, Goldstein H (2009) A User’s Guide to MLwiN, v2.10. Centre for Multilevel Modelling, University of Bristol.
- 9.
Browne WJ (2009) MCMC Estimation in MLwiN v2.20. Centre for Multilevel Modelling, University of Bristol.
- 10. Andronis L, Barton P, Bryan S (2009) Sensitivity analysis in economic evaluation: an audit of NICE current practice and a review of its use and value in decision-making. Health Technol Assess 13: 1–45.
- 11. Anderson K, Jacobson JS, Heitjan DF, Zivin JG, Hershman D, et al. (2006) Cost-effectiveness of preventive strategies for women with a BRCA1 or a BRCA2 mutation. Ann Intern Med 144: 397–406. doi: 10.7326/0003-4819-144-6-200603210-00006
- 12. Evans GE, Jones B (2009) The application of Monte Carlo simulation in finance, economics and operations management. 2009 WRI World Congress on Computer Science and Information Engineering, CSIE 2009. Vol. 4: 379–383. doi: 10.1109/csie.2009.703
- 13. LeBlanc S (2007) Economics of improving reproductive performance in dairy herds. Adv Dairy Technol 19: 201–214.
- 14.
Steinfeld H, Gerber P, Wassenaar T, Castel V, Rosales M, et al.. (2006) Livestock’s long shadow. Available: http://www.fao.org/docrep/010/a0701e/a0701e00.HTM. Accessed 25 April 2013.
- 15. Barker ZE, Leach KA, Whay HR, Bell NJ, Main DCJ (2010) Assessment of lameness prevalence and associated risk factors in dairy herds in England and Wales. J Dairy Sci 93: 932–941. doi: 10.3168/jds.2009-2309
- 16. Garbarino EJ, Hernandez JA, Shearer JK, Risco CA, Thatcher WW (2004) Effect of lameness on ovarian activity in postpartum Holstein cows. J Dairy Sci 87: 4123–4131. doi: 10.3168/jds.s0022-0302(04)73555-9
- 17. Melendez P, Bartolome J, Archbald LF, Donovan A (2003) The association between lameness, ovarian cysts and fertility in lactating dairy cows. Theriogenology 59: 927–937. doi: 10.1016/s0093-691x(02)01152-4
- 18. Loeffler SH, de Vries MJ, Schukken YH (1999) The effects of time of disease occurrence, milk yield, and body condition on fertility in dairy cows. J Dairy Sci 82: 2589–2604. doi: 10.3168/jds.s0022-0302(99)75514-1
- 19. Sogstad AM, Østeraas O, Fjeldaas T (2006) Bovine claw and limb disorders related to reproductive performance and production diseases. J Dairy Sci 89: 2519–2528. doi: 10.3168/jds.s0022-0302(06)72327-x
- 20. Hudson CD, Bradley AJ, Breen JE, Green MJ (2012) Associations between udder health and reproductive performance in United Kingdom dairy cows. J Dairy Sci 95: 3683–3697. doi: 10.3168/jds.2011-4629
- 21.
R Core Development Team (2010) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available: http://www.R-project.org.
- 22.
Yang M, Goldstein H (2003) Modelling survival data in MLwiN 1.20. Centre for Multilevel Modelling, University of Bristol.
- 23.
Rasbash J, Charlton C, Browne WJ, Healy M, Cameron B (2010) MLwiN Version 2.2. UK: Centre for Multilevel Modelling, University of Bristol.
- 24.
Esslemont RJ, Kossaibati M (2002) DAISY Research Report No. 5: The costs of poor fertility and disease in UK dairy herds - Trends in DAISY herds over 10 seasons. UK: University of Reading.
- 25. Bahonar AR, Azizzadeh M, Stevenson MA, Vojgani M, Mahmoudi M (2009) Factors affecting days open in Holstein dairy cattle in Khorasan Razavi province, Iran; A cox proportional hazard model. JAVA 8: 747–754.
- 26. Hernandez JA, Garbarino EJ, Shearer JK, Risco CA, Thatcher WW (2005) Comparison of the calving-to-conception interval in dairy cows with different degrees of lameness during the prebreeding postpartum period. J Am Vet Med Assoc 227: 1284–1291. doi: 10.2460/javma.2005.227.1284
- 27. Bicalho RC, Vokey F, Erb HN, Guard CL (2007) Visual locomotion scoring in the first seventy days in milk: Impact on pregnancy and survival. J Dairy Sci 90: 4586–4591. doi: 10.3168/jds.2007-0297
- 28. Zhang J, Yu KF (1998) What’s the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. JAMA 280: 1690–1691. doi: 10.1001/jama.280.19.1690
- 29. Bland JM, Altman DG (2000) The odds ratio. BMJ 320: 1468. doi: 10.1136/bmj.320.7247.1468
- 30. Davies HTO, Crombie IK, Tavakoli M (1998) When can odds ratios mislead? BMJ 316: 989–991. doi: 10.1136/bmj.316.7136.989
- 31. Huxley JN (2013) Impact of lameness and claw lesions in cows on health and production. Livest Sci 156: 64–70. doi: 10.1016/j.livsci.2013.06.012