Research Article

Estimating Summer Nutrient Concentrations in Northeastern Lakes from SPARROW Load Predictions and Modeled Lake Depth and Volume

  • W. Bryan Milstead mail,

    Affiliation: United States Environmental Protection Agency, Office of Research and Development, National Health and Environmental Effects Research Laboratory, Atlantic Ecology Division, Narragansett, Rhode Island, United States of America

  • Jeffrey W. Hollister,

    Affiliation: United States Environmental Protection Agency, Office of Research and Development, National Health and Environmental Effects Research Laboratory, Atlantic Ecology Division, Narragansett, Rhode Island, United States of America

  • Richard B. Moore,

    Affiliation: United States Geological Survey, Pembroke, New Hampshire, United States of America

  • Henry A. Walker

    Affiliation: United States Environmental Protection Agency, Office of Research and Development, National Health and Environmental Effects Research Laboratory, Atlantic Ecology Division, Narragansett, Rhode Island, United States of America

  • Published: November 19, 2013
  • DOI: 10.1371/journal.pone.0081457


Global nutrient cycles have been altered by the use of fossil fuels and fertilizers resulting in increases in nutrient loads to aquatic systems. In the United States, excess nutrients have been repeatedly reported as the primary cause of lake water quality impairments. Setting nutrient criteria that are protective of a lakes ecological condition is one common solution; however, the data required to do this are not always easily available. A useful solution for this is to combine available field data (i.e., The United States Environmental Protection Agency (USEPA) National Lake Assessment (NLA)) with average annual nutrient load models (i.e., USGS SPARROW model) to estimate summer concentrations across a large number of lakes. In this paper we use this combined approach and compare the observed total nitrogen (TN) and total phosphorus (TN) concentrations in Northeastern lakes from the 2007 National Lake Assessment to those predicted by the Northeast SPARROW model. We successfully adjusted the SPARROW predictions to the NLA observations with the use of Vollenweider equations, simple input-output models that predict nutrient concentrations in lakes based on nutrient loads and hydraulic residence time. This allows us to better predict summer concentrations of TN and TP in Northeastern lakes and ponds. On average we improved our predicted concentrations of TN and TP with Vollenweider models by 18.7% for nitrogen and 19.0% for phosphorus. These improved predictions are being used in other studies to model ecosystem services (e.g., aesthetics) and dis-services (e.g. cyanobacterial blooms) for ~18,000 lakes in the Northeastern United States.


Global nutrient cycles have been disrupted by the combustion of fossil fuels and the use of fertilizers derived from industrially fixed nitrogen and mined phosphorus [1-6]. A large proportion of this anthropogenic increase in nitrogen and phosphorus flux is delivered to ground or surface waters through direct runoff, human and animal wastes, and atmospheric deposition. Ultimately, excess nutrients are transported to coastal waters [7,8].

Increases in nutrient loads to aquatic systems often results in enhanced primary production. This process, known as cultural eutrophication, leads to undesirable changes in aquatic resources such as reduced water clarity, hypoxia, harmful algal blooms, fish kills, loss of biodiversity, and increases in nuisance species [9-11]. Eutrophication can also affect human health through increased exposure to cyanobacteria toxins [12,13], nitrites, and nitrates [14,15]. Furthermore, the economic costs of eutrophication resulting from lost ecosystem services (e.g., housing amenity value, recreation opportunities, freshwater provisioning, and food and fiber production) are high [16-18].

In the United States, excess nutrients were reported as the primary cause of lake water quality impairments in the biannual United States Environmental Protection Agency (USEPA) reports to congress from 1994-2002 [19-23]. Given the importance of nutrient pollution, the USEPA [24] requires states to adopt water quality standards with specific numeric nutrient criteria; however, less than half the states have complied [25].

The development of nutrient criteria for lakes requires access to reliable information on nitrogen and phosphorus concentrations at the statewide or ecoregion level [24,26]. These data, however, are not always comparable as the field and laboratory methods vary (see 27) often resulting in only a few sites with consistently collected and analyzed nutrient observations . Additionally, differences among sample times can obscure seasonal and inter-annual patterns. In 2007, the USEPA coordinated the National Lake Assessment, a survey of the biological, physical, habitat, and water quality condition for lakes in the 48 contiguous United States [28]. The survey provides consistently collected and analyzed nutrient data for 1152 lakes from the summer of 2007. The majority of the NLA lakes (1028) were selected with a spatially balanced, probabilistic sampling design that was developed to provide inference on the condition of the lakes in the contiguous United States at the national and ecoregional level [28]. An additional set of “hand-selected lakes” were included as reference sites. The National Lake Assessment found that with respect to nutrients approximately half of the lakes are in “Good” condition with the remaining split between “Fair” and “Poor” [28]. Although these data provide useful information on the quality of the nations lakes, the sampling density (mean=21.4 sites/state) is too low for use in the development of nutrient criteria.

An alternative approach is to use models such as the USGS SPARROW (SPAtially Referenced Regression On Watershed attributes) models [29-31] to estimate nutrient loads to lakes. SPARROW models are watershed based models that estimate nutrient loads to streams based on landscape characteristics and known or estimated nutrient sources [30]. An example of this is the Northeast SPARROW model [31]. This model uses an enhanced version of the medium resolution (1:100,000) National Hydrography Dataset (NHDPlus version 1 (V1) [32] to represent the hydrology of the Northeast United States as a network of reaches (called “flowlines” within the NHD, but referred to as “reaches” throughout this article). A reach is a discrete, spatially defined, linear feature located on a hydrologic flow network (i.e., each reach has known upstream and downstream connections) that represents either a stream segment or an artificial path through a waterbody (e.g. wetland, lake, pond, or reservoir). Reaches are associated with unique catchments that include the submerged parts of the local watershed the reach flows through and any land area that drains directly to it.

The Northeast SPARROW model estimates nitrogen and phosphorus contributions to each reach based on 2002 land cover, point source discharges (e.g., waste water treatment facilities), crop types, agricultural fertilizer use levels, animal manure production, and atmospheric deposition (nitrogen only) for the catchment. Not all nutrients that are applied to the land will be delivered to or exported from the streams. Some nutrients will be lost during the land to water delivery phase and others will be lost through instream processes such as nutrient retention. Instream processing varies with stream order and waterbody type [29,31]. The model is calibrated with long-term monitoring data and therefore represents long-term average annual nutrient loads based on 2002 catchment condition. Non-linear least squares regression is used to estimate the model coefficients for source contributions and loss functions that will maximize the fit to the monitoring data [29,31].

In this paper we propose a novel use for the SPARROW model predictions. Although designed to estimate nutrient loads to reaches, the inclusion of waterbody features in both the SPARROW model and NHDplusV1 allow us to aggregate load predictions to lakes. From NHDplusV1 we can identify the stream reaches directly upstream of a lake (the inflows). The sum of the export loads for lake inflows represents the nutrient inputs to the lake from upstream sources. Incremental load (i.e., loads generated within the local catchments before instream processing occurs) predictions for reaches within a lake can be added to the upstream loads to estimate total nutrient inputs. Nutrient loads for all reaches exiting the lake can be summed to predict nutrient exports.

Moore et al [31] have demonstrated that predicted nutrient concentrations for lakes from the Northeast SPARROW model are consistent with, though higher than values observed during the 2007 National Lake Assessment. This is not surprising since water and nutrient inputs are greatest during the spring runoff period and therefore the average annual predictions of concentration by the SPARROW model will be heavily weighted by spring conditions when nutrient retention within the lake is minimal [33]. Inter-annual variation in precipitation, nutrient inputs, or changes in land cover between 2002 and 2007 will also explain some of the variation. Additionally, estimates of nutrient retention in lakes by the Northeast SPARROW are lower than expected based on a survey of the literature (see below) and this will also affect nutrient load predictions. We hypothesize that the differences between the NLA observations and the SPARROW predictions for TN and TP result from a combination of inter-annual and seasonal variation in water and nutrient inputs coupled with an underestimation of nutrient retention by the model.

Our overall goal for this paper was to demonstrate that average annual nutrient predictions from SPARROW could be coupled with monitoring data to estimate summer concentrations of nutrients in lakes. The objectives were to (1) adjust the SPARROW predicted annual average concentrations to the observed summer values for 2007 with simple linear models; (2) use Vollenweider type input-output models [35-37] and modeled maximum lake depth and volume [38,39] to improve predictions; and, (3) extrapolate results from the best fit model to the ca. 18000 lakes in the Northeast United States with SPARROW nutrient flux predictions.

Materials and Methods

Study Area

The study area includes Hydrologic Unit Code regions 01 and 02 from the National Hydrologic Dataset [32], the same areal extent and spatial resolution as the Northeast SPARROW model [31]. This area comprises all or most of Connecticut, Delaware, Maine, Maryland, Massachusetts, New Hampshire, New Jersey, Rhode Island, Vermont, and Washington D.C., much of New York, Pennsylvania, Virginia, and the eastern part of West Virginia (Figure 1).


Figure 1. Map of the study area.

Shown are the locations of the lakes within Hydrologic Unit Code (HUC) regions 01 (New England) and 02 (Mid-Atlantic).


Lake Hydrology, Morphometric, and Water Quality Data

The medium resolution (1:100,000) NHDplusV1 [32], associated tools and value added data tables were used for this analysis. NHDplusV1 for HUC 01 and 02 includes 208,185 reaches and their catchments, and 28,879 waterbody polygons identified as feature type (FTYPE) lake/pond, or reservoir (hereafter lakes). The NHDplusV1 shapefiles and data files were saved in an ESRI personal geodatabase. This allowed the data to be queried as a relational database and geo-processed in ArcGIS (v. 9.2). In NHDplusV1, lakes that intersected the USGS 1:100,000 quad boundaries were split into two or more contiguous polygons. These were re-aggregated into 28,122 uniquely identified (WB_ID) waterbody units by dissolving contiguous polygons of the same feature type in ArcGIS.

All reaches are assigned a unique identifier (a ComID) by NHDplusV1. Hydrologic relationships among ComIDs are defined in a table included with NHDplusV1 (NHDflow) that identifies the ComIDs of upstream (FromComID) and downstream (ToComID) connections. We used ArcGIS to identify the ComIds of reaches within lakes and these were joined to the table NHDflow in MS Access to identify upstream and downstream connections following the procedures outlined in the NHDplusV1 user guide [40]. When the ComIDs for reaches within a lake are joined to the ToComID in the table “NHDflow” the resulting list of FromComIDs identifies all upstream connections. Since many lakes have multiple internal reaches some of these upstream connections represent connections within the lake and others represent stream segments immediately upstream of the lake (input reaches). Input reaches can be distinguished from internal connections by excluding all FromComIDs that match ComIDs within the lake. Lake outflows, the most downstream reaches within a lake were identified in a similar manner. When the ComIDs for reaches within a lake are joined to the FromComID in table NHDflow the resulting list of ToComIDs identifies all downstream connections. The outflows are those reaches associated with ToComIDs that do not correspond to any of the ComIDs within the lake.

The surface area (m2) of each lake was calculated in ArcGIS version 9.2 following transformation to the Albers equal area projection. Maximum depth (m) and volume (m2) were estimated from the surrounding topography from the National Elevation Dataset [41] following the methods of Hollister and Milstead [38] and Hollister et al, [39]. Hydraulic Residence Time (years) was calculated as the ratio of volume to flow. Mean depth (m) was calculated as the ratio of volume to surface area.

The National Lake Assessment water quality data (TN and TP) were obtained from the USEPA [42] and converted to an ESRI personal Geodatabase. Lakes were spatially joined to the NHD waterbodies and assigned the corresponding WB_ID. A total of 131 lakes with NLA data and SPARROW estimates were identified in the Northeast. Of these, 98 lakes were selected for sampling using the probability design and 33 were reference sites. All TN values were above the method detection limit (0.01 mg/l) but 18 of the TP observations were below the detection limit (0.004 mg/l). TP values below the limit were arbitrarily set to 0.002 mg/l for our analysis.

Northeast SPARROW Model

The Northeast SPARROW model predictions [31] for total and incremental nitrogen and phosphorus loads (kg/yr) and flow (average annual water inputs [CFS]) were retrieved from the USGS SPARROW Decision Support System [43-45] for all reaches in the study area. Total nitrogen and phosphorus load predictions represent the average annual flux (kg/yr) of nitrogen and phosphorus delivered to the next downstream reach. Total load equals the sum of the nutrient inputs from all reaches immediately upstream of a reach plus the incremental load (load delivered directly to the reach from sources within the local catchment area) minus estimated nutrient decay within the reach itself [29-31]. In this study we used the same logic to calculate nitrogen, phosphorus, and water (flow) inputs and exports for lakes. Upstream inputs of nitrogen and phosphorus were calculated by summing the nitrogen and phosphorus loads for all reaches immediately upstream of the lake. The incremental inputs of nutrients were represented by the sums of the individual incremental nitrogen and phosphorus loads for all reaches within the lake. The total inputs to the lake equal the sums of the upstream inputs and the incremental inputs for each nutrient. The lake nutrient inputs do not include losses due to nutrient retention within the lake and therefore estimate total load before in situ processing occurs. Lake exports, the nutrient delivered to the reaches immediately downstream of a lake, are the total load minus in lake nutrient retention. These were calculated by summing the nitrogen and phosphorus loads for the lake outflows. Flow is the average annual input of water to the lake. In the SPARROW model, water is conserved so water exports equal water inputs. Total flow (the average annual input or export of water to the lake) was calculated as the sum of the flows for all lake outflows. Total flow was converted to m3/year by multiplying the total CFS by 893,593. Nutrient inputs and exports of nitrogen (Nin & Nout) and phosphorus (Pin & Pout) were converted to concentrations (mg/l) by dividing the total load (kg/yr) by the total flow (m3/yr).

The Northeast SPARROW model provided nutrient load predictions for 18,016 of the 28,122 lakes identified in the study area. Lakes without predictions were either small, isolated basins without connections to the larger NHDplusV1 network (no input or output reaches) or were associated with coastal salt pond systems. The SPARROW model estimates a nutrient decay coefficient (based on hydraulic load for lakes) from the data but the decay term is only included if it is statistically significant. For reaches in lakes the Northeast SPARROW model found a significant decay coefficient for phosphorus but not nitrogen [31]. As a result, nitrogen inputs to lakes should equal exports unless there are hydrological issues such as water diversions within or immediately upstream of the lakes. A total of 17,792 lakes met this conservation of nitrogen mass balance criterion (Nin = Nout) for inclusion in this study.

Vollenweider Models

Vollenweider models (input-output models) are used to predict lake nutrient concentrations from nutrient inputs, residence time, and (sometimes) mean depth [34-36]. Brett and Benjamin [35] identified five input-output models variations (H1 to H5 in ) used to estimate lake phosphorus concentrations from phosphorus input concentrations. Brett and Benjamin [35] compared the five models to the null hypothesis (H0: lake concentration = output concentration) and found H4 to be best supported. A sixth variation (H6; Table 1) was used by Reckhow [36] to estimate phosphorus concentrations for the EUTROMOD model. The majority of the input-output models in the literature have been used to predict phosphorus concentrations in lakes. However, Vollenweider [34] applied his original input-output models to both nitrogen and phosphorus, Bachman [46] and Reckhow [36] used a variation of H4 for nitrogen, and Windolf et al [37] developed two input-output model variations for nitrogen (see H7 and H8; Table 1). The input-output models are simple mass balance equations that estimate the concentration of chemical substances based on inputs, outputs and sedimentation [34]. Therefore they are equally applicable to the estimation of TN and TP concentrations. To test whether an independent estimate of nutrient retention improves our ability to predict nutrient concentrations we used the eight input-output models in Table 1 to predict nitrogen and phosphorus concentrations for the Northeast lakes. The input-models were parameterized with the SPARROW predicted nutrient input concentrations (Nin & Pin) and our estimates of hydraulic residence time and mean depth; the NLA observed TN and TP concentrations were used to validate the models.


Table 1. Hypotheses tested.

H0 is the null hypothesis that lake concentration of nutrients (nitrogen and phosphorus) is equal to the export concentration for reaches leaving the lake. H1-H8 are the Vollenweider (input-output) models with parameters and initial coefficients taken from the literature used to fit the models. Note: H1-H6 were originally derived to estimate total phosphorus and H7 & H8 were derived to estimate total nitrogen. Nutrientlake = Measured nutrient (nitrogen or phosphorus) concentration (mg/l) for the lake. Nutrientin SPARROW predicted average annual nutrient (nitrogen or phosphorus) input concentration (mg/l) for the lake; Nutrientout = SPARROW predicted average annual nutrient (nitrogen or phosphorus) concentration (mg/l) for the lake; z = mean depth (m); and, τ = hydraulic residence time (years).

Statistical Analysis

All analyses were completed with the open source statistical package R version 2.14.2 [47]. The R package “spsurvey” [48] was used to calculate confidence intervals for the NLA observations. The statistical analyses, tables, and figures 2-7 can be reproduced with the R-script (Text S1) and the R-dataset (Dataset S1) included as supplements to this publication.


Figure 2. Total nitrogen (A) and phosphorus (B) in Northeast Lakes.

National Lake Assessment 2007 observed summer concentrations versus the average annual SPARROW predicted concentrations. Observations are color coded by hydraulic residence time (HRT: Short < 0.04 years; Medium = 0.04 to 0.4 years; Long > 0.4 years). TN = Total Nitrogen. TP = Total Phosphorus.


Figure 3. Adjusted (Linear Model) total nitrogen (A) and phosphorus (B) in Northeast Lakes.

National Lake Assessment observed 2007 summer concentrations of (A) total nitrogen and (B) phosphorus in Northeast Lakes versus the linear model (LM) adjusted average annual SPARROW predicted concentrations. Linear regression was used to adjust SPARROW predictions to the 2007 NLA observations. Observations are color coded by hydraulic residence time (HRT: Short < 0.04 years; Medium = 0.04 to 0.4 years; Long > 0.4 years). TN = Total Nitrogen. TP = Total Phosphorus.


Figure 4. Adjusted (Vollenweider Model) total nitrogen (A) and phosphorus (B) in Northeast Lakes.

National Lake Assessment observed 2007 summer concentrations of (A) total nitrogen and (B) phosphorus in Northeast Lakes versus the Vollenweider (Vw) adjusted average annual SPARROW predicted concentrations. Robust non-linear regression was used to fit SPARROW predictions to 2007 NLA observations using the Vollenweider equation (H6). Observations are color coded by hydraulic residence time (HRT: Short < 0.04 years; Medium = 0.04 to 0.4 years; Long > 0.4 years). TN = Total Nitrogen. TP = Total Phosphorus.


Figure 5. Cumulative distribution functions for observed and predicted nitrogen (A) and phosphorus (B) concentrations.

Cumulative distribution functions for SPARROW predictions, the Vollenweider (H6) adjusted SPARROW predictions, and the 2007 National Lake Assessment observations. The grey polygons represent the weighted 95% confidence intervals for the NLA cumulative distributions. Note: Only the SPARROW predictions consistent with the NLA sampling design (area ≥ 4 ha; maximum depth ≥ 1 m; n=7669) and the NLA lakes selected under the probabilistic design were included (n=98).


Figure 6. Plot of model residuals by hydraulic residence time.

Panels: (A) linear model (LM) for total nitrogen; (B) Vollenweider (Vw) model for total nitrogen; (C) linear model (LM) for total phosphorus; and, (D) Vollenweider (Vw) model for total phosphorus. For both nutrients, the linear models (A and C) show a bias towards negative residuals (overestimation) for longer residence times. In contrast, the residuals for the Vollenweider models (B and D) are independent of residence time. Observations are color coded by hydraulic residence time (Short: < 0.04 years; Medium: 0.04 to 0.4 years; Long: > 0.4 years).


Figure 7. The geographical distribution of the Northeast lakes coded by trophic state.

The National Lake Assessment Lake centroids with trophic state (TS) estimated from observed total (A) Nitrogen and (B) Phosphorus concentrations. Lakes with trophic state estimated from the Vollenweider adjusted predicted total (C) nitrogen and (D) phosphorus concentrations.


Our null hypothesis (H0; Table 1) was that the NLA observed in-lake nutrient concentrations could be estimated from the SPARROW predicted export load concentrations (Nout & Pout). The expectation was that the observed values would differ from the SPARROW predictions due to seasonal and inter-annual variation in inputs but that the variation could be reduced by fitting the data to a simple linear model. For NLA lakes the unadjusted SPARROW predictions were fit to observed values with linear regression (lm procedure in R; Nutrientobs ~ Nutrientout); all variables were log10 transformed. The resulting model was used to produce new estimates of nutrient concentrations for the NLA lakes.

To test whether the SPARROW predictions could be improved by the application of input-output models we followed a two step procedure. First, we used robust non-linear regression (nlrob procedure from the R robustbase package [49]) to fit the observed NLA TN and TP concentrations, predicted SPARROW nutrient input concentrations (Nin & Pin), and the estimated hydraulic residence times and mean depths to the eight Vollenweider models listed in Table 1. Second, for each nutrient the input-output model with the lowest AIC (Akaike Information Criterion) value was selected and used to generate new nutrient concentration predictions for the NLA lakes.

Three estimates of TN and TP concentrations were made for each NLA lake; the uncorrected SPARROW output predictions (Nout & Pout), linear model adjustments to Nout and Pout, and the predictions from the best fit input-output models. Each estimate was compared to the observed values with linear regression and the adjusted R2 was used as an estimate of the amount of variance explained by the model. Root mean squared error (RMSE) was used as an estimate of goodness of fit. The model with the lowest RMSE and highest adjusted R2 was selected and used to predict the nutrient concentrations of the 17,792 lakes in the region with useable SPARROW information.


No permits were required for the described study, which complied with all relevant regulations.

Results and Discussion

We successful accomplished the three objectives of this paper to: (1) adjust the SPARROW predicted annual average concentrations to the observed summer values for 2007 with simple linear models; (2) use Vollenweider type input-output models [35-37] and modeled maximum lake depth and volume [38,39] to improve predictions; and, (3) extrapolate results from the best fit model to the ca. 18000 lakes in the Northeast United States with SPARROW nutrient flux predictions.

Comparison of SPARROW to the National Lakes Assessment

The relationship between observed and predicted nutrient concentration for both nitrogen and phosphorus appear to be linear (Figure 2). However, for the 131 lakes with both NLA observations and SPARROW predictions 76% of the TP estimates and 83% of the TN estimates fall below the one-to-one line (Figure 2) indicating the SPARROW predicts higher nutrient concentrations than were observed during the summer of 2007.

A large proportion of these differences are likely due to a combination of inter-annual and seasonal variation in nutrient loads along with model and sampling error. SPARROW estimates are based on long-term annual means at monitored sites. Therefore, observations from any given year will differ from the SPARROW estimates due to fluctuations in inputs, flow, and climatic conditions. In the Northeast U.S. most of the annual nutrient flux occurs following snow melt in the spring [33]. Therefore, summer nutrient concentrations are expected to be lower than the annual means [33] since inputs are lowest and aquatic decay is greatest during the summer months. Based on the assumption that these changes are both linear and additive, linear regression was used to adjust the predicted 2002 mean annual nutrient concentrations from the SPARROW model to the 2007 NLA observed summer conditions. This approach (H0) explains 43% and 35% of the variation in TN and TP concentrations respectively (Table 2 and Figure 3). These results indicate the SPARROW model can be used to estimate nutrient concentrations with a reasonable level of accuracy. Similar results are also reported in Moore et al, [31].

Nitrogen ResultsPhosphorus Results

Table 2. Model selection results by hypothesis and nutrient.

Hypothesis H6 (in bold) had the lowest aic and rmse values for both nutrients. Abbreviations: rmse = the root mean squared error; adjR2 = coefficient of determination adjusted for number of estimated parameters; aic = Akaike information criterion.

Improved Predictions with Vollenweider Models

An important source of bias became evident when nutrient concentration data are coded by hydraulic residence time (Figures 2 and 3). Estimated nutrient concentrations for lakes with long residence times (>0.4 years; the fourth quartile for NLA lakes) tended to be higher than observed values while the opposite trend occurred in lakes with short residence times (<0.04 years; the first quartile for NLA lakes). For both nutrients, the means of the model residuals were significantly different (p < 0.0001) between lakes with long and short residence times (Table 3) thus confirming a prediction bias due to residence time. To adjust for this, the data were fit to Vollenweider type input-output models H1-H8 (Table 1) by robust non-linear regression. Most of the models improved the prediction accuracy of the SPARROW model (Table 2). Based on AIC, H6 was selected as the best model for both phosphorus and nitrogen. Use of this model explained substantially more of the variation in observed vs. predicted concentration for both nutrients than the linear model alone. The adjusted R2 values for TN improved from 0.431 for the linear model to 0.618 for the Vollenweider model (H6) and the root mean squared error (RMSE) decreased from 0.260 to 0.214 (Table 2). For TP, H6 resulted in a higher adjusted R2 (0.541) and lower RMSE (0.359) than the linear model (adjusted R2 = 0.351; RMSE = 0.426; Table 2). The final parameterized Vollenweider models (H6) with calibrated coefficients are shown in Equations 1 and 2 (Nlake and Plake = lake TN and TP concentrations [mg/l]; Nin and Pin = TN and TP input concentrations [mg/l]; τ = hydraulic residence time [years]; and, z = mean depth [m]).


Table 3. T-test of residual means by hydraulic residence time for each nutrient.

Residuals from the linear and Vollenweider models for lakes with short (< 0.04 years) and long (> 0.4 years) residence times were compared with a t-test of the means. SD = standard deviation. N=number of observations by group. t = Student’s t-statistic. d.f. = degrees of freedom. P = probability.


log10(Nlake)=log10(Nin1+2.0τ0.38z0.29Nin1.14)Equation 1
log10(Plake)=log10(Pin1+89.0τ0.40z0.57Pin1.08)Equation 2

Use of the Vollenweider model H6 improves estimates of summer nutrient levels (Figure 4). Cumulative distribution functions and their 95% confidence intervals were calculated for the observed NLA concentrations of TN and TP with the R package “spsurvey” [48]. These were compared to the TN and TP concentrations predicted by SPARROW and the Vollenweider adjusted SPARROW predictions. For both nitrogen and phosphorus the Vollenweider adjusted predictions closely approximated the observed distribution (within the 95% confidence interval) whereas the unadjusted SPARROW predictions did not (Figure 5).

More importantly, however, the Vollenweider model decreased the bias associated with residence time. When the Vollenweider predictions were compared to the observed summer values the predictions were symmetrical around the one-to-one line (Figure 4). The differences between the linear models and the Vollenweider models are apparent when the model residuals are plotted against residence time (Figure 6). For the linear models there is a clear change from under prediction to over-prediction as residence time increases, whereas the Vollenweider residuals for both nutrients are symmetrical around the zero line. Overall the residuals for the Vollenweider models showed less deviation than those for the linear model and there were no significant differences (p > 0.05) in means of the residuals for lakes with long and short residence times (Table 3). This highlights the importance of accounting for both nutrient inputs and residence time when estimating nutrient concentrations in lakes. Whereas both the linear model and the non-linear Vollenweider model adjust the model results to summer 2007 conditions, only the Vollenweider adjusted estimates control for differences in nutrient retention related to hydraulic residence time.

The Northeast SPARROW model predicts no nitrogen retention (100*[Input- Output] / Input) and low phosphorus retention (median = 8.3%; mean= 14.0%; s.d. = 15.40) for the Northeast Lakes. These nutrient retention predictions are low compared to other published studies. Saunders and Kalff [50] found on average lakes retain 34% percent of nitrogen and similar values for nitrogen retention are reported by Reckhow ([36]; mean=35%), Windolf et al, ([37]; mean=33%), and Harrison et al, ([51]; reported as ranges). In an analysis of data from 305 lakes Brett and Benjamin [35] report a mean phosphorus retention of 40% (median = 45%) which is consistent with the results of Hejzlar et al, ([52]; lake mean = 46%; reservoir mean = 43%) but lower than the 60% reported by Reckhow [36]. When compared to the SPARROW model the H6 Vollenweider models show much higher levels of retention for both nitrogen (median = 20.7%; mean= 24.6%; s.d. = 18.51) and phosphorus (median = 29.5%; mean= 33.5%; s.d. = 25.17). Caution needs to be exercised in interpreting the nutrient retention calculations from H6 because they are confounded with the corrections for seasonal and annual variation in inputs. However, the reductions in bias related to residence time suggest that the Vollenweider model give a more accurate representation of nutrient retention than the SPARROW model alone.

The Northeast SPARROW model estimates nutrient inputs, nutrient retention, and land to water delivery fractions directly from the data [31]. It is possible to include a user defined nutrient retention estimate in the SPARROW model and this approach has been used successfully by Alexander et al, [53]. Our results suggest that incorporating a user defined retention estimate or a nutrient loss function based on hydraulic residence time for the reaches in lakes will improve the fit for the Northeast SPARROW model.

Predicting Summer Nutrient Concentration in Northeastern Lakes

The Vollenweider adjusted SPARROW predictions provide reasonable estimates for the TN and TP concentrations observed during the 2007 National Lake Assessment. Equations 1 and 2 are used to extend these predictions to 17,810 lakes in the Northeast region of the United States. To visualize the data, we follow the 2007 National Lake assessment [28] in assigning trophic status to lakes as follows: oligotrophic (TN ≤ 0.35 mg/l; TP ≤ 0.01), mesotrophic (0.35 < TN ≤ 0.75 mg/l; 0.01 < TP ≤ 0.25), eutrophic (0.75 < TN ≤ 1.4 mg/l; 0.25 < TP ≤ 0. 5), and hypereutrophic (TN > 0.5 mg/l; TP > 1.4). Figure 7 shows the trophic status of Northeastern lakes in based on observed (NLA) and predicted (Vollenweider adjusted) nutrient concentrations. Visually, there is high degree of spatial concordance between observed and predicted trophic state with similar patterns for both nutrients. The trend is for a predominance of lower nutrient, oligotrophic and mesotrophic lakes in the north and higher elevation sites in the south. In contrast, higher nutrient, eutrophic and hypereutrophic lakes are more common in the agricultural areas of the Chesapeake drainage and the urbanized areas of the mid-Atlantic region.

Potential Use of Predicted Nutrient Concentrations

In this paper we demonstrate how the predictions from the USGS SPARROW model can be used to assess summer nutrient concentrations in lakes. Although the SPARROW model was designed to give reach level information for streams, reaches within lakes can be aggregated to estimate long-term flow-weighted average annual nutrient concentrations in lakes. These concentrations, however, may not reflect the summer conditions that are of greater interest to lake managers. Furthermore, average annual conditions may not accurately capture inter-annual variation in inputs. By fitting Vollenweider models to monitoring data the SPARROW predictions can be used to more accurately predict summer nutrient concentrations in lakes. Care should be used in interpreting the results. The Northeast SPARROW model is based on 2002 landscape conditions and it is highly likely that for any given watershed these conditions will have changed. Whereas the prediction uncertainty may be high for individual lakes, the SPARROW model gives a reasonable assessment of conditions for lakes aggregated at the state and regional levels.

The modified SPARROW predictions for nutrient concentrations in lakes could be useful to States for the design of monitoring programs aimed at the development of water quality standards and the assessment of impaired waters under the section 303d of the Clean Water Act. Although modeled nutrient concentrations will never replace monitoring data, they could be used to target limited sampling funds to areas with highest estimated nutrient concentrations or greatest uncertainties [54].

Once impairments have been established, the SPARROW model predictions could also be used as a tool to evaluate scenarios for TMDL (total maximum daily load) reductions necessary to remove impaired lakes from the 303d list. In addition to predicting total loads, the SPARROW model also provides numerical estimates of loads by sources such as agriculture, atmospheric, deposition, and runoff from urban areas (see 31). The USGS has recently released a web-based SPARROW decision support system that allows managers to estimate changes in loads that will accrue from modifications of nutrient input sources [43]. This tool could be used to predict how changes in management practices will affect nutrient loads to streams and lakes.

Many ecosystem services, such as lake shore housing amenity value, recreational opportunities for fishing, wildlife viewing, boating, contemplation, and the provisioning of safe drinking and irrigation water, are stongly affected by nutrient loads. Ecosystem dis-services such as cyanobacteria and their human and animal health risks are also affected by nutrients in lakes. As a result, it will be possible to also use the SPARROW decision support system to model how changes to loads could affect ecosystem services in lakes.

Supporting Information

Dataset S1.

Nitrogen, phosphorus, flow, and lake morphometry data used in the analyses. See Text S1 for data definitions. Data in comma separated value (CSV) format.



Text S1.

R-code in text format to replicate the analyses. Use this code in conjunction with Dataset S1 to replicate the statistical analyses, tables and figures 2-7.




We thank Kristen Hychka, Betty Kreakie, Jamie Shanley, Tim Gleason, Wayne Munns, Tomoya Iwata, and an anonymous reviewer for the comments and criticisms that greatly improved the manuscript. Eric Everman provided assistance with the USGS SPARROW decision support system. This paper has not been subjected to Agency review. Therefore, it does not necessary reflect the views of the Agency. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. This contribution is identified by the tracking number ORD-003873 of the Atlantic Ecology Division, Office of Research and Development, National Health and Environmental Effects Research Laboratory, US Environmental Protection Agency.

Author Contributions

Conceived and designed the experiments: WBM JWH RBM HAW. Performed the experiments: WBM JWH RBM. Analyzed the data: WBM JWH RBM. Contributed reagents/materials/analysis tools: WBM JWH RBM HAW. Wrote the manuscript: WBM JWH.


  1. 1. Falkowski P, Scholes RJ, Boyle E, Canadell J, Canfield D et al. (2000) The global carbon cycle: A test of our knowledge of Earth as a system. Science 290: 291-296. doi:10.1126/science.290.5490.291. PubMed: 11030643.
  2. 2. Galloway JN, Aber JD, Erisman JW, Seitzinger SP, Howarth RW et al. (2003) The nitrogen cascade. Journal of Biosciences 53: 341-356. Available online at: doi:10.1641/0006-3568(2003)053[0341:TNC]​2.0.CO;2
  3. 3. Galloway JN, Dentener FJ, Capone DG, Boyer EW, Howarth RW et al. (2004) Nitrogen cycles: Past, present, and future. Biogeochemistry 70: 153-226. doi:10.1007/s10533-004-0370-0.
  4. 4. Galloway JN, Townsend AR, Erisman JW, Bekunda M, Cai Z et al. (2008) Transformation of the nitrogen cycle: Recent trends, questions, and potential solutions. Science 320: 889-892. doi:10.1126/science.1136674. PubMed: 18487183.
  5. 5. Seitzinger SP, Harrison JA, Dumont E, Beusen AHW, Bouwman AF (2005) Sources and delivery of carbon, nitrogen, and phosphorus to the coastal zone: An overview of Global Nutrient Export from Watersheds (NEWS) models and their application. Global Biogeochemical Cycles 19: 1-11. doi: 10.1029/2005gb002606
  6. 6. Cordell D, Drangert J, White S (2009) The story of phosphorus: Global food security and food for thought. Global Environmental Change 19: 292-305. doi:10.1016/j.gloenvcha.2008.10.009.
  7. 7. Liu Y, Villalba G, Ayres RU, Schroder H (2008) Global phosphorus flows and environmental impacts from a consumption perspective. Journal of Industrial Ecology 12: 229-247. doi:10.1111/j.1530-9290.2008.00025.x.
  8. 8. Turner RE, Rabalais NN (2003) Linking landscape and water quality in the Mississippi river basin for 200 years. Journal of Biosciences 53: 563-572. Available online at: doi:10.1641/0006-3568(2003)053[0563:LLAW​QI]2.0.CO;2
  9. 9. Hasler AD (1969) Cultural eutrophication is reversible. Journal of Biosciences 19: 425-431. doi:10.2307/1294478.
  10. 10. Smith VH, Tilman GD, Nekola JC (1999) Eutrophication: impacts of excess nutrient inputs on freshwater, marine, and terrestrial ecosystems. Environ Pollut 100: 179-196. doi:10.1016/S0269-7491(99)00091-3. PubMed: 15093117.
  11. 11. Schindler DW, Vallentyne JR (2008) The algal bowl : Overfertilization of the world's freshwaters and estuaries. Edmonton: University of Alberta Press. 348 pp.
  12. 12. Hudnell HK (2010) The state of U.S. freshwater harmful algal blooms assessments, policy and legislation. Toxicon 55: 1024-1034. doi:10.1016/j.toxicon.2009.07.021. PubMed: 19646465.
  13. 13. Hudnell HK, Dortch Q (2008) A synopsis of research needs identified at the Interagency, International. Symposium on Cyanobacterial Harmful Algal Blooms (ISOC-HAB). Advances in Experimental Medicine and Biology: 17-43.
  14. 14. Wolfe AH, Patz JA (2002) Reactive nitrogen and human health: Acute and long-term implications. Ambio 31: 120-125. Available online at: doi:10.1579/0044-7447-31.2.120;2. PubMed: 12078000
  15. 15. Townsend AR, Howarth RW, Bazzaz FA, Booth MS, Cleveland CC et al. (2003) Human health effects of a changing global nitrogen cycle. Frontiers in Ecology and the Environment 1: 240-246. Available online at: doi:10.1890/1540-9295(2003)001[0240:HHEO​AC]2.0.CO;2
  16. 16. Moomaw WR, Birch MB (2005) Cascading costs: an economic nitrogen cycle. Science in China Series C, Life Sciences. Chinese Academy of Sciences 48 Spec No. 2: 678-696.
  17. 17. Pretty JN, Mason CF, Nedwell DB, Hine RE, Leaf S et al. (2003) Environmental costs of freshwater eutrophication in England and Wales. Environ Sci Technol 37: 201-208. doi:10.1021/es032467q. PubMed: 12564888.
  18. 18. Dodds WK, Bouska WW, Eitzmann JL, Pilger TJ, Pitts KL et al. (2009) Eutrophication of U.S. freshwaters: Analysis of potential economic damages. Environ Sci Technol 43: 12-19. doi:10.1021/es801217q. PubMed: 19209578.
  19. 19. USEPA (1994) National water quality inventory: 1994 report to Congress. Washington, D.C.. Available: Available online at:​wa/305b/94report_index.cfm. Accessed 2013 October 29.
  20. 20. USEPA (1996) National water quality inventory: 1996 report to Congress. Washington, D.C.. Available: Available online at:​wa/305b/96report_index.cfm. Accessed 2013 October 29.
  21. 21. USEPA (1998) National water quality inventory: 1998 report to Congress. Washington, D.C.. Available: Available online at:​wa/305b/98report_index.cfm. Accessed 2013 October 29.
  22. 22. USEPA (2000) National water quality inventory 2000 report. Washington, D.C.. Available: Available online at:​wa/305b/2000report_index.cfm. Accessed 2013 October 29.
  23. 23. USEPA (2002) National water quality inventory: Report to Congress 2002 reporting cycle. Washington, D.C.. Available: Available online at:​wa/305b/2002report_index.cfm. Accessed 2013 October 29.
  24. 24. USEPA (1998) National strategy for the development of regional nutrient criteria. Available: .​waterquality/standards/criteria/aqlife/p​ollutants/nutrient/nutsi.cfm. Accessed 2013 October 29.
  25. 25. USEPA (2009) EPA needs to accelerate adoption of numeric nutrient water quality standards. Available: .​0826-09-P-0223.pdf. Accessed 2013 October 29.
  26. 26. Reckhow KH, Arhonditsis GB, Kenney MA, Hauser L, Tribo J et al. (2005) A predictive approach to nutrient criteria. Environ Sci Technol 39: 2913-2919. doi:10.1021/es048584i. PubMed: 15926533.
  27. 27. Lamon EC III, Qian SS (2008) Regional scale stressor-response models in aquatic ecosystems. Journal of the American Water Resources Association 44: 771-781. doi:10.1111/j.1752-1688.2008.00205.x.
  28. 28. USEPA (2009) National lakes assessment: A collaborative survey of the Nation’s lakes. U.S. Environmental Protection Agency . Available: .​la_newlowres_fullrpt.pdf. Accessed 2013 October 29.
  29. 29. Schwarz GE, Hoos AB, Alexander RB, Smith RA (2006) The SPARROW surface water-quality model theory, application, and user documentation. US Geological Survey Techniques and Methods Report, Book 6, Chapter. Reston, VA, U.S. Dept. of the Interior, U.S. Geological Survey. p. 248, Available: .​3. Accessed 2013 October 29.
  30. 30. Preston SD, Alexander RB, Woodside MD, Hamilton PA (2009) SPARROW modeling enhancing understanding of the Nation's water quality. US Geological Survey Scientific Fact Sheet 2009–3019. Available: . Accessed 2013 October 29.
  31. 31. Moore RB, Johnston CM, Smith RA, Milstead B (2011) Source and delivery of nutrients to receiving waters in the Northeastern and Mid-Atlantic regions of the United States. J Am Water Resour Assoc 47: 965-990. doi:10.1111/j.1752-1688.2011.00582.x. PubMed: 22457578.
  32. 32. USEPA USGS (2006) National Hydrography Dataset Plus version 1 (NHDPlusV1). Available: .​HDPlusV1_home.php. Accessed 2013 October 29.
  33. 33. Alexander R, Böhlke J, Boyer E, David M, Harvey J et al. (2009) Dynamic modeling of nitrogen losses in river networks unravels the coupled effects of hydrological and biogeochemical processes. Biogeochemistry 93: 91-116. doi:10.1007/s10533-008-9274-8.
  34. 34. Vollenweider R (1975) Input-output models. Schweizerische Zeitschrift für Hydrologie 37: 53-84. doi: 10.1007/bf02505178
  35. 35. Brett MT, Benjamin MM (2008) A review and reassessment of lake phosphorus retention and the nutrient loading concept. Freshwater Biology 53: 194-211. doi: 10.1111/j.1365-2427.2007.01862.x
  36. 36. Reckhow KH (1988) Empirical models for trophic state in Southeastern U.S. lakes and reservoirs. Journal of the American Water Resources Association 24: 723-734. doi:10.1111/j.1752-1688.1988.tb00923.x.
  37. 37. Windolf J, Jeppesen E, Jensen JP, Kristensen P (1996) Modelling of seasonal variation in nitrogen retention and in-lake concentration: A four-year mass balance study in 16 shallow Danish lakes. Biogeochemistry 33: 25-44. doi: 10.1007/bf00000968
  38. 38. Hollister J, Milstead WB (2010) Using GIS to estimate lake volume from limited data. Lake and Reservoir Management 26: 194-199. doi:10.1080/07438141.2010.504321.
  39. 39. Hollister JW, Milstead WB, Urrutia MA (2011) Predicting maximum lake depth from surrounding topography. PLOS ONE 6: e25764. doi:10.1371/journal.pone.0025764. PubMed: 21984945.
  40. 40. USEPA USGS (2010) NHDPlus version 1 (NHDPlusV1) user guide. Available: .​DPlusV1/documentation/NHDPLUSV1_UserGuid​e.pdf. Accessed 2013 October 29.
  41. 41. Gesch D, Evans G, Mauck J, Hutchinson J, Carswell WJ Jr (2009). National Map - Elevation: U.S. Geological Survey Fact Sheet: 2009-3053, Available: . Accessed 2013 October 29.
  42. 42. USEPA (2007) National lake assessment data. Available: .​.cfm. Accessed 2013 October 29.
  43. 43. Booth NL, Everman EJ, Kuo IL, Murphy L, Sprague L (2011) A web-based decision support system for assessing regional water-quality conditions and management actions. Journal of the American Water Resources Association 47: 1136-1150. doi:10.1111/j.1752-1688.2011.00573.x. PubMed: 22457585.
  44. 44. USGS (2012) SPARROW Decision Support System: Total nitrogen model for the New England and Mid-Atlantic region - 2002. Available: . Accessed 2013 October 29.
  45. 45. USGS (2012) SPARROW Decision Support System: Total phosphorus model for the New England and Mid-Atlantic region - 2002. Available: . Accessed 2013 October 29.
  46. 46. Bachmann RW (1980) Prediction of total nitrogen in lakes and reservoirs. In USEPA; Restoration of lakes and inland waters: International symposium on inland waters and lake restoration (Sept. 8-12, 1980). Portland, ME. EPA. p. 320. pp 324.
  47. 47. R Development Core Team (2012) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available: . Accessed 2013 October 29.
  48. 48. Kincaid TM, Olsen AR (2012) spsurvey: Spatial survey design and analysis. R package version 2.6. Available: .​psurvey/index.html. Accessed 2013 October 29.
  49. 49. Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M et al. (2012) robustbase: Basic robust statistics. R package version 0.8-1-1. Available: .​obustbase/index.html. Accessed 2013 October 29.
  50. 50. Saunders DL, Kalff J (2001) Nitrogen retention in wetlands, lakes and rivers. Hydrobiologia 443: 205-212. doi:10.1023/A:1017506914063.
  51. 51. Harrison JA, Maranger RJ, Alexander RB, Giblin AE, Jacinthe PA et al. (2009) The regional and global significance of nitrogen removal in lakes and reservoirs. Biogeochemistry 93: 143-157. doi:10.1007/s10533-008-9272-x.
  52. 52. Hejzlar J, Šámalová K, Boers P, Kronvang B (2006) Modelling phosphorus retention in lakes and reservoirs. In: B. KronvangJ. FaganeliN. Ogrinc. The Interactions Between Sediments and Water. Springer Netherlands. pp. 123-130.
  53. 53. Alexander RB, Boyer EW, Smith RA, Schwarz GE, Moore RB (2007) The role of headwater streams in downstream water quality. J Am Water Resour Assoc 43: 41-59. doi:10.1111/j.1752-1688.2007.00005.x. PubMed: 22457565.
  54. 54. National Research Council (2001) Assessing the TMDL approach to water quality management. National Academy Press Washington, D.C..