Conceived and designed the experiments: YT. Performed the experiments: YT PWR. Analyzed the data: YT. Wrote the paper: YT. Reviewed and edited manuscript: PWR DC. Obtained funding and permit: DC.
The authors have declared that no competing interests exist.
Animal tracking is a growing field in ecology and previous work has shown that simple speed filtering of tracking data is not sufficient and that improvement of tracking location estimates are possible. To date, this has required methods that are complicated and often time-consuming (state-space models), resulting in limited application of this technique and the potential for analysis errors due to poor understanding of the fundamental framework behind the approach. We describe and test an alternative and intuitive approach consisting of bootstrapping random walks biased by forward particles. The model uses recorded data accuracy estimates, and can assimilate other sources of data such as sea-surface temperature, bathymetry and/or physical boundaries. We tested our model using ARGOS and geolocation tracks of elephant seals that also carried GPS tags in addition to PTTs, enabling true validation. Among pinnipeds, elephant seals are extreme divers that spend little time at the surface, which considerably impact the quality of both ARGOS and light-based geolocation tracks. Despite such low overall quality tracks, our model provided location estimates within 4.0, 5.5 and 12.0 km of true location 50% of the time, and within 9, 10.5 and 20.0 km 90% of the time, for above, equal or below average elephant seal ARGOS track qualities, respectively. With geolocation data, 50% of errors were less than 104.8 km (<0.94°), and 90% were less than 199.8 km (<1.80°). Larger errors were due to lack of sea-surface temperature gradients. In addition we show that our model is flexible enough to solve the obstacle avoidance problem by assimilating high resolution coastline data. This reduced the number of invalid on-land location by almost an order of magnitude. The method is intuitive, flexible and efficient, promising extensive utilization in future research.
Monitoring the movement of animals is fundamental for investigating processes and patterns of animal distribution, habitat use and selection, habitat connectivity, recruitment, migrations, and foraging strategies. Movements of freely ranging animals are typically studied using some form of telemetry due to the difficulties of visually tracking individual animals in the wild. However, the various forms of telemetry come with certain limitations, such as limited spatial accuracy and low and/or uneven temporal resolution of recorded locations
A variety of approaches have been developed to correct telemetry data by: 1) reducing spatial errors and 2) correcting for temporal lags and unevenness between data points. For the first process, filtering techniques are commonly applied to the data, based on the previous estimation of a maximum traveling speed of the animal
State-space modeling (or state-space models; SSM) is an alternative process that uses the error in the data as a source of information to infer the likelihood of the animal's position
The aim of this paper is to propose a simpler, non-state-based random walk (RW) modeling approach that uses forward particle sampling as a parsimonious, intuitive, efficient and practical alternative to correcting and interpolating tracking data.
In developing this new methodology, we imposed several requirements:
As in state-space models, we use estimates of spatial accuracy as a source of information to infer a probable animal position for a given time.
Contrary to state-space models, we do not speculate on the unknown state of an animal to infer a subsequent position.
The method must output a track with a custom fixed time interval, thus dealing with the corrective and interpolating processes in a single step.
In many cases, other information independent from the tracking data can also be used in the modeling process, such as physical boundaries known to constrain dispersal or habitat characteristics known to provide more or less favorable habitat. For example, these can be forest or city limits in terrestrial environment, or coastlines in the marine environment. We must have a way to include these sources of information in the process.
Each output estimate of the animal's position must come with a valid estimation of confidence.
The method must be tested on real data in a way that performance can be validated.
Each step must be intuitively easy to understand, i.e. as simple as possible. While this is subjective, it will be crucial in determining both the usefulness and probability that the method will become accepted and employed by the greater animal tracking community.
All procedures used were approved by the UCSC CARC (IACUC) committee and permitted under NMFS marine mammal permits #786-1463 and #87-143.
Our focus here is on tracking data collected via the ARGOS satellite system and archival light-based geolocation telemetry, which are the two major techniques requiring post-processing of raw data. The general framework is, however, not restricted to these tracking techniques and can be adapted to any tracking data. Few studies have evaluated the performance of a model for tracking data because the true position of animals (at sea) has until recently been impossible to determine with better accuracy than with the actual tracking method used. To validate our method, we considered several movement pathways from marine animals, each bringing a possibility of evaluating the accuracy of the method and/or posing a particular analytical challenge, as detailed below:
Dataset 1
The first data set was composed of 3 Argos tracks of adult female northern elephant seals,
Dataset 2
The second dataset is one ARGOS track from one adult female northern elephant seal. This animal was equipped with an ARGOS-only transmitter (SMRU SRDL). Although no GPS data were available for comparison, this track was chosen because the animal ventured into coastal waters of British Colombia (Canada), into a meander of fjords and islands. Elephant seals do not cross islands and do not haul-out during their migrations. Therefore, this track must be constrained by coastlines. We used this knowledge by assimilating the Global Self-consistent Hierarchical, High-resolution Shoreline database
Dataset 3
The third dataset is a track obtained from one of the three female elephant seals from Dataset I, but in this case her track was estimated using the geolocation method. Diurnal patterns of light levels measured via a time series are used to estimate one position per day
Track number | #1 | #2 | #3 |
Duration (Days) | 83.8 | 68.3 | 222.2 |
Number of raw locations | 988 | 652 | 1042 |
Number of raw locations per day | 11.8 | 9.5 | 4.7 |
ARGOS Class 3 (%) | 0.2 | 0 | 0.4 |
ARGOS Class 2 (%) | 1.2 | 0.6 | 0.7 |
ARGOS Class 1 (%) | 2.9 | 1.5 | 2.2 |
ARGOS Class 0 (%) | 11.0 | 7.4 | 8.7 |
ARGOS Class A (%) | 29.9 | 23 | 22.7 |
ARGOS Class B (%) | 40.4 | 49.5 | 53.4 |
ARGOS Class Z (%) | 14.4 | 17.9 | 11.8 |
Percentage of locations removed by speed filter | 52.8 | 64.4 | 55.0 |
Number of filtered location per day | 5.6 | 3.4 | 2.1 |
Tracks are sorted by decreasing order of quality (based of the number of location per day). The number of location for the different ARGOS classes is given as percentages.
Field methodologies followed standard and approved procedures by the Institutional Animal Care and Use Committee at the University of California (Santa Cruz) and were described elsewhere (see
Animal movement is best described as a time series of movement steps
The model is illustrated in
Based on the raw data (A) and knowledge about their inaccuracy, the first step consists of generating a number of possible locations for each recorded point (B). The distribution of these particles follows a known or estimated error distribution for each point. Based on the likelihood of the speed required to get from a point to a given particle or any other known information if relevant, a weight is assigned to each particle (C). From a starting position, some forward particles will serve as attractors for constructing random walks. These forward particles define a distribution of speed and azimuth from which one random step is selected (D). The repetition of this process generates one random walk. This process is bootstrapped in order to generate many possible random walks (E). From this set of random walks, an average track is calculated. For each position of the average track, an error can be estimated from all of the corresponding locations of the set of random walks (F).
Panel A shows the whole track as obtained using a classical speed filter and the location of the inset panels (black rectangle). From the recorded raw data (B) a set of particles is generated (gray dots in C). Using these particles a number of random walks is computed (D), which allow the calculation of an average track (green line in E) and associated error footprint (gray area in E: the accumulation of all error circles for every steps in the model). Red circles in panel E are the location of highly accurate GPS location obtained on a duty cycle fashion for this track.
To create the particles, we used estimated errors from our own static tests of ARGOS data
With this dataset, particles were weighted according to a probability distribution of the local speeds estimated from the four prior and four following recorded points. From these locations, all combinations of speed were calculated, and only the likely ones (below a maximum speed threshold set by the user are kept, here 12.6 km/h in accordance with previous study
One characteristic of tracking data (including ARGOS) is that the error of recorded location is probabilistic. That is any location has a low probability of being very wrong, independently of its given error (i.e. errors are strongly non-gaussian
In our simulations, we generated random walks with steps every 30 minutes, which is close to the average duration of a dive in elephant seals
Track quality is not easily defined because it is a combination of location quality and frequency in relation to the animal's speed, and it can be variable within a track record. In our case, reported location qualities were roughly similar in the three tracks (
Box plot depicts the 25th and 75th percentile around the median with whiskers extending to the last non-outlier value. Outliers are observations (dots) that lay over 1.5 times the inter-quartile range from the start or end of the 25th–75th percentile box. Panel B shows the relationship between the model estimate of confidence (99% confidence radius of the modeled positions) and the actual error. The relationship was only calculated using points within the 95% confidence ellipse.
In order to evaluate the difference between our model and classical methods, we applied a speed filter at 12.6 km/h on the raw data
Consistent with relatively low error locations, instantaneous speeds calculated from our model were positively related to the speed recorded with GPS for all locations that were less than 2 minutes apart (R = 0.638, P<0.001, N = 97). Such correlation at a small time scale is remarkable given the relative scarcity of ARGOS data. Smoothing the pattern of speed by taking into account the previous and subsequent 3 points in the record (using a moving average) permits to look at speed at a slightly coarser scale and shows further improvement of the fit, yielding a quasi one to one relationship (
Smoothing was done using a moving average including the previous and next 3 points.
For each average location estimated with the model (i.e. each averaged step) we calculated the 99% confidence radius using the 30 location alternatives. This radius defines a circle that can be used as a standard measure of spatial dispersion. We showed that 75.6, 66.7, and 39.7% of the GPS positions fell within the circle in the 3 tracks respectively. By doubling the radius size, we found that 92.7, 91.7 and 87.1% of the GPS locations fell in the circle footprint. We therefore suggest that twice the 99% confidence radius calculated on the model output can be used as a valid estimate for the “real” 90% confidence error for each step. However, we found that the footprint made by the successive 99% confidence radii was usually larger than the footprint made by the random walks (
The 99% confidence radius was positively related to the actual distance between the model average location and the corresponding GPS location (Radius = 6.0+0.33×Distance, n = 87, R = 0.491, P<0.001,
We ran a sensitivity analysis to assess how many random walks were necessary to obtain a reliable and stable track estimate. Convergence was reached when the average standard deviations of latitude and longitude became stable, indicating that any additional track would not change the location or spatial extent of the track. This was achieved using between 15 and 20 track iterations (
With Dataset 2 we confronted the problem of obstacle avoidance. In our case the seal obviously cannot cross over land, presenting a problem common in tracking studies involving coastal marine species. Solving the obstacle contouring problem in a bootstrapping context requires taking into account the compromise between a satisfactory result and the computing time needed to obtain it. Here, we added several intermediate particles to the track data in order to re-route the path when it cut through the obstacles. The geographic average of these added particles belonged to the convex Hull of the coastline polygon crossed, and were selected to achieve the shortest possible path. When several polygons were crossed, the various combinations of convex hull points were processed through Dijkstra's algorithm in order to find the shortest path between the different combinations of possible paths
Black line represents the track from the raw ARGOS data. Darker polygons represent land masses and light grey background represents water. Note that only the part of the track that was within the islands (British Colombia, North East Pacific, Canada) is shown.
Due to differences in the characteristics of light-based geolocation, we treated these data differently. In particular, errors from geolocation data are much higher latitudinally than longitudinally. The geolocation solver produced estimates of errors with some south and north boundaries for each location. We randomly generated particles so that they lay in the ellipse delimited by these boundaries. Because geolocation errors are much larger than for ARGOS data, we generated 500 particles per record instead of 50 with ARGOS data (empirically). The frequency of geolocation data provide one location per day, and SST correction applies to these locations
Fifty percent of the distances between modeled locations and the corresponding GPS locations (n = 41, locations less than 2 minutes apart) were less than 104.8 km (<0.94°) and 90% were less than 199.8 km (<1.80°). The average error was 108.4±66.8 km (0.98±0.60°). Careful examination of the track revealed that larger errors occurred at the end of the track, at locations where SST gradients were weak, and therefore SST correction had little effect (
The black line and dots represent the light-based geolocation data and the blue line and crosses represent the result of the model applied to the ARGOS data instead of the geolocation data (used here as reference since it is much more accurate – see
Bootstrapping random walks generated using forward particles in tracking data is an intuitive, relatively fast and efficient way of handling the caveats associated with current tracking techniques. To our knowledge, this is the first time that a track improvement model technique has been directly validated with animals for which “true” positions are known. Although we have illustrated our approach using ARGOS and geolocation data, the technique is applicable for any remotely sensed movement data for which an estimate of accuracy can be made.
When selecting a method, one must define an acceptable tradeoff between performance and the complexity or computation time required. In the case of tracking data, deciding on acceptable performances also depends on what is usually expected with the type of tracking methodology used For example, methods can be referred to as “ARGOS” or “Geolocation” to imply common knowledge accuracy, but both methods may give track records of very different qualities depending on the type of animal tracked and the type of tracking device used, etc. For example, diving animals may provide decreased ARGOS and geolocation track qualities compared to non-diving animals
In elephant seals, very long dive durations and very short surface intervals between dives leaves few opportunities for the ARGOS tag to transmit signals to overhead satellites, which produces the poorest ARGOS track qualities
Our approach is based on calculating a location from a cloud of weighted particles which can be manipulated as needed. This yields great flexibility, allowing us to apply corrections based on known constraints or data available. For example, if gaps in the data exist, some particles may be added in a wide footprint within the gap, thus allowing the random walks to disperse and increasing uncertainty (i.e. decreasing confidence) where data are absent.
We showed that Sea Surface Temperature (SST) data can be assimilated into our model. To our knowledge, this is the first study reporting sub-degree average accuracy with SST-corrected geolocation data. The few studies that had assessed the accuracy of the SST-corrected geolocation method reported accuracies of 1.82°±1.54° (mean±SD) in albatrosses
Interestingly, one study assimilated both SST and bathymetry in order to correct geolocation tracks of gray seals (
Analyzing animals' behavior in relation to environmental characteristics may enable estimation of a confidence metric for each position, which is rarely available with recorded ARGOS data (ARGOS location of class A,B and Z for example) and not obtainable with the speed filter+interpolation methods. This confidence metric is essential for interpreting the behavior of the animal at an appropriate scale. For example, if the confidence radius is 10 km, it is probably inappropriate to interpret movements shorter than this distance. Similarly, the environment characteristics at a given location may be gathered within the location error footprint rather than under each average location, potentially reducing noise in the data and improving habitat model confidence and determination.
In this paper we used a circle to estimate confidence, but this could be done differently. For example, if the errors are systematically biased towards one direction (for example latitudinally as in the case of geolocation data), it would be straightforward to determine another metric based on the output from the random walks. For example, the standard deviation in latitude and longitude, or the maximum distance between the various positions and their average could be used as well. More complex methods such as the determination of an ellipse (as mentioned earlier) could be used to account for particular spatial structure that might occur in the error footprint. For example, this could allow us to discriminate between the error vectors occurring along the animal path to the error vectors occurring laterally to the animal path. Overall, this shows that our approach is not sealed to one scheme, but instead, it is very open to user input and experimentation.
For most applications a biased random walk approach like ours seems to be an excellent compromise between complexity, computation time, ease of implementation and effectiveness, especially when compared to state-space models (SSM). Some SSM users and developers have themselves considered the approach as a technically difficult statistical framework
A recent attempt to assess SSM accuracy was made using artificial tracks, and reported mean absolute errors to be at best one to two times larger than what we recorded in this study
State-space models have the ability to produce an estimate of the animal's behavioral mode, and this can be seen as a major advantage over our method. The behavioral mode is usually characterized by a certain bearing (or turn) variability and a certain speed
State-space models were presented as a way to use all available information in the tracking data without the need for filtering processes to be performed
Finally, it is important to note that the advantages or inconvenient of using a method or another might also depend on one's initial goals. By fitting a mechanistic model to the data, state-space models are by construction more predictive than our approach. This possibility might be of interest for comparing different mechanistic models of individual movement and therefore to explore the effects of different behavioral processes on the dispersal of individuals. This cannot be done directly with our approach, but as a second step, by comparing the output of our method to the output obtained under a given theoretical framework.
Our model uses particles, but it differs from particle sampling filters by the fact that particles are not generated using prior behavioral information
Previous work has shown that simple filtering is wasteful and inefficient and that additional, valuable behavioral information can be extracted from tracking data. To date, this has required methods that are both complicated and time-consuming, resulting in limited application and the potential for analysis errors due to poor understanding. The particle filter model outlined here attempts to improve the quality of tracking data while operating by a framework that is both accessible and efficient. This method improves the accuracy of positions and assigns an estimate of spatial error, facilitating subsequent post-hoc behavioral analyses. In order to share this method with the research community, we will establish a dedicated website to provide source codes, examples, and a manual. The package will be known as the IKNOS-WALK program.
The authors are grateful to all field volunteers and assistants for their work in the field, in particular S. Simmons and J. Hassrick. We are grateful to the Año Nuevo state park for allowing access to the elephant seal colonies and for logistical support from the UC Natural Reserve System. Animal handling procedures were approved by the Chancellor's Animal Research Committee UC Santa Cruz.