# Robotic ecology: Tracking small dynamic animals with an autonomous aerial vehicle

## INTRODUCTION

Conservation management of certain critically endangered species relies on understanding how these species interact with their environment. This is achieved by tagging and tracking individual animals in the wild (*1*–*3*). Aerial robot systems can access rugged areas that are difficult for humans to traverse and thus are viewed as a potentially revolutionary tool for data collection in wildlife ecology (*4*, *5*). However, this potential remains largely unrealized. Robot systems have yet to achieve levels of signal detection, tracking accuracy, and speed that are sufficient to legitimize their role as a replacement for human trackers. Despite recent advances in automated wildlife telemetry tracking, very little is known about the movement of small, dynamic migratory species, of which many have reached critically endangered status. For large to medium animals, the miniaturization of GPS tags with remote data readout has facilitated a marked increase in understanding the movements of a diversity of species (*6*, *7*). Methods such as satellite telemetry have far-reaching applications, from investigating migration routes and wintering areas of large migratory birds (*8*–*10*) to studying the dynamics of aquatic predators (*11*, *12*). Unfortunately, these approaches are still only suitable for about 70% of bird species and 65% of mammal species (*2*). In the case of smaller species that return to the same breeding areas seasonally, miniature nontransmitting data loggers can be used (*2*); however, retrieving these data requires relocating the animals in situ. Because of this challenge, very high frequency (VHF) tracking has become one of the most useful techniques in ecology and management (*13*). This involves attaching small radio transmitters to animals and subsequently tracking the target species. Although scientists have been using VHF tracking since the early 1960s (*14*), data yielded by this approach are sparse because of the manual labor involved (*2*) and the need to constantly seek higher elevations to increase signal detectability. Thus, researchers are increasingly using low-cost unmanned aerial vehicles (UAVs) equipped with visual sensors as alternative tools for improving conservation management and wildlife monitoring (*4*, *5*). However, the capability of these systems is limited in terms of identifying individual animals and locating animals in unknown locations. Practical considerations when using remotely piloted aircraft for fieldwork are discussed in (*15*).

In recent years, there has been increased interest in end-to-end wildlife telemetry tracking with robotic systems (*4*), where the robot moves autonomously to track a live target animal. The usefulness of these systems, however, has yet to be proven in direct performance comparison with the traditional manual approach. Ongoing research is aimed at tracking radio-tagged carp in Minnesotan lakes using autonomous surface vehicles on the water and mobile ground vehicles when the lake is frozen (*16*–*21*). Although this project has yielded seminal work in the field, the use of ground and surface vehicles is untenable for wildlife situated in rugged habitats. We recently validated the use of a multirotor UAV for autonomously localizing live radio-tagged birds in such environments (the *Manorina melanocephala*) (*22*). Here, we present validation that robotic systems can perform comparably to experienced human trackers in head-to-head experiments. Our system is illustrated in Fig. 1.

Fig. 1

Swift parrots (*L. discolor*) (**A**) are considered here; noisy miners (*M. melanocephala*) (**B**) were considered in our previous work (*22*). (**C**) The robot was able to track swift parrots and yielded performance comparable with an expert human operator performing traditional wildlife telemetry tracking. The multirotor platform, shown from the front (**D**) and rear (**E**), includes a lightweight directional antenna system and payload that receives the signal strength from the tag (*22*). These data were then transmitted to a GCS for processing and online decision-making.

The majority of research in radio tracking with an aerial vehicle has focused on isolated subsystems. Although these systems were typically motivated by the idea of tracking small animals [e.g., bird (*23*–*26*) and fish species (*27*, *28*)], only simulations or prototypes have been presented with limited field testing. Alternatively, when tracking a relatively stationary target, the observations can be considered more robust, and thus, attention in this field has shifted to optimizing planning for single- (*18*, *20*, *21*) or multirobot (*19*, *29*) systems. The main assumption the authors have made is that the sequential observations are homoscedastic, meaning that the uncertainty over each measurement is constant or bounded. However, with a sporadic and unpredictable live target, this assumption is violated because of the resulting wide spectrum of observation quality from noisy to precise. As we show in this paper, this induces heteroscedastic observations, where the uncertainty varies with every observation. Failing to distinguish between low- and high-quality observations can lead to overconfident measurements that cause spurious location estimates or highly uncertain location estimates that are of little value.

A mathematically valid observation model is critical in planning the motion of the robot to improve the location estimate. In robotics, this general problem is known as active perception (*30*, *31*) and introduces a coupling between data collection and planning. The idea of passively locating transmitting radio sources has been investigated in operations research motivated by search-and-rescue missions, where stationary distress beacons must be recovered rapidly. Hence, the task is a coverage problem solved via offline strategies with an emphasis on minimizing path cost over the entire area or teleoperated by humans (*32*). Alternatively, when the wildlife habitat is known and bounded, sensor networks can be placed to precisely track an animal’s location (*33*, *34*). In our case, we require fast, precise estimates without intervention and thus use active strategies where the observation quality relies crucially on an appropriate sequence of viewpoints (*35*). Our objective is to reduce uncertainty (entropy) of the target location; thus, the task of actively tracking targets falls under the informative path planning paradigm (*36*). This problem is known to be NP-hard (*37*) and has been studied extensively over the past decade (*38*), with many applications focusing specifically on UAVs (*39*–*44*). In this paper, we leveraged these results to obtain an approximately optimal sequence of actions by greedily selecting the most informative viewpoints at each decision step.

In this work, we present rigorous theoretical analysis and in-field validation of a complete system for autonomous wildlife tracking. We show that this system addresses many theoretical and engineering challenges to a degree that is sufficient to match or surpass the performance of skilled human trackers from a location where signals are detectable. First, we provided a mathematical derivation for our data-driven sensor model, which has previously been validated over a number of trials on real birds and stationary targets (*22*). This range-azimuth model was further used to predict the quality of future viewpoints in planning an approximately optimal sequence of observations. We then directly compared this system with human operators in the problem of tracking the critically endangered swift parrot (*Lathamus discolor*) species in the wild. In six of eight trials, the estimated bird locations fell within 50 m of the true locations on average, which was sufficient for recapture, detailed field observation, or data readout. In the remaining two trials, we were unable to establish ground truth position because the target bird moved during the flight and visual confirmation was lost. Moreover, the time taken to achieve these estimates was comparable to, and often faster than, experienced human trackers. Our demonstration validates the concept of a robotic system for wildlife telemetry tracking used as an autonomous or human-assistive device in real field conditions. This milestone motivates further engineering development that may enable more widespread use of autonomous robots in movement ecology and in conservation management of small, dynamic moving species.

## RESULTS

In this section, we report on the approaches used for sensor modeling, data fusion, and decision-making. We then present field trial results, followed by a discussion of insights into the global and local spatiotemporal movements of swift parrots that were gathered from the trials.

The material in this section has its basis in the bearing-only, heuristic approach presented in previous work (*22*). Here, we present a full range-azimuth algorithm derived rigorously from first principles. We also provide variance analysis and proof that the objective function is monotone submodular, an important property that is useful in designing efficient planning algorithms.

### Likelihood functions for observations

The most critical component of the system is the sensor model, which allowed us to convert the signal received from the radio tag to an instantaneous estimate of the target’s location. An inaccurate or overconfident observation can lead to poor decision-making and imprecise final location estimates.

Let

denote the vehicle location,

the targets’ geographic coordinates within some workspace

, and **Z** an observation in some measurement space H. We are interested in learning the likelihood function ℓ(**y**; **x**, **z**), that is, the probability of receiving the measurement **z** at location **x**, given the target location **y**:

(1)

Note that we could consider uncertainty in the vehicle location **x** by including it in the measurement **z**; however, we assumed full knowledge of the vehicle state in this paper. We use the convention that uppercase symbols (e.g., **Y** and **Z**) are random variables, lowercase symbols (e.g., **y** and **z**) denote their realizations, and bold denotes a set. Further, true (or optimal) quantities are denoted with an asterisk (e.g., **y*** is the true location of the bird), and estimates are denoted with a hat (e.g., **ŷ** is the target estimate).

To construct our sensor models, we must determine what we are measuring and the uncertainty over these measurements. In this work, we take both range and azimuth readings of the target, where both observations are assumed to be normally distributed. This results in each measurement comprising the mean and variance **z** = {μ, σ^{2}}. Given a measurement function *h*:

that maps the vehicle **x** and target state **y** to the measurement space H, the Gaussian likelihood function is

(2)where *f* is the probability density function (PDF) of the normal distribution.

### Observed and expected sensor data

To derive our likelihood functions, we first characterized the raw sensor data and described the model used for obtaining our measurements. More details on the system collecting these measurements are given in Materials and Methods.

The radio tag emits an on-off keyed pulse signal; this transmission was received by the payload on board the UAV, and the received signal strength indicator (RSSI) values of the signal were captured and filtered. These RSSI values are linearly related to the power received during a transmission and are used as the raw sensor data for the observation. The range and bearing likelihood functions are based on these raw values and the learned sensor model.

The antenna used on board the UAV was a two-point phased array: a lightweight, unambiguous directional antenna designed for radio telemetry with multirotor vehicles (*22*). The array comprised two monopoles fed through a passive combiner circuit, which yielded a radiation pattern with a front lobe and back null. To reduce noise and spurious readings due to multipath propagation, the UAV remained stationary while yawing through a full rotation. During this rotation, the continuous RSSI values were filtered and sampled at a constant rate to give a scalar value *g*_{n} associated with the bearing of the *n*th value φ_{n}. These values were then transmitted to a base station, giving the recorded gain pattern **g** = (*g*_{1}, *g*_{2}, … , *g*_{N}). As a result, the random vector **G** = **g** is a function of the vehicle **X** and target **Y** location. Further, let *b*(**x**, **y**) denote the bearing from **x** to **y**. The true bearing to the target from robot location **x** is then θ* = *b*(**x**, **y***). We assume that the error for each recorded RSSI value is normally distributed with an unknown variance σ^{2}(θ*) that remains constant throughout an observation, that is, for arbitrary *g*_{n} ∈ **g**

(3)where σ^{2}(θ*) = **V**(*G*_{n}|Θ = θ*).

We obtain the expected gain pattern **Ε**[**G**|*Θ*] by linear regression. Specifically, we fit the expected gain pattern to a *J*th-order Fourier series φ: *ℝ* → *ℝ*, that is, given the true bearing θ*,

(4)

From this Fourier model, we obtain the expected gain pattern φ(θ) = **Ε**[**G**|Θ = θ], where φ = *ℝ* → *ℝ*^{N} is generated by sampling the Fourier series (Eq. 4) with a phase offset θ at *N* regular intervals, that is, **φ**(θ) = (φ(θ), φ(θ + 2π/*N*), …, φ(θ + 2π)).

Given the expected and observed sensor output, φ and **g**, the main goal of Bayesian sensor data fusion is to compute PDFs of the bearing and range to a target from the robot. Given that the likelihoods are assumed to be Gaussian, the measurement tuple **Z** equals {μ(**G**), σ^{2}(**G**)}. To learn the mapping from **G** to **Z**, we use a data-driven approach described in the following subsection.

### Azimuth likelihood function

We model the likelihood of each azimuth measurement with a Gaussian bearing-error model (*45*), where

. That is, the difference between the true bearing to the target θ* and the estimated bearing

(i.e., the bearing error) is Gaussian distributed. The bearing estimate

and its variance

are not measured directly but instead given as functions of observation quality (i.e., the correlation coefficient, discussed below). As a result, when **G** = **g**, the bearing-error likelihood function ℓ_{Θ} is given by

(5)

Now, given our model φ of the gain pattern, our problem becomes that of inverse regression to find the expected bearing and uncertainty. The Gaussian bearing-error assumption states

(6)when

and

. We find the expected azimuth by minimizing the sum of squares of the residuals, that is,

(7)

To infer the variance **V**(Θ|**g**) for a given signal **g**, we note that the collection of {**G**} is heteroscedastic, that is, the conditional variance can change with each observation. This is shown in the scattergram in Fig. 2, where the bearing error is plotted against observation quality (correlation). Sensor data used to construct this example were collected from a stationary radio tag as described in our previous work (*22*). We assume that this unexplained variance is due to hidden causes of observation noise, such as the target animal moving during a measurement or spurious recordings due to multipath interference. In typical regression, heteroscedasticity is considered undesirable and is reduced by introducing more regressors or nonlinear transformations of the existing variables. In our case, given that this knowledge is hidden, we cannot introduce more variables and instead marginalize out this quantity to infer the conditional variance from data. Below, we show how the coefficient of determination expresses the proportion of variability in our model (i.e., the heteroscedasticity is attributed to bearing error).

Fig. 2

(**Top row**) Two example observations taken online with a stationary target. The radial plots illustrate real RSSI readings (green line) **g** and a third-order Fourier series model φ(θ) of the radiation pattern (black line). The model is offset (rotated) such that it is oriented toward the true bearing to the target θ*, and the RSSI values are offset by the maximum correlation μ_{Θ}(**g**) = arg max_{θ}*r*_{ϕ(θ),g}. These offsets are illustrated with dashed green and black radial lines. (**Left**) The maximum value correlation coefficient

maps to a bearing-error

, which is illustrated in the grid plots below. (**Right**) The maximum RSSI value *g*_{max} maps to an expected range μ_{R}(**g**) with a fixed range-error

, giving the associated grid plots below.

In the context of regression, we can obtain the fraction of variance unexplained (FVU) for a response variable through the coefficient of determination. In linear regression, where we have the sample variance

as an estimate of the population variance **V**(*G*), the FVU is given by the sample correlation coefficient *r*^{2}:

(8)

However, we are interested in the bearing variance **V**(*Θ*|**G**), which we can approximate from the model variance **V**(φ(Θ)|**G**) by Taylor expansion. Recall that our estimate

is a function of the random vector **G**. We can approximate the variance of this mapping via a first-order Taylor expansion (*46*),

(9)

Now, because the measurement **G** comprises independent and identically distributed variables *G*_{n}, the covariance matrix is given by **Σ** = **V**(*G*|*Θ*)**I**_{N}, where **I**_{N} is the identify matrix. This gives the conditional variance in Eq. 9 as

(10)

Because small changes in each realization of *G* will introduce small changes in μ_{Θ}, the variance in Eq. 10 is approximately linear for low noise ν_{G}; however, the approximation becomes worse as ν_{G} becomes large. By using the coefficient of determination (Eq. 8), we can express the variance of a given sensor reading **g** in Eq. 10 as

(11)

Thus, σ^{2}(**g**) can be expressed as a function of

.

In practice, we regress only on σ^{2}(**g**), assuming that the variable is a piecewise continuous function of the explanatory variable

. We can also determine azimuth

for each measurement **g** by the correlation coefficient

. That is, after each observation, the recorded gain pattern is correlated against the model φ(θ) with regular phase offsets θ, and the lag that corresponds to the maximum correlation then gives the estimated angle of arrival; that is, Eq. 7 becomes μ_{Θ}(**g**) = arg max_{θ∈ [0,2π)}*r*_{φ(θ),g}. This process of obtaining an azimuth observation is illustrated on the left of Fig. 2, and example likelihood functions from one trial can be seen in Fig. 3.

Fig. 3

The distributions shown are spatially discrete grids over a 750-m^{2} area (with grid lines every 100 m for illustrative purposes only). (**Left** to **right**) The bearing-only likelihood function ℓ_{Θ}, the range-only likelihood function *ℓ*_{R}, the combined likelihood function ℓ, and the posterior belief *p*(*·*). In each column, the first observation (*k* = 1) is shown in the bottom grid, the last observation (*k* = 4) is at the top, and higher probability mass is represented as darker, raised regions. The UAV location **x**_{k} is indicated by a green dot, the target location

in purple, and the maximum likelihood estimate *ŷ*_{k} in yellow.

### Range likelihood function

Next, we estimate the distance to the target using a Gaussian range-error model where the set

. The range errors are assumed to be logarithmic, as discussed below. Furthermore, unlike the bearing observations, the scattergram in Fig. 2 does not indicate that the noise is heteroscedastic, that is, the variance is constant for each observation. This yields the likelihood function

(12)

In general, range measurements in cluttered environments can be highly imprecise due to multipath interference. We anticipate the vehicle to be deployed in similar environments and estimate the variance under these conditions. Although the error in range measurements can be substantial, including such observations is still useful. Because the noise is homoscedastic, we can rely on range measurements to provide an approximate location. The ability to focus on an approximate location is particularly beneficial when the search area would otherwise be expansive, such as in tracking scenarios where there is little prior knowledge of the target’s location and when bearing uncertainty is high.

We are interested in mapping the sensor output **g** to the distance between transmitter and receiver. Because of atmospheric interactions, the signal amplitude will decrease with range. We denote *d*(**x**, **y**) as the Euclidean distance between our receiver **x** and the transmitter **y**. Then, the received power *p*_{r} is a function of the transmitted power *p*_{t} and the attenuation per meter α (*47*):

(13)

In Eq. 13, we have assumed that *p*_{r} and *p*_{t} take into account the link budget, which characterizes all gains and losses in the telecommunication system. Most of these components are fixed for a given system (e.g., transmitter and receiver losses); however, for a directional antenna, the gain relative to the average radiation intensity (the isotropic directivity) depends on the immediate angle of arrival, φ_{n}. As a result, the RSSI values *g*_{n} are a function of the received power *p*_{r} and angle of arrival φ_{n}. The isotropic directivity is approximately constant if we take the maximum RSSI value *g*_{max} = max_{n}*g*_{n}. Thus, we use the value *g*_{max} to estimate distance.

Now, let the true distance to the target be *r** = *d*(**x**, **y***) and its estimate be a function of **g**, that is,

. From the above discussion and Eq. 13, *p*_{r} = *p*(*g*_{max}) for some linear function *p*: *ℝ* → *ℝ*. Moreover, *r** is a function of log *p*(*g*_{max}), and the Gaussian range-error assumption may be expressed as

(14)where

. We thus obtain the estimated range

as

(15)

The function μ_{R}(**G**) can be fitted to a first-degree polynomial function of log *g*_{max}. The variance

is estimated by the sample variance

. The procedure for obtaining a Gaussian range-error observation is illustrated on the right of Fig. 2, and example range likelihood functions can be seen in Fig. 3.

### Bayesian data fusion

Given the likelihood function in Eq. 16, we can combine numerous observations to determine the most likely position of the target animal. To achieve this, we use Bayesian data fusion, assuming independent observations; this process is illustrated in Fig. 3.

By time *t*, we have obtained observations of the target animal at a set of times 0 ≤ *t*_{1} … ≤ *t*_{k} ≤ *t*. The above notation is extended to include the observation number *k*. Denote **Y**_{k} = **Y**(*t*_{k}) as the target’s location at time *t*_{k} and, similarly, **Z**_{k} = **Z**(*t*_{k}) is the observation taken at this time. Moreover, let **U**_{k} = {**X**_{k}, **Z**_{k}}; denote the measurement process as **Z**_{1:k} = (**Z**_{1},…, **Z**_{k}) and the values it takes on as **z**_{1:k} = (**z**_{1}, . . . , **z**_{k} ). We are ultimately interested in knowing the probability of the target’s state after all *K* observations, that is, the posterior belief (*36*, *45*),

(17)

Further, we assume that the target can transition between observations such that **y**_{k} = **y**_{k−1} + **ν _{Y}** with

for some covariance Σ_{Y}. This leads to the transition density

(18)

Computing the posterior belief (Eq. 17) becomes simpler if the process (**Y**(*t*))_{t≥0} is assumed to be Markovian and each observation **Z**_{k} only depends on **Y**_{k}, that is,

. Because the likelihood function in Eq. 16 is defined this way, recursive Bayesian filtering (*45*) can be used to update the belief. That is, the posterior belief is computed as

(19)

(20)where η is a normalization constant such that ∫*d***y**_{k}*p*(*t*_{k}, **y**_{k}) and *ℓ*(**y**_{k}; **u**_{k}) is the likelihood function (Eq. 16). The first step, Eq. 19, gives a motion update, and the second step, Eq. 20, gives the information update to obtain a new belief of the target location (*45*).

Early approaches to recursive Bayesian filtering focused on Gaussian implementations because of convenient analytical solutions to computing the posterior belief in Eq. 17, for example, Kalman filters (KFs) and extensions such as the unscented and extended KFs. However, these methods are approximations to the nonlinear, non-Gaussian Bayesian filter (shown in Eqs. 19 and 20). Grid-based filtering allows for resolution-complete recursive estimation (*45*, *48*) and can be computed in reasonable time over our workspace. Thus, we represent our workspace S as an *I* × *J* grid in *ℝ*^{2}.

The evolution model, Eq. 19, is functionally equivalent to Gaussian convolution. Further, given our grid-based workspace S, this convolution is simply a Gaussian blur, a spatial (low-pass) filter commonly used in image processing. To efficiently implement this model, we leverage results from computer vision for convolution and use a separable Gaussian kernel of width 3|Σ_{Y}|.

Last, we require an estimate of the location of the target **ŷ**_{k} given the posterior *p*(*t*_{k}, **y**_{k}). Two obvious choices are the expected value of the posterior, **E**[**Y**_{k}] = ∫d**y**_{k}**y**_{k}*p*(*t*_{k}, **y**_{k}), or the maximum a posteriori probability (MAP) estimate,

. The MAP estimate performed marginally better in preliminary trials; however, in practice, the target does not remain stationary, and so we instead maximize recursively over all posteriors:

In this way, the location estimate likelihood is strictly increasing.

### Decision-making by information gain

Our overall objective is to know where our target animal is and with what certainty. Thus, the problem can be considered under the framework of information gathering (*36*). To quantify uncertainty, we use Shannon entropy, a standard measure for this purpose. The conditional (Shannon) entropy of a random variable *V*, given another variable *W*, quantifies the uncertainty over the outcomes of *V* in the context of *W*; mathematically, this is given by *H*(*V*|*W*) = **E**[log Pr(*V*|*W*)].

In this context, we aim to choose a sequence **u**_{1:K} = {**x**_{1:K}, **z**_{1:K}} of state-measurement pairs such that the final entropy of the belief *H*(**Y**_{N}) is minimized. That is, letting

and fixing the measurement space H, the objective can be stated as

(21)where the posterior belief *p*(*t*_{k}, **y**_{k}) is a function of the robot position and measurements **u**_{1:k}.

However, it is more convenient to consider the equivalent problem of maximizing the information gain of each observation. Let

be distributed according to the target belief after the motion update step, that is, Eq. 19. The information gained in taking the action **U**_{k} = **u**_{k} is quantified by the mutual information

between the posterior and the prior belief:

(22)

Decomposing Eq. 22 using the chain rule, the entropy minimization problem defined in Eq. 21 can be expressed as

(23)

The objective of Eq. 23 is equivalent to entropy minimization and is, in general, nonconvex and analytically intractable. However, the mutual information given in Eq. 22 is monotone submodular, and thus, the quality of the solution provided by a greedy algorithm is at least 63% of optimal (*37*). That is, given a deterministic greedy algorithm that selects the action

(24)at each decision step, the resulting path **û**_{1 : K} is within a constant factor of optimal of the objective shown in Eq. 23, that is,

(25)

Furthermore, this is the most efficient algorithm to obtain such a bound unless P = NP (*37*).

Optimizing each observation **U**_{k} is constrained in that only the vehicle locations **x**_{k} ⊂ **u**_{k} can be selected and, consequently, only the expected information gain at each sample **s** can be computed; that is, we choose future waypoints **x**_{k} such that

(26)

As mentioned above, we assume independent errors in the likelihood functions shown in Eq. 16, giving Pr(**Z**) = Pr(**Z**_{Θ}) Pr(**Z**_{R}). However, even solving for independent priors requires inverting all possible distributions at all sample locations

; this is generally intractable.

As an efficient alternative, we assume that the target location for the next observation is the maximum likelihood position after the motion update, that is,

. As a result, for a fixed viewpoint **s**, the expected range measurement

and expected bearing measurement

to the target are given. Moreover, the expected variance

is given by marginalizing out **G** such that

radian. In this case, the expected observation is a function of the viewpoint **s**:

and the optimization over potential viewpoints **s** from Eq. 26 becomes

(27)

To reduce computation time, instead of sampling every location in the workspace

as indicated in Eq. 27, we simply sample a uniformly distributed subset. Given the stochastic nature of observations, this does not appear to affect the quality of the planner.

### Evaluating the performance of the system

To validate our approach in real field conditions, we compared the performance of the robotic system to human tracker performance in locating swift parrots in the wild. The box plot in Fig. 4 collates the tracking performance from eight flights at four different sites near Temora, New South Wales, Australia. The data for individual trials are presented in table S1 for further detail. At each site, we obtained the GPS trajectory of a novice and an expert tracker performing manual wildlife telemetry. Once the human tracker had established the true location **y** for the target through visual confirmation, the UAV began its flight trial. The procedure for human trackers locating the bird is provided in Materials and Methods. After confirmation of the target by the human, a flight trial was performed for each type of tracker (novice and expert). Thus, we obtained two tests of the robot system at each site with a known true bird location.

Fig. 4

We performed two flights at each of four trial sites (eight flights in total). The box plot illustrates the estimate errors (on the *y* axis) for both the robot (green and blue) and the human (white and gray) trackers as a function of the observation number (on the *x* axis). The blue boxes labeled “Robot (certain)” indicate scenarios where the bird remained stationary during the trial and the final location was known.

For fair comparison, both the robot and the humans began trials from the same initial coordinates, with the target animal location unknown. This starting location was chosen such that the radio signal was strong enough to be measured by the onboard payload. To quantify performance, we compared the robotic tracker estimate **ŷ**_{k} with the Euclidean distance between the GPS positions of human tracker and the final ground truth **y** at time *t*_{k}. Specifically, in Fig. 4, the *y* axis shows (i) the error in the robot system’s target estimate **ŷ**_{k} for each observation *k* (in green and blue) and (ii) the Euclidean distance between the human tracker’s location at time *t*_{k} and their final estimate (in white and gray). The *x*-axis labels provide the observation number (e.g., observation 1) and the time interval during which that observation was taken for all flights (e.g., 46 to 60 s). Each data point in the box plot thus represents an observation from one of the eight trials. In two of the flights, the final location was uncertain because the bird moved during the flight and visual confirmation was lost. The case where data from these two flights were removed is labeled “Robot (certain)” (blue boxes).

The results shown in Fig. 4 and table S1 indicate that the robot was able to approximate the location of the target species in less time than human trackers. Specifically, by the second observation (between 143 and 289 s), the robot successfully located the bird to within 55 m in five of the trials (see table S1). In each of these five trials, the human tracker error was greater than the robots. Moreover, the final position estimates show an average of 51.47 m and are obtained in fewer than five observations (about 10 min). Table S1 indicates that the observation error does not monotonically decrease over time (e.g., in trial 7). This is not surprising because of the various environmental conditions that can cause observation noise; these include when the bird moves during readings as well as strong winds interfering with the UAV remaining stationary while spinning. In future work, we aim to be able to detect these conditions to minimize the effect of such noise.

### Ecological significance of trials

The quantitative data from our trials demonstrate the potential of our system to shed light on the movements of small, highly mobile animals such as swift parrots. The Temora region was chosen because, based on a small number of sightings, it was assumed that numerous swift parrots had migrated to the area in the weeks leading up to the trial (see Materials and Methods). The results in this paper were obtained over a 7-day trial in the region, and the posterior estimates from all flights were aggregated to yield the heatmap shown in Fig. 5. The figure shows that the flocks used two distinct areas for foraging and roosting, including sites where the species had not previously been recorded.

Fig. 5

A map of southeast Australia, where a majority of swift parrots have been sighted. Inset: Heatmap illustrating aggregated posterior distributions from our trial in July 2017. The posterior distributions of all trials were normalized and aggregated to give an indication of the most likely foraging and roosting areas. White symbols indicate locations where ground truth data (confirmed by visual sightings) were available; each tag has a different symbol: ○, ×, +, or *. Map data: Google, DigitalGlobe, CNES/Airbus.

Swift parrots are small, critically endangered migratory birds that depend on highly variable winter nectar resources. As a result, the small population (less than 2000 birds) spreads across vast areas of southeastern Australia each year in search of suitable food. Given their small body and hence tag size, as well as their capacity for highly variable and large movements, this species has not been successfully radio-tracked.

Science Robotics