Multiple fish tracking with an NACA airfoil model for collective behavior analysis
- Kei Terayama^{1}Email author,
- Hitoshi Habe^{2} and
- Masa-aki Sakagami^{3}
https://doi.org/10.1186/s41074-016-0004-1
© The Author(s) 2016
Received: 28 April 2016
Accepted: 17 June 2016
Published: 2 August 2016
Abstract
We propose a visual tracking method with an NACA airfoil model for dense fish schools in which occlusions occur frequently. Although much progress has been made for tracking multiple objects, it remains a challenging task to track individuals due to factors such as occlusion and target appearance variation. In this paper, we first introduce a NACA airfoil model as a deformable appearance model of fish. For occluded fish, we estimate their positions, angles, and postures with template matching and simulated annealing algorithms to effectively optimize their parameters. To improve performance of tracking, we repeatedly track fish with the parameter estimation algorithm forwards and backwards. We prepared two real fish scenes in which the average number of fish is over 25 in each frame and multiple fish superimpose over 50 times. Experimental results for the scenes show that fish are practically tracked with our method compared to a tracking method based on a mixture particle filter. Over 75 % of fish in each scene have been tracked throughout the scene, and the average difference is less than 4 % of the mean body length of the school.
Keywords
1 Introduction
The tracking of multiple fish in a tank to measure their behaviors has many important applications in various fields of natural science, such as animal behavior and neuroscience [1–3]. Automatic surveillance of fish in aquariums and fish farms is also important for observing the growth and health of fish in order to improve their survival rate.
Terayama et al. tracked multiple fish in such a dense school using their appearance model based on the images of fish in a video [11]. They showed that if the number of fish in a cluster of fish is known, their positions and other parameters can be estimated by matching all of the combinations of the possible parameters. However, their algorithm is quite slow because of the number of their parameter combinations, and their model is not parameterized.
In this paper, we propose a novel multiple fish tracking method for a dense school of fish. First, we introduce a parameterized appearance model based on the NACA0012 airfoil model^{1}, which has been adopted in biomechanics and computational fluid dynamics research, e.g., in [12], to represent a fish body. The model is simple but can effectively represent the deformations of fish caused by tail beating using small parameters as compared to the models in [7–9]. The results of our experiments, in which two types of swimming event were easily detected, show the effectiveness of this model. Second, we propose a practical tracking method, which estimates the parameters of fish with in a realistic time by using simulated annealing (SA) [13]. The approach for parameter estimation is based on that in [11]. However, their algorithm is unrealistic because of the combination of parameters. Since it is difficult to estimate the number and positions of fish in a cluster, in the proposed method, we begin to track only isolated fish that do not overlap with others. Therefore, we cannot track a fish that is initially occluded and in the middle of the video is isolated at the beginning of its trajectory. Finally, to deal with this problem, we propose a forward-backward tracking algorithm. This algorithm corresponds to manual tracking, in which we track fish in a cluster where the fish overlap by playing the video forward and backward repeatedly.
In the rest of the paper, we describe our tracking method for multiple fish in Section 2. We show the results of experiments using movies recorded in an aquarium and our event detection results in Section 3. Finally, we summarize this paper and state the plan for future work in Section 4.
2 Our method
2.1 Appearance model of fish
where the parameters A, λ, c, x, and ϕ represent the maximum amplitude, wave length, phase velocity, position from the head as shown in Fig. 3 a, and phase of one beat cycle, respectively. For each scene, we first calculate the averaged brightness of the fish and construct 92 normal appearance models based on the NACA0012 model and Eq. (1), changing A and ϕ by filling in the form with the brightness. We call these models the NACA model. Figure 3 b, c shows examples of the NACA model. We set λ to 2 and c to 2 and change A from 0.01 to 0.3 and ϕ from 0 to 1. In order to deal with large deformation (bending), we add some largely deformed models based on h(x,ϕ) with a large amplitude. Figure 3 d shows an example of a largely deformed model.
2.2 Multiple fish tracking with simulated annealing
Initially, we track only isolated fish, because in a cluster of overlapping fish it is difficult to estimate the number of fish. When two or more fish tracked by our method begin to overlap, we estimate their parameters by matching the overlapped image and the image drawn from their parameters by applying the NACA model using SA. The details of our tracking algorithm are as follows.
Parameters of our method
Parameter | Units | Value explored |
---|---|---|
Position (x, y) | Pixels | The entire image |
Direction angle | Degrees | Omnidirectional |
A (amplitude) | 0.01, 0.04, 0.07, 0.10, 0.15,0.22, 0.30 | |
ϕ (phase of beat cycle) | 0.05 | 0–1 |
Length scale | 1 % | 75–150 % |
Thickness scale | 1 % | 70–130 % |
For each frame t in a scene, we first binarize the frame and extract fish candidate regions (FCRs) using the binarized image, as shown in image (ii) in Fig. 2 a. To each FCR, we assign fish IDs from tracking results of the previous frames by calculating the minimum of the similarity^{2} between the FCR and all tracked fish. The image (iii) in Fig. 2 a shows examples of the assignment of IDs to each FCR. If no IDs are assigned to an FCR and its area size is in the range [a _{ l },a _{ m }], we assign a new ID and begin to track a new fish as the FCR. We do not assign IDs to an FCR and terminate tracking if there is little or no overlap between the FCR and any images drawn from the FPs.
For an FCR that consists of multiple fish, it is difficult to estimate the number of fish in the FCR and their FPs simultaneously. However, if we know the number of fish in the FCR, we can accurately estimate their FPs by minimizing the sum of absolute differences (SAD) between the FCR and the image drawn from the FPs and the NACA model, as shown in [11]. We minimize the SAD using SA [13] to accelerate tracking process, although all the combinations of parameters were matched in [11].
We also terminate the optimization process if l p≥l p _{max}.
Note that our optimization process is more practical than that in [11], because the order of our algorithm for an FCR that has m assigned IDs is \(\mathcal {O}(m)\) from Eq. (2). We refer to the proposed tracking method with SA as SAT (SA Tracking).
2.3 Forward-backward tracking
We repeatedly apply the SAT process to the same scene in reverse, i.e., we track the fish forward-backward. We call the former tracking process the former process. Figure 2 b shows an overview of a reverse tracking.
During reverse tracking, for an FCR, if the FPs were appropriately estimated in the former process, we simply trace the FPs (case 1 in Fig. 2 b). If the FPs were not estimated in the former process for the FCR, we calculate a novel FPs according to the SAT (case 2 in Fig. 2 b).
where x _{ p }, x _{ t }, y _{ p }, and y _{ t } are positions and θ _{ p } and θ _{ t } are direction angles. We integrate the trajectories when the distance is smaller than d _{ cn }.
We repeat the forward-backward tracking process in order to improve the tracking performance until no new estimated FPs appear. We call the process FBT.
3 Experimental results
We conducted experiments to show the effectiveness of the proposed method. We first describe the dataset used in the experiments. In order to compare the tracking performance of the proposed method, we also performed tracking using an implementation based on [4]. We call the implementation MPF (Mixture Particle Filter). We prepared 30,000 and 40,000 particles for scene A and B in order to assign hundreds of particles to each fish. The parameters of a particle such as positions, direction angles, and NACA model parameters and the likelihood function of MPF are also based on the NACA model, as is the proposed SAT.
3.1 Dataset
We recorded videos of schools of sardines at Kujukushima Aquarium Umikirara, Nagasaki, Japan in March 2015. The videos were recorded at 30 fps using a HERO4 video camera. Figure 1 a, b shows a snapshot of the video and the camera setup, respectively.
We manually prepared the ground-truth (GT) trajectories of all the fish in the scenes. In this study, we tracked only the fish having a connected (occluded) component that is completely contained in the frame of the scenes. For example, the fish in the white dotted oval in Fig. 4 b is not tracked in this frame.
Basic data of scenes A and B
AN | NT | NO | |
---|---|---|---|
A | 28.61 | 80 | 104 |
B | 40.05 | 114 | 117 |
3.2 Evaluation metrics
We used five metrics to evaluate the tracking results based on the metrics in [6]. For each scene, we calculated the average ratio of the fish detected correctly using our tracking methods to the GT (Rcll) and to all fish detections that may contain failures (Prcn). We considered a pair of an estimated parameter and GT correctly matched if the metric distance between the estimated head position and the GT position is less than five pixels. To measure the tracking performance, we calculated the ratio of GT trajectories that are covered by the estimated tracklets for more than 95 % of their length to all the GT trajectories (MT). Our method cannot track a fish which is overlapped for the entire frames in the scene. We also measured the MT except for such fish (MT-I). To measure tracking failures, we counted the total number of ID switchings during fish crossing and particle migrations (switchings and migrations (SMs)).
3.3 Experiment 1
Quantitative comparison of tracking results
Scene | Method | Rcll | Prcn | MT | MT-I | SM |
---|---|---|---|---|---|---|
A | MPF | 0.779 | 0.753 | 0.575 | 0.687 | 6 |
SAT | 0.844 | 0.944 | 0.625 | 0.746 | 0 | |
SAT+FBT | 0.875 | 0.939 | 0.788 | 0.940 | 0 | |
B | MPF | 0.810 | 0.796 | 0.519 | 0.612 | 9 |
SAT | 0.826 | 0.936 | 0.557 | 0.657 | 2 | |
SAT+FBT | 0.866 | 0.936 | 0.760 | 0.896 | 2 |
We tracked fish, repeating the FBT process five times. The results are shown in the SAT+FBT row in Table 3. Figure 5 c, d shows space-time trajectory plot of the entire sequence of scenes A and B. By virtue of backward tracking, we can estimate the FPs of a cluster of nine fish in the white oval in Fig. 5 b in the first frame. Figure 4 e, f shows the improvement in the tracking performance using FBT. Over 75 % (approximately 90 % in MT-I) of fish in each scene are correctly tracked by FBT, and the average differences between GT and estimated positions are less than 4 % of the mean body length in each scene.
The experiments are performed using our non-optimized implementation in Python and OpenCV. The average SAT computation times for processing one frame of scenes A and B were about 4 and 6 min.
3.4 Experiment 2
Since we employed the parametrized appearance model, we can easily find events that are useful for collective behavior analysis, such as bending (Fig. 3 d) and gliding. Gliding is a swimming phase in which there is no beating and the fish sometimes strongly bend their bodies to change the direction of their movement. We extracted such events using the amplitude parameter A. The blue points in Fig. 5 e, f show gliding events. The fish are bending at the red points in Fig. 5 e, f.
From the viewpoints of biomechanics and animal behavior, such measurements are essential, because gliding and bending are related to the energy consumption of swimming [14] and constitute a type of information transmission in a school [2, 15].
4 Conclusion
In this paper, we proposed an appearance-based tracking method for multiple fish tracking. Over 75 % (approximately 90 % in MT-I) of the fish in two scenes were successfully tracked. The experimental results indicate that our method is practical for multiple fish tracking and collective motion analysis.
Our method is suitable for fish filmed from the bottom because the NACA model represent a fish viewed from the bottom and top. We would like to extend the applicable range of our method for movies taken from other directions. Our future work also includes improving the tracking performance by introducing data association frameworks and interaction models between fishes to estimate the states in the next frame. Furthermore, it is worth accelerating our algorithm in order to track thousands of fishes in schools.
5 Endnotes
^{1} The NACA airfoils are models of shapes for aircraft wing sections originally developed by the National Advisory Committee for Aeronautics (NACA). The digits represent the parameters of the shapes.
^{2} We employed the sum of absolute differences (SAD) as the similarity measure.
Declarations
Acknowledgements
This work were supported by JSPS KAKENHI Grant Number 26240023 and 26610114. The authors sincerely appreciate the cooperation of Kujukushima Aquarium Umikirara to take movies of the school of sardines.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Vicsek T, Zafeiris A (2012) Collective motion. Phys Rep 517(3): 71–140.View ArticleGoogle Scholar
- Strandburg-Peshkin A, Twomey CR, Bode NW, Kao AB, Katz Y, Ioannou CC, Rosenthal SB, Torney CJ, Wu HS, Levin SA, et al. (2013) Visual sensory networks and effective information transfer in animal groups. Curr Biol 23(17): 709–711.View ArticleGoogle Scholar
- Delcourt J, Denoël M, Ylieff M, Poncin P (2013) Video multitracking of fish behaviour: a synthesis and future perspectives. Fish Fish 14(2): 186–204.View ArticleGoogle Scholar
- Vermaak J, Doucet A, Pérez P (2003) Maintaining multimodality through mixture tracking In: Proc 9th IEEE Int Conf Comput Vis, 1110–1116. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1238473&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1238473.Google Scholar
- Okuma K, Taleghani A, De Freitas N, Little JJ, Lowe DG (2004) A boosted particle filter: Multitarget detection and tracking In: Proc 8th European Conf Comput Vis, 28–39. http://link.springer.com/chapter/10.1007/978-3-540-24670-1_3.Google Scholar
- Wu B, Nevatia R (2006) Tracking of multiple, partially occluded humans based on static body part detection In: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog, 951–958. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1640854&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1640854.Google Scholar
- Ukita N, Kitajima T, Kidode M (2004) Estimating the positions and postures of non-rigid objects lacking sufficient features based on the stick and ellipse model In: Proc Conf Comput Vis Pattern Recog Workshop. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1384794&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1384794.Google Scholar
- Mitsugami I, Kakusho K, Minoh M (2009) Efficient particle filtering for a non-rigid object based on PCA about changes of its shape and motion. IEICE Trans Inf Syst192-D(8): 1270–1278.Google Scholar
- Butail S, Paley DA (2012) Three-dimensional reconstruction of the fast-start swimming kinematics of densely schooling fish. J R Soc Interface 9(66): 77–88.View ArticleGoogle Scholar
- Fukunaga T, Kubota S, Oda S, Iwasaki W (2015) Grouptracker: Video tracking system for multiple animals under severe occlusion. Comput Biol Chem 57: 39–45.View ArticleGoogle Scholar
- Terayama K, Hongo K, Habe H, Sakagami M-a (2015) Appearance-based multiple fish tracking for collective motion analysis In: Proc 3rd IAPR Asian Conf Pattern Recog, 361–365. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7486526.Google Scholar
- Akimoto H, Miyata H (1993) Finite-volume simulation of a flow about a moving body with deformation In: Proc 5th Int Symp Comput Fluid Dynamics, 13–18. https://www.researchgate.net/profile/Hiromichi_AKIMOTO/publication/260146162_Finitevolume_simulation_of_a_flow_about_a_moving_body_with_deformation/links/0deec52fc2c9c0a556000000.pdf.Google Scholar
- Kirkpatrick S, Vecchi MP, et al. (1983) Optimization by simulated annealing. Science 220(4598): 671–680.MathSciNetView ArticleMATHGoogle Scholar
- Hemelrijk CK, Reid DAP, Hildenbrandt H, Padding JT (2014) The increased efficiency of fish swimming in a school. Fish Fish 16(3): 511–521.View ArticleGoogle Scholar
- Radakov DV (1973) Schooling in the Ecology of Fish. Wiley, New York.Google Scholar