International Journal of Fuzzy Logic and Intelligent Systems 2020; 20(1): 43-51
Published online March 25, 2020
https://doi.org/10.5391/IJFIS.2020.20.1.43
© The Korean Institute of Intelligent Systems
School of Information Convergence Technology, Daegu University, Gyeongsan, 38453, Korea
Correspondence to :
Seokwon Yeom (yeom@daegu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Small unmanned aerial vehicles can be effectively used for aerial video surveillance. Although the field of view of the camera mounted on the drone is limited, flying drones can expand their surveillance coverage. In this paper, we address the detection of moving targets in urban environments with a moving drone. The drone moves at a constant velocity and captures video clips of moving vehicles such as cars, buses, and bicycles. Moving vehicle detection consists of frame registration and subtraction followed by thresholding, morphological operations and false blob reduction. First, two consecutive frames are registered; the coordinates of the next frame are compensated by a displacement vector that minimizes the sum of absolute difference between the two frames. Second, the next compensated frame is subtracted from the current frame, and the binary image is generated by thresholding. Finally, morphological operations and false alarm removal extract the target blobs. In the experiments, the drone flies at a constant speed of 5.1 m/s at an altitude of 150 m while capturing video clips of nine moving targets. The detection and false alarm rates as well as the receiver operating characteristic curves are obtained, and the drone velocities in the
Keywords: Drone, Unmanned aerial vehicle, Frame registration, Object detection, Velocity estimation
Small drones or unmanned aerial vehicles (UAVs) are highly useful for aerial video surveillance. Moreover, moving object detection is essential for identifying harmful threats in advance. Drones can capture video from a distance while hovering at a fixed point or moving from one point to another as programmed. This remote scene capturing is cost effective and does not require highly trained personnel. However, the computational resources of small drones are often limited in battery, memory, computing power, and bandwidth, which pose a challenge to the analysis of high-resolution video sequences in real-time.
The importance of automatic video surveillance was highlighted in [1]. Various applications of the aerial surveillance by drones were presented in [2]. In [3], a distant moving object was detected using a background area subtractor. A coarse-to-fine detection scheme was proposed in [4]. In [5], the background was subtracted under a Gaussian mixture assumption, followed by a morphological filter. In [6], various object detection methods were reviewed in three categories: background subtraction, frame difference, and optical flow.
An all-in-one camera-based target detection and positioning system was studied for search and rescue mission in [7]. [8] and [9] studied the detection and tracking of moving vehicles by using Kalman and interacting multiple model (IMM) filters, respectively. The detection and tracking of pedestrians were studied with a drone hovering at a low altitude in [10]. In past studies, the drone hovered in one position as a stationary sensor, and the field of view was fixed to capture the same scene. The surveillance coverage of the drone depends on its distance to the object and the focal length of the camera. As the distance from the object increases, the drone coverage increases. However, the image quality easily degrades due to blur, noise, and low resolution.
Image registration is the process of matching two or more images of the same scene into the same coordinates. Various image registration methods were surveyed in [11]. Frames from different sensors was registered via modified dynamic time warping in [12]. Coarse-to-fine level image registration was proposed in [13]. Telemetry-assisted registrations of frames from a drone was studied in [14]. The speed of small UAVs can be set, but the direction of flight is determined by two waypoints on the map. Thus, the drone velocities cannot be immediately found in the
In this paper, we address the detection of moving targets in urban environments with a flying drone. The drone flies from one point to another at a constant speed while creating a wider coverage than a stationary drone. To compensate for the drone’s movement in the video sequence, frame registration is performed between two consecutive frames. The displacement vector for the next frame is obtained when the sum of absolute difference (SAD) between the two consecutive frames is minimized. This displacement vector compensates for the coordinates of the next frame. Moving objects are then detected in the current frame by frame subtraction and thresholding; the current frame is subtracted by the next frame that is compensated, and thresholding is performed to generate a binary image. Two morphological operations (erosion and dilation) are sequentially applied to the binary image. The erosion operation removes small clutters while dilation operation compensates for erosion in the object areas and connects the segmented areas of one object to generate target blobs. Finally, target blobs smaller than the target size are removed. Meanwhile, the drone velocities in the
In the experiments, a drone flies in a straight line at a speed of 5.1 m/s and captures a video clip at the height of 150 m. The drone camera points directly downwards while capturing the video sequences. Nine moving vehicles (6 sedans, 2 buses, and 1 bicycle) are captured for approximately 15 seconds. The consecutive frames are apart by three frames for frame registration, and moving targets are detected every three frames. The detection and false alarm rates are obtained with different minimum blob sizes, and the receiver operating characteristic (ROC) curve is obtained. The average detection rate ranges from 0.9 to 0.97, while the false alarm rate ranges from 0.06 to 0.5. The root mean square error (RMSE) of the speed is 0.7 m/s when the reference frame is set to the first frame for frame registration.
The remainder of the paper is organized as follows: moving object detection and drone velocity estimation are discussed in Section 2. Section 3 presents experimental results and the conclusion follows in Section 4.
Figure 2 shows the block diagram of frame registration and moving object detection. The system consists of two stages: frame registration and object detection. The drone velocity is estimated by the displacement vector. The detailed procedures are described in the next subsections.
Two consecutive frames
where
The displacement vectors are estimated in the
The subtraction and thresholding processes generates a binary image after the coordinates of the next frame is compensated, as
where
where
where
To evaluate the detection performance, three metrics are defined as
where
The velocity estimation of the drone is useful when a high-precision global navigation satellite system is not available. The accuracy of the frame registration can be also verified by the estimate of the drone speed. The following cumulative displacement vector minimizes the cumulative SAD, where the reference frame is set to the first frame as
The velocities of the moving drone in
where
The speed of the drone is calculated as
The accuracy of the speed is evaluated by the RMSE between the ground truth and the estimate as
where
The drone (DJI Phantom 4 Advanced) flew at 150 m high in a straight line at a constant speed of 5.1 m/s around the Daegu University main gate fountain. The speed was set at the control box by the operator. The drone camera pointed directly toward the ground and captured a video clip of 454 frames for approximately 15 seconds at 30 fps; thus,
Figure 3(a) shows Targets 1–4 at the first frame, Figure 3(b) shows Targets 1–6 at the 54th frame, and Figure 3(c) shows Targets 1–3 and 6–9 at the 131st frame. In Figure 3, the red circles indicate the moving targets. The coverage of each frame continued to move as the drone flew slightly upwards from left to right.
Table 1 shows the characteristics of the nine targets such as the initial and final frame when the targets appear as well as the direction and component of the targets in the video. This analysis was performed manually to identify the target characteristics for the vehicle detection.
Two consecutive frames were 0.1 seconds (3 frames) apart from each other. After the coordinates of the next frame were compensated, frame subtraction was performed between the current and 0.1 seconds preceding frames. Thus, object detection was performed with 151 frames (1, 4, 7, ..., 451 frames). Figure 4(a) shows the results of the detection process of Figure 3(a) when
Figure 5(a) to 5(c) show the detection results (bounding boxes) of Figure 3(a) to 3(c), respectively. The number of detections in Figure 3(a) to 3(c) are 3, 6, and 6 out of 4, 6, and 7 targets, respectively. No false alarm was detected in Figure 5(a) to 5(c). The bus (Target 1) was missing in Figure 5(a) due to the tree obstruction, and the gray sedan (Target 6) was missing in Figure 5(c) because the car stopped before the traffic signal. It should be noted that when a target stops, there is no change in the target area between the frames.
Table 2 shows the detection rates with varying
Figure 7 shows an expanded coverage as the drone moves. In Figure 7, the centroids of all the targets are presented in blue circles, including false alarms. The coverage was expanded to 2779 × 1096 pixels, which corresponds to 305 × 120 m. In the upper left part, several people were detected but disappeared with a larger threshold.
Figure 8(a) and 8(b) show the estimated velocities of the drone in the
Table 4 shows the RMSEs of the instant and cumulative speeds. The cumulative speed is around 0.07 m/s, which is 98.6% accurate compared to the actual speed while the accuracy of the instant speed is 92.2%.
In this study, multiple moving targets were detected by a flying drone. To compensate for the movement of the drone, the frames were registered by SAD. Moving objects were then successfully detected by frame subtraction, morphological operations, and false blob removal. In addition, the drone speed was estimated, and the registration accuracy was verified with a ground truth. The detection rate was as high as 97% and the RMSE of the drone speed was as low as 0.07 m/s. However, if the target moves too slowly, the detection rate can be decreased. Blob fragments may be generated if the speed of the target is too high.
This technology can be applied to smart surveillance, such as unmanned security systems or search and rescue missions. Multiple target tracking after object detection remains for future study.
No potential conflict of interest relevant to this article was reported.
This research was supported by the Daegu University Research Grant 2015.
Table 1. Target characteristics.
Target ID | Initial frame | Final frame | Direction | Component |
---|---|---|---|---|
1 | 1 | 454 | Right | Blue Bus |
2 | 1 | 454 | Right | Black Sedan |
3 | 1 | 454 | Right | White Sedan |
4 | 1 | 196 | Upward | White Sedan |
5 | 31 | 229 | Upward | Bicycle |
6 | 130 | 208 | Left | Gray Sedan |
7 | 211 | 454 | Left | Blue Bus |
8 | 295 | 454 | Left | White-black Sedan |
9 | 370 | 415 | Parking | White Sedan |
Table 2. Detection rates.
Target ID | Number of appearances | |||
---|---|---|---|---|
380 | 400 | 420 | ||
1 | 151 | 0.94 | 0.85 | 0.82 |
2 | 151 | 1 | 1 | 1 |
3 | 151 | 1 | 1 | 1 |
4 | 66 | 1 | 1 | 1 |
5 | 67 | 0.92 | 0.83 | 0.77 |
6 | 27 | 1 | 0.96 | 0.96 |
7 | 81 | 1 | 0.82 | 0.66 |
8 | 53 | 1 | 1 | 1 |
9 | 16 | 0.75 | 0.62 | 0.56 |
Avg. | 84 | 0.97 | 0.92 | 0.90 |
Table 3. Number of false alarms and false alarm rates.
380 | 400 | 420 | |
---|---|---|---|
Number of false alarms | 76 | 23 | 9 |
Number of effective false alarms | 12 | 7 | 4 |
0.50 | 0.15 | 0.06 | |
0.08 | 0.046 | 0.03 |
E-mail: qaws0040@daegu.ac.kr
E-mail: yeom@daegu.ac.kr
International Journal of Fuzzy Logic and Intelligent Systems 2020; 20(1): 43-51
Published online March 25, 2020 https://doi.org/10.5391/IJFIS.2020.20.1.43
Copyright © The Korean Institute of Intelligent Systems.
School of Information Convergence Technology, Daegu University, Gyeongsan, 38453, Korea
Correspondence to:Seokwon Yeom (yeom@daegu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Small unmanned aerial vehicles can be effectively used for aerial video surveillance. Although the field of view of the camera mounted on the drone is limited, flying drones can expand their surveillance coverage. In this paper, we address the detection of moving targets in urban environments with a moving drone. The drone moves at a constant velocity and captures video clips of moving vehicles such as cars, buses, and bicycles. Moving vehicle detection consists of frame registration and subtraction followed by thresholding, morphological operations and false blob reduction. First, two consecutive frames are registered; the coordinates of the next frame are compensated by a displacement vector that minimizes the sum of absolute difference between the two frames. Second, the next compensated frame is subtracted from the current frame, and the binary image is generated by thresholding. Finally, morphological operations and false alarm removal extract the target blobs. In the experiments, the drone flies at a constant speed of 5.1 m/s at an altitude of 150 m while capturing video clips of nine moving targets. The detection and false alarm rates as well as the receiver operating characteristic curves are obtained, and the drone velocities in the
Keywords: Drone, Unmanned aerial vehicle, Frame registration, Object detection, Velocity estimation
Small drones or unmanned aerial vehicles (UAVs) are highly useful for aerial video surveillance. Moreover, moving object detection is essential for identifying harmful threats in advance. Drones can capture video from a distance while hovering at a fixed point or moving from one point to another as programmed. This remote scene capturing is cost effective and does not require highly trained personnel. However, the computational resources of small drones are often limited in battery, memory, computing power, and bandwidth, which pose a challenge to the analysis of high-resolution video sequences in real-time.
The importance of automatic video surveillance was highlighted in [1]. Various applications of the aerial surveillance by drones were presented in [2]. In [3], a distant moving object was detected using a background area subtractor. A coarse-to-fine detection scheme was proposed in [4]. In [5], the background was subtracted under a Gaussian mixture assumption, followed by a morphological filter. In [6], various object detection methods were reviewed in three categories: background subtraction, frame difference, and optical flow.
An all-in-one camera-based target detection and positioning system was studied for search and rescue mission in [7]. [8] and [9] studied the detection and tracking of moving vehicles by using Kalman and interacting multiple model (IMM) filters, respectively. The detection and tracking of pedestrians were studied with a drone hovering at a low altitude in [10]. In past studies, the drone hovered in one position as a stationary sensor, and the field of view was fixed to capture the same scene. The surveillance coverage of the drone depends on its distance to the object and the focal length of the camera. As the distance from the object increases, the drone coverage increases. However, the image quality easily degrades due to blur, noise, and low resolution.
Image registration is the process of matching two or more images of the same scene into the same coordinates. Various image registration methods were surveyed in [11]. Frames from different sensors was registered via modified dynamic time warping in [12]. Coarse-to-fine level image registration was proposed in [13]. Telemetry-assisted registrations of frames from a drone was studied in [14]. The speed of small UAVs can be set, but the direction of flight is determined by two waypoints on the map. Thus, the drone velocities cannot be immediately found in the
In this paper, we address the detection of moving targets in urban environments with a flying drone. The drone flies from one point to another at a constant speed while creating a wider coverage than a stationary drone. To compensate for the drone’s movement in the video sequence, frame registration is performed between two consecutive frames. The displacement vector for the next frame is obtained when the sum of absolute difference (SAD) between the two consecutive frames is minimized. This displacement vector compensates for the coordinates of the next frame. Moving objects are then detected in the current frame by frame subtraction and thresholding; the current frame is subtracted by the next frame that is compensated, and thresholding is performed to generate a binary image. Two morphological operations (erosion and dilation) are sequentially applied to the binary image. The erosion operation removes small clutters while dilation operation compensates for erosion in the object areas and connects the segmented areas of one object to generate target blobs. Finally, target blobs smaller than the target size are removed. Meanwhile, the drone velocities in the
In the experiments, a drone flies in a straight line at a speed of 5.1 m/s and captures a video clip at the height of 150 m. The drone camera points directly downwards while capturing the video sequences. Nine moving vehicles (6 sedans, 2 buses, and 1 bicycle) are captured for approximately 15 seconds. The consecutive frames are apart by three frames for frame registration, and moving targets are detected every three frames. The detection and false alarm rates are obtained with different minimum blob sizes, and the receiver operating characteristic (ROC) curve is obtained. The average detection rate ranges from 0.9 to 0.97, while the false alarm rate ranges from 0.06 to 0.5. The root mean square error (RMSE) of the speed is 0.7 m/s when the reference frame is set to the first frame for frame registration.
The remainder of the paper is organized as follows: moving object detection and drone velocity estimation are discussed in Section 2. Section 3 presents experimental results and the conclusion follows in Section 4.
Figure 2 shows the block diagram of frame registration and moving object detection. The system consists of two stages: frame registration and object detection. The drone velocity is estimated by the displacement vector. The detailed procedures are described in the next subsections.
Two consecutive frames
where
The displacement vectors are estimated in the
The subtraction and thresholding processes generates a binary image after the coordinates of the next frame is compensated, as
where
where
where
To evaluate the detection performance, three metrics are defined as
where
The velocity estimation of the drone is useful when a high-precision global navigation satellite system is not available. The accuracy of the frame registration can be also verified by the estimate of the drone speed. The following cumulative displacement vector minimizes the cumulative SAD, where the reference frame is set to the first frame as
The velocities of the moving drone in
where
The speed of the drone is calculated as
The accuracy of the speed is evaluated by the RMSE between the ground truth and the estimate as
where
The drone (DJI Phantom 4 Advanced) flew at 150 m high in a straight line at a constant speed of 5.1 m/s around the Daegu University main gate fountain. The speed was set at the control box by the operator. The drone camera pointed directly toward the ground and captured a video clip of 454 frames for approximately 15 seconds at 30 fps; thus,
Figure 3(a) shows Targets 1–4 at the first frame, Figure 3(b) shows Targets 1–6 at the 54th frame, and Figure 3(c) shows Targets 1–3 and 6–9 at the 131st frame. In Figure 3, the red circles indicate the moving targets. The coverage of each frame continued to move as the drone flew slightly upwards from left to right.
Table 1 shows the characteristics of the nine targets such as the initial and final frame when the targets appear as well as the direction and component of the targets in the video. This analysis was performed manually to identify the target characteristics for the vehicle detection.
Two consecutive frames were 0.1 seconds (3 frames) apart from each other. After the coordinates of the next frame were compensated, frame subtraction was performed between the current and 0.1 seconds preceding frames. Thus, object detection was performed with 151 frames (1, 4, 7, ..., 451 frames). Figure 4(a) shows the results of the detection process of Figure 3(a) when
Figure 5(a) to 5(c) show the detection results (bounding boxes) of Figure 3(a) to 3(c), respectively. The number of detections in Figure 3(a) to 3(c) are 3, 6, and 6 out of 4, 6, and 7 targets, respectively. No false alarm was detected in Figure 5(a) to 5(c). The bus (Target 1) was missing in Figure 5(a) due to the tree obstruction, and the gray sedan (Target 6) was missing in Figure 5(c) because the car stopped before the traffic signal. It should be noted that when a target stops, there is no change in the target area between the frames.
Table 2 shows the detection rates with varying
Figure 7 shows an expanded coverage as the drone moves. In Figure 7, the centroids of all the targets are presented in blue circles, including false alarms. The coverage was expanded to 2779 × 1096 pixels, which corresponds to 305 × 120 m. In the upper left part, several people were detected but disappeared with a larger threshold.
Figure 8(a) and 8(b) show the estimated velocities of the drone in the
Table 4 shows the RMSEs of the instant and cumulative speeds. The cumulative speed is around 0.07 m/s, which is 98.6% accurate compared to the actual speed while the accuracy of the instant speed is 92.2%.
In this study, multiple moving targets were detected by a flying drone. To compensate for the movement of the drone, the frames were registered by SAD. Moving objects were then successfully detected by frame subtraction, morphological operations, and false blob removal. In addition, the drone speed was estimated, and the registration accuracy was verified with a ground truth. The detection rate was as high as 97% and the RMSE of the drone speed was as low as 0.07 m/s. However, if the target moves too slowly, the detection rate can be decreased. Blob fragments may be generated if the speed of the target is too high.
This technology can be applied to smart surveillance, such as unmanned security systems or search and rescue missions. Multiple target tracking after object detection remains for future study.
No potential conflict of interest relevant to this article was reported.
This research was supported by the Daegu University Research Grant 2015.
Illustration of a flying drone for detection of moving targets.
Block diagram of moving object detection.
(a) Targets 1–4 at the first frame, (b) Targets 1–6 at the 54th frame, and (c) Targets 1–3 and 6–9 at the 131st frame.
Detection process of (a)
Detection results (bounding boxes) of (a)
ROC curve for target detection.
All detections including false alarms on the expanded coverage with varying
(a) Velocity in the
Table 1 . Target characteristics.
Target ID | Initial frame | Final frame | Direction | Component |
---|---|---|---|---|
1 | 1 | 454 | Right | Blue Bus |
2 | 1 | 454 | Right | Black Sedan |
3 | 1 | 454 | Right | White Sedan |
4 | 1 | 196 | Upward | White Sedan |
5 | 31 | 229 | Upward | Bicycle |
6 | 130 | 208 | Left | Gray Sedan |
7 | 211 | 454 | Left | Blue Bus |
8 | 295 | 454 | Left | White-black Sedan |
9 | 370 | 415 | Parking | White Sedan |
Table 2 . Detection rates.
Target ID | Number of appearances | |||
---|---|---|---|---|
380 | 400 | 420 | ||
1 | 151 | 0.94 | 0.85 | 0.82 |
2 | 151 | 1 | 1 | 1 |
3 | 151 | 1 | 1 | 1 |
4 | 66 | 1 | 1 | 1 |
5 | 67 | 0.92 | 0.83 | 0.77 |
6 | 27 | 1 | 0.96 | 0.96 |
7 | 81 | 1 | 0.82 | 0.66 |
8 | 53 | 1 | 1 | 1 |
9 | 16 | 0.75 | 0.62 | 0.56 |
Avg. | 84 | 0.97 | 0.92 | 0.90 |
Table 3 . Number of false alarms and false alarm rates.
380 | 400 | 420 | |
---|---|---|---|
Number of false alarms | 76 | 23 | 9 |
Number of effective false alarms | 12 | 7 | 4 |
0.50 | 0.15 | 0.06 | |
0.08 | 0.046 | 0.03 |
Table 4 . Speed RMSE.
RMSE (m/s) | |
---|---|
Instant | 0.40 |
Cumulative | 0.07 |
Laily Nur Qomariyati, Nurul Jannah, Suryo Adhi Wibowo, and Thomhert Suprapto Siadari
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(3): 194-202 https://doi.org/10.5391/IJFIS.2024.24.3.194Dheo Prasetyo Nugroho, Sigit Widiyanto, and Dini Tri Wardani
International Journal of Fuzzy Logic and Intelligent Systems 2022; 22(3): 223-232 https://doi.org/10.5391/IJFIS.2022.22.3.223Min-Hyuck Lee, and Seokwon Yeom
International Journal of Fuzzy Logic and Intelligent Systems 2018; 18(3): 182-189 https://doi.org/10.5391/IJFIS.2018.18.3.182Illustration of a flying drone for detection of moving targets.
|@|~(^,^)~|@|Block diagram of moving object detection.
|@|~(^,^)~|@|(a) Targets 1–4 at the first frame, (b) Targets 1–6 at the 54th frame, and (c) Targets 1–3 and 6–9 at the 131st frame.
|@|~(^,^)~|@|Detection process of (a)
Detection results (bounding boxes) of (a)
ROC curve for target detection.
|@|~(^,^)~|@|All detections including false alarms on the expanded coverage with varying
(a) Velocity in the