Initializing Detection System

0%

UAV
COMPUTER VISION PROJECT

UAV Human DetectionThermal Imaging AI

2,898Thermal Images
Faster R-CNNArchitecture
2 ModelsBaseline vs Augmented
Explore Project
CHAPTER 01

SeeingfromAbove

Can data augmentation improve robustness of UAV detection in adverse conditions?

Search and Rescue (SAR) operations rely on UAVs equipped with thermal imaging to locate missing persons in remote areas. However, current detection models are trained on clean aerial images and fail when faced with adverse conditions like heavy snowfall or wildfire smoke.

We investigate whether data augmentation strategies can simulate realistic SAR environmental conditions to improve model robustness without significantly sacrificing baseline performance on clean images.

Person 0.94
Person 0.87
Person 0.79
THERMAL VIEW3 DETECTIONS

Applications

Search & Rescue

Locate missing persons in remote or blocked-off terrain where manual image review is slow and error-prone.

Disaster Response

Operate in wildfire smoke and heavy snowfall conditions that degrade standard detection models.

Reduced False Negatives

Minimize missed detections in SAR operations where a false negative can mean a life lost.

All-Weather Operation

Extend operation windows beyond ideal weather conditions using augmentation-trained models.

0Thermal Images
0pxInput Resolution
0Object Classes
0Model Variants
CHAPTER 02

IntelligenceintheSky

Faster R-CNN with ResNet-50 backbone for thermal human detection

Model Architecture

Input Layer

512×512 thermal image

HIT-UAV thermal dataset

Backbone

ResNet-50 + FPN

Layers 1–3 frozen, Layer 4+ trained

Detection Head

Faster R-CNN

79.4% params trainable

Output

Bounding Boxes

Class + Confidence

Model Variants

Model A

Baseline

Trained on clean thermal images only. F1 = 0.78 on clean data, but degrades 20% on perturbed data.

  • Clean image training only
  • F1: 0.78 clean, 0.62 perturbed
  • mAP drops 27.9% under adverse conditions
Recommended

Model B

SAR Augmented

Trained with 50% SAR augmentation (snow + smoke). Only 2% F1 cost on clean data for 19.4% robustness gain.

  • Snow + smoke augmentation at 50% rate
  • F1: 0.76 clean, 0.70 perturbed
  • mAP drops only 8.6% under adverse conditions

Key Finding

Model B is 19.4% more robust under adverse conditions

SAR augmentation trades only ~2% clean-data F1 for a 19.4% mAP robustness improvement on perturbed data with snow and smoke effects.

CHAPTER 03

ImpactAcrossIndustries

From rescue missions to commercial deployment

LAT: 46.8523°LNG: -121.7603°

Search & Rescue

Remote Terrain

Our primary target application. SAR teams need reliable detection in snow, smoke, and low-visibility conditions where current models fail. Our augmented model retains 80% recall under adverse conditions.

84.9%Recall (Clean)
80.2%Recall (Perturbed)
+19.4%Robustness Gain
Targeting SAR teams, disaster response, and first responders
CHAPTER 04

TrainingVision

Building accuracy through diverse aerial perspectives

Thermal Imaging

Infrared captures heat signatures day and night at 60–130m altitude

2,898 Images

Sampled from 43,470 UAV video frames with camera angles 30–90°

YOLO Format Labels

Bounding box annotations for Person, Car, Bicycle, OtherVehicle

HIT-UAV Dataset

Published in Scientific Data by Suo et al., CC0 licensed

SAR Augmentations

Using the Albumentations library, we perturb images with snow and smoke effects to simulate realistic SAR conditions. Applied to 50% of training images for Model B.

Snow Effect

50%

Albumentations-based snow perturbation with Gaussian noise for realistic SAR winter conditions

Smoke/Fog Effect

50%

Albumentations-based fog overlay with Gaussian noise to simulate wildfire and disaster scenarios

Training Distribution
Clean Images50%
Augmented50%

Training Configuration

Image Size512×512
Batch Size4
Epochs6
Learning Rate0.005
OptimizerSGD
IoU Threshold0.5

Training Progress

1.00.50.0
0246
Loss over Epochs
0Total Images
0pxInput Size
0Train Split
0Test Split
CHAPTER 05

SeeItInAction

Try the detection models yourself

UAV Human Detection - Live Demo

Upload Image

Drop any thermal or aerial image for instant detection

Adjust Threshold

Fine-tune confidence threshold for precision vs recall trade-off

Compare Models

Side-by-side comparison of baseline vs augmented model

Robustness Test

Test both models under snow and fog conditions

The demo runs on HuggingFace Spaces. First inference may take a moment to warm up. Upload your own thermal images or use the provided examples.

CHAPTER 06

NextHorizons

Advancing capabilities for tomorrow's challenges

Multi-Class Detection

Next Priority

Expand beyond person detection to also detect cars, bicycles, and other vehicles. These objects can indicate a missing person's location in SAR operations.

Progress0%

Ensemble Models

Proposed

Combine multiple model types and average results to reduce overfitting and bias, creating a more robust detection system overall.

Progress0%

Video Tracking

Proposed

Extend from single-frame detection to continuous tracking across video sequences, leveraging the 43,470 frames available in the HIT-UAV source data.

Progress0%

Longer Training

Recommended

Both models were still improving at epoch 6. Extending to 15–20 epochs with early stopping and learning rate warmup would likely improve F1 by 5–10%.

Progress0%

Improved Localization

Recommended

AP@0.75 is below 0.43 for both models. Switching to GIoU/CIoU loss and tuning anchor box sizes for UAV data would improve bounding box precision.

Progress0%

Additional Augmentations

Proposed

Add more perturbation types beyond snow and smoke (e.g., rain, dust, varying altitudes) to further improve robustness across diverse SAR scenarios.

Progress0%

Development Roadmap

SHORT TERM

Training Improvements

  • Extend to 15–20 epochs with early stopping
  • Increase batch size to 8–16
  • Add LR warmup schedule
MEDIUM TERM

Model Enhancements

  • Multi-class detection (person + vehicles)
  • GIoU/CIoU loss for localization
  • Ensemble model averaging
LONGER TERM

Data & Robustness

  • Additional perturbation types (rain, dust)
  • Video frame tracking pipeline
  • Larger thermal datasets
DEPLOYMENT

Real-World Readiness

  • Confidence threshold tuning per scenario
  • Human-in-the-loop validation
  • Ethics & privacy safeguards

Explore the Project

View our code, training notebook, and results on GitHub. Built by Lindsay Gross, Andrew Jin, and Shreya Mendi.