COMPUTER VISION PROJECT

UAV Human DetectionThermal Imaging AI

2,898Thermal Images

Faster R-CNNArchitecture

2 ModelsBaseline vs Augmented

CHAPTER 01

SeeingfromAbove

Can data augmentation improve robustness of UAV detection in adverse conditions?

Search and Rescue (SAR) operations rely on UAVs equipped with thermal imaging to locate missing persons in remote areas. However, current detection models are trained on clean aerial images and fail when faced with adverse conditions like heavy snowfall or wildfire smoke.

We investigate whether data augmentation strategies can simulate realistic SAR environmental conditions to improve model robustness without significantly sacrificing baseline performance on clean images.

Person 0.94

Person 0.87

Person 0.79

THERMAL VIEW3 DETECTIONS

Applications

Search & Rescue

Locate missing persons in remote or blocked-off terrain where manual image review is slow and error-prone.

Disaster Response

Operate in wildfire smoke and heavy snowfall conditions that degrade standard detection models.

Reduced False Negatives

Minimize missed detections in SAR operations where a false negative can mean a life lost.

All-Weather Operation

Extend operation windows beyond ideal weather conditions using augmentation-trained models.

0Thermal Images

0pxInput Resolution

0Object Classes

0Model Variants

CHAPTER 02

IntelligenceintheSky

Faster R-CNN with ResNet-50 backbone for thermal human detection

Model Architecture

Input Layer

512×512 thermal image

HIT-UAV thermal dataset

Backbone

ResNet-50 + FPN

Layers 1–3 frozen, Layer 4+ trained

Detection Head

Faster R-CNN

79.4% params trainable

Output

Bounding Boxes

Class + Confidence

Model Variants

Model A

Baseline

Trained on clean thermal images only. F1 = 0.78 on clean data, but degrades 20% on perturbed data.

Clean image training only
F1: 0.78 clean, 0.62 perturbed
mAP drops 27.9% under adverse conditions

Recommended

Model B

SAR Augmented

Trained with 50% SAR augmentation (snow + smoke). Only 2% F1 cost on clean data for 19.4% robustness gain.

Snow + smoke augmentation at 50% rate
F1: 0.76 clean, 0.70 perturbed
mAP drops only 8.6% under adverse conditions

Key Finding

Model B is 19.4% more robust under adverse conditions

SAR augmentation trades only ~2% clean-data F1 for a 19.4% mAP robustness improvement on perturbed data with snow and smoke effects.

CHAPTER 03

ImpactAcrossIndustries

From rescue missions to commercial deployment

LAT: 46.8523°LNG: -121.7603°

Search & Rescue

Remote Terrain

Our primary target application. SAR teams need reliable detection in snow, smoke, and low-visibility conditions where current models fail. Our augmented model retains 80% recall under adverse conditions.

84.9%Recall (Clean)

80.2%Recall (Perturbed)

+19.4%Robustness Gain

Targeting SAR teams, disaster response, and first responders

CHAPTER 04

TrainingVision

Building accuracy through diverse aerial perspectives

Thermal Imaging

Infrared captures heat signatures day and night at 60–130m altitude

2,898 Images

Sampled from 43,470 UAV video frames with camera angles 30–90°

YOLO Format Labels

Bounding box annotations for Person, Car, Bicycle, OtherVehicle

HIT-UAV Dataset

Published in Scientific Data by Suo et al., CC0 licensed

SAR Augmentations

Using the Albumentations library, we perturb images with snow and smoke effects to simulate realistic SAR conditions. Applied to 50% of training images for Model B.

Snow Effect

50%

Albumentations-based snow perturbation with Gaussian noise for realistic SAR winter conditions

Smoke/Fog Effect

50%

Albumentations-based fog overlay with Gaussian noise to simulate wildfire and disaster scenarios

Training Distribution

Clean Images50%

Augmented50%

Training Configuration

Image Size512×512

Batch Size4

Epochs6

Learning Rate0.005

OptimizerSGD

IoU Threshold0.5

Training Progress

1.00.50.0

0246

Loss over Epochs

0Total Images

0pxInput Size

0Train Split

0Test Split

CHAPTER 05

SeeItInAction

Try the detection models yourself

UAV Human Detection - Live Demo

Upload Image

Drop any thermal or aerial image for instant detection

Adjust Threshold

Fine-tune confidence threshold for precision vs recall trade-off

Compare Models

Side-by-side comparison of baseline vs augmented model

Robustness Test

Test both models under snow and fog conditions

The demo runs on HuggingFace Spaces. First inference may take a moment to warm up. Upload your own thermal images or use the provided examples.

CHAPTER 06

NextHorizons

Advancing capabilities for tomorrow's challenges

Multi-Class Detection

Next Priority

Expand beyond person detection to also detect cars, bicycles, and other vehicles. These objects can indicate a missing person's location in SAR operations.

Progress0%

Ensemble Models

Proposed

Combine multiple model types and average results to reduce overfitting and bias, creating a more robust detection system overall.

Progress0%

Video Tracking

Proposed

Extend from single-frame detection to continuous tracking across video sequences, leveraging the 43,470 frames available in the HIT-UAV source data.

Progress0%

Longer Training

Recommended

Both models were still improving at epoch 6. Extending to 15–20 epochs with early stopping and learning rate warmup would likely improve F1 by 5–10%.

Progress0%

Improved Localization

Recommended

AP@0.75 is below 0.43 for both models. Switching to GIoU/CIoU loss and tuning anchor box sizes for UAV data would improve bounding box precision.

Progress0%

Additional Augmentations

Proposed

Add more perturbation types beyond snow and smoke (e.g., rain, dust, varying altitudes) to further improve robustness across diverse SAR scenarios.

Progress0%

Development Roadmap

SHORT TERM

Training Improvements

Extend to 15–20 epochs with early stopping
Increase batch size to 8–16
Add LR warmup schedule

MEDIUM TERM

Model Enhancements

Multi-class detection (person + vehicles)
GIoU/CIoU loss for localization
Ensemble model averaging

LONGER TERM

Data & Robustness

Additional perturbation types (rain, dust)
Video frame tracking pipeline
Larger thermal datasets

DEPLOYMENT

Real-World Readiness

Confidence threshold tuning per scenario
Human-in-the-loop validation
Ethics & privacy safeguards

Explore the Project

View our code, training notebook, and results on GitHub. Built by Lindsay Gross, Andrew Jin, and Shreya Mendi.

View on GitHub Open Colab Notebook