Projects — Tarun Sadarla

The Clinical AI Arc

Two flagship systems. One coherent clinical capability.

Read these as a sequence, not a list. Each solved a different clinical problem and transferred a different capability to what came next.

ASD Detection from Structural MRI

Clinical classification from NIfTI brain volumes. Learned: multi-site validation, quality gating, dual XAI, the limit of a high-AUC model with no explainability or uncertainty.

Also: established the 8-stage clinical inference pipeline used in all subsequent systems.

→

Fetal Head Circumference Measurement

Clinical measurement from ultrasound. Learned: segmentation, temporal cine reasoning, Hadlock GA estimation, clinical threshold validation. Exposed the deployment gap → led directly to CNN pruning research.

Also: Hybrid Crossover pruning (see Research section) runs in parallel, motivated by this system.

★ Origin Project · B.Tech 2023 + MS 2026 Rebuild · Neuroimaging AI

ASD Detection from Structural Brain MRI

1,067 subjects · 17 acquisition sites · ABIDE-I · NIfTI volumes

Live on HuggingFace

0.994

AUC-ROC

95.6%

Sensitivity

97.2%

Specificity

0.027

Brier Score

Acquisition Sites

The clinical problem: ASD diagnosis typically takes 18–24 months from referral to confirmed diagnosis — limited by specialist availability. Structural MRI is already routinely acquired. This system analyses that existing data algorithmically, flagging high-probability cases for expedited review.

In clinical terms: 95.6% sensitivity means 956 of every 1,000 children with ASD are correctly identified. 97.2% specificity means only 28 false alarms per 1,000 non-ASD children. The Brier score of 0.027 means stated confidence levels are genuinely calibrated — when the system says 85% probable, 85% is what it means.

Site variance is the critical deployment finding: Sensitivity spans 88.5% (PITT) to 98.5% (UM_1) — a 10 percentage-point gap driven by scanner heterogeneity. Site-specific calibration or ComBat harmonisation would be required before production deployment. This is documented in the Model Card with explicit implications.

The 2023→2026 rebuild: B.Tech baseline had no explainability, no uncertainty, no quality gating, no clinical output. The 2026 system added GradCAM + LIME dual explanation, MC-Dropout (30 stochastic passes), 4-metric quality gate, LLM clinical report, site reliability indicators, and FDA SaMD Class II governance documentation.

Clinical proposition

Pre-assessment triage layer targeting the 18–24 month diagnostic delay — surfaces high-probability cases for expedited specialist review

What a clinician receives

P(ASD) with CI, GradCAM + LIME spatial heatmaps, MC-Dropout uncertainty σ, site reliability badge, LLM-generated PDF report with regulatory framing

Documented failure mode

Confident false positives (σ ≈ 0.005) — the dangerous mode where the model is wrong and certain. Human override protocol required. Documented explicitly in Model Card.

Regulatory status

Research-grade decision support only. FDA SaMD Class II / De Novo pathway, sex and site bias audits, and known failure modes documented in full Model Card.

→ What the ASD project couldn't do — and what came next

The ASD system classified from whole-brain structural MRI but measured nothing. Clinical practice often needs a precise numerical output — not just a probability. The fetal head project moved from classification to clinical measurement, adding temporal reasoning (cine-loop sequences) and a hard deployment constraint: the model needs to run in a busy ultrasound suite on shared hardware, not a research server. That deployment constraint was what motivated the CNN pruning work running in parallel. The question "how do we make this small enough to actually deploy?" became its own research thread.

★ Capstone System · Obstetric AI · Deployed · CSCE 6260 + Post-course rebuild

Fetal Head Circumference Measurement

HC18 dataset · Static + cine-loop · ISUOG ±3mm threshold · Hadlock 1984 GA estimation

Live on HuggingFace

1.75mm

HC Error (ISUOG ≤3mm)

97.36%

Dice (Static)

2.10mm

HC Error (Cine-loop)

3.4×

Better than SOTA

153 hrs

Saved / unit / year

The clinical problem: HC measurement is mandatory at every routine antenatal scan. Manual calliper placement takes 2–4 minutes, introduces up to 7mm inter-observer variation, and accumulates to ~153 sonographer hours per year at a unit doing 20 scans per day. This system replaces that step with a reproducible measurement to 1.75mm — 40% inside the ISUOG ±3mm acceptability threshold.

Course baseline → clinical system: The course project achieved 17.25mm MAE — over the ISUOG threshold by 5.75×. The post-course rebuild closed that gap to 1.75mm through three targeted changes: flood-fill correction of hollow-ellipse annotations (single biggest fix), boundary-weighted loss with distance-transform upweighting, and clinically-motivated augmentation (Rician speckle, not Gaussian). No architecture change — engineering, not a better model.

The cine-loop system: Sonographers assess HC over a probe sweep, not a single frozen frame. The temporal model processes 16-frame sequences via shared 2D U-Net encoder + temporal self-attention (MAE 2.10mm, ISUOG compliant). Training data synthesized via Pseudo-LDDM v2 (Ornstein-Uhlenbeck probe motion, per-frame skull variation, Rician speckle) when real cine acquisitions were clinically held.

Documented limitation: Third-trimester MAE 7.60mm exceeds the ISUOG threshold. Cause: acoustic shadowing from the ossified skull. Explicitly documented in Model Card with recommendation for manual verification at >30 weeks GA.

Clinical proposition

Replaces 2–4 min manual calliper placement with reproducible automated measurement. 153 sonographer hours saved per unit per year at 20 scans/day.

What a clinician receives

HC in mm, gestational age ± 2-week CI, trimester classification, GradCAM++ overlay, frame-level uncertainty map, dual-mode PDF report (LLM + template)

Deployment connection

Hybrid Crossover pruning (see Research section) delivers 2× compression + accuracy improvement for this backbone — enabling clinical hardware deployment without accuracy penalty

Governance

GA-trimester bias audit, FDA SaMD Class II + EU IVDR Class B framing, Model Card with acoustic-shadowing limitation explanation and trimester-specific reliability indicators

Research Contribution

A novel compression method — motivated by clinical deployment.

A directed study with a clear research question, a novel contribution, rigorous evaluation, and an IEEE-format report. Originated from the deployment constraint in the fetal HC system.

★ Novel Method · Directed Study · CSCE 5934 · Prof. Russel Pears

CNN Filter Pruning — Hybrid Crossover Method

VGG-16 · CelebA · CIFAR-10 · IEEE-format report · Individual contribution

2×

Channel Compression

+0.37%

Accuracy Improvement

2×

Faster Runtime

2.05×

Best Latency Speedup

The research question: When two convolutional filters are found to be redundant, standard structured pruning discards the weaker one permanently. Is it better to delete — or to synthesise a new filter that preserves the information from both?

The contribution: Hybrid Crossover replaces deletion with regression-based synthesis. Given two redundant filters A and B, a new filter is learned via least-squares regression to match the element-wise max of their activation maps — the peak activation from each parent, in a single filter. Integrated into a Global ILR scoring pipeline with hard accuracy guard rails (≤2pp overall, ≤6pp per-class on CelebA).

The counterintuitive result: Hybrid Crossover achieves 2× more compression AND higher accuracy AND 2× faster runtime than standard drop under identical constraints. Faster because regression-synthesized filters satisfy guard rails on the first attempt — standard drop under aggressive compression triggers expensive rollback loops that dominate total runtime.

Clinical deployment relevance: The CIFAR-10 5-CNN result (2.05× latency speedup, 1.2pp accuracy cost, 75% of B4 channels removed) demonstrates the method on a realistic inference backbone. Applied to the fetal HC segmentation backbone: same clinical accuracy, half the GPU memory, twice the inference speed — the difference between requiring a dedicated GPU server and running on existing radiology workstation infrastructure.

The novel idea

Synthesis over deletion: create a new filter from two redundant parents rather than discarding one. Reframes compression as an information-preservation problem, not a removal problem.

Why it's faster AND better

Synthesized filters satisfy accuracy guard rails on the first attempt. Standard drop under aggressive compression repeatedly violates constraints and triggers rollbacks — those rollback loops dominate runtime.

Limitations

Middle layers (B2–B3) resist compression due to low filter redundancy. Evaluated on binary classification only — multi-class and medical imaging domain extension are the natural next steps.

PyTorchVGG-16ILR SaliencyGuard RailsCelebACIFAR-10

Full Case Study → GitHub ↗

Clinical Systems

Deployed systems from course and team work.

Substantial team projects with real datasets and documented clinical relevance. Individual contributions are clearly marked.

Edge AI · Wearable · 2-person team

WECARE — Cardiac & Fall Detection

ECG F1: 0.9864 · 0.033ms inference · 8 missed falls / 228

On-device arrhythmia and fall detection — entirely on-device, no cloud round-trip in a cardiac emergency. ECG stream: 1D CNN on MIT-BIH, class-weighted sampling for 75% normal-beat imbalance, TorchScript export at 0.033ms — 1,200× below the 40ms real-time clinical threshold. Fall stream: MobiFall, threshold set at 0.65 (not 0.50) to deliberately minimise missed falls at the cost of more false alarms. Threshold as clinical safety policy, not hyperparameter.

Individual contribution: full ECG pipeline (bandpass filter, R-peak segmentation, 1D CNN, class weighting, evaluation) + TorchScript export. Teammate: full IMU pipeline.

Case Study → GitHub ↗

Digital Pathology · Full-Stack · 5-person team

Histopathologic Cancer Detection

AUC 0.921 · F1 0.819 · 187K patches · CADe compliance

Binary cancer detection on PCam (Camelyon16). Key finding: 4-stage data-volume study showed non-monotonic AUC scaling — adding more data initially degraded performance before recovering. This is directly informative for clinical AI validation practice. Post-course additions: GradCAM spatial explainability (FDA CADe audit requirement), confidence-tiered decision zones (auto-clear at p<0.10 removes 24.5% of patches at 2.6% miss rate), full Model Card. Full-stack Django REST API + React frontend.

Individual contribution: one of two model developers in team of five. Led CNN architecture, training, hyperparameter tuning, and post-course GradCAM + governance additions.

Case Study → GitHub ↗

Technical Breadth

Skills that transfer to clinical AI pipelines.

Each project below built a capability used somewhere in the clinical systems above. Included for breadth, not as clinical work.

Audio · Cloud

BirdCLEF Audio Classification

107-feature librosa pipeline, hierarchical XGBoost (Order → Family), live AWS Streamlit deployment. Acoustic signal processing and cloud deployment skills — directly transferable to auscultation, phonocardiography, respiratory AI.

Case Study →

Multimodal · Fusion

Multimodal Emotion Recognition

VGG-16 + BiLSTM late fusion on MELD. Fusion underperformed individual streams — diagnosing why (temporal misalignment) taught more than a success would. Multimodal failure analysis is directly applicable to imaging + EHR pipelines.

Case Study →

Big Data · Cloud

Customer Segmentation — PySpark

541K transactions on AWS S3 + EMR. RFM feature engineering + K-Means / GMM. Large-scale clinical data (EHR, PACS, claims) uses the same infrastructure pattern — PySpark on distributed compute, not pandas.

Case Study →

Classical Vision · MATLAB

Industrial Quality Classification

Sobel + morphological operations, Gabor + Wavelet features, mIoU evaluation in MATLAB. Classical image processing foundations remain relevant for interpretable feature engineering where training data is scarce.

Case Study →

Course Collections

Theoretical foundations and regulatory context.

Multi-project course pages — each covers the assignments within a single course that collectively built the groundwork for the clinical systems above.

Machine Learning · Fall 2024 · Prof. Russel Pears

Machine Learning — 3 Projects

Glass ID (RBF SVM), CelebA attribute recognition (VGG-16, 0.9594 acc), Bayesian network construction. Kernel methods, transfer learning, probabilistic inference.

View →

Fundamentals of AI · Spring 2025 · Prof. Russel Pears

AI Algorithms — 3 Projects

Warehouse robot (Dijkstra + A*), genetic algorithm for job-shop scheduling (makespan 298), value iteration RL (100% goal success, 22 iterations).

View →

AI in Wearables & Healthcare · Fall 2025 · Prof. Mahdi Pedram

Clinical AI Stack — 15 Activities

FDA/SaMD frameworks, Seeed XIAO nRF52840 hardware (PPG + IMU at 50Hz), YOLOv8 food detection (mAP@50 = 0.913), MIT App Inventor mobile deployment, MIMIC-IV ICU analysis. Real hardware, real clinical regulatory context.

View →

Scientific Data Visualization · Summer 2025 · Prof. Zeenat Tariq

Data Visualization — 2 Assignments

Tableau (Global Superstore), D3.js via Observable (Iris), Power BI (tech layoffs). Clinical dashboards and decision-support interfaces require these same skills.

View →

Applications of AI in Health · Spring 2026 · Prof. Haihua Chen · Health Informatics Dept.

HINF 5506 — In Progress

Clinical decision support, health informatics, AI integration into care workflows, clinical necessity vs. technical feasibility. Bridges the gap between notebook model output and clinical deployment requirements.

Prof. Chen, Health Informatics · Completing Spring 2026

In Progress

A deliberate progression towardproduction clinical AI.

Two flagship systems. One coherent clinical capability.

A novel compression method — motivated by clinical deployment.

Deployed systems from course and team work.

Skills that transfer to clinical AI pipelines.

Theoretical foundations and regulatory context.

A deliberate progression toward
production clinical AI.