INSIGHT: Explainable Weakly-Supervised Medical Image Analysis

A novel approach to interpretable and efficient medical image analysis using weakly supervised learning.

Wenbo Zhang

University of Rochester

Junyu Chen

University of Rochester

Christopher Kanan

University of Rochester

Paper Code

Visualization

Below are examples of heatmaps generated by INSIGHT, which highlight diagnostically relevant regions in whole-slide images (WSIs). Our method achieves this using only WSI-level labels, making it both efficient and interpretable without requiring costly pixel-level annotations.

Example WSI heatmap

Motivation

The rapid growth of medical imaging data has presented significant challenges for developing diagnostic systems that are both accurate and interpretable. Traditional methods often rely on fully supervised approaches that require dense annotations, which are labor-intensive and costly to obtain. Moreover, existing aggregators, such as those based on multiple-instance learning (MIL), struggle to achieve a balance between classification accuracy and spatial calibration. While they can identify regions of interest, they typically depend on post-hoc visualization methods like Grad-CAM to generate interpretable outputs. This reliance on external tools introduces additional complexity and fails to integrate interpretability as a core feature of the model.

About INSIGHT

INSIGHT (Integrated Network for Segmentation and Interpretation with Generalized Heatmap Transmission) is a novel framework designed to analyze large-scale medical images, such as whole-slide pathology images (WSIs) and volumetric CT scans, while maintaining interpretability for clinicians. It addresses the limitations of traditional methods by embedding interpretability directly into its architecture, eliminating the need for post-hoc visualization tools like Grad-CAM. INSIGHT combines fine-grained local feature detection with broader contextual awareness through two key modules: the Detection Module, which captures small, diagnostically critical details, and the Context Module, which suppresses irrelevant activations by incorporating global contextual information. This design enables INSIGHT to generate heatmaps that closely align with ground-truth diagnostic regions, offering both accuracy and transparency. By requiring only image-level labels, INSIGHT significantly reduces the annotation burden while delivering state-of-the-art classification and weakly supervised segmentation performance.

INSIGHT's inputs and architecture: (a) Images are pre-processed to extract pre-trained features from each CT slice or WSI patch. (b) These features are processed through the Detection and Context modules to generate slice- or patch-level heatmaps by incorporating both fine-grained details and broader contextual information. (c) Heatmaps are aggregated across slices or patches to produce a binary prediction for each category along with interpretable heatmaps.

Quantitative Performance

Below is the comparison of INSIGHT with other models across CAMELYON16, BRACS, and MosMed datasets. The results showcase classification AUC and segmentation Dice metrics.

CAMELYON16 & BRACS

Aggregator	CAMELYON16		BRACS
Aggregator	AUC	Dice	ADH	FEA	DCIS	Invasive	Macro AUC
ABMIL	0.975	55.8 ± 25.0	0.656	0.744	0.804	0.995	0.800
CLAM-SB	0.966	64.7 ± 24.1	0.611	0.757	0.833	0.999	0.800
CLAM-MB	0.973	67.7 ± 22.6	0.701	0.687	0.828	0.998	0.804
TransMIL	0.982	12.4 ± 22.4	0.644	0.653	0.769	0.989	0.764
INSIGHT(Ours)	0.990	74.6 ± 19.1	0.734	0.790	0.837	0.999	0.840

MosMed

Task	Model	Setting	AUC / Dice
Classification	ViT-COVID	0.6/0.2/0.2 split	0.870
	PR-3D-CNN	Five-fold CV	0.914 ± 0.049
	INSIGHT	Five-fold CV	0.962 ± 0.012

Segmentation	3D U-Net	Voxel-level	40.5 ± 21.3
	3D GAN	Volume-level	41.2 ± 14.7
	INSIGHT	Volume-level	42.7 ± 15.3

Acknowledgments

This work was supported in part by NSF award #2326491. The views and conclusions contained herein are those of the authors and should not be interpreted as the official policies or endorsements of any sponsor. We thank Jhair Gallardo and Shikhar Srivastava for their comments on early drafts.

↑ Back to Top