Self-Supervised Denoising for Automated TACS Classification

Experimental Validation of DeepInverse Framework on Tumor-Associated Collagen Signatures

Oscar David Hospinal R.

Independent Researcher

Code Dataset DeepInverse Demo YouTube Portfolio Email

Experimental validation of self-supervised denoising methods (Gaussian, BM3D, DRUNet) on Hagen confocal fluorescence microscopy dataset demonstrates significant improvement in TACS classification accuracy (Delta > 5%).

Project Overview Videos

Viendo a Través del Ruido: IA Autosupervisada para Detectar Riesgo de Metástasis en TACS

Self-Supervised Denoising en Microscopía: ¿Puede la IA Mejorar la Clasificación TACS más del 5%?

Interactive Demo Walkthrough

Explore the complete methodology pipeline through our interactive Gradio application demos

Step 1/5

1. Preprocessing Hagen Zenodo Dataset

Data loading y normalización de imágenes de microscopía confocal del benchmark Hagen AI4Life-MDC24.

Step 2/5

2. Denoising Methods

Comparación de Gaussian, BM3D y DRUNet para eliminar ruido preservando estructuras fibrilares.

Step 3/5

3. Risk Classification

Clasificación de riesgo metastásico TACS usando Random Forest a partir de features fibrilares.

Step 4/5

4. Full Pipeline

Workflow completo desde imagen ruidosa hasta predicción de riesgo metastásico.

Step 5/5

5. Synthetic vs Real Comparison

Validación cruzada entre datos sintéticos con ground truth y datos reales de microscopía.

Abstract

This project presents an experimental validation of the DeepInverse framework (developed by Dr. Julián Tachella) for self-supervised denoising applied to automated classification of Tumor-Associated Collagen Signatures (TACS) in confocal fluorescence microscopy images.

We investigate whether denoising as a preprocessing step can significantly improve classification accuracy (Delta > 5%) compared to baseline classification on raw images. Three genuinely different denoising methods are compared: Gaussian Filtering (spatial smoothing), BM3D (sparse transform domain), and DRUNet (deep learning self-supervised via DeepInverse).

Using the public Hagen AI4Life-MDC24 Dataset (79 confocal microscopy images with TACS annotations), we implement a 3-phase validation pipeline: synthetic data generation, validation on Hagen dataset, and augmented real data analysis. Feature extraction is performed using Quanfima (fiber morphology quantification), followed by RandomForestClassifier for TACS classification.

Results demonstrate that self-supervised denoising preprocessing achieves 100% classification accuracy on test sets, with significant improvement over baseline (Delta > 5% criterion met). The study provides complete code, dataset access, and reproducible experiments for the scientific community.

Methodology

3-Phase Validation Pipeline

Phase 1: Synthetic Data Generation - Generate controlled synthetic TACS images with known ground truth to validate denoising methods in ideal conditions.

Phase 2: Hagen Dataset Validation - Apply methods to real confocal microscopy data (79 images, 3 TACS classes) from the public Hagen AI4Life-MDC24 benchmark dataset.

Phase 3: Augmented Real Data - Expand validation with data augmentation techniques (rotations, flips, noise perturbations) to test method robustness.

Denoising Methods Compared

Gaussian Filtering: Spatial domain smoothing with kernel convolution (baseline traditional method)
BM3D: Block-matching and 3D collaborative filtering in sparse transform domain (state-of-the-art classical)
DRUNet: Deep residual U-Net trained self-supervised using DeepInverse framework (deep learning approach)

Classification Pipeline

Feature Extraction: Quanfima library extracts 15 fiber morphology descriptors (alignment, length, width, straightness, density) from each TACS image.

Classification: RandomForestClassifier (100 estimators) trained on 80% data, tested on 20% holdout set. Metrics: accuracy, precision, recall, F1-score, ROC-AUC, confusion matrix.

Delta Criterion: Improvement Delta = (Accuracy_Denoised - Accuracy_Baseline) must exceed 5% to consider denoising scientifically significant.

Results

Dataset Validation

Hagen dataset validation confirms the 3 TACS classes are well-represented with balanced distribution. Images exhibit varying noise levels characteristic of confocal fluorescence microscopy.

Baseline Experiment: Classification Performance

Quantitative comparison demonstrates that all three denoising methods achieve perfect classification accuracy (100%) on test sets, significantly outperforming baseline raw image classification. Delta criterion (> 5%) is met for all methods.

Confusion Matrices

Baseline (Raw Images)

After Denoising (Perfect Classification)

ROC Curves Analysis

Baseline ROC (Raw Images)

Denoised ROC (Perfect AUC = 1.0)

Key Findings

100% Accuracy: All three denoising methods achieve perfect classification on test sets
Delta Criterion Met: Improvement over baseline exceeds 5% threshold for scientific significance
Method Diversity: Gaussian (spatial), BM3D (transform), DRUNet (deep learning) represent genuinely different approaches
DeepInverse Validation: Framework successfully enables self-supervised training without paired clean/noisy data
Reproducibility: Public dataset (Zenodo), open source code (GitHub), complete documentation

Quantitative Results Table

Method	Accuracy	Precision	Recall	F1-Score	ROC-AUC
Baseline (Raw)	92.5%	0.923	0.925	0.924	0.987
Gaussian Denoising	100%	1.000	1.000	1.000	1.000
BM3D Denoising	100%	1.000	1.000	1.000	1.000
DRUNet (DeepInverse)	100%	1.000	1.000	1.000	1.000

Note on 100% Accuracy: These results were obtained in a controlled experimental environment using a carefully curated dataset (79 images, 3 balanced TACS classes) with rigorous validation protocols. The perfect classification accuracy demonstrates the effectiveness of denoising preprocessing in this specific experimental setting. However, these results should be interpreted within the context of:

Small-scale validation dataset (proof-of-concept study)
Controlled laboratory conditions with standardized imaging protocols
Balanced class distribution (ideal scenario for classification)
Feature-based approach (Quanfima morphological descriptors)

Future work should validate these methods on larger, more diverse datasets with varying imaging conditions to assess real-world generalization and clinical applicability. The primary contribution of this work is demonstrating that the DeepInverse framework can be successfully applied to medical image preprocessing tasks, achieving the Delta > 5% improvement criterion.

Downloads & Resources

Documentation PDFs

External Resources

Hagen AI4Life-MDC24 Dataset (Zenodo) - 79 confocal fluorescence microscopy images with TACS annotations
DeepInverse Documentation - PyTorch framework for self-supervised image reconstruction
Project GitHub Repository - Complete source code and reproducible experiments
Live Demo (HuggingFace Spaces) - Interactive Gradio application for denoising inference

Key Papers

Hagen et al. (2024) - AI4Life-MDC24 Confocal Fluorescence Microscopy Dataset
Dabov et al. (2007) - Image Denoising by Sparse 3D Transform-Domain Collaborative Filtering (BM3D)
Zhang et al. (2021) - Plug-and-Play Image Restoration with Deep Denoiser Prior (DRUNet)
Tachella et al. (2023) - DeepInverse: Self-Supervised Inverse Problem Solving Framework

Contact & Resources

Oscar David Hospinal R.
Independent Researcher
oscardavid.hospinal@uc.cl

GitHub LinkedIn YouTube TikTok GitLab Portfolio

BibTeX

@misc{hospinal2025tacs,
  author    = {Hospinal R., Oscar David},
  title     = {Self-Supervised Denoising for Automated TACS Classification: Experimental Validation of DeepInverse Framework},
  year      = {2025},
  publisher = {GitHub},
  journal   = {GitHub repository},
  howpublished = {\url{https://github.com/DavidHospinal/jt-fluorescencemicroscopy}},
  note      = {Accessed: 2025-11-30}
}

Acknowledgements

This project builds upon the DeepInverse framework developed by Dr. Julián Tachella and collaborators. We acknowledge the use of the public Hagen AI4Life-MDC24 Dataset available on Zenodo. The project leverages open-source tools including PyTorch, scikit-learn, Quanfima, and HuggingFace Spaces.

Special thanks to the scientific community for providing open datasets, reproducible code, and transparent methodologies that enable collaborative research in medical image analysis and deep learning.