GenSeg-Baselines

Reproducible code for a 2D medical-image segmentation benchmark: 8 methods × 10 datasets × 3 seeds/folds, 7 metrics, evaluated under a unified resolution-fair protocol. Companion to the GenSegDataset.

This is a code-only repository — trained checkpoints and the generated result tables are not hosted here.

Methods: UNet, UNet++, DeepLabV3+ (ResNet-50/ImageNet), Attention-UNet (from scratch), TransUNet (R50-ViT-B/16, input 256), Swin-UNet (Swin-Tiny, input 224), nnU-Net v2 (250 ep), U-Mamba (UMambaBot, 100 ep).

Datasets: cvc_clinicdb, kvasir_seg, fives, busi, refuge2, acdc, idridd, pannuke, isic2018, kits19.

Metrics (computed per image, then aggregated): Dice, IoU, HD95, ASSD, Sensitivity, Specificity, Precision — plus per-class Dice for the multi-class datasets and paired-Wilcoxon significance on per-image Dice.

Resolution-fair protocol

Convolutional nets are trained at 512; the fixed-input transformers (Swin-UNet 224, TransUNet 256) and nnU-Net / U-Mamba run at their native size; every prediction and ground truth is resized to a common 512×512 before scoring, so boundary metrics (HD95/ASSD, in pixels) are directly comparable across methods.

Layout (code only)

code/framework/ — training/evaluation framework: train.py, test.py, eval_at_res.py, nnunet_eval.py; metrics/ (the 7 metrics + boundary distances); models/ (SMP wrappers, Attention-UNet, Swin/TransUNet wrappers, model registry); report/aggregate.py builds the summary tables (per-dataset Dice/HD95/IoU, per-class Dice, Sensitivity/Precision, significance).
code/sota/{Swin-Unet,TransUNet}/ — upstream network definitions imported by the Swin-UNet / TransUNet wrappers.
code/scripts/ — reproduction scripts (unified-512 training & evaluation, nnU-Net / U-Mamba pipelines).
code/envs/ — conda environments (seggen.yml, nnunet.yml, umamba.yml).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support