Self-supervised segmentation by students using transformers and CNN with TCC
Self-supervised segmentation by students using transformers and CNN with TCC
Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the Best of Both Students
arXiv paper abstract https://arxiv.org/abs/2209.02178v1
arXiv PDF paper https://arxiv.org/pdf/2209.02178v1.pdf
... methods for semi-supervised semantic segmentation mostly adopt a unitary network model using convolutional neural networks (CNNs) and enforce consistency of the model predictions over small perturbations applied to the inputs or model.
... propose a novel Semi-supervised Learning approach, called Transformer-CNN Cohort (TCC), that consists of two students with one based on the vision transformer (ViT) and the other based on the CNN.
... First, as the inputs of the ViT student are image patches, the feature maps extracted encode crucial class-wise statistics.
... propose class-aware feature consistency distillation (CFCD) that first leverages the outputs of each student as the pseudo labels and generates class-aware feature (CF) maps.
... Second, as the ViT student has more uniform representations for all layers, ... propose consistency-aware cross distillation to transfer knowledge between the pixel-wise predictions from the cohort.
... TCC ... significantly outperforms existing semi-supervised methods by a large margin.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments