Combine knowledge of many vision transformers into a smaller multi-talented model
Improve segmentation in bin picking by using part awareness
Get 3D layout of room from a panoramic image with LGT-Net
Convert a blurry image to a sharp video using a continuous intensity function with E-CIR
Survey of continuous human action recognition
Survey of re-identifying people seen by multiple cameras for various problem types
Real-time object segmentation in video with faster training using EfficientVIS
Better recognition of human actions using two-stage detection transformers
Improve scene segmentation with smaller models by distilling knowledge with SKR+PEA
Estimate the pixel map homograhy using image features and pixels
Segment objects of any type in image using model trained without manual annotations with FreeSOLO
Better object detection when domain change with balanced domains using BD-DIR
Improve object discovery with self-supervised transformers using TokenCut
Better image stitching by using depth information
Improve any image feature matcher on large appearance changes using OETR
Human pose estimation with 80% smaller model and 68% less CPU using STNet
Improve vision models by pretrain on uncurated images without supervision with SEER
Better 3D surface reconstruction from point clouds by using sensor's viewpoint
Remove haze in a single image using estimated transmission map with EDN-GTM
Improved object detection when test and train domains differ with MS-DAYOLO