Survey of transformers for vision, text, and audio
Survey of deep learning for image retrieval
Unblurring defocused images using multi-branch neural networks
Advantages of nested transformers for computer vision
Neural network for counting crowds
Vision transformer morphed to CNN works better
Getting better placement of objects in a scene
Multi-layer perceptrons for vision competitive with transformers and CNN
Image segmentation of camouflaged objects
Facebook AI software does speech recognition without any transcribed data
Making street maps from satellite images
Google Vertex AI builds, trains, and deploys scalable machine learning models
Survey of work on remaining challenges in biometrics
From video get 3D shape of people, animals, and other non-rigid objects
Detect and segment 3D objects in room after train only with list of room objects
Automating Data Science: Prospects and Challenges
Computer vision transformer models (CLIP, ViT, DeiT) released by Hugging Face
Get image matching text plus image, also get descriptions of images
Results of a competition on enhancing video resolution
Get shape of room and people in it by echoes using one mike