Search


Improved object segmentation in video by using object descriptors instead of pixel matching
Improved object segmentation in video by using object descriptors instead of pixel matching HODOR: High-level Object Descriptors for...

morrislee
Dec 20, 20211 min read
96 views
0 comments


Answer questions about scene using image and 3D information
Answer questions about scene using image and 3D information 3D Question Answering arXiv paper abstract https://arxiv.org/abs/2112.08359v1...

morrislee
Dec 17, 20211 min read
25 views
0 comments


Get 3D shape, pose, and relative depth of people from a single image despite occlusion
Get 3D shape, pose, and relative depth of people from a single image despite occlusion Putting People in their Place: Monocular...

morrislee
Dec 16, 20211 min read
266 views
0 comments


Improved super-resolution for display screens by using transformer designed for screen content
Improved super-resolution for display screens by using transformer designed for screen content Implicit Transformer Network for Screen...

morrislee
Dec 15, 20211 min read
72 views
0 comments


Better enhancement of dim images with edge-awareness using CSDNet
Better enhancement of dim images with edge-awareness using CSDNet Learning Deep Context-Sensitive Decomposition for Low-Light Image...

morrislee
Dec 14, 20211 min read
63 views
0 comments


Robot grips new objects in new poses from 10 examples using neural descriptor fields
Robot grips new objects in new poses from 10 examples using neural descriptor fields Neural Descriptor Fields: SE(3)-Equivariant Object...

morrislee
Dec 13, 20211 min read
39 views
0 comments


Recognize 3D objects when only trained on 2D image and text pairs with PointCLIP
Recognize 3D objects when only trained on 2D image and text pairs with PointCLIP PointCLIP: Point Cloud Understanding by CLIP arXiv paper...

morrislee
Dec 10, 20211 min read
71 views
0 comments


Track people in images better by building 3D model from image and predicting appearance
Track people in images better by building 3D model from image and predicting appearance Tracking People by Predicting 3D Appearance,...

morrislee
Dec 9, 20211 min read
49 views
0 comments


Calibrate cameras from video with sub-pixel error using self-supervision
Calibrate cameras from video with sub-pixel error using self-supervision Self-Supervised Camera Self-Calibration from Video arXiv paper...

morrislee
Dec 8, 20211 min read
170 views
0 comments


Restore faces blurred by air turbulence with prior knowledge from GAN network
Restore faces blurred by air turbulence with prior knowledge from GAN network LTT-GAN: Looking Through Turbulence by Inverting GANs arXiv...

morrislee
Dec 7, 20211 min read
121 views
0 comments


Detect new objects better by teaching classifier not to ignore unlabeled objects
Detect new objects better by teaching classifier not to ignore unlabeled objects Learning to Detect Every Thing in an Open World arXiv...

morrislee
Dec 6, 20211 min read
77 views
0 comments


Correcting an image classifier prediction by using a single image
Correcting an image classifier prediction by using a single image Editing a classifier by rewriting its prediction rules arXiv paper...

morrislee
Dec 3, 20211 min read
126 views
0 comments


Segment objects in a video that are mentioned in a text query
Segment objects in a video that are mentioned in a text query End-to-End Referring Video Object Segmentation with Multimodal Transformers...

morrislee
Dec 2, 20211 min read
31 views
0 comments


Better document understanding without OCR using Donut transformer
Better document understanding without OCR using Donut transformer Donut: Document Understanding Transformer without OCR arXiv paper...

morrislee
Dec 1, 20211 min read
59 views
0 comments


Get centimeter depth image from smartphone using LiDAR and unsteadiness of hand
Get centimeter depth image from smartphone using LiDAR and unsteadiness of hand The Implicit Values of A Good Hand Shake: Handheld...

morrislee
Nov 30, 20211 min read
269 views
0 comments


Classifying visual and audio events of various durations in videos with MM-Pyramid
Classifying visual and audio events of various durations in videos with MM-Pyramid MM-Pyramid: Multimodal Pyramid Attentional Network for...

morrislee
Nov 29, 20211 min read
23 views
0 comments


Multi-label image classification using information on context, space, and meaning
Multi-label image classification using information on context, space, and meaning Spatial-context-aware deep neural network for...

morrislee
Nov 26, 20211 min read
61 views
0 comments


Survey of panoptic image segmentation for objects and regions
Survey of panoptic image segmentation for objects and regions Panoptic Segmentation: A Review arXiv paper abstract...

morrislee
Nov 24, 20211 min read
67 views
0 comments


Many types of computer vision tasks possible with new customizable vision foundation model, Florence
Many types of computer vision tasks possible with new customizable vision foundation model, Florence Florence: A New Foundation Model for...

morrislee
Nov 23, 20211 min read
109 views
0 comments


Correcting Face Distortion in Wide-Angle Videos
Correcting Face Distortion in Wide-Angle Videos Correcting Face Distortion in Wide-Angle Videos arXiv paper abstract...

morrislee
Nov 22, 20211 min read
46 views
0 comments