top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

Segment scene using words in the caption using one stage with PPMN

Segment scene using words in the caption using one stage with PPMN


PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding



Panoptic Narrative Grounding (PNG) is an emerging task whose goal is to segment visual objects of things and stuff categories described by dense narrative captions of a still image.


... two-stage approach first extracts segmentation region proposals ... then conducts coarse region-phrase matching to ground the candidate regions for each noun phrase.


However, the two-stage pipeline usually suffers from the performance limitation of low-quality proposals in the first stage ... as well as complicated strategies designed for things and stuff


... To alleviate ... drawbacks, ... propose a one-stage end-to-end Pixel-Phrase Matching Network (PPMN), which directly matches each phrase to its corresponding pixels instead of region proposals


... model can exploit sufficient and finer cross-modal semantic correspondence from the supervision of densely annotated pixel-phrase pairs


... method achieves new state-of-the-art performance on the PNG benchmark with 4.0 absolute Average Recall gains.



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website



53 views0 comments

Comments


ClickBank paid link

bottom of page