top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

Better document understanding without OCR using Donut transformer

Better document understanding without OCR using Donut transformer


Donut: Document Understanding Transformer without OCR

arXiv paper abstract https://arxiv.org/abs/2111.15664



Understanding document images (e.g., invoices) has been an important research topic


... current Visual Document Understanding (VDU) systems have come to be designed based on OCR.


... suffer from critical problems induced by the OCR, e.g., (1) expensive computational costs and (2) performance degradation due to the OCR error propagation.


... propose a novel VDU model that is end-to-end trainable without underpinning OCR framework.


... pre-train the model to mitigate the dependencies on large-scale real document images.


... achieves state-of-the-art performance on various document understanding tasks in public benchmark datasets ...



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website


59 views0 comments

Comments


ClickBank paid link

bottom of page