Better document understanding without OCR using Donut transformer
Better document understanding without OCR using Donut transformer
Donut: Document Understanding Transformer without OCR
arXiv paper abstract https://arxiv.org/abs/2111.15664
arXiv PDF paper https://arxiv.org/pdf/2111.15664.pdf
Understanding document images (e.g., invoices) has been an important research topic
... current Visual Document Understanding (VDU) systems have come to be designed based on OCR.
... suffer from critical problems induced by the OCR, e.g., (1) expensive computational costs and (2) performance degradation due to the OCR error propagation.
... propose a novel VDU model that is end-to-end trainable without underpinning OCR framework.
... pre-train the model to mitigate the dependencies on large-scale real document images.
... achieves state-of-the-art performance on various document understanding tasks in public benchmark datasets ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments