Papers
arxiv:2308.13418

Nougat: Neural Optical Understanding for Academic Documents

Published on Aug 25, 2023
Β· Submitted by
AK
on Aug 28, 2023
#1 Paper of the day
Authors:

Abstract

Nougat, a Visual Transformer model, performs OCR on scientific documents converting them to markup language, enhancing digital accessibility.

Scientific knowledge is predominantly stored in books and scientific journals, often in the form of PDFs. However, the PDF format leads to a loss of semantic information, particularly for mathematical expressions. We propose Nougat (Neural Optical Understanding for Academic Documents), a Visual Transformer model that performs an Optical Character Recognition (OCR) task for processing scientific documents into a markup language, and demonstrate the effectiveness of our model on a new dataset of scientific documents. The proposed approach offers a promising solution to enhance the accessibility of scientific knowledge in the digital age, by bridging the gap between human-readable documents and machine-readable text. We release the models and code to accelerate future work on scientific text recognition.

Community

Breakthrough in Document OCR: Meet Nougat - The Neural Transformer for Scientific PDFs!

Links πŸ”—:

πŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix
πŸ‘‰ Twitter: https://x.com/arxflix
πŸ‘‰ LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2308.13418
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 8

Browse 8 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2308.13418 in a dataset README.md to link it from this page.

Spaces citing this paper 44

Browse 44 spaces citing this paper

Collections including this paper 8