Don's Tools · PDF · Extract Text

Extract text from a PDF

Pull the text out of any PDF right in your browser. Fast for normal PDFs, with optional on-device OCR for scans. Nothing is uploaded.

OptionsOCR reads scanned pages on your device. It downloads a one-time language pack (a few MB), is slower, and is optimized for major languages.
Drop a PDF here
or tap to choose · your file stays on your device
Extract Text from PDF is a free tool that runs entirely in your browser. It reads the text layer of normal PDFs instantly, and offers optional on-device OCR for scanned PDFs that have no text layer. You can copy the extracted text or download it as a plain text file, with a word and character count. The OCR is optimized for major languages. Your PDF is never uploaded.

Frequently asked questions

Are my PDFs uploaded anywhere?

No. The PDF is read and its text is pulled out inside your browser on your device, and nothing is ever sent to a server.

Does it work on scanned PDFs?

Normal digital PDFs (made from Word, Google Docs, print to PDF and so on) carry a real text layer, so their text comes out instantly. A scanned PDF is just an image of text with no text layer; turn on "Use OCR for scanned pages" and the text is read on your device. The OCR engine downloads once (a few MB) and is then cached, it is slower, and it is optimized for major languages, so accuracy is best on common scripts and clear scans.

Will the layout and formatting be kept?

It pulls out the words and line breaks, but extraction is essentially linear, so complex multi-column layouts, tables and heavy formatting can come out in an awkward reading order. You can clean up the result in the editable box before saving.

Which languages are supported?

Text-layer extraction works for any language already stored in the PDF. The optional OCR is optimized for major languages and uses English by default, so it is most accurate on common scripts and may struggle with unusual fonts or other writing systems.

Can it handle big PDFs?

Yes for the fast text-layer method. OCR is much heavier because every scanned page is analysed on your device, so on older phones or very long scanned documents it can be slow; work in smaller batches if needed.