Runs 100% in your browser — nothing uploaded

PDF to Text Extractor

Pull the text layer out of a PDF entirely in your browser — choose page ranges, keep layout line breaks, then copy the result or download it as a .txt file. The PDF is read on your device with Mozilla's pdf.js engine and never uploaded, so it is safe for contracts, reports and other private documents.

Drag & drop a PDF here, or click to choose — never leaves your browser
Extracted text
Ready — drop a PDF to extract its text locally.

How to extract text from a PDF

Drag a PDF onto the drop zone or click to choose a file. The text is extracted page by page on your device and appears in the box, with a live count of pages, words and characters. Enter a page range to limit which pages are read, toggle whether to keep the document's line breaks, and add page separators if you want each page marked. When you are done, copy the text or download it as a .txt file. Because everything runs in your browser, even a large or confidential PDF is processed without an upload.

What the text layer is

A PDF stores text as a text layer: the actual characters, positioned on the page. This tool reads that layer directly using Mozilla's pdf.js — the same open-source engine that powers PDF viewing in Firefox — so the extracted text is exactly what the document contains, not a guess. Most PDFs created from a word processor, browser or design tool carry a full text layer, which is why their text is selectable in a normal PDF viewer. If you can select and copy text inside the original PDF, this tool can extract it.

Scanned PDFs and OCR

A PDF made by scanning paper, or by exporting an image, has no text layer — each page is a picture of text. There are no characters to read, so extraction returns nothing and the tool tells you the document looks scanned. Turning a picture of text back into characters requires OCR (optical character recognition), which is a different process and is not performed here. If you need OCR, use a dedicated tool; this extractor is for PDFs that already contain real text, and it stays fast and fully local precisely because it does no image recognition.

Page ranges and layout

The page field accepts single pages, ranges and open-ended ranges combined with commas — for example 1-3,5,8- reads pages 1 through 3, page 5, and page 8 to the end. Leave it blank for the whole document. Keep layout breaks uses each text item's position to reconstruct line and paragraph breaks, which suits prose and reports; turn it off to collapse everything into space-separated words, which is often better when you will re-chunk the text for embeddings. Page separators insert a labelled marker between pages so you can tell where each one starts.

Why extract PDF text locally

PDFs are full of sensitive material — signed contracts, invoices, medical and legal documents, internal reports. Uploading one to a conversion website hands a third party the whole file. Running pdf.js in your browser keeps the document on your device: it is read with the local FileReader, parsed in memory, and nothing is logged or transmitted, matching the gitime.dev default that your data stays local.

Frequently asked questions

Is my PDF uploaded to a server?
No. It is read locally with FileReader and parsed by pdf.js in your browser, so nothing is uploaded.
Why does my scanned PDF produce no text?
Scanned PDFs are images with no text layer; extracting them needs OCR, which this tool does not do.
Can I extract only certain pages?
Yes — enter a range like 1-3,5,8- in the Pages field. Blank means the whole document.
Does it work on password-protected PDFs?
Encrypted PDFs must have the password removed first; the tool reports when a file is protected.

Related tools