Is my PDF uploaded to a server?

No. The file is read with the browser's FileReader and parsed by the Mozilla pdf.js engine running as JavaScript on your device. Nothing is uploaded, so it is safe for confidential documents.

Why does my scanned PDF produce no text?

A scanned or image-only PDF has no text layer — the page is a picture of text. Extracting characters from an image requires OCR (optical character recognition), which this tool does not perform; it reads the embedded text layer only.

Can I extract only certain pages?

Yes. Enter a page range such as 1-3,5,8- to extract specific pages. Leave it blank to extract the whole document.

Does it work on password-protected PDFs?

Encrypted PDFs that require a password to open cannot be read until the password is removed. The tool reports when a file is password-protected.

PDF to Text Extractor

On this page

How to use it
The text layer
Scanned PDFs & OCR
Ranges & layout
Why extract locally
FAQ

How to extract text from a PDF

Drag a PDF onto the drop zone or click to choose a file. The text is extracted page by page on your device and appears in the box, with a live count of pages, words and characters. Enter a page range to limit which pages are read, toggle whether to keep the document's line breaks, and add page separators if you want each page marked. When you are done, copy the text or download it as a .txt file. Because everything runs in your browser, even a large or confidential PDF is processed without an upload.

What the text layer is

A PDF stores text as a text layer: the actual characters, positioned on the page. This tool reads that layer directly using Mozilla's pdf.js — the same open-source engine that powers PDF viewing in Firefox — so the extracted text is exactly what the document contains, not a guess. Most PDFs created from a word processor, browser or design tool carry a full text layer, which is why their text is selectable in a normal PDF viewer. If you can select and copy text inside the original PDF, this tool can extract it.

Scanned PDFs and OCR

A PDF made by scanning paper, or by exporting an image, has no text layer — each page is a picture of text. There are no characters to read, so extraction returns nothing and the tool tells you the document looks scanned. Turning a picture of text back into characters requires OCR (optical character recognition), which is a different process and is not performed here. If you need OCR, use a dedicated tool; this extractor is for PDFs that already contain real text, and it stays fast and fully local precisely because it does no image recognition.

Page ranges and layout

The page field accepts single pages, ranges and open-ended ranges combined with commas — for example 1-3,5,8- reads pages 1 through 3, page 5, and page 8 to the end. Leave it blank for the whole document. Keep layout breaks uses each text item's position to reconstruct line and paragraph breaks, which suits prose and reports; turn it off to collapse everything into space-separated words, which is often better when you will re-chunk the text for embeddings. Page separators insert a labelled marker between pages so you can tell where each one starts.

Why extract PDF text locally

PDFs are full of sensitive material — signed contracts, invoices, medical and legal documents, internal reports. Uploading one to a conversion website hands a third party the whole file. Running pdf.js in your browser keeps the document on your device: it is read with the local FileReader, parsed in memory, and nothing is logged or transmitted, matching the gitime.dev default that your data stays local.

Local — the PDF is never uploaded.
Accurate — reads the real text layer via pdf.js.
Selective — extract any page range.
Layout-aware — keep or flatten line breaks.
Exportable — copy or download as .txt.

Frequently asked questions

Is my PDF uploaded to a server?: No. It is read locally with FileReader and parsed by pdf.js in your browser, so nothing is uploaded.
Why does my scanned PDF produce no text?: Scanned PDFs are images with no text layer; extracting them needs OCR, which this tool does not do.
Can I extract only certain pages?: Yes — enter a range like 1-3,5,8- in the Pages field. Blank means the whole document.
Does it work on password-protected PDFs?: Encrypted PDFs must have the password removed first; the tool reports when a file is protected.