Tools in This Collection
PDF Conversion Workflow
PDFs store content in fundamentally different ways depending on how they were created. Digital PDFs (created from Word, PowerPoint, or directly from an application) contain actual text that can be extracted and searched. Scanned PDFs contain page images — photos of paper — with no searchable text. These four tools handle both types, plus converting in the opposite direction.
Extracting Text from Digital PDFs
The PDF to Text tool extracts all selectable text from a digital PDF into plain text. Works for any PDF where you can click and highlight text in a viewer. Common use cases: copying a contract's key terms into a document, extracting a report's data for spreadsheet analysis, or pulling article text for summarization. Extraction is instant and preserves paragraph structure.
Converting PDF Pages to Images
The PDF to Image converter renders each page as a PNG or JPEG file. Pages are rendered at 1.5x scale by default for sharp output on retina displays. Common use cases: creating thumbnail previews of PDF pages for a website, extracting a diagram from a report for use in a presentation, or converting a single-page PDF form to an image for embedding. A 5-page PDF produces 5 separate image files.
OCR for Scanned PDFs
The PDF OCR tool uses Tesseract.js — the same open-source OCR engine that powers document scanning apps — to extract text from scanned PDF pages. OCR is significantly slower than direct text extraction (15-30 seconds per page for a complex scan) but handles documents where the text exists only as an image. Accuracy depends on scan quality: clean, high-contrast scans achieve 95%+ accuracy; faded or skewed scans may require cleanup. Supports English and many other languages.
Bundling Images into a PDF
The Image to PDF tool converts JPEG, PNG, or WebP images into a single PDF document. Common use cases: combining photos of paper documents (receipts, handwritten notes, whiteboards) into a single shareable file, or converting a series of scanned images into a PDF for archiving. Each image becomes one page; images are scaled to fit the page while maintaining aspect ratio.
Frequently Asked Questions
Why can't I extract text from my scanned PDF?
Scanned PDFs contain page images — photos of paper — not actual text characters. The PDF to Text tool only works for digital PDFs where you can click and highlight text. For scanned documents, use the PDF OCR tool instead, which uses Tesseract.js to recognize text from the page images.
How accurate is the PDF OCR tool?
OCR accuracy depends heavily on scan quality. Clean, high-contrast, properly aligned scans of printed text typically achieve 95%+ character accuracy. Faded text, skewed pages, handwriting, or complex layouts reduce accuracy. The tool works best on standard business documents, forms, and printed text.
What image formats does Image to PDF support?
The Image to PDF tool accepts JPEG, PNG, and WebP files. You can add multiple images at once, and each image becomes one page in the resulting PDF. Images are automatically scaled to fit the page while maintaining the original aspect ratio.