The PDF Table Extractor finds tables in PDF documents and converts them to structured data. Upload a PDF, select a page, and export the detected tables as CSV or HTML. Powered by PDF.js — all processing happens in your browser, no upload required.
1. Upload PDF
Drop your PDF here or browse
Text-based PDFs only — scanned documents require OCR
Processing PDF...
0 Table(s) Found
No tables detected on this page
Try reducing the column gap threshold, or check if the PDF contains selectable text
How to Extract Tables from PDFs
Many reports, financial statements, and academic papers store important data in PDF tables. This tool uses PDF.js to read the PDF directly in your browser, analyzes text positions to detect table structure, and exports clean data without any server-side processing.
Step 1: Upload Your PDF
Drag and drop your PDF into the upload area or click to browse. The tool loads the PDF using PDF.js and detects how many pages it contains. Only text-based PDFs work — if text in the PDF is selectable when you open it in a viewer, it will work here.
Step 2: Select a Page and Adjust Settings
Choose which page contains the tables. The Row grouping (px) setting determines how close text items need to be vertically to be grouped into the same row. The Column gap (px) setting determines the minimum horizontal space that indicates a column boundary. Increase the row grouping for dense tables; increase the column gap if columns are being merged together.
Step 3: Extract and Export
Click "Extract Tables" to process the page. Each detected table appears with options to copy as CSV, download as CSV, or copy as HTML. For Excel or Google Sheets, use the CSV export. For web pages, use the HTML export which gives you a properly structured <table> element.
Improving Extraction Quality
If columns are merging incorrectly, reduce the column gap threshold. If rows are splitting when they shouldn't, increase the row grouping value. For tables with merged cells or complex headers, you may need to clean up the CSV manually after export. Multi-page tables spanning across page breaks will need to be extracted page by page and combined.
FAQ
Is this PDF table extractor free?
Yes, completely free with no signup required. Your PDF files are processed entirely in your browser using PDF.js — nothing is uploaded to any server. Your documents remain completely private on your device.
What types of PDFs work with this tool?
This tool works best with text-based PDFs where the text is selectable. PDFs created from Word documents, Excel spreadsheets, or other digital sources typically work well. Scanned documents (images of paper) require OCR (optical character recognition) processing and may not extract cleanly.
How does automatic table detection work?
The tool uses the text position data from PDF.js to group text items by their Y-coordinate (rows) and X-coordinate (columns). Items on the same horizontal line become a row; consistent vertical alignment across multiple rows becomes a column. This works well for grid-structured tables with clear column alignment.
Can I extract tables from password-protected PDFs?
Password-protected PDFs cannot be opened without the password. If your PDF is encrypted, you'll need to unlock it first using the PDF owner password before extracting tables.
What export formats are available?
You can export extracted tables as CSV (comma-separated values) for Excel/Sheets, TSV (tab-separated values) for tab-delimited imports, or copy the raw HTML table code to embed in websites or documents.
Why are some tables not detected correctly?
Table detection works by analyzing text position coordinates. Tables with merged cells, complex multi-level headers, or irregular column spacing may not extract perfectly. Tables embedded as images (rather than text) cannot be extracted at all — they would require OCR. Try adjusting the row and column detection sensitivity if tables look misaligned.