Add a searchable text layer to your scanned PDF documents using FunPDF's Scan to Searchable PDF tool.
What is OCR?
OCR (Optical Character Recognition) converts images of text into actual searchable and selectable text. This technology:
Enables Text Search:
- Use Ctrl+F to find any word or phrase
- Quickly locate specific information in long documents
- Search through archives of scanned documents
Allows Copy-Paste:
- Select and copy text directly from scanned PDFs
- Extract text without manual retyping
- Quote and reference content easily
Preserves Original Appearance:
- OCR text layer is invisible - sits behind the image
- Document looks exactly like the original scan
- Visual fidelity, layout, and colors unchanged
- Perfect for legal documents requiring authenticity
Improves Accessibility:
- Screen readers can read the text aloud
- Complies with accessibility standards (WCAG, Section 508)
- Helps visually impaired users access scanned content
Step-by-Step Guide
1. Upload Your Scanned PDF
Upload your scanned PDF to the FunPDF Editor:
Using Drag and Drop:
- Drag your scanned PDF file into the editor window
- Drop it anywhere on the page
- File loads automatically
Using Upload Button:
- Click "Upload PDFs" or "Add files" button
- Select your scanned PDF (up to 200MB)
- Click "Open" to upload
Supported Files:
- Image-based PDFs from scanners
- Phone camera photos of documents
- Photocopies saved as PDF
- Screenshots and digital scans
2. Select OCR Tool
3. Configure OCR Options
Language Selection
Choose the language of your document's text for accurate recognition:
Supported Languages (12 total):
- English (eng)
- Chinese Simplified (chi_sim)
- Chinese Traditional (chi_tra)
- Japanese (jpn)
- Korean (kor)
- Spanish (spa)
- French (fra)
- German (deu)
- Italian (ita)
- Portuguese (por)
- Russian (rus)
- Arabic (ara)
Important: Select the correct language for best OCR accuracy. Wrong language selection can result in garbled or incorrect text.
Page Range
All Pages (Recommended):
- Process entire document
- Best for complete searchability
Custom Ranges:
- Specify pages: "1-10, 15, 20-25"
- Single pages: "5, 8, 12"
- Ranges: "1-5" for pages 1 through 5
- Mixed: "1-10, 15, 20-25" for multiple sections
- Use for large files to save time
- Process only relevant sections
Text Handling Mode (If PDF Contains Text)
The tool automatically detects if your PDF already has text. Choose how to handle it:
Skip Text (Default - Recommended):
- OCR only image pages without text
- Keep existing text untouched
- Fastest option for mixed PDFs
- Best for PDFs with some text already
Redo OCR:
- Replace existing text with fresh OCR
- Use if original text is inaccurate
- Useful for poorly extracted text
Force OCR:
- OCR all pages regardless of existing text
- Creates new text layer on all pages
- Use when you want uniform OCR across entire document
Output Format
PDF (Standard - Recommended):
- Standard PDF format
- Maximum compatibility
- Use for everyday documents
PDF/A-2 (Archival):
- Long-term preservation format
- For legal compliance and archiving
- Embeds all fonts and resources
- Ensures document looks identical decades later
PDF/A-3 (Archival with Attachments):
- Same as PDF/A-2
- Plus support for embedded files
- For complex archival requirements
When to Use PDF/A:
- Legal documents requiring preservation
- Government records and official archives
- Institutional repositories
- Long-term storage (10+ years)
Compression Level
None:
- No compression
- Largest file size
- Maximum quality
- Use when file size doesn't matter
Low (1):
- Minimal compression
- Near-perfect quality
- Large files
- Good for print documents
Medium (2) - Recommended:
- Balanced quality and size
- Very good quality maintained
- Reasonable file size
- Best for most use cases
High (3):
- Maximum compression
- Smallest file size
- Slight quality reduction
- Good for storage optimization
4. Start OCR Processing
Click "Start OCR" to begin:
Small Files (≤10MB):
- Process instantly on server
- Files not stored
- Results in seconds
Large Files (>10MB):
- Upload to secure temporary storage
- Process in background
- Real-time progress bar shows:
- Current page being processed
- Overall progress percentage
- Status messages
- Files auto-delete after 2 hours
- Can cancel processing if needed
5. Download Result
After OCR completes:
- Click "Download" to save the searchable PDF
- Test searchability: Open in PDF viewer, press Ctrl+F, search for a word
- Text layer is invisible but fully functional
- Document looks identical to original scan
Understanding OCR Results
OCR Accuracy
Factors Affecting Accuracy:
High Accuracy (95-99%):
- Clean, high-resolution scans (300+ DPI)
- Good contrast between text and background
- Clear, printed text
- Correct language selected
Medium Accuracy (80-95%):
- Lower resolution scans (150-300 DPI)
- Some background noise
- Faded or light text
- Slight tilt or skew
Low Accuracy (<80%):
- Very low resolution (<150 DPI)
- Blurry or out-of-focus scans
- Heavy background noise, stains
- Severely tilted or distorted pages
- Wrong language selected
Not Suitable for OCR:
- Handwritten or cursive text (very low accuracy)
- Artistic or stylized fonts
- Severely damaged documents
- Images instead of text
Improving OCR Quality
Before OCR:
- Use Scanned PDF Enhancement tool first
- Enable deskew to straighten tilted pages
- Enable background removal to clean stains
- Enable artifact cleaning for clearer text
During OCR:
- Select correct document language
- Choose medium or low compression for better quality
- Process a test page first to verify accuracy
Use Cases
Make Scanned Contracts Searchable
- Search for specific clauses instantly
- Find all instances of terms or names
- Quick reference without reading entire document
- Essential for legal review
Digitize Paper Archives
- Convert old letters, reports, meeting minutes
- Create searchable digital library
- Find historical information quickly
- Preserve while adding modern functionality
Extract Text from Academic Papers
- Copy citations and quotes without retyping
- Search for specific research topics
- Create reference databases
- Extract data for analysis
Searchable Receipt Archives
- Find specific purchases by vendor or item
- Track expenses by searching keywords
- Organize accounting records
- Quick retrieval for tax or audit purposes
Accessibility Compliance
- Screen readers require text to read aloud
- Add text layer for visually impaired users
- Meet WCAG and Section 508 requirements
- Make documents inclusive
Best Practices
For Standard Documents
- Language: Select document language
- Pages: All pages
- Format: Standard PDF
- Compression: Medium (2)
- Text handling: Skip text (default)
For Large Documents
- Process in batches (custom page ranges)
- Use high compression to reduce file size
- Monitor progress, cancel if needed
- Download and verify each batch
For Archival Documents
- Format: PDF/A-2 or PDF/A-3
- Compression: Low or Medium
- Process all pages
- Preserve for long-term storage
For Low-Quality Scans
- Use PDF Enhancement first:
- Enable deskew
- Enable background removal
- Enable clean artifacts
- Then OCR the enhanced PDF
- Results will be significantly better
Troubleshooting
OCR Accuracy Too Low
Causes:
- Wrong language selected
- Poor scan quality (blurry, faded)
- Low resolution
- Heavy background noise
Solutions:
- Verify correct language is selected
- Use PDF Enhancement to improve scan quality
- Re-scan at higher resolution (300+ DPI recommended)
- Clean original document before scanning
Handwritten Text Not Recognized
Explanation:
- OCR works best on printed text
- Handwriting has very low accuracy
- Cursive text especially problematic
Solution:
- Manual transcription required for handwriting
- OCR not suitable for handwritten documents
PDF Already Contains Text
Not an error - This is smart text detection:
- Tool automatically detects existing text
- Choose text handling mode:
- Skip text: Keep existing, OCR images only (recommended)
- Redo OCR: Replace existing text
- Force OCR: OCR all pages
File Too Large (>200MB)
Solutions:
- Split PDF into smaller parts
- OCR each part separately
- Merge searchable parts if needed
- Or process with custom page ranges in batches
Processing Failed
Solutions:
Privacy and Security
Small Files (≤10MB)
- Process on server without storage
- Not stored
- Immediate results
- Maximum privacy
Large Files (>10MB)
- Temporarily stored during processing
- Automatically deleted after 2 hours
- Secure temporary storage
- Files don't stay permanently
See Is My Data Secure? for full privacy details.
Next Steps
Start making your scanned PDFs searchable with Scan to Searchable PDF now!