Recognize text in scanned documents

You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF. Optical character recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF. To apply OCR to a PDF, the original scanner resolution must have been set at 72 dpi or higher.
Note: Scanning at 300 dpi produces the best text for conversion. At 150 dpi, OCR accuracy is slightly lower.

Recognize text in a single document

  1. Open the scanned PDF.
  2. Choose Document > OCR Text Recognition > Recognize Text Using OCR.
  3. In the Recognize Text dialog box, select an option under Pages.
  4. Optionally, click Edit to open the Recognize Text - Settings dialog box, and specify the options as needed.

Recognize text in multiple documents

  1. In Acrobat, choose Document > OCR Text Recognition > Recognize Text In Multiple Files Using OCR.
  2. In the Paper Capture Multiple Files dialog box, click Add Files, and choose Add Files, Add Folders, or Add Open Files. Then select the files or folder.
  3. In the Output Options dialog box, specify a target folder for output files, filename preferences, and an output format.
  4. In the Recognize Text - Settings dialog box, specify the options, and then click OK.

Recognize text in component PDFs in a PDF Portfolio

  1. Select one or more scanned PDFs in a PDF Portfolio.
  2. Choose Document > OCR Text Recognition > Recognize Text Using OCR.
  3. Specify the options in the Recognize Text - Settings dialog box.

Recognize Text - Settings dialog box

Primary OCR Language
Specifies the language for the OCR engine to use to identify the characters.

PDF Output Style
Determines the type of PDF to produce. All options require an input resolution of 72 dpi or higher (recommended). All formats apply OCR and font and page recognition to the text images and convert them to normal text.
Searchable Image
Ensures that text is searchable and selectable. This option keeps the original image, deskews it as needed, and places an invisible text layer over it. The selection for Downsample Images in this same dialog box determines whether the image is downsampled and to what extent.

Searchable Image (Exact)
Ensures that text is searchable and selectable. This option keeps the original image and places an invisible text layer over it. Recommended for cases requiring maximum fidelity to the original image.

ClearScan
Converts the fonts in the document to custom fonts that closely approximate the original. These fonts preserve the page background using a low-resolution copy. To be able to edit text in PDFs created using ClearScan, replace the custom fonts with fonts that you have on your computer. For more information, see Replace custom fonts with local fonts.

Downsample Images
Decreases the number of pixels in color, grayscale, and monochrome images after OCR is complete. Choose the degree of downsampling to apply. Higher-numbered options do less downsampling, producing higher-resolution PDFs.