This code provides functionality for optical character recognition (OCR) using...
July 1, 2025 at 06:21 PM
This code provides functionality for optical character recognition (OCR) using the Tesseract OCR engine. It performs the following tasks:
Individual Image OCR (The ocr_image
function):
- Input: Takes the path of an image.
- Preprocessing:
- Reads the image using OpenCV (
cv2.imread
). - Converts the image to grayscale (
cv2.cvtColor
). - Applies thresholding using Otsu's method to make it binary (
cv2.threshold
). - Removes noise using morphological operations like opening and dilating (
cv2.morphologyEx
andcv2.dilate
).
- Reads the image using OpenCV (
- OCR: Extracts text from the preprocessed image using Tesseract OCR (
pytesseract.image_to_string
) and allows specifying the language (e.g.,lang='rus'
for Russian). - Postprocessing: Strips extra whitespace from the recognized text.
- Output: Returns the recognized text. If an error occurs, it returns
None
.
Batch OCR (The batch_ocr
function):
- Input:
- A directory containing images.
- An optional output file path to save the results (default:
output.txt
).
- Processing:
- Iterates through all supported image files (
.png
,.jpg
,.jpeg
) in the given directory. - Processes each image using the
ocr_image
function to extract text. - Writes the recognized text (or error message) along with the filename to the output file.
- Iterates through all supported image files (
- Output:
- Saves recognized text for all processed images in the specified output file.
Code Customization and Notes:
- The Tesseract OCR executable path (
tesseract_cmd
) must be correctly set for Tesseract to work (e.g.,C:\Program Files\Tesseract-OCR\tesseract.exe
on Windows). - It supports multi-language OCR by modifying the
lang
parameter in thepytesseract.image_to_string
call.
Example Usage:
-
Individual OCR: Uncomment and set the
image_path
variable in theocr_image
usage example to process a single image. -
Batch OCR: Uncomment the block related to
batch_ocr
and set thedirectory_path
to process all images in a directory and save the results to a specified file.
Purpose:
This script is designed for automating OCR on images, with preprocessing to improve text recognition accuracy. It is suitable for applications like digitizing scanned documents, extracting text from images, or performing batch text recognition.
Generate your own explanations
Download our vscode extension
Read other generated explanations
Built by @thebuilderjr
Sponsored by beam analytics
Read our terms and privacy policy
Forked from openai-quickstart-node