This code provides functionality for optical character recognition (OCR) using...

July 1, 2025 at 06:21 PM

from PIL import Image
import pytesseract
import cv2

# Путь к Tesseract OCR executable (необходимо указать свой путь)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

def ocr_image(image_path):
    """
    Распознает текст на изображении с использованием Tesseract OCR.

    Args:
        image_path: Путь к изображению.

    Returns:
        Распознанный текст.
    """
    try:
        # 1. Предобработка изображения
        img = cv2.imread(image_path)
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] # Бинаризация Otsu

        # Удаление шумов
        kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
        opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, it-erations=1)

        # Улучшение резкости
        result = 255 - cv2.dilate(opening, kernel, iterations=2)

        # 2. Распознавание текста с использованием Tesseract OCR
        text = pytesseract.image_to_string(result, lang='rus') # Укажите нужный язык

        # 3. Постобработка текста (удаление лишних пробелов)
        text = text.strip()

        return text
    except Exception as e:
        print(f"Ошибка при распознавании: {e}")
        return None

# Пример использования
# Закомментируйте или удалите этот блок, если это не нужно при каж-дом запуске
# image_path = 'example.png' # Замените на путь к вашему изображению
# recognized_text = ocr_image(image_path)

# if recognized_text:
#     print(f"Распознанный текст:\n{recognized_text}")
# else:
#     print("Не удалось распознать текст.")
# Функция для пакетной обработки файлов
import os

def batch_ocr(directory, output_file="output.txt"):
    """
    Пакетно обрабатывает изображения в указанной директории и со-храняет распознанный текст в файл.

    Args:
        directory: Путь к директории с изображениями.
        output_file: Путь к файлу для сохранения распознанного текста.
    """
    with open(output_file, "w", encoding="utf-8") as f:
        for filename in os.listdir(directory):
            if filename.endswith((".png", ".jpg", ".jpeg")):  # Поддерживаемые форматы изображений
                image_path = os.path.join(directory, filename)
                recognized_text = ocr_image(image_path)

                if recognized_text:
                    f.write(f"Файл: {filename}\n")
                    f.write(recognized_text + "\n\n")
                else:
                    f.write(f"Файл: {filename} - Не удалось распознать текст.\n\n")

# Пример пакетной обработки (закомментируйте или удалите, если не нужно)
# directory_path = "images"  # Замените на путь к директории с изобра-жениями
# batch_ocr(directory_path, "output.txt")

This code provides functionality for optical character recognition (OCR) using the Tesseract OCR engine. It performs the following tasks:

Individual Image OCR (The `ocr_image` function):

Input: Takes the path of an image.
Preprocessing:
- Reads the image using OpenCV (cv2.imread).
- Converts the image to grayscale (cv2.cvtColor).
- Applies thresholding using Otsu's method to make it binary (cv2.threshold).
- Removes noise using morphological operations like opening and dilating (cv2.morphologyEx and cv2.dilate).
OCR: Extracts text from the preprocessed image using Tesseract OCR (pytesseract.image_to_string) and allows specifying the language (e.g., lang='rus' for Russian).
Postprocessing: Strips extra whitespace from the recognized text.
Output: Returns the recognized text. If an error occurs, it returns None.

Batch OCR (The `batch_ocr` function):

Input:
- A directory containing images.
- An optional output file path to save the results (default: output.txt).
Processing:
- Iterates through all supported image files (.png, .jpg, .jpeg) in the given directory.
- Processes each image using the ocr_image function to extract text.
- Writes the recognized text (or error message) along with the filename to the output file.
Output:
- Saves recognized text for all processed images in the specified output file.

Code Customization and Notes:

The Tesseract OCR executable path (tesseract_cmd) must be correctly set for Tesseract to work (e.g., C:\Program Files\Tesseract-OCR\tesseract.exe on Windows).
It supports multi-language OCR by modifying the lang parameter in the pytesseract.image_to_string call.

Example Usage:

Individual OCR: Uncomment and set the image_path variable in the ocr_image usage example to process a single image.
Batch OCR: Uncomment the block related to batch_ocr and set the directory_path to process all images in a directory and save the results to a specified file.

Purpose:

This script is designed for automating OCR on images, with preprocessing to improve text recognition accuracy. It is suitable for applications like digitizing scanned documents, extracting text from images, or performing batch text recognition.

Generate your own explanations

Download our vscode extension

Read other generated explanations

Built by @thebuilderjr