基于 Spring Boot3.3 + OCR 实现图片转文字功能-工具盒子

基于 Spring Boot3.3 + EasyOCR 实现图片转文字功能

在当今数字化信息时代，图像中的文字信息越来越重要，无论是文档扫描、名片识别，还是车辆牌照的识别，OCR（Optical Character Recognition，光学字符识别）技术都在各个领域发挥着关键作用。本文将介绍如何基于 Spring Boot 框架集成 EasyOCR，实现图片转文字的功能。我们将通过实际的代码示例，详细讲解从前端上传图片、后端处理到最终文字识别的全过程。

运行效果：

识别效果：

若想获取项目完整代码以及其他文章的项目源码，且在代码编写时遇到问题需要咨询交流，欢迎加入下方的知识星球。

什么是 OCR？

OCR 是一种将图片中的印刷体或手写文本转换为可编辑文本的技术。它广泛应用于文档管理系统、车牌识别、票据处理、身份证识别等领域。传统的 OCR 解决方案通常基于复杂的机器学习算法，需进行大量的数据训练。然而，随着深度学习的快速发展，出现了一些更加灵活且易于使用的 OCR 框架，其中 EasyOCR 就是一个突出的代表。

EasyOCR 框架简介

EasyOCR 简介

EasyOCR 是一个由 Jaided AI 开发的开源 OCR 解决方案。它基于 PyTorch 深度学习框架，具有开箱即用、易于集成、支持多语言等特点。与传统的 OCR 工具相比，EasyOCR 不仅识别速度快，还能处理各种复杂的文本图像，如弯曲的文本、不同字体、各种语言混合的文本等。

EasyOCR 的特性

多语言支持：EasyOCR 支持 80 多种语言，包括中英双语、日语、韩语、阿拉伯语等，特别适合需要处理多语言文本的场景。
开源免费：EasyOCR 完全开源，并且在 GitHub 上持续维护和更新，开发者可以免费使用并进行二次开发。
易于集成：只需简单几行代码，即可将 EasyOCR 集成到现有项目中。其 API 设计简单明了，非常适合快速开发和部署。
高准确率：基于深度学习的模型，EasyOCR 在复杂场景下的文本识别准确率较高，能够应对弯曲文本、复杂背景等难题。
轻量级：与其他基于深度学习的 OCR 解决方案相比，EasyOCR 更加轻量级，占用资源少，适合嵌入式设备和服务器应用。

环境准备

Python 环境：EasyOCR 是基于 Python 的，因此需要在系统中安装 Python。
EasyOCR 安装：使用 pip 安装 EasyOCR。

pip install easyocr

Spring Boot 项目：我们将创建一个 Spring Boot 项目，并通过 HTTP 请求将图片传递给 Python 脚本进行 OCR 处理。

项目结构

easyocr
|-- src
|   |-- main
|       |-- java
|           |-- com
|               |-- icoderoad
|                   |-- easyocr
|                       |-- EasyOcrApplication.java
|                       |-- controller
|                           |-- OcrController.java
|-- resources
|   |-- application.yml
|-- pom.xml
|-- ocr_script.py

Python OCR 脚本

首先，我们编写一个 Python 脚本 ocr_script.py，用于接收图像文件并使用 EasyOCR 进行文字识别。

import easyocr
import sys
def extract_text_from_image(image_path):
    # 初始化 EasyOCR Reader，支持中文和英文
    reader = easyocr.Reader(['ch_sim', 'en'])  # 'ch_sim' 用于简体中文，'ch_tra' 用于繁体中文
    results = reader.readtext(image_path)
    
    text = ""
    for result in results:
        text += result[1] + "\n"
    return text
if name == "main":
    image_path = sys.argv[1]  # 从命令行参数获取图片路径
    text = extract_text_from_image(image_path)
    print(text)  # 输出识别结果

Spring Boot 配置

pom.xml 配置

添加 spring-boot-starter-web 和 commons-io 依赖，用于创建 REST API 和处理文件操作。

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>commons-io</groupId>
        <artifactId>commons-io</artifactId>
        <version>2.11.0</version>
    </dependency>
</dependencies>

application.yml 配置

配置文件上传的临时存储路径。

server:
  port: 8080
  
spring:
  servlet:
    multipart:
      max-file-size: 10MB
      max-request-size: 10MB
      
ocr:
  python-path: /path/python/bin/python
  script-path: /path/to/ocr_script.py
  upload-dir: /tmp/uploads/

EasyOcrApplication.java

Spring Boot 启动类。

package com.icoderoad.easyocr;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class EasyocrApplication {
public static void main(String[] args) {
	SpringApplication.run(EasyocrApplication.class, args);
}

}

创建配置类

使用 @ConfigurationProperties 注解创建一个配置类，以便将 YAML 文件中的配置注入到 Spring Boot 应用中。

OcrProperties.java

package com.icoderoad.easyocr.config;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;
@Component
@ConfigurationProperties(prefix = "ocr")
public class OcrProperties {
    private String pythonPath;
    private String scriptPath;
    private String uploadDir;
    
    public String getPythonPath() {
return pythonPath;
}
public void setPythonPath(String pythonPath) {
	this.pythonPath = pythonPath;
}
public String getScriptPath() {

        return scriptPath;
    }
    public void setScriptPath(String scriptPath) {
        this.scriptPath = scriptPath;
    }
    public String getUploadDir() {
        return uploadDir;
    }
    public void setUploadDir(String uploadDir) {
        this.uploadDir = uploadDir;
    }
}

OcrController.java

控制器用于处理文件上传和调用 Python 脚本。

package com.icoderoad.easyocr.controller;
import java.io.File;
import java.io.IOException;
import org.apache.commons.io.FileUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;
import com.icoderoad.easyocr.config.OcrProperties;
@RestController
@RequestMapping("/api/ocr")
public class OcrController {
    @Autowired
    private OcrProperties ocrProperties;
    @PostMapping("/extract")
    public String extractText(@RequestParam("file") MultipartFile file) {
        try {
            // 保存上传的文件
            File tempFile = new File(ocrProperties.getUploadDir() + file.getOriginalFilename());
            FileUtils.writeByteArrayToFile(tempFile, file.getBytes());
            // 调用 Python 脚本
            ProcessBuilder processBuilder = new ProcessBuilder(ocrProperties.getPythonPath(), ocrProperties.getScriptPath(), tempFile.getAbsolutePath());
            Process process = processBuilder.start();
            process.waitFor();
            // 读取输出
            String output = new String(process.getInputStream().readAllBytes());
            return output;
        } catch (IOException | InterruptedException e) {
            e.printStackTrace();
            return "OCR 识别失败";
        }
    }
}

前端示例

使用 Thymeleaf 模板、Bootstrap 和 JavaScript 创建一个简单的前端页面，允许用户上传图片并查看 OCR 结果。

src/main/resources/templates/index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>OCR 图片识别</title>
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css">
</head>
<body>
    <div class="container">
        <h1 class="mt-5">OCR 图片识别</h1>
        <form id="uploadForm">
            <div class="form-group">
                <label for="fileInput">选择图片文件:</label>
                <input type="file" class="form-control" id="fileInput" name="file" required>
            </div>
            <button type="submit" class="btn btn-primary">上传并识别</button>
        </form>
        <div class="mt-3">
            <h2>识别结果:</h2>
            <pre id="result">上传图片以查看识别结果</pre>
        </div>
    </div>
    <script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
    <script src="/js/app.js"></script>
</body>
</html>

src/main/resources/static/js/app.js

$(document).ready(function() {
    $('#uploadForm').on('submit', function(event) {
        event.preventDefault();
        // 获取文件输入
        var fileInput = $('#fileInput')[0].files[0];
        // 检查是否选择了文件
        if (!fileInput) {
            alert("请选择一个文件");
            return;
        }
        // 创建 FormData 对象
        var formData = new FormData();
        formData.append('file', fileInput);
        // 使用 jQuery 的 AJAX 发送 POST 请求
        $.ajax({
            url: '/api/ocr/extract',
            type: 'POST',
            data: formData,
            contentType: false, // 不设置内容类型，让浏览器自动处理
            processData: false, // 不处理数据，让它保持原样
            success: function(result) {
                // 在页面上显示识别结果
                $('#result').text(result);
            },
            error: function(xhr, status, error) {
                console.error('Error:', error);
                alert('识别失败，请稍后重试。');
            }
        });
    });
});

总结

在这篇文章中，我们展示了如何使用 EasyOCR 与 Spring Boot 集成实现图片文字识别。通过 Python 脚本处理 OCR 任务，并在 Spring Boot 应用中处理文件上传和调用 OCR 脚本，最终将识别结果返回给前端页面。这种方法结合了 EasyOCR 强大的文字识别能力与 Spring Boot 灵活的 Web 开发特性，为大家提供了一个完整的解决方案。