当前位置: 开发笔记 > 编程语言 > 正文

【模型部署】PaddleOCR模型openvino部署（一）

作者：秋老虎丶_628 | 来源：互联网 | 2023-06-08 19:26

本文将使用openvino部署PaddleOCR官方提供的检测模型

PaddleOCR:https://github.com/PaddlePaddle/PaddleOCR

PaddleOCR是一个非常好用的OCR工具，它有如下特性：

PP-OCR系列高质量预训练模型，准确的识别效果
- 超轻量PP-OCRv2系列：检测（3.1M）+ 方向分类器（1.4M）+ 识别（8.5M）= 13.0M
- 超轻量PP-OCR mobile移动端系列：检测（3.0M）+方向分类器（1.4M）+ 识别（5.0M）= 9.4M
- 通用PP-OCR server系列：检测（47.1M）+方向分类器（1.4M）+ 识别（94.9M）= 143.4M
- 支持中英文数字组合识别、竖排文本识别、长文本识别
- 支持多语言识别：韩语、日语、德语、法语等约80种语言
PP-Structure文档结构化系统
- 支持版面分析与表格识别（含Excel导出）
- 支持关键信息提取任务
- 支持DocVQA任务
丰富易用的OCR相关工具组件
- 半自动数据标注工具PPOCRLabel：支持快速高效的数据标注
- 数据合成工具Style-Text：批量合成大量与目标场景类似的图像
支持用户自定义训练，提供丰富的预测推理部署方案
支持PIP快速安装使用
可运行于Linux、Windows、MacOS等多种系统

本文将使用openvino部署PaddleOCR官方提供的检测模型，实现文本检测功能。其效果如下图。

原图：

检测结果：

一、模型下载

1、下载推理模型

2、查看模型

二、Openvino部署

三、部署效果

一、模型下载
这里选择官方提供的中英文超轻量PP-OCRv2检测（DBNet）模型进行部署，DBNet是使用语义分割的方法来检测文本区域，其结构如下图：（论文传送门）

这里不多介绍原理，可自行阅读论文。

1、下载推理模型

PaddleOCR提供了很多预训练模型，本文选择【中英文超轻量PP-OCRv2模型】中的检测模型进行部署，首先下载模型：

!wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar

解压压缩包，可以得到如下文件：

2、查看模型

使用netron查看inference.pdmodel结构，如下图，主要关注2点：
（a）模型的输出（关系到后续的后处理）；
（b）输入的维度（设计到后续的预处理）；

二、Openvino部署
在模型部署阶段，我们只需要完成预处理和后处理的代码即可，预处理过程要和训练一致，查看PaddleOCR对应的配置文件（仅保留预处理和后处理部分）：

PostProcess: name: DBPostProcess thresh: 0.3 box_thresh: 0.6 max_candidates: 1000 unclip_ratio: 1.5 Eval: dataset: name: SimpleDataSet data_dir: ./train_data/icdar2015/text_localization/ label_file_list: - ./train_data/icdar2015/text_localization/test_icdar2015_label.txt transforms: - DecodeImage: # load image img_mode: BGR channel_first: False - DetLabelEncode: # Class handling label - DetResizeForTest: - NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: 'hwc' - ToCHWImage: - KeepKeys: keep_keys: ['image', 'shape', 'polys', 'ignore_tags']
这里需要关注的预处理部分的NormalizeImage 均值和标准差，此外上一步查看模型架构时发现模型的输入维度是[?, 3, 960, 960]，需要在预处理添加resize操作。

后处理部分直接使用PaddleOCR提供的DBPostProcess类即可（需要稍作修改）。

下面给出具体代码及相关命令：

# 命令：python predict.py --model_path {上面导出的inference.pdmodel路径} --image_path {图片路径} # 案例: python predict.py --model_path inference.pdmodel --image_path test.png import cv2 import openvino import argparse import numpy as np import pyclipper from openvino.runtime import Core from shapely.geometry import Polygon def normalize(im, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]): im = im.astype(np.float32, copy=False) / 255.0 im -= mean im /= std return im def resize(im, target_size=608, interp=cv2.INTER_LINEAR): if isinstance(target_size, list) or isinstance(target_size, tuple): w = target_size[0] h = target_size[1] else: w = target_size h = target_size im = cv2.resize(im, (w, h), interpolation=interp) return im class DBPostProcess(object): """ The post process for Differentiable Binarization (DB). """ def init(self, thresh=0.3, box_thresh=0.7, max_candidates=1000, unclip_ratio=2.0, use_dilation=False, score_mode="fast", **kwargs): self.thresh = thresh self.box_thresh = box_thresh self.max_candidates = max_candidates self.unclip_ratio = unclip_ratio self.min_size = 3 self.score_mode = score_mode assert score_mode in [ "slow", "fast" ], "Score mode must be in [slow, fast] but got: {}".format(score_mode) self.dilation_kernel = None if not use_dilation else np.array( [[1, 1], [1, 1]]) def boxes_from_bitmap(self, pred, _bitmap, dest_width, dest_height): ''' _bitmap: single map with shape (1, H, W), whose values are binarized as {0, 1} ''' bitmap = _bitmap height, width = bitmap.shape outs = cv2.findContours((bitmap * 255).astype(np.uint8), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE) if len(outs) == 3: img, contours, _ = outs[0], outs[1], outs[2] elif len(outs) == 2: contours, _ = outs[0], outs[1] num_cOntours= min(len(contours), self.max_candidates) boxes = [] scores = [] for index in range(num_contours): cOntour= contours[index] points, sside = self.get_mini_boxes(contour) if sside score: continue box = self.unclip(points).reshape(-1, 1, 2) box, sside = self.get_mini_boxes(box) if sside points[0][1]: index_1 = 0 index_4 = 1 else: index_1 = 1 index_4 = 0 if points[3][1] > points[2][1]: index_2 = 2 index_3 = 3 else: index_2 = 3 index_3 = 2 box = [ points[index_1], points[index_2], points[index_3], points[index_4] ] return box, min(bounding_box[1]) def box_score_fast(self, bitmap, _box): ''' box_score_fast: use bbox mean score as the mean score ''' h, w = bitmap.shape[:2] box = _box.copy() xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1) xmax = np.clip(np.ceil(box[:, 0].max()).astype(np.int), 0, w - 1) ymin = np.clip(np.floor(box[:, 1].min()).astype(np.int), 0, h - 1) ymax = np.clip(np.ceil(box[:, 1].max()).astype(np.int), 0, h - 1) mask = np.zeros((ymax - ymin + 1, xmax - xmin + 1), dtype=np.uint8) box[:, 0] = box[:, 0] - xmin box[:, 1] = box[:, 1] - ymin cv2.fillPoly(mask, box.reshape(1, -1, 2).astype(np.int32), 1) return cv2.mean(bitmap[ymin:ymax + 1, xmin:xmax + 1], mask)[0] def box_score_slow(self, bitmap, contour): ''' box_score_slow: use polyon mean score as the mean score ''' h, w = bitmap.shape[:2] cOntour= contour.copy() cOntour= np.reshape(contour, (-1, 2)) xmin = np.clip(np.min(contour[:, 0]), 0, w - 1) xmax = np.clip(np.max(contour[:, 0]), 0, w - 1) ymin = np.clip(np.min(contour[:, 1]), 0, h - 1) ymax = np.clip(np.max(contour[:, 1]), 0, h - 1) mask = np.zeros((ymax - ymin + 1, xmax - xmin + 1), dtype=np.uint8) contour[:, 0] = contour[:, 0] - xmin contour[:, 1] = contour[:, 1] - ymin cv2.fillPoly(mask, contour.reshape(1, -1, 2).astype(np.int32), 1) return cv2.mean(bitmap[ymin:ymax + 1, xmin:xmax + 1], mask)[0] def call(self, pred, shape_list): pred = pred[:, 0, :, :] segmentation = pred > self.thresh boxes_batch = [] for batch_index in range(pred.shape[0]): src_h, src_w, _, _ = shape_list[batch_index] if self.dilation_kernel is not None: mask = cv2.dilate( np.array(segmentation[batch_index]).astype(np.uint8), self.dilation_kernel) else: mask = segmentation[batch_index] boxes, scores = self.boxes_from_bitmap(pred[batch_index], mask, src_w, src_h) boxes_batch.append({'points': boxes}) return boxes_batch class Predictor: def init(self, model_path, target_size=(960, 960), mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]): self.target_size = target_size self.mean = mean self.std = std self.model_path = model_path self.post_process = DBPostProcess(thresh=0.3, box_thresh=0.6, max_candidates=1000, unclip_ratio=1.5, use_dilation=False, score_mode="fast") # 后处理流程参考PaddleOCR def preprocess(self, image): image = resize(image, target_size=self.target_size) image = normalize(image, mean=self.mean, std=self.std) return image def draw_det(self, image, dt_boxes): for box in dt_boxes: box = box.astype(np.int32).reshape((-1, 1, 2)) cv2.polylines(image, [box], True, color=(255, 255, 0), thickness=2) return image def predict(self, image_path): image = cv2.imread(image_path) image_h, image_w, _ = image.shape inputs = self.preprocess(image) input_image = np.expand_dims( inputs.transpose(2, 0, 1), 0 ) ie = Core() model = ie.read_model(model=self.model_path) compiled_model = ie.compile_model(model=model, device_name="CPU") input_layer_ir = next(iter(compiled_model.inputs)) output_layer_ir = next(iter(compiled_model.outputs)) mask = compiled_model([input_image])[output_layer_ir] shape_list = [[image_h, image_w, None, None]] # 对上batch size， batch size为1，所以这里套一个列表 boxes_batch = self.post_process(mask, shape_list) # DBPostProcess，后处理流程参考PaddleOCR image = self.draw_det(image, boxes_batch[0]['points']) # 绘制box return image def parse_args(): parser = argparse.ArgumentParser(description='Model export.') # params of training parser.add_argument( '--model_path', dest='model_path', help='The path of pdmodel for export', type=str, default=None) parser.add_argument( '--image_path', dest='image_path', help='The path of image to predict.', type=str, default=None) return parser.parse_args() if name == "main": args = parse_args() model_path = args.model_path image_path = args.image_path predictor = Predictor(model_path, target_size=(960, 960), mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) image = predictor.predict(image_path) cv2.imwrite("result.png", image)

三、部署效果
原图：

预测结果：

推荐阅读

install
如何将PHP文件上传至服务器及正确配置服务器地址

如何将PHP文件上传至服务器及正确配置服务器地址 ... [详细]

蜡笔小新 2024-10-31 15:32:47
runtime
利用PaddleSharp模块在C#中实现图像文字识别功能测试

PaddleSharp 是 PaddleInferenceCAPI 的 C# 封装库，适用于 Windows (x64)、NVIDIA GPU 和 Linux (Ubuntu 20.04) 等平台。本文详细介绍了如何使用 PaddleSharp 在 C# 环境中实现图像文字识别功能，并进行了全面的功能测试，验证了其在多种硬件配置下的稳定性和准确性。 ... [详细]

蜡笔小新 2024-10-30 15:53:37
format
表面缺陷检测数据集综述及GitHub开源项目推荐

本文综述了表面缺陷检测领域的数据集，并推荐了多个GitHub上的开源项目。通过对现有文献和数据集的系统整理，为研究人员提供了全面的资源参考，有助于推动该领域的发展和技术进步。 ... [详细]

蜡笔小新 2024-10-29 08:22:46
include
求助高手：下载的压缩包中包含CMake文件，如何在Windows环境下使用已安装的CMake GUI进行运行？

从GitHub仓库 `https://github.com/vonmax007/RobotSimulation` 下载的代码包含多种算法，其中算法1的文件目录中包含了CMake文件。为了在Windows环境下使用已安装的CMake GUI运行这些文件，需要先确保CMake已正确安装，并按照以下步骤操作：打开CMake GUI，设置源代码路径和构建路径，点击“Configure”配置项目，然后点击“Generate”生成构建文件。最后，在生成的构建目录中使用命令行或IDE进行编译和运行。 ... [详细]

蜡笔小新 2024-11-01 19:42:15
web
基于Node.js的高性能实时消息推送系统：利用Socket.IO与Express实现高并发消息转发

基于Node.js的高性能实时消息推送系统通过集成Socket.IO和Express框架，实现了高效的高并发消息转发功能。该系统能够支持大量用户同时在线，并确保消息的实时性和可靠性，适用于需要即时通信的应用场景。 ... [详细]

蜡笔小新 2024-11-01 11:20:11
main
结语 | 《探索二进制世界：软件安全与逆向分析》读书笔记：深入理解二进制代码的逆向工程方法

结语 | 《探索二进制世界：软件安全与逆向分析》读书笔记：深入理解二进制代码的逆向工程方法 ... [详细]

蜡笔小新 2024-10-31 18:43:36
list
掌握 IScroll 技巧：实现流畅的上拉加载与下拉刷新功能

本文介绍了如何通过掌握 IScroll 技巧来实现流畅的上拉加载和下拉刷新功能。首先，需要按正确的顺序引入相关文件：1. Zepto；2. iScroll.js；3. scroll-probe.js。此外，还提供了完整的代码示例，可在 GitHub 仓库中查看。通过这些步骤，开发者可以轻松实现高效、流畅的滚动效果，提升用户体验。 ... [详细]

蜡笔小新 2024-10-31 17:28:44
format
探讨 `org.openide.windows.TopComponent.componentOpened()` 方法的应用及其代码实例分析

探讨 `org.openide.windows.TopComponent.componentOpened()` 方法的应用及其代码实例分析 ... [详细]

蜡笔小新 2024-10-30 18:43:34
web
SQLmap自动化注入工具命令详解（第28-29天实战演练）

SQL注入工具如SQLMap等在网络安全测试中广泛应用。SQLMap是一款开源的自动化SQL注入工具，支持12种不同的数据库，具体支持的数据库类型可在其插件目录中查看。作为当前最强大的注入工具之一，SQLMap在实际应用中具有极高的效率和准确性。 ... [详细]

蜡笔小新 2024-10-30 11:16:15
main
如何在Android应用中设计和实现专业的启动欢迎界面（Splash Screen）

在Android应用开发中，设计与实现一个专业的启动欢迎界面（Splash Screen）至关重要。尽管Android设计指南对使用Splash Screen的态度存在争议，但一个精心设计的启动界面不仅能提升用户体验，还能增强品牌识别度。本文将探讨如何在遵循最佳实践的同时，通过技术手段实现既美观又高效的启动欢迎界面，包括加载动画、过渡效果以及性能优化等方面。 ... [详细]

蜡笔小新 2024-10-28 19:45:09
list
PHP中元素的计量单位是什么？

PHP中元素的计量单位是什么？ ... [详细]

蜡笔小新 2024-11-01 15:06:51
runtime
在CentOS上部署和配置FreeSWITCH

在CentOS系统上部署和配置FreeSWITCH的过程涉及多个步骤。本文详细介绍了从源代码安装FreeSWITCH的方法，包括必要的依赖项安装、编译和配置过程。此外，还提供了常见的配置选项和故障排除技巧，帮助用户顺利完成部署并确保系统的稳定运行。 ... [详细]

蜡笔小新 2024-11-01 09:14:29
format
稀疏直接法视觉里程计中的特征点优化：基于光度误差最小化的灰度图像线性插值技术

在稀疏直接法视觉里程计中，通过优化特征点并采用基于光度误差最小化的灰度图像线性插值技术，提高了定位精度。该方法通过对空间点的非齐次和齐次表示进行处理，利用RGB-D传感器获取的3D坐标信息，在两帧图像之间实现精确匹配，有效减少了光度误差，提升了系统的鲁棒性和稳定性。 ... [详细]

蜡笔小新 2024-10-31 13:24:59
io
深入解析OSI七层架构与TCP/IP协议体系

本文详细探讨了OSI七层模型（Open System Interconnection，开放系统互连）及其与TCP/IP协议体系的关系。OSI模型将网络通信过程划分为七个层次，每个层次负责不同的功能，从物理层到应用层逐步实现数据传输和处理。通过对比分析，本文揭示了OSI模型与TCP/IP协议在结构和功能上的异同，为理解现代网络通信提供了全面的视角。 ... [详细]

蜡笔小新 2024-10-30 12:58:01
usb
掌握DSP必备的56个核心问题，我已经将其收藏以备不时之需！

掌握DSP必备的56个核心问题，我已经将其收藏以备不时之需！ ... [详细]

蜡笔小新 2024-10-28 18:26:22