前言

C#版本的方法介绍

https://www.psvmc.cn/article/2022-07-18-opencv-csharp.html

安装依赖

1	pip install opencv-python==4.5.4.60

图片信息

查看图片

image = cv2.imread('D:\\Project\\csharp\\z-exam-card-recognize\\ml-resource\\1202_1.png')
cv2.imshow('Image', image)

# 等待按键事件
cv2.waitKey(0)

读取支持中文路径

import cv2
import numpy as np

def read_img(filename, mode=cv2.IMREAD_COLOR):
  raw_data = np.fromfile(filename, dtype=np.uint8)  # 先用numpy把图片文件存入内存：raw_data，把图片数据看做是纯字节数据
  img = cv2.imdecode(raw_data, mode)  # 从内存数据读入图片
  return img

这个函数就可以代替opencv的imread了，并且该函数支持中文路径

图片保存

这种方式存在路径中有中文无法保存的问题

1
2
3

page_word_folder = 'D:\\Project\\csharp\\z-exam-card-recognize\\ml-resource\\page_word'
file_path = os.path.join(page_word_folder, "{}_{}.png".format(row + 1, col + 1))
cv2.imwrite(file_path, image)

如果路径中有中文无法保存，使用下面的方式就可以了

1 2	file_path = os.path.join(paper_word_folder, f"{row + 1}行_{col + 1}列.png") cv2.imencode(".jpg", word20_img)[1].tofile(file_path)

图片数据展平

1	word20Img.flatten()

获取图片大小

1	height, width = image.shape[:2]

获取图片Base64

import cv2
import base64
class CvCommonUtils:
    @staticmethod
    def mat_to_base64(mat):
        # 将Mat对象转换为JPEG格式的字节流
        _, img_encoded = cv2.imencode(".jpg", mat)
        # 将字节流编码为base64字符串
        base64_str = base64.b64encode(img_encoded).decode("utf-8")
        return base64_str

    @staticmethod
    def mat_to_base64_all(mat):
        # 将Mat对象转换为JPEG格式的字节流
        _, img_encoded = cv2.imencode(".jpg", mat)
        # 将字节流编码为base64字符串
        base64_str = base64.b64encode(img_encoded).decode("utf-8")
        base64StrAll = f"data:image/jpeg;base64,{base64_str}"
        return base64StrAll

内存释放

在 Python 中使用 OpenCV 加载图像（或创建其他类型的 cv2.Mat 对象）时，通常不需要显式释放内存，因为 Python 的垃圾回收机制会自动管理内存。

然而，如果你确实需要确保某些资源被释放，可以采取一些措施。

例如：使用 del 语句（适用于图像或矩阵）

对于图像矩阵（cv2.Mat），你可以使用 del 语句来删除引用，这样垃圾回收器会在适当的时候回收内存：

import cv2

# 读取图像
image = cv2.imread('image.jpg')

# 处理图像
# ...

# 删除对图像的引用
del image

图片处理

二值化

二值化的前提是图片进行灰度化

# 将图像二值化
@staticmethod
def binary(image):
  image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  ret, bin_image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_TRIANGLE)
  return bin_image

判断图片是否二值化

@staticmethod
def is_binary_image(image, tolerance=1e-5):
    """
    判断图像是否为二值化图像
    :param image: 输入的图像
    :param tolerance: 用于处理浮点数精度和噪声的容差，默认为 1e - 5
    :return: 如果是二值化图像返回 True，否则返回 False
    """
    # 处理多通道图像，将其转换为单通道灰度图像
    if len(image.shape) > 2:
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # 对图像进行归一化处理，将像素值缩放到 0 - 1 范围
    image_normalized = image.astype(np.float32) / 255.0

    # 获取唯一像素值
    unique_values = np.unique(
        np.round(image_normalized, decimals=int(-np.log10(tolerance)))
    )

    # 判断唯一值数量是否接近 2
    return len(unique_values) <= 2

腐蚀与膨胀

# 腐蚀操作 扩大黑色
@staticmethod
def eroding(image):
  # 定义腐蚀核
  kernel = np.ones((2, 2), np.uint8)
  # 进行腐蚀操作
  eroded_image = cv2.erode(image, kernel, iterations=1)
  return eroded_image

# 膨胀操作 扩大白色
@staticmethod
def dilate(image):
  # 定义腐蚀核
  kernel = np.ones((2, 2), np.uint8)
  # 进行腐蚀操作
  dilated_image = cv2.dilate(image, kernel, iterations=1)
  return dilated_image

@staticmethod
def eroding_dilate(image):
  eroded_image = CvCommonUtils.eroding(image)
  return CvCommonUtils.dilate(eroded_image)

旋转

@staticmethod
def rotate90_counter(source):
  """
        逆时针旋转90度
        :param source: 输入的图像矩阵
        :return: 旋转后的图像矩阵
        """
  # 逆时针旋转90度
  result_mat = cv2.rotate(source, cv2.ROTATE_90_COUNTERCLOCKWISE)
  return result_mat

@staticmethod
def rotate90(source):
  """
        顺时针旋转90度
        :param source: 输入的图像矩阵
        :return: 旋转后的图像矩阵
        """
  # 顺时针旋转90度
  result_mat = cv2.rotate(source, cv2.ROTATE_90_CLOCKWISE)
  return result_mat

@staticmethod
def rotate180(source):
  """
        旋转180度
        :param source: 输入的图像矩阵
        :return: 旋转后的图像矩阵
        """
  # 顺时针旋转90度
  result_mat = cv2.rotate(source, cv2.ROTATE_180)
  return result_mat

图片拼接

import cv2
import numpy as np
from PIL import Image, ImageDraw, ImageFont


class CvCommonUtils:
		@staticmethod
    def joint_mat(mat_list):
        """
        将多个 Mat 对象垂直拼接成一个 Mat
        :param mat_list: 包含多个 Mat 对象的列表
        :return: 拼接后的 Mat 对象
        """
        # 检查输入是否为空
        if not mat_list:
            return None

        # 如果只有一张图片，直接返回
        if len(mat_list) == 1:
            return mat_list[0]

        # 计算拼接后的总行数和最大列数
        total_rows = sum(img.shape[0] for img in mat_list)  # 总行数
        max_cols = max(img.shape[1] for img in mat_list)  # 最大列数

        # 创建一个空白的结果图像（白色背景）
        result = (
            np.ones((total_rows, max_cols, mat_list[0].shape[2]), dtype=np.uint8) * 255
        )

        # 逐个图像复制到结果图像中
        row_offset = 0
        for img in mat_list:
            rows, cols, channels = img.shape
            # 将当前图像复制到目标区域
            result[row_offset : row_offset + rows, 0:cols] = img
            row_offset += rows  # 更新行偏移

        return result

二值化处理方式

在文字识别（OCR，Optical Character Recognition）中，选择合适的阈值处理方法取决于图像的质量、背景复杂度以及光照条件。

二值化的前提是图片进行灰度化

以下是几种常见的阈值处理方法及其在文字识别中的适用场景：

简单阈值处理

这是最基本的阈值处理方法。你需要手动指定一个阈值，然后根据这个阈值将图像二值化。

cv2.THRESH_BINARY：如果像素值大于阈值，则设置为最大值，否则设置为 0。
cv2.THRESH_BINARY_INV：与 cv2.THRESH_BINARY 相反，如果像素值大于阈值，则设置为 0，否则设置为最大值。

适用场景：适用于光照均匀、背景简单的图像。

优点：简单直接，对于背景和前景对比明显的图像效果较好。

示例代码：

1	ret, bin_image_binary = cv2.threshold(image, 200, 255, cv2.THRESH_BINARY)

反相

1	ret, bin_image_binary_inv = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY_INV)

三角法阈值处理

cv2.THRESH_TRIANGLE

三角法阈值处理（Triangle Thresholding）是一种自动确定阈值的方法，它使用图像直方图的峰值和谷值来计算阈值。

在扫描仪效果不太好的前提下，这种方式的识别效果最好。

适用场景：适用于具有单峰直方图的图像，例如某些特定的扫描文档或低对比度图像。

优点：自动计算阈值，适用于某些特定场景。

示例代码：

1 2	# 三角法阈值处理 ret, bin_image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY \| cv2.THRESH_TRIANGLE)

大津法阈值处理

cv2.THRESH_OTSU

适用场景：适用于具有双峰直方图的图像，尤其是文档扫描件或具有清晰背景和前景的图像。

优点：自动计算最佳阈值，不需要手动调整。

示例代码：

import cv2

# 读取图像
image = cv2.imread('document.jpg', cv2.IMREAD_GRAYSCALE)

# 大津法阈值处理
ret, bin_image_otsu = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

# 显示图像
cv2.imshow('Otsu Thresholding', bin_image_otsu)
cv2.waitKey(0)
cv2.destroyAllWindows()

自适应阈值处理

自适应阈值处理方法会根据图像的不同区域计算不同的阈值。

扫描仪不推荐使用这种方式，效果非常不好。

适用于光照不均匀的图像。

cv2.ADAPTIVE_THRESH_MEAN_C：阈值是邻域像素的平均值减去一个常数。
cv2.ADAPTIVE_THRESH_GAUSSIAN_C：阈值是邻域像素的加权和（高斯加权）减去一个常数。

适用场景：适用于光照不均匀的图像，例如自然场景中的文字或低质量扫描文档。

优点：根据局部区域的光照条件调整阈值，适合处理光照不均匀的图像。

示例代码：

import cv2

# 读取图像
image = cv2.imread('document.jpg', cv2.IMREAD_GRAYSCALE)

# 自适应阈值处理
bin_image_mean = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
bin_image_gaussian = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

# 显示图像
cv2.imshow('Adaptive Mean', bin_image_mean)
cv2.imshow('Adaptive Gaussian', bin_image_gaussian)
cv2.waitKey(0)
cv2.destroyAllWindows()

总结

扫描图片背景基本一样：简单阈值处理（cv2.THRESH_BINARY 或 cv2.THRESH_BINARY_INV）。
扫描图片直方图为双峰：通常使用大津法阈值处理（cv2.THRESH_OTSU）效果较好。
扫描图片直方图为单峰：三角法阈值处理（cv2.THRESH_TRIANGLE）。
光照不均匀的图像：自适应阈值处理（cv2.ADAPTIVE_THRESH_MEAN_C 或 cv2.ADAPTIVE_THRESH_GAUSSIAN_C）更适合。

注意

在做阅卷识别的时候，还是建议用cv2.THRESH_BINARY的方法，其他几种方式自动获取的阈值会变动，导致同样的试卷多次扫描的结果会不一致。

图片绘制

绘制矩形

import cv2
import numpy as np
from PIL import Image, ImageDraw, ImageFont


class CvCommonUtils:
		@staticmethod
    def draw_rectangles(image, regions, color=(0, 0, 255), thickness=2):
        """
        在图片上绘制矩形。

        参数:
        - image: 由OpenCV加载的图片，例如通过cv2.imread()加载。
        - regions: 区域列表，每个元素是一个包含四个整数的列表或元组，表示矩形的左上角和宽高 [x1, y1, width, height]。
        - color: 绘制矩形的颜色，以BGR格式表示，默认为红色 (0, 0, 255)。
        - thickness: 矩形线条的粗细，默认为2像素。

        返回:
        - 新的图片，其中绘制了指定的矩形。
        """
        # 复制原始图像以保留原始图像不变
        output_image = image.copy()

        for region in regions:
            if len(region) != 4:
                print("警告：区域应包含四个坐标值，跳过此区域。")
                continue
            x1, y1, width, height = map(int, region)
            cv2.rectangle(
                output_image, (x1, y1), (x1 + width, y1 + height), color, thickness
            )

        return output_image

    @staticmethod
    def show_image(image, title=""):
        cv2.imshow(title, image)
        # cv2.waitKey(0)

绘制中文乱码

由于OpenCV的字体渲染机制不直接支持Unicode字符（包括中文），因此无法直接绘制中文。

Pillow库可以直接支持中文绘制，因此你可以将Pillow和OpenCV结合使用来绘制包含中文的文字。

1	pipenv install pillow==8.4.0

脚本同级目录添加字体文件simsun.ttc

方法

import cv2
import numpy as np
from PIL import Image, ImageDraw, ImageFont

@staticmethod
def cv2AddChineseText(img, text, position, textColor=(255, 0, 0), textSize=20):
  if isinstance(img, np.ndarray):  # 判断是否OpenCV图片类型
    img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    # 创建一个可以在给定图像上绘图的对象
    draw = ImageDraw.Draw(img)
    # 字体的格式
    fontStyle = ImageFont.truetype("simsun.ttc", textSize, encoding="utf-8")
    # 绘制文本
    draw.text(position, text, textColor, font=fontStyle)
    # 转换回OpenCV格式
    return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)

也可以使用系统自带的字体

import cv2
import numpy as np
from PIL import Image, ImageDraw, ImageFont


def get_default_font():
    import platform

    system_name = platform.system()

    if system_name == "Windows":
        return "C:/Windows/Fonts/simsun.ttc"  # 宋体
    elif system_name == "Darwin":  # macOS
        return "/System/Library/Fonts/Songti.ttc"  # 宋体
    elif system_name == "Linux":
        # 常见的Linux字体路径，可能需要根据系统配置调整
        font_paths = [
            "/usr/share/fonts/truetype/arphic/uming.ttc",
            "/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc",
        ]
        for path in font_paths:
            try:
                with open(path, "rb"):
                    return path
            except FileNotFoundError:
                continue
        raise FileNotFoundError("无法找到合适的中文字体文件")
    else:
        raise RuntimeError("不支持的操作系统: " + system_name)


# 创建一个空白图像
image = np.ones((500, 500, 3), dtype=np.uint8) * 255

# 使用PIL创建一个可以在OpenCV图像上绘制文字的图像对象
pil_image = Image.fromarray(image)
draw = ImageDraw.Draw(pil_image)

# 获取默认字体路径
font_path = get_default_font()

# 设置字体和文字内容
font = ImageFont.truetype(font_path, 40)  # 设置字体和字号
text = "你好，OpenCV！"

# 绘制文字
draw.text((50, 100), text, font=font, fill=(0, 0, 0))

# 将PIL图像转换回OpenCV格式
image = np.array(pil_image)

# 显示图像
cv2.imshow("Text Image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

图像识别

import cv2
import numpy as np
from PIL import Image, ImageDraw, ImageFont


class CvCommonUtils:   
    @staticmethod
    def get_mat32(source_image):
        # 目标图像的大小
        target_size = (32, 32)

        # 获取源图像的尺寸
        source_height, source_width = source_image.shape[:2]

        # 计算目标图像的大小
        target_height, target_width = target_size

        # 计算缩放比例
        scale_x = target_width / source_width
        scale_y = target_height / source_height
        scale = min(scale_x, scale_y)

        # 计算缩放后的尺寸
        new_width = int(source_width * scale)
        new_height = int(source_height * scale)

        # 缩放源图像
        resized_image = cv2.resize(source_image, (new_width, new_height))

        # 创建一个空白图像，背景色为白色
        target_image = np.ones(target_size, np.uint8) * 255

        # 计算源图像在目标图像中的左上角坐标
        start_x = (target_width - new_width) // 2
        start_y = (target_height - new_height) // 2

        # 将缩放后的源图像粘贴到目标图像上
        target_image[start_y:start_y + new_height,
                     start_x:start_x + new_width] = resized_image

        return target_image

    @staticmethod
    def remove_lines(image):
        # 获取图像的尺寸
        rows, cols = image.shape[:2]

        # 存储空行和空列的索引
        empty_lines_hor = []
        empty_lines_ver = []

        # 遍历每一行
        for y in range(rows):
            # 获取当前行的像素值
            row = image[y, :]

            # 检查该列是否为线
            if np.count_nonzero(row == 0) / row.size > 0.9:
                empty_lines_hor.append(y)

        # 遍历每一列
        for x in range(cols):
            # 获取当前列的像素值
            col = image[:, x]

            # 检查该列是否为线
            if np.count_nonzero(col == 0) / col.size > 0.9:
                empty_lines_ver.append(x)

        # 移除空行
        for row in empty_lines_hor:
            image[row, :] = 255

        # 移除空列
        for col in empty_lines_ver:
            image[:, col] = 255

        return image

    @staticmethod
    def get_line_rect_list(img):
        """
        获取有内容的区域列表
        :param img: 输入图像
        :return: 有内容区域的矩形列表
        """
        # 获取图像的尺寸
        rows, cols = img.shape[:2]

        # 用于存储空行的索引
        empty_lines = []

        # 遍历每一行
        for y in range(rows):
            # 获取当前行的像素值
            row = img[y, :]

            # 检查该行是否为空（全不为零）
            count = cv2.countNonZero(row)

            # 如果是空行，记录下行号
            if count / cols > 0.95:
                empty_lines.append(y)

        last_num = 0
        num_arr_list = []

        # 处理空行索引
        for line_num in empty_lines:
            if line_num - last_num >= 5:
                num_arr_list.append((last_num, line_num))
            last_num = line_num

        rect_list = []

        # 创建矩形列表
        for item in num_arr_list:
            rect_list.append(
                (0, item[0], cols, item[1] - item[0])
            )

        return rect_list

    @staticmethod
    def find_contours(image):
        """
        查找图像轮廓
        """
        contours, hierarchy = cv2.findContours(
            image, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
        return contours

    @staticmethod
    def is_rect_contained(rect1, rect2):
        """
        判断矩形框互相包含关系
        """
        # 返回四个布尔值，表示矩形4个角的包含情况
        return (rect1[0] >= rect2[0] and rect1[1] >= rect2[1]
                and (rect1[0] + rect1[2]) <= (rect2[0] + rect2[2])
                and (rect1[1] + rect1[3]) <= (rect2[1] + rect2[3]))

    # 判断交叉的条件
    @staticmethod
    def is_intersecting(rect1, rect2):
        """
        判断两个矩形是否相交
        :param rect1:
        :param rect2:
        :return:
        """
        x1, y1, width1, height1 = rect1
        x2, y2, width2, height2 = rect2

        right1 = x1 + width1
        bottom1 = y1 + height1

        right2 = x2 + width2
        bottom2 = y2 + height2

        # 判断是否交叉
        return not (x1 >= right2 or right1 <= x2 or y1 >= bottom2 or bottom1 <= y2)

    @staticmethod
    def get_rect_all_by_img(img):
        """
        从图像中提取矩形区域
        :param img: 输入图像（灰度图）
        :return: 提取到的矩形区域列表
        """
        # 查找轮廓
        contours = CvCommonUtils.find_contours(img)
        rect_list = []
        for contour in contours:
            rect = cv2.boundingRect(contour)
            x, y, w, h = rect
            if 4 < w < 100 and 8 < h < 100:
                rect_list.append(rect)
        return rect_list

    @staticmethod
    def get_rect_by_img(img):
        """
        从图像中提取矩形区域
        :param img: 输入图像（灰度图）
        :return: 提取到的矩形区域列表
        """
        # 查找轮廓
        contours = CvCommonUtils.find_contours(img)
        rect_list = []
        for contour in contours:
            rect = cv2.boundingRect(contour)
            x, y, w, h = rect
            if 6 < w < 60 and 10 < h < 60:
                rect_list.append(rect)

        filtered_list = []
        for current_rect in rect_list:
            is_contained = False
            for other_rect in rect_list:
                if current_rect != other_rect and CvCommonUtils.is_rect_contained(current_rect,
                                                                                  other_rect):
                    is_contained = True
                    break
            if not is_contained:
                filtered_list.append(current_rect)
        return filtered_list

工具类

import base64

import cv2
import numpy as np
from PIL import Image, ImageDraw, ImageFont


class CvCommonUtils:

    @staticmethod
    def read_img_by_byte(image_bytes):
        # 将字节数据转换为 NumPy 数组
        nparr = np.frombuffer(image_bytes, np.uint8)
        # 使用 cv2.imdecode 解码图像
        img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
        return img

    @staticmethod
    def read_img(filename, mode=cv2.IMREAD_COLOR):
        # 先用numpy把图片文件存入内存：raw_data，把图片数据看做是纯字节数据
        raw_data = np.fromfile(filename, dtype=np.uint8)
        img = cv2.imdecode(raw_data, mode)  # 从内存数据读入图片
        return img

    @staticmethod
    def read_img_gray(filename):
        # 先用numpy把图片文件存入内存：raw_data，把图片数据看做是纯字节数据
        raw_data = np.fromfile(filename, dtype=np.uint8)
        img = cv2.imdecode(raw_data, cv2.IMREAD_GRAYSCALE)  # 从内存数据读入图片
        return img

    @staticmethod
    def save_img(image, save_path):
        cv2.imencode(".jpg", image)[1].tofile(save_path)

    @staticmethod
    def gray(image):
        """
        将图像灰度化
        """
        return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    @staticmethod
    def is_binary_image(image):
        # 将图片转换为灰度图（如果已经是灰度图则不需要转换）
        if len(image.shape) == 3:
            image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # 计算直方图
        hist = cv2.calcHist([image], [0], None, [256], [0, 256])

        # 计算非零像素值的数量
        nonzero_count = np.count_nonzero(hist)

        # 如果只有两个非零像素值，则是二值化图片
        return nonzero_count == 2

    @staticmethod
    def binary(image):
        """
        将图像二值化
        """
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        ret, bin_image = cv2.threshold(image, 200, 255, cv2.THRESH_BINARY)
        return bin_image

    # 腐蚀操作 扩大黑色
    @staticmethod
    def eroding(image):
        # 定义腐蚀核
        kernel = np.ones((2, 2), np.uint8)
        # 进行腐蚀操作
        eroded_image = cv2.erode(image, kernel, iterations=1)
        return eroded_image

    # 膨胀操作 扩大白色
    @staticmethod
    def dilate(image):
        # 定义腐蚀核
        kernel = np.ones((2, 2), np.uint8)
        # 进行腐蚀操作
        dilated_image = cv2.dilate(image, kernel, iterations=1)
        return dilated_image

    @staticmethod
    def eroding_dilate(image):
        eroded_image = CvCommonUtils.eroding(image)
        return CvCommonUtils.dilate(eroded_image)

    @staticmethod
    def sub_img(image, rect):
        """
        获取区域图像
        """
        y_start, y_end = rect[1], rect[1] + rect[3]
        x_start, x_end = rect[0], rect[0] + rect[2]
        return image[y_start:y_end, x_start:x_end]

    @staticmethod
    def resize_image(source, width, height, interpolation=cv2.INTER_LINEAR):
        """
        调整图像大小
        :param source: 输入的图像矩阵
        :param width: 目标宽度
        :param height: 目标高度
        :param interpolation: 插值方法，默认为 cv2.INTER_LINEAR
        :return: 调整大小后的图像矩阵
        """
        # 调整图像大小
        result_mat = cv2.resize(source, (width, height), interpolation=interpolation)
        return result_mat

    @staticmethod
    def rotate90_counter(source):
        """
        逆时针旋转90度
        :param source: 输入的图像矩阵
        :return: 旋转后的图像矩阵
        """
        # 逆时针旋转90度
        result_mat = cv2.rotate(source, cv2.ROTATE_90_COUNTERCLOCKWISE)
        return result_mat

    @staticmethod
    def rotate90(source):
        """
        顺时针旋转90度
        :param source: 输入的图像矩阵
        :return: 旋转后的图像矩阵
        """
        # 顺时针旋转90度
        result_mat = cv2.rotate(source, cv2.ROTATE_90_CLOCKWISE)
        return result_mat

    @staticmethod
    def rotate180(source):
        """
        旋转180度
        :param source: 输入的图像矩阵
        :return: 旋转后的图像矩阵
        """
        # 顺时针旋转90度
        result_mat = cv2.rotate(source, cv2.ROTATE_180)
        return result_mat

    @staticmethod
    def is_smear_card(image, rect):
        """
        判断指定区域是否存在涂卡行为
        """
        sub_image = image[rect[1] : rect[1] + rect[3], rect[0] : rect[0] + rect[2]]
        if not CvCommonUtils.is_binary_image(sub_image):
            sub_image = CvCommonUtils.binary(sub_image)
        count = cv2.countNonZero(sub_image)
        total = rect[2] * rect[3]
        rate = 1.0 * (total - count) / total
        return rate > 0.55

    @staticmethod
    def get_smear_rate(image, rect):
        """
        判断指定区域是否存在涂卡行为
        """
        sub_image = image[rect[1] : rect[1] + rect[3], rect[0] : rect[0] + rect[2]]
        if not CvCommonUtils.is_binary_image(sub_image):
            sub_image = CvCommonUtils.binary(sub_image)
        count = cv2.countNonZero(sub_image)
        total = rect[2] * rect[3]
        rate = 1.0 * (total - count) / total
        return round(rate, 2)

    @staticmethod
    def get_rate(image):
        """
        判断指定区域的填涂率
        """
        if not CvCommonUtils.is_binary_image(image):
            image = CvCommonUtils.binary(image)
        count = cv2.countNonZero(image)
        height, width = image.shape
        total = width * height
        rate = 1.0 * (total - count) / total
        return round(rate, 2)

    @staticmethod
    def get_rate_similarity(rate1, rate2):
        return abs(rate1 - rate2)

    @staticmethod
    def joint_mat(mat_list):
        """
        将多个 Mat 对象垂直拼接成一个 Mat
        :param mat_list: 包含多个 Mat 对象的列表
        :return: 拼接后的 Mat 对象
        """
        # 检查输入是否为空
        if not mat_list:
            return None

        # 如果只有一张图片，直接返回
        if len(mat_list) == 1:
            return mat_list[0]

        # 计算拼接后的总行数和最大列数
        total_rows = sum(img.shape[0] for img in mat_list)  # 总行数
        max_cols = max(img.shape[1] for img in mat_list)  # 最大列数

        # 创建一个空白的结果图像（白色背景）
        result = (
            np.ones((total_rows, max_cols, mat_list[0].shape[2]), dtype=np.uint8) * 255
        )

        # 逐个图像复制到结果图像中
        row_offset = 0
        for img in mat_list:
            rows, cols, channels = img.shape
            # 将当前图像复制到目标区域
            result[row_offset : row_offset + rows, 0:cols] = img
            row_offset += rows  # 更新行偏移

        return result

    @staticmethod
    def cv2AddChineseText(img, text, position, textColor=(255, 0, 0), textSize=20):
        if isinstance(img, np.ndarray):  # 判断是否OpenCV图片类型
            img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        # 创建一个可以在给定图像上绘图的对象
        draw = ImageDraw.Draw(img)
        # 字体的格式
        fontStyle = ImageFont.truetype("simsun.ttc", textSize, encoding="utf-8")
        # 绘制文本
        draw.text(position, text, textColor, font=fontStyle)
        # 转换回OpenCV格式
        return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)

    @staticmethod
    def write_txt(img, txt, rect):
        # 计算文本位置
        text_org = (rect[0], rect[1])
        return CvCommonUtils.cv2AddChineseText(img, txt, text_org)

    @staticmethod
    def draw_rectangles(image, regions, color=(0, 0, 255), thickness=2):
        """
        在图片上绘制矩形。

        参数:
        - image: 由OpenCV加载的图片，例如通过cv2.imread()加载。
        - regions: 区域列表，每个元素是一个包含四个整数的列表或元组，表示矩形的左上角和宽高 [x1, y1, width, height]。
        - color: 绘制矩形的颜色，以BGR格式表示，默认为红色 (0, 0, 255)。
        - thickness: 矩形线条的粗细，默认为2像素。

        返回:
        - 新的图片，其中绘制了指定的矩形。
        """
        # 复制原始图像以保留原始图像不变
        output_image = image.copy()

        for region in regions:
            if len(region) != 4:
                print("警告：区域应包含四个坐标值，跳过此区域。")
                continue
            x1, y1, width, height = map(int, region)
            cv2.rectangle(
                output_image, (x1, y1), (x1 + width, y1 + height), color, thickness
            )

        return output_image

    @staticmethod
    def show_image(image, title="", max_height=960):
        height, width = image.shape[:2]
        if height > max_height:
            scale = max_height / height
            width = int(width * scale)
            height = int(height * scale)
        result_mat = cv2.resize(image, (width, height), interpolation=cv2.INTER_LINEAR)
        cv2.imshow(title, result_mat)
        # cv2.waitKey(0)

    @staticmethod
    def mat_to_base64(mat):
        # 将Mat对象转换为JPEG格式的字节流
        _, img_encoded = cv2.imencode(".jpg", mat)
        # 将字节流编码为base64字符串
        base64_str = base64.b64encode(img_encoded).decode("utf-8")
        return base64_str

    @staticmethod
    def mat_to_base64_all(mat):
        # 将Mat对象转换为JPEG格式的字节流
        _, img_encoded = cv2.imencode(".jpg", mat)
        # 将字节流编码为base64字符串
        base64_str = base64.b64encode(img_encoded).decode("utf-8")
        base64StrAll = f"data:image/jpeg;base64,{base64_str}"
        return base64StrAll

我是码客，我是全栈工程师，我为自己代言。

Python使用OpenCV常用方法

前言

安装依赖

图片信息

查看图片

图片保存

图片数据展平

获取图片大小

获取图片Base64

内存释放

图片处理

二值化

腐蚀与膨胀

旋转

图片拼接

二值化处理方式

简单阈值处理

三角法阈值处理

大津法阈值处理

自适应阈值处理

总结

图片绘制

绘制矩形

绘制中文乱码

图像识别

工具类