您的当前位置:首页正文

OpenCV实现倾斜文字校正

2024-11-17 来源:个人技术集锦

基于OpenCV的倾斜文字校正,供大家参考,具体内容如下

使用OpenCV里example中的的倾斜文本作为输入,本文的目的即将该倾斜的文本校正成水平方向的文本。

主要思路为:

读取图像-——>Canny边缘检测——->形态学操作-——>提取最小外接矩形——->计算旋转矩阵-——>仿射变换校正文本图像

原始图像:

提取最小外接矩形区域

校正后的图像

主要涉及的API

创建滑动条

这个API可以创建一个滑动条,可以在不改变程序的情况下更改变量的值来显示图像变化的效果。在这个场景中使用创建滑动条来调节Canny边缘检测的阈值,提取合适的边缘。

```cpp
Clicking the label of each trackbar enables editing the trackbar values manually.

@param trackbarname Name of the created trackbar.\\第一个参数为滑动条的名称
@param winname Name of the window that will be used as a parent of the created trackbar. \\ 滑动条所依附的窗口名称
@param value Optional pointer to an integer variable whose value reflects the position of the slider. Upon creation, the slider position is defined by this variable.
\\ 引用值,即拖动滑动条所改变的值,需要提前定义,定义好的值即为滑动条的初始值。
@param count Maximal position of the slider. The minimal position is always 0. \\滑动条的最大位置
@param onChange Pointer to the function to be called every time the slider changes position. This function should be prototyped as void Foo(int,void\*); , where the first parameter is the trackbar position and the second parameter is the user data (see the next parameter). If the callback is the NULL pointer, no callbacks are called, but only value is updated.
\\ 定义回调函数,每次滑动滑动条都会调用这个回调函数。回调函数的格式为 void Foo(int,void*)其中第一个参数为轨迹条的位置,第二个参数是用户数据userdata。
@param userdata User data that is passed as is to the callback. It can be used to handle trackbar events without using global variables.
\\ 用户传递给回调函数的数据userdata,这里使用的是全局变量,因此这一项可以忽略。

CV_EXPORTS int createTrackbar(const String& trackbarname, const String& winname,
                              int* value, int count,
                              TrackbarCallback onChange = 0,
                              void* userdata = 0);
### Canny边缘检测

使用Canny边缘检测算法提取文本图像的边缘

`

```cpp
/** @brief Finds edges in an image using the Canny algorithm @cite Canny86 .

The function finds edges in the input image and marks them in the output map edges using the Canny algorithm. The smallest value between threshold1 and threshold2 is used for edge linking. The largest value is used to find initial segments of strong edges. See
<http://en.wikipedia.org/wiki/Canny_edge_detector>

@param image 8-bit input image. 
// 输入图像
@param edges output edge map; single channels 8-bit image, which has the same size as image .
// 输出单通道的八位图像
@param threshold1 first threshold for the hysteresis procedure.
// 阈值(低于此阈值的直接被剔除)
@param threshold2 second threshold for the hysteresis procedure.
// 阈值2(高于此阈值的被认为为边缘)
@param apertureSize aperture size for the Sobel operator.
// 表示Sobel算子的模板大小
@param L2gradient a flag, indicating whether a more accurate \f$L_2\f$ norm
\f$=\sqrt{(dI/dx)^2 + (dI/dy)^2}\f$ should be used to calculate the image gradient magnitude (L2gradient=true ), or whether the default \f$L_1\f$ norm\f$=|dI/dx|+|dI/dy|\f$ is enough (L2gradient=false ).
// 使用更精确的L2范数(True)还是默认的L1范数(默认)

 */
CV_EXPORTS_W void Canny( InputArray image, OutputArray edges,
                         double threshold1, double threshold2,
                         int apertureSize = 3, bool L2gradient = false );

边缘检测的结果,由于测试的样本图像比较理想,调节滑动条的位置影响不是很大。如果图像质量较差,也可以添加图像预处理步骤优化源图像。

形态学处理

使用形态学处理操作连接经过Canny边缘检测处理后字符缝隙,使其变成一个连通域。

步骤:创建结构元素(getStructuringElement)->膨胀处理

Mat src_dilate;
Mat kernel = getStructuringElement(MORPH_RECT, Size(11, 11), Point());
// 创建结构元素的大小,第一个参数为结构元素的形状(矩形MORPH_RECT,十字形结构MORPH_CROSS,椭圆形结构MORPH_ELLIPSE),第二个参数为结构元素的大小,第三个参数为锚点的位置。
dilate(Canny_edge, src_dilate, kernel, Point(-1, -1), 1, BORDER_DEFAULT);
// 膨胀图像,第一个参数为输入图像,第二个为输出图像,第三个参数为结构元素,第四个参数为锚点的位置,第五个参数为迭代的次数,最后一个参数为边界填充的类型

膨胀后的图像显示为:

查找轮廓

vector<vector<Point>> Contours;
// 定义边缘的点集
vector<Vec4i> hierarchy;
// 定义边缘的层次关系
findContours(src_dilate, Contours, hierarchy, RETR_TREE, CHAIN_APPROX_NONE, Point());
/** @brief Finds contours in a binary image.

The function retrieves contours from the binary image using the algorithm @cite Suzuki85 . The contours are a useful tool for shape analysis and object detection and recognition. See squares.cpp in the OpenCV sample directory.
@note Since opencv 3.2 source image is not modified by this function.

@param image Source, an 8-bit single-channel image. Non-zero pixels are treated as 1's. Zero pixels remain 0's, so the image is treated as binary . You can use #compare, #inRange, #threshold ,#adaptiveThreshold, #Canny, and others to create a binary image out of a grayscale or color one.If mode equals to #RETR_CCOMP or #RETR_FLOODFILL, the input can also be a 32-bit integer image of labels (CV_32SC1).
// 输入图像
@param contours Detected contours. Each contour is stored as a vector of points (e.g.
std::vector<std::vector<cv::Point> >).
// 输出边缘的点集
@param hierarchy Optional output vector (e.g. std::vector<cv::Vec4i>), containing information about the image topology(拓扑结构). It has as many elements as the number of contours. For each i-th contour contours[i], the elements hierarchy[i][0] , hierarchy[i][1] , hierarchy[i][2] , and hierarchy[i][3] are set to 0-based indices in contours of the next and previous contours at the same hierarchical level, the first child contour and the parent contour, respectively. If for the contour i there are no next, previous, parent, or nested contours, the corresponding elements of hierarchy[i] will be negative.
//轮廓之间的层次结构关系(下一个轮廓、前一个轮廓、第一个子轮廓、父轮廓)
@param mode Contour retrieval mode, see #RetrievalModes
// 检索模式:RETR_EXTERNAL 仅检索外部轮廓
RETR_LIST 在不建立层次结构的情况下检索所有轮廓
RETR_CCOMP 检索具有两级层次结构(外部和孔)的所有轮廓。
RETR_TREE检索所有轮廓,在轮廓之间创建完整的层次结构

@param method Contour approximation method, see #ContourApproximationModes
// 轮廓的形状近似方法
// CV_CHAIN_APPROX_NONE 不对轮廓应用任何近似方法并存储轮廓点
// CV_CHAIN_APPROX_SIMPLE 压缩所有水平、垂直和对角线段,仅存储起点和终点
// 等其他近似算法
@param offset Optional offset by which every contour point is shifted. This is useful if the contours are extracted from the image ROI and then they should be analyzed in the whole image context.
// 定义偏移量
 */
CV_EXPORTS_W void findContours( InputArray image, OutputArrayOfArrays contours,
                              OutputArray hierarchy, int mode,
                              int method, Point offset = Point());

最小外接矩形

旋转矩形对象类型 RotatedRect

轮廓的最小外接矩形 minAreaRect

/** @brief Finds a rotated rectangle of the minimum area enclosing the input 2D point set.

The function calculates and returns the minimum-area bounding rectangle (possibly rotated) for a specified point set. Developer should keep in mind that the returned RotatedRect can contain negative indices when data is close to the containing Mat element boundary.
// 该函数输入边缘的点集,并返回一个最小外接矩形,当数据靠近边界的时候,返回的RotatedRect可能包含有负所引。

@param points Input vector of 2D points, stored in std::vector\<\> or Mat
 */
CV_EXPORTS_W RotatedRect minAreaRect( InputArray points );

计算旋转矩阵

Mat Rotation = getRotationMatrix2D(center,RRect_degree,1.0);
Mat output;
warpAffine(src,output,Rotation,src.size(),INTER_CUBIC,BORDER_CONSTANT,Scalar(255,255,255));
// 输入要变换图像,输出图像,定义旋转矩阵,定义插值方式,边界类型(边界类型需要注意)

完整代码

#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/calib3d.hpp>

using namespace std;
using namespace cv;

// Define Mat
Mat src, gray_src, dst;
const char *IWindow = "InputWindow";
const char *OWindow = "OutputWindow";
int canny_threshold = 100;
int threshold_level = 255;

void Canny_function(int, void *);

int main(int argc, char **argv) {
    src = imread("D:/Delete/imageTextR.png");
    if (src.empty()) {
        cout << "Could not load image" << endl;
        return -1;
    }
    // Covert to GrayScale
    cvtColor(src, gray_src, COLOR_BGR2GRAY);
    namedWindow(IWindow, WINDOW_AUTOSIZE);
    imshow(IWindow, src);
    namedWindow(OWindow, WINDOW_AUTOSIZE);
    createTrackbar("Canny_Threshold", IWindow, &canny_threshold, threshold_level, Canny_function);

    waitKey();
    return 0;
}

void Canny_function(int, void *) {
    Mat Canny_edge;
    Canny(gray_src, Canny_edge, canny_threshold, canny_threshold * 2, 3, false);
    //Display Canny_edge
    //imshow(OWindow,Canny_edge);

    Mat src_dilate;
    Mat kernel = getStructuringElement(MORPH_RECT, Size(11, 11), Point());
    dilate(Canny_edge, src_dilate, kernel, Point(-1, -1), 1, BORDER_DEFAULT);
    //Display Dilate image
    imshow(OWindow,src_dilate);

    // Find Contour
    vector<vector<Point>> Contours;
    vector<Vec4i> hierarchy;
    findContours(src_dilate, Contours, hierarchy, RETR_TREE, CHAIN_APPROX_NONE, Point());

    // Select the Max area Contour
    double MaxAreaRRect = 0;
    int SizeContour = 0;
    for (size_t t = 0; t < Contours.size(); t++) {
        RotatedRect RRect = minAreaRect(Contours[t]);
        double AreaRRect = 0;
        AreaRRect = RRect.size.area();
        MaxAreaRRect = max(MaxAreaRRect, AreaRRect);
    }

    double RRect_degree = 0;
    dst = src.clone(); // 这里涉及是否复制数据的问题
    for (size_t t = 0; t < Contours.size(); t++) {
        RotatedRect RRect = minAreaRect(Contours[t]);
        double AreaRRect = RRect.size.area();
        if (AreaRRect == MaxAreaRRect ) {
            SizeContour = SizeContour + 1;
            // Rotate degree
            RRect_degree = RRect.angle;
            // Draw this rectangle
            Point2f vertex[4];
            RRect.points(vertex);
            for (int i = 0; i < 4; i++) {
                line(dst, Point(vertex[i]), Point(vertex[(i + 1) % 4]), Scalar(0, 255, 0), 2, LINE_8);
            }
        }
    }
    cout << "SizeContour : \t "<< SizeContour <<endl;
    cout << "Rotated Rectangle angle : " << RRect_degree << endl;

    //imshow(OWindow, dst);

    Point2f center(src.cols/2,src.rows/2);
    Mat Rotation = getRotationMatrix2D(center,RRect_degree,1.0);
    Mat output;
    warpAffine(src,output,Rotation,src.size(),INTER_CUBIC,BORDER_CONSTANT,Scalar(255,255,255));


    // Display Rotate Rectangle
    namedWindow("Final_Result",WINDOW_AUTOSIZE);
    imshow("Final_Result", output);


}

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持。

Top