Hello, this article will briefly introduce how to use the open source and powerful third-party opencv library to realize image segmentation.
The libraries to be installed are:
pip install opencv-python
pip install matplotlib
Python interface help documentation website: https://docs.opencv.org/4.5.2/d6/d00/tutorial_py_root.html
catalogue
5. draw a rectangle for the outline
6. split the picture and save it
Image segmentation
Picture materials used in this article:
First, import the library used:
import cv2 import os,shutil from matplotlib import pyplot as plt
1. load pictures
Note: when the image path is passed in here, the path cannot contain a Chinese name, otherwise an error will be reported!!!
###1. Load pictures filepath = './testImage.png' ###Image path. Note: the path here cannot contain Chinese name img = cv2.imread(filepath) cv2.imshow('Orignal img', img) ###display picture cv2.waitKey(0) ###To prevent a flash, it is a keyboard binding function (0 means to press any key to terminate)
2. grayscale the picture
###2. Change the color picture to gray (gray processing) img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) cv2.imshow('img_gray', img_gray) cv2.waitKey(0)
3. binary image processing
thresh=220 is the user-defined threshold (roughly obtained by analyzing the image data of print(img_gray)). The pixel value greater than 220 is set to 0, and the pixel value less than 220 is set to 255.
maxval= and THRESH_BINARY and THRESH_BINARY_ The maximum value used together with the inv threshold can be understood as the fill color, and the range is (0 ~ 255).
Type: parameter type threshold type (cv2.THRESH_BINARY) the part greater than the threshold value is set to 255, and the part less than the threshold value is set to 0 (black and white binary) cv2 THRESH_ BINARY_ The inv greater than the threshold value is set to 0, and the inv less than the threshold value is set to 255 (black white binary inversion - white black) and other types...)
###3. Binary image processing ''' thresh=220 Is a custom set threshold(By analysis print(img_gray)Image data of about),Pixel values greater than 220 are set to 0, and those less than 220 are set to 255 maxval=And THRESH_BINARY and THRESH_BINARY_INV The maximum value used together with the threshold value, which can be understood as the fill color, and the range is (0~255). type: Parameter type threshold type( cv2.THRESH_BINARY The part greater than the threshold value is set to 255, and the part less than the threshold value is set to 0 (black-and-white binary) cv2.THRESH_BINARY_INV The part greater than the threshold value is set to 0, and the part less than the threshold value is set to 255 (black-and-white binary inversion - white black) Other types...... ) ''' ret, img_inv = cv2.threshold(src=img_gray, thresh=220, maxval=255, type=cv2.THRESH_BINARY_INV) cv2.imshow('img_inv', img_inv) cv2.waitKey(0)
3.1 Custom threshold
###Threshold comparison (global threshold (v = 127), adaptive average threshold, adaptive Gaussian threshold) def threshContrast(): filepath = './testImage.png' img = cv2.imread(filepath) img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) img_gray = cv2.medianBlur(img_gray, 5) ret1, th1 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY) th2 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2) th3 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2) title = ['original image (Grayscale)','Global threshold( v = 127)','Adaptive average threshold','Adaptive Gaussian threshold'] images = [img_gray, th1, th2, th3] for i in range(4): plt.subplot(2, 2, i + 1), plt.imshow(images[i], 'gray') # plt.title(title[i]) ###Cannot use Chinese when plt drawing plt.xticks([]), plt.yticks([]) plt.show()
4. extract contour
img_inv is an image that looks for contour;
cv2.RETR_EXTERNAL: indicates that only extreme external contours are retrieved;
cv2.CHAIN_APPROX_SIMPLE: compress elements in horizontal, vertical and diagonal directions, and only retain their endpoint coordinates. For example, a vertical rectangular outline is encoded with 4 points.
###4. Extract contour ''' https://docs.opencv.org/4.5.2/d4/d73/tutorial_py_contours_begin.html img_inv Is an image that looks for contours; cv2.RETR_EXTERNAL: Indicates that only extreme external contours are retrieved; cv2.CHAIN_APPROX_SIMPLE: Compression level, Vertical and diagonal elements only retain their endpoint coordinates. For example, an upright rectangular outline is encoded with 4 points. ''' contours,hierarchy = cv2.findContours(img_inv, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) print(f'The number of detected contours is:{len(contours)}individual') print('The return value is the index of each layer contour:\n', hierarchy)
5. draw a rectangle for the outline
###5. Find out the rectangular position drawn by each contour br = [] cntid = 0 for cnt in contours: '''cnt Represents the entered contour value, x,y, w, h Representing circumscribed rectangles x Shaft and y The coordinates of the axis, and the coordinates of the rectangle w generous and easygoing h high,''' x, y, w, h = cv2.boundingRect(cnt) cntid += 1 print(f'Detected No{cntid}The rectangular positions drawn by the outlines are: x={x},y={y},w={w},h={h}') br.append(cv2.boundingRect(cnt)) '''img Represents the input picture to be drawn(Here is the outline drawn on the original drawing),cnt Represents the entered contour value,-1 express contours Index of contour in(Draw all the outlines here),(0, 0, 255)express rgb Color - red, 2 indicates line thickness''' cv2.drawContours(img, [cnt], -1, (0, 0, 255), 2) cv2.imshow('cnt', img) cv2.waitKey(0) br.sort() ###Sort the tuples in the list in ascending order (the idea here is to sort in ascending order according to the corresponding x-axis coordinates)
The process of drawing the outline of each character (the sequence is drawn from right to left, and the period may also be intermittent, as shown in the following figure).
6. split the picture and save it
###6. Split the picture and save it (here, split the previously processed binary picture data (img_inv)) if not os.path.exists('./imageSplit'): os.mkdir('./imageSplit') else: shutil.rmtree('./imageSplit') os.mkdir('./imageSplit') for x,y,w,h in br: # print(x,y,w,h) # split_image = img_inv[y:y + h, x:x + w] split_image = img_inv[y - 2:y + h + 2, x - 2:x + w + 2] ###It feels better to split like this cv2.imshow('split_image', split_image) cv2.waitKey(0) save_filepath = './imageSplit/' filename = f'{x}.jpg' ###It is named by the corresponding x-axis coordinates of each picture cv2.imwrite(save_filepath + filename, split_image) print(f'\033[31m{filename}Image segmentation completed!\033[0m')
Here is a character segmentation display process for the previously processed binary picture data (img_inv).
Here is the meaning of this line of code. The following figure is drawn manually. It's too ugly, hahaha!!!
# split_image = img_inv[y:y + h, x:x + w]
7. view split pictures
Finally, we check the effect of the split image on pyplot, and it is finally finished.
###7. Use pyplot to view the image after our segmentation imagefile_list = os.listdir('./imageSplit') imagefile_list.sort(key=lambda x: int(x[:-4])) for i in range(len(imagefile_list)): img = cv2.imread(f'./imageSplit/{imagefile_list[i]}') plt.subplot(1, len(imagefile_list), i + 1), plt.imshow(img, 'gray') plt.title(imagefile_list[i]) plt.xticks([]), plt.yticks([]) plt.show()
8. complete code
import cv2 import os,shutil from matplotlib import pyplot as plt ''' This is the URL of the document:https://docs.opencv.org/4.5.2/index.html This is provided Python Interface tutorial website: https://docs.opencv.org/4.5.2/d6/d00/tutorial_py_root.html ''' def imageSplit(): ###1. Load pictures filepath = './testImage.png' ###Image path. Note: the path here cannot contain Chinese name img = cv2.imread(filepath) cv2.imshow('Orignal img', img) ###display picture cv2.waitKey(0) ###To prevent a flash, it is a keyboard binding function (0 means to press any key to terminate) ###2. Change the color picture to gray (gray processing) img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) cv2.imshow('img_gray', img_gray) cv2.waitKey(0) ###3. Binary image processing ''' thresh=220 Is a custom set threshold(By analysis print(img_gray)Image data of about),Pixel values greater than 220 are set to 0, and those less than 220 are set to 255 maxval=And THRESH_BINARY and THRESH_BINARY_INV The maximum value used together with the threshold value, which can be understood as the fill color, and the range is (0~255). type: Parameter type threshold type( cv2.THRESH_BINARY The part greater than the threshold value is set to 255, and the part less than the threshold value is set to 0 (black-and-white binary) cv2.THRESH_BINARY_INV The part greater than the threshold value is set to 0, and the part less than the threshold value is set to 255 (black-and-white binary inversion - white black) Other types...... ) ''' ret, img_inv = cv2.threshold(src=img_gray, thresh=220, maxval=255, type=cv2.THRESH_BINARY_INV) cv2.imshow('img_inv', img_inv) cv2.waitKey(0) ###4. Extract contour ''' https://docs.opencv.org/4.5.2/d4/d73/tutorial_py_contours_begin.html img_inv Is an image that looks for contours; cv2.RETR_EXTERNAL: Indicates that only extreme external contours are retrieved; cv2.CHAIN_APPROX_SIMPLE: Compression level, Vertical and diagonal elements only retain their endpoint coordinates. For example, an upright rectangular outline is encoded with 4 points. ''' contours,hierarchy = cv2.findContours(img_inv, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) print(f'The number of detected contours is:{len(contours)}individual') print('The return value is the index of each layer contour:\n', hierarchy) ###5. Find out the rectangular position drawn by each contour br = [] cntid = 0 for cnt in contours: '''cnt Represents the entered contour value, x,y, w, h Representing circumscribed rectangles x Shaft and y The coordinates of the axis, and the coordinates of the rectangle w generous and easygoing h high,''' x, y, w, h = cv2.boundingRect(cnt) cntid += 1 print(f'Detected No{cntid}The rectangular positions drawn by the outlines are: x={x},y={y},w={w},h={h}') br.append(cv2.boundingRect(cnt)) '''img Represents the input picture to be drawn(Here is the outline drawn on the original drawing),cnt Represents the entered contour value,-1 express contours Index of outline in(Draw all the outlines here),(0, 0, 255)express rgb Color - red, 2 indicates line thickness''' cv2.drawContours(img, [cnt], -1, (0, 0, 255), 2) cv2.imshow('cnt', img) cv2.waitKey(0) br.sort() ###Sort the tuples in the list in ascending order (the idea here is to sort in ascending order according to the corresponding x-axis coordinates) ###6. Split the picture and save it (here, split the previously processed binary picture data (img_inv)) if not os.path.exists('./imageSplit'): os.mkdir('./imageSplit') else: shutil.rmtree('./imageSplit') os.mkdir('./imageSplit') for x,y,w,h in br: # print(x,y,w,h) # split_image = img_inv[y:y + h, x:x + w] split_image = img_inv[y - 2:y + h + 2, x - 2:x + w + 2] ###It feels better to split like this cv2.imshow('split_image', split_image) cv2.waitKey(0) save_filepath = './imageSplit/' filename = f'{x}.jpg' ###It is named by the corresponding x-axis coordinates of each picture cv2.imwrite(save_filepath + filename, split_image) print(f'\033[31m{filename}Image segmentation completed!\033[0m') cv2.destroyAllWindows() ###Delete all windows ###7. Use pyplot to view the image after our segmentation imagefile_list = os.listdir('./imageSplit') imagefile_list.sort(key=lambda x: int(x[:-4])) for i in range(len(imagefile_list)): img = cv2.imread(f'./imageSplit/{imagefile_list[i]}') plt.subplot(1, len(imagefile_list), i + 1), plt.imshow(img, 'gray') plt.title(imagefile_list[i]) plt.xticks([]), plt.yticks([]) plt.show() print('\nperfect!!!') ###Threshold comparison (global threshold (v = 127), adaptive average threshold, adaptive Gaussian threshold) def threshContrast(): filepath = './testImage.png' img = cv2.imread(filepath) img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) img_gray = cv2.medianBlur(img_gray, 5) ret1, th1 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY) th2 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2) th3 = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2) title = ['original image (Grayscale)','Global threshold( v = 127)','Adaptive average threshold','Adaptive Gaussian threshold'] images = [img_gray, th1, th2, th3] for i in range(4): plt.subplot(2, 2, i + 1), plt.imshow(images[i], 'gray') # plt.title(title[i]) ###Cannot use Chinese when plt drawing plt.xticks([]), plt.yticks([]) plt.show() if __name__ == '__main__': imageSplit() ###Threshold comparison # threshContrast()