OpenCV-Python note GUI and core operations

This part of the notes came from my previous paper notebooks and re-written them into electronic documents for future reference. Because I am not familiar with formula writing, I may not add formulas in the future, mainly to explain and remind myself of some pits that need attention.

Python version: python3.7
OpenCV version: 4.3
Compiler: Pycharm

image reading

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

# img=cv.imread("gh.jpg",1)
# # cv.namedWindow("gh",cv.WINDOW_NORMAL)
# cv.imshow("gh.jpg",img)
# k=cv.waitKey(0)
# # cv.destroyAllWindows()
# if k==27:
#     cv.destroyAllWindows()
# else:
#     cv.imwrite("gh_gray.png", img)

# img=cv.imread("gh.jpg",1)
# print(img.shape)
# img=cv.cvtColor(img,cv.COLOR_BGR2RGB)
# plt.imshow(img,cmap="gray",interpolation="bicubic")
# plt.xticks([]),plt.yticks([])
# plt.show()
plt.imshow(img,cmap="gray",interpolation="bicubic")
matplotlab Read the image color space as **BGR**

img=cv.imread("gh.jpg",1)
cv.imshow("gh.jpg",img)
openCV Read the image color space as**RGB**

matplotlab

openCV

Convert using openCV statement

img=cv.cvtColor(img,cv.COLOR_BGR2RGB)

Change video box size and video encoding

cap=cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH,320)
cap.set(cv.CAP_PROP_FRAME_HEIGHT,240)

When displaying the frame, use the appropriate wait time cv.waitKey(). If it is too small, the video will be very fast, and if it is too large, the video will be slow (this is how slow motion is displayed). Normally 25ms is fine.

Save the video:

cap=cv.VideoCapture(0)
fourcc=cv.VideoWriter_fourcc(*'DIVX')
out=cv.VideoWriter("output.avi",fourcc,20.0,(640,480))

FourCC: is a 4-byte code used to specify the video codec. Available code it depends on the platform. Following the codec works fine for me.

In Fedora: DIVX, XVID, MJPG, X264, WMV1, WMV2. (It is better to use XVID. MJPG will produce large size video. X264 will produce very small size video)
In Windows: DIVX (to be tested and added)
In OSX: MJPG (.mp4), DIVX (.avi), X264 (.mkv).

FourCC code as MJPG

cv.VideoWriter_fourcc('M','J','P','G') 
cv.VideoWriter_fourcc(*'MJPG') 

transfer.

video flip

frame=cv.flip(frame,1) # 0 flip up and down, 1 flip left and right

draw

#canvas
img=np.zeros((512,512,3),np.uint8)
#line (canvas, start point, end point, color, line width)
cv.line(img,(0,0),(511,511),(255,100,0),2)
#rectangle(canvas, diagonal start point, diagonal end point, color, line width
cv.rectangle(img,(166,166),(346,346),(0,255,0),3)
#circle (canvas, center, radius, color, line width) when the line width is -1, fill the entire circle
cv.circle(img,(256,256),90,(0,0,255),4)
#Ellipse (canvas, circle center, length of major and minor axes, angle between major axis and horizontal, start angle, end angle, color, line width) When the line width is -1, fill the entire circle
cv.ellipse(img,(256,256),(100,50),0,0,270,255,3)

#polygon
pts=np.array([[0,0],[256,256],[512,0]],np.int32)#point collection
pts=pts.reshape(-1,1,2)#reshape point set, (number of points, each point, each point coordinates)
cv.polylines(img,[pts],True,(0,255,255))#(canvas, point, closed, color)

#Word
font=cv.FONT_HERSHEY_SIMPLEX#Initialize font, SIMPLEX normal size sans serif font
#(Canvas, text text, coordinates of the lower left corner of the text box, font, size, color, font line width, font scale)
cv.putText(img,"OpenCV",(170,265),font,1.5,(255,255,255),2,cv.LINE_AA)

draw track

drawing=False#start tag
mode=False#Mode selection, circle/rectangle
ix,iy=-1,-1
def draw_circle(event,x,y,flags,param):
    global ix,iy,drawing,mode #global variables such as coordinates
    if event==cv.EVENT_LBUTTONDOWN:#cv.EVENT_LBUTTONDOWN is True when the left mouse button is pressed
        drawing=True#start
        ix,iy=x,y#coordinate
    elif event==cv.EVENT_MOUSEMOVE:#mousemove is True
        if drawing==True:
            if mode==True:
                cv.rectangle(img,(ix,iy),(x,y),(0,255,0),-1)#draw rectangle
            else:
                cv.circle(img,(x,y),5,(0,0,255),-1)#draw a circle
    elif event==cv.EVENT_LBUTTONUP:#Release the left mouse button
        drawing=False#stop
        if mode==True:
            cv.rectangle(img,(ix,iy),(x,y),(0,255,0),3)#The last frame is a rectangle
        else:
            cv.circle(img,(x,y),5,(0,0,255),3)#round at the end

img=np.zeros((512,512,3),np.uint8)#Create a canvas
cv.namedWindow("image")#window name
cv.setMouseCallback("image",draw_circle)#call mouse tracking function
while(1):
    cv.imshow("image",img)
    if cv.waitKey(20)&0xFF==27:
        break
cv.destroyAllWindows()

Palette & slider

#Build function because cv.createTrackbar fifth is a callback function that executes every time the track bar value changes. The callback function always has the default parameter, which is the trackbar position.
def nothing(x):
    pass
#Create a canvas
img=np.zeros((300,512,3),np.uint8)
#name the canvas
cv.namedWindow("image")
#Create a slider (slider name, canvas name, default value, maximum value, callback function)
cv.createTrackbar("R","image",0,255,nothing)
cv.createTrackbar("G","image",0,255,nothing)
cv.createTrackbar("B","image",0,255,nothing)
#switch name
switch="0:OFF\n1:ON"
#Create a 0/1 switch
cv.createTrackbar(switch,"image",0,1,nothing)

while(1):
    cv.imshow("image",img)#show canvas
    k=cv.waitKey(1)&0xFF#wait
    if k==27:#exit switch
        break
        
    #Get the current position of the four tracks (the current value of the slider)
    r=cv.getTrackbarPos("R","image")
    g = cv.getTrackbarPos("G", "image")
    b = cv.getTrackbarPos("B", "image")
    s = cv.getTrackbarPos(switch, "image")
    #If the switch is 0, do not change the color, if it is 1, change the color
    if s==0:
        img[:]=0
    else:
        img[:]=[g,b,r]
cv.destroyAllWindows()

Access to modify individual pixels

px=img[100,100]

access the color of a single pixel

blue=img[100,100,0]

Since numpy's simple operation of accessing a single pixel and modifying it is complicated, it uses item and itemset to access and modify

#access
img.item(100,100,0)
#Revise
img.itemset((100,100,0),100)

You can use img.shape to quickly see if the image is in color

Total number of pixels img.size

Image data type img.dtype

copy, paste pixels

#[horizontal pixel start:end, vertical pixel start:end]
k=img[100:400,200:500]
#Paste it into the target area, if the area size is inconsistent, an error will be reported
img[600:900,700:1000]=k

Get all pixels of a single channel

b,g,r=cv.split(img)
#or
b=img[:,:,0]
##Note: The time complexity of split is higher than that of numpy operations. If it is not necessary, it is generally not used.

Zero channel pixels (python's broadcasting mechanism)

img[:,:,2]=0

set border

RED=[255,0,0]
img=cv.imread("gh.jpg")
#(picture, top width, bottom width, left width, right width, padding)
#fill with border pixels
replicate=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_REPLICATE)
#Mirror the border as the centerline
reflect=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_REFLECT)
#Mirror the centerline of the boundary, but exclude the pixels on the centerline
reflect101=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_REFLECT_101)
#Hard to explain, probably fghi|abcdefghi|abcd
wrap=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_WRAP)
#fill with value
constant=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_CONSTANT,value=RED)
plt.subplot(231),plt.imshow(img,"gray"),plt.title("original")
plt.subplot(232),plt.imshow(replicate,"gray"),plt.title("replicate")
plt.subplot(233),plt.imshow(reflect,"gray"),plt.title("reflect")
plt.subplot(234),plt.imshow(reflect101,"gray"),plt.title("reflect101")
plt.subplot(235),plt.imshow(wrap,"gray"),plt.title("wrap")
plt.subplot(236),plt.imshow(constant,"gray"),plt.title("constant")
plt.show()

result:

image addition

Both images should have the same type, or the second image is a scalar that python adds to each pixel via broadcasting.

openCV is a saturation operation: it defaults to the maximum value after exceeding the maximum value
numpy is modulo operation: take the remainder after exceeding the maximum value

To use OpenCV operations first

image fusion

img1=cv.imread("gh.jpg")
img2=cv.imread("gy.jpg")
print(img1.shape)
#If the two images are not the same size, they need to be converted to the same size
img2.resize((1080, 1920,3))
print(img2.shape)
#Fusion, (picture 1, picture 1 weight, picture 2, picture 2 weight, gamma value)
dst=cv.addWeighted(img1,0.1,img2,0.9,0)
cv.imshow("dst",dst)
cv.waitKey(0)

bitwise operations

img1=cv.imread("gh.jpg")
img2=cv.imread("gy.jpg")
#Get the length and width of the small image and the number of channels
rows,cols,channels=img2.shape
#Open area for easy sticking
roi=img1[0:rows,0:cols]
#Convert to black and white
img2gray=cv.cvtColor(img2,cv.COLOR_BGR2GRAY)
#take mask
ret,mask=cv.threshold(img2gray,10,255,cv.THRESH_BINARY)
#Invert the mask
mask_inv=cv.bitwise_not(mask)
#add up
img1_bg=cv.bitwise_and(roi,roi,mask=mask_inv)
img2_bg=cv.bitwise_and(img2,img2,mask=mask)
dst=cv.add(img1_bg,img2_bg)
img1[0:rows,0:cols]=dst
cv.imshow("res",img1)
cv.waitKey(0)

Performance measurement and improvement techniques

Use cv.getTickCount() to get the current time, and also use time.time to get it, for example:

import time
img1=cv.imread("gy.jpg")
e1=cv.getTickCount()
a=time.time()
for i in range(5,49,2):
    img1=cv.medianBlur(img1,i)
e2=cv.getTickCount()
b=time.time()
e=(e2-e1)/cv.getTickFrequency()
c=b-a
print(e,"Second",c,"Second")

result:

performance optimization technology

There are several techniques and coding methods to get the most out of Python and Numpy. Only relevant information is noted here, and links to important sources of information are provided. The main thing to note here is that first try to implement the algorithm in a simple way. Once it's running, profile it, find bottlenecks and optimize them.

1. Try to avoid using loops in Python, especially double/triple loops, etc. They are inherently slow.
2. Since Numpy and OpenCV are optimized for vector operations, vectorize the algorithm/code to the maximum extent.
3. Take advantage of cache coherence.
4. Never create a copy of an array unless necessary. Try using views instead. Array copying is an expensive operation.

Even after doing all this, if your code is still slow, or unavoidably needs to use large loops, use other libraries like Cython to make it faster.

Tags: Python OpenCV Computer Vision CV

Posted by plutarck on Thu, 30 Jun 2022 21:49:21 +0530