This part of the notes came from my previous paper notebooks and re-written them into electronic documents for future reference. Because I am not familiar with formula writing, I may not add formulas in the future, mainly to explain and remind myself of some pits that need attention.
Python version: python3.7
OpenCV version: 4.3
Compiler: Pycharm
image reading
import cv2 as cv import numpy as np import matplotlib.pyplot as plt # img=cv.imread("gh.jpg",1) # # cv.namedWindow("gh",cv.WINDOW_NORMAL) # cv.imshow("gh.jpg",img) # k=cv.waitKey(0) # # cv.destroyAllWindows() # if k==27: # cv.destroyAllWindows() # else: # cv.imwrite("gh_gray.png", img) # img=cv.imread("gh.jpg",1) # print(img.shape) # img=cv.cvtColor(img,cv.COLOR_BGR2RGB) # plt.imshow(img,cmap="gray",interpolation="bicubic") # plt.xticks([]),plt.yticks([]) # plt.show()
plt.imshow(img,cmap="gray",interpolation="bicubic") matplotlab Read the image color space as **BGR** img=cv.imread("gh.jpg",1) cv.imshow("gh.jpg",img) openCV Read the image color space as**RGB**
matplotlab
openCV
Convert using openCV statement
img=cv.cvtColor(img,cv.COLOR_BGR2RGB)
Change video box size and video encoding
cap=cv.VideoCapture(0) cap.set(cv.CAP_PROP_FRAME_WIDTH,320) cap.set(cv.CAP_PROP_FRAME_HEIGHT,240)
When displaying the frame, use the appropriate wait time cv.waitKey(). If it is too small, the video will be very fast, and if it is too large, the video will be slow (this is how slow motion is displayed). Normally 25ms is fine.
Save the video:
cap=cv.VideoCapture(0) fourcc=cv.VideoWriter_fourcc(*'DIVX') out=cv.VideoWriter("output.avi",fourcc,20.0,(640,480))
FourCC: is a 4-byte code used to specify the video codec. Available code it depends on the platform. Following the codec works fine for me.
In Fedora: DIVX, XVID, MJPG, X264, WMV1, WMV2. (It is better to use XVID. MJPG will produce large size video. X264 will produce very small size video)
In Windows: DIVX (to be tested and added)
In OSX: MJPG (.mp4), DIVX (.avi), X264 (.mkv).
FourCC code as MJPG
cv.VideoWriter_fourcc('M','J','P','G') cv.VideoWriter_fourcc(*'MJPG')
transfer.
video flip
frame=cv.flip(frame,1) # 0 flip up and down, 1 flip left and right
draw
#canvas img=np.zeros((512,512,3),np.uint8) #line (canvas, start point, end point, color, line width) cv.line(img,(0,0),(511,511),(255,100,0),2) #rectangle(canvas, diagonal start point, diagonal end point, color, line width cv.rectangle(img,(166,166),(346,346),(0,255,0),3) #circle (canvas, center, radius, color, line width) when the line width is -1, fill the entire circle cv.circle(img,(256,256),90,(0,0,255),4) #Ellipse (canvas, circle center, length of major and minor axes, angle between major axis and horizontal, start angle, end angle, color, line width) When the line width is -1, fill the entire circle cv.ellipse(img,(256,256),(100,50),0,0,270,255,3) #polygon pts=np.array([[0,0],[256,256],[512,0]],np.int32)#point collection pts=pts.reshape(-1,1,2)#reshape point set, (number of points, each point, each point coordinates) cv.polylines(img,[pts],True,(0,255,255))#(canvas, point, closed, color) #Word font=cv.FONT_HERSHEY_SIMPLEX#Initialize font, SIMPLEX normal size sans serif font #(Canvas, text text, coordinates of the lower left corner of the text box, font, size, color, font line width, font scale) cv.putText(img,"OpenCV",(170,265),font,1.5,(255,255,255),2,cv.LINE_AA)
draw track
drawing=False#start tag mode=False#Mode selection, circle/rectangle ix,iy=-1,-1 def draw_circle(event,x,y,flags,param): global ix,iy,drawing,mode #global variables such as coordinates if event==cv.EVENT_LBUTTONDOWN:#cv.EVENT_LBUTTONDOWN is True when the left mouse button is pressed drawing=True#start ix,iy=x,y#coordinate elif event==cv.EVENT_MOUSEMOVE:#mousemove is True if drawing==True: if mode==True: cv.rectangle(img,(ix,iy),(x,y),(0,255,0),-1)#draw rectangle else: cv.circle(img,(x,y),5,(0,0,255),-1)#draw a circle elif event==cv.EVENT_LBUTTONUP:#Release the left mouse button drawing=False#stop if mode==True: cv.rectangle(img,(ix,iy),(x,y),(0,255,0),3)#The last frame is a rectangle else: cv.circle(img,(x,y),5,(0,0,255),3)#round at the end img=np.zeros((512,512,3),np.uint8)#Create a canvas cv.namedWindow("image")#window name cv.setMouseCallback("image",draw_circle)#call mouse tracking function while(1): cv.imshow("image",img) if cv.waitKey(20)&0xFF==27: break cv.destroyAllWindows()
Palette & slider
#Build function because cv.createTrackbar fifth is a callback function that executes every time the track bar value changes. The callback function always has the default parameter, which is the trackbar position. def nothing(x): pass #Create a canvas img=np.zeros((300,512,3),np.uint8) #name the canvas cv.namedWindow("image") #Create a slider (slider name, canvas name, default value, maximum value, callback function) cv.createTrackbar("R","image",0,255,nothing) cv.createTrackbar("G","image",0,255,nothing) cv.createTrackbar("B","image",0,255,nothing) #switch name switch="0:OFF\n1:ON" #Create a 0/1 switch cv.createTrackbar(switch,"image",0,1,nothing) while(1): cv.imshow("image",img)#show canvas k=cv.waitKey(1)&0xFF#wait if k==27:#exit switch break #Get the current position of the four tracks (the current value of the slider) r=cv.getTrackbarPos("R","image") g = cv.getTrackbarPos("G", "image") b = cv.getTrackbarPos("B", "image") s = cv.getTrackbarPos(switch, "image") #If the switch is 0, do not change the color, if it is 1, change the color if s==0: img[:]=0 else: img[:]=[g,b,r] cv.destroyAllWindows()
Access to modify individual pixels
px=img[100,100]
access the color of a single pixel
blue=img[100,100,0]
Since numpy's simple operation of accessing a single pixel and modifying it is complicated, it uses item and itemset to access and modify
#access img.item(100,100,0) #Revise img.itemset((100,100,0),100)
You can use img.shape to quickly see if the image is in color
Total number of pixels img.size
Image data type img.dtype
copy, paste pixels
#[horizontal pixel start:end, vertical pixel start:end] k=img[100:400,200:500] #Paste it into the target area, if the area size is inconsistent, an error will be reported img[600:900,700:1000]=k
Get all pixels of a single channel
b,g,r=cv.split(img) #or b=img[:,:,0] ##Note: The time complexity of split is higher than that of numpy operations. If it is not necessary, it is generally not used.
Zero channel pixels (python's broadcasting mechanism)
img[:,:,2]=0
set border
RED=[255,0,0] img=cv.imread("gh.jpg") #(picture, top width, bottom width, left width, right width, padding) #fill with border pixels replicate=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_REPLICATE) #Mirror the border as the centerline reflect=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_REFLECT) #Mirror the centerline of the boundary, but exclude the pixels on the centerline reflect101=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_REFLECT_101) #Hard to explain, probably fghi|abcdefghi|abcd wrap=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_WRAP) #fill with value constant=cv.copyMakeBorder(img,100,100,100,100,cv.BORDER_CONSTANT,value=RED) plt.subplot(231),plt.imshow(img,"gray"),plt.title("original") plt.subplot(232),plt.imshow(replicate,"gray"),plt.title("replicate") plt.subplot(233),plt.imshow(reflect,"gray"),plt.title("reflect") plt.subplot(234),plt.imshow(reflect101,"gray"),plt.title("reflect101") plt.subplot(235),plt.imshow(wrap,"gray"),plt.title("wrap") plt.subplot(236),plt.imshow(constant,"gray"),plt.title("constant") plt.show()
result:
image addition
Both images should have the same type, or the second image is a scalar that python adds to each pixel via broadcasting.
openCV is a saturation operation: it defaults to the maximum value after exceeding the maximum value
numpy is modulo operation: take the remainder after exceeding the maximum value
To use OpenCV operations first
image fusion
img1=cv.imread("gh.jpg") img2=cv.imread("gy.jpg") print(img1.shape) #If the two images are not the same size, they need to be converted to the same size img2.resize((1080, 1920,3)) print(img2.shape) #Fusion, (picture 1, picture 1 weight, picture 2, picture 2 weight, gamma value) dst=cv.addWeighted(img1,0.1,img2,0.9,0) cv.imshow("dst",dst) cv.waitKey(0)
bitwise operations
img1=cv.imread("gh.jpg") img2=cv.imread("gy.jpg") #Get the length and width of the small image and the number of channels rows,cols,channels=img2.shape #Open area for easy sticking roi=img1[0:rows,0:cols] #Convert to black and white img2gray=cv.cvtColor(img2,cv.COLOR_BGR2GRAY) #take mask ret,mask=cv.threshold(img2gray,10,255,cv.THRESH_BINARY) #Invert the mask mask_inv=cv.bitwise_not(mask) #add up img1_bg=cv.bitwise_and(roi,roi,mask=mask_inv) img2_bg=cv.bitwise_and(img2,img2,mask=mask) dst=cv.add(img1_bg,img2_bg) img1[0:rows,0:cols]=dst cv.imshow("res",img1) cv.waitKey(0)
Performance measurement and improvement techniques
Use cv.getTickCount() to get the current time, and also use time.time to get it, for example:
import time img1=cv.imread("gy.jpg") e1=cv.getTickCount() a=time.time() for i in range(5,49,2): img1=cv.medianBlur(img1,i) e2=cv.getTickCount() b=time.time() e=(e2-e1)/cv.getTickFrequency() c=b-a print(e,"Second",c,"Second")
result:
performance optimization technology
There are several techniques and coding methods to get the most out of Python and Numpy. Only relevant information is noted here, and links to important sources of information are provided. The main thing to note here is that first try to implement the algorithm in a simple way. Once it's running, profile it, find bottlenecks and optimize them.
1. Try to avoid using loops in Python, especially double/triple loops, etc. They are inherently slow.
2. Since Numpy and OpenCV are optimized for vector operations, vectorize the algorithm/code to the maximum extent.
3. Take advantage of cache coherence.
4. Never create a copy of an array unless necessary. Try using views instead. Array copying is an expensive operation.
Even after doing all this, if your code is still slow, or unavoidably needs to use large loops, use other libraries like Cython to make it faster.