introduce
Recently, in the research of small things in target detection, it is necessary to capture the screen in real time on the desktop, and then detect it after obtaining the screen, so as to achieve the purpose of real-time desktop target detection, so I wrote a small code to realize this function. The actual measurement speed is very fast, in line with my needs. Hereby record it.
the code
import argparse import time import cv2 import keyboard import mss import numpy as np import win32com.client import win32con import win32gui class ScreenCapture: """ parameters ---------- screen_frame : Tuple[int, int] Screen width and height, respectively x,y region : Tuple[float, float] The actual screenshot range is x,y,(1.0, 1.0)Indicates full-screen detection, the lower the detection range, the smaller the detection range(Always keep the center of the screen as the center) window_name : str show window name exit_key : int The exit key value of the end window, which corresponds to each key on the keyboard ASCII Code value, the default is ESC key """ def __init__(self, screen_frame=(1920, 1080), region=(0.5, 0.5), window_name='test', exit_key=0x1B): self.parser = argparse.ArgumentParser() self.parser.add_argument('--region', type=tuple, default=region, help='screenshot range; respectively x,y,(1.0, 1.0)Indicates full-screen detection, the lower the detection range, the smaller the detection range(Always keep the center of the screen as the center)') self.parser_args = self.parser.parse_args() self.cap = mss.mss(mon=-1, optimize=True) # Instantiate mss and use efficient mode self.screen_width = screen_frame[0] # screen width self.screen_height = screen_frame[1] # screen height self.mouse_x, self.mouse_y = self.screen_width // 2, self.screen_height // 2 # The coordinates of the center point of the screen # screenshot area self.GAME_WIDTH, self.GAME_HEIGHT = int(self.screen_width * self.parser_args.region[0]), int( self.screen_height * self.parser_args.region[1]) # Width Height self.GAME_LEFT, self.GAME_TOP = int(0 + self.screen_width // 2 * (1. - self.parser_args.region[0])), int( 0 + 1080 // 2 * (1. - self.parser_args.region[1])) # origin self.RESZIE_WIN_WIDTH, self.RESIZE_WIN_HEIGHT = self.screen_width // 4, self.screen_height // 4 # display window size self.mointor = { 'left': self.GAME_LEFT, 'top': self.GAME_TOP, 'width': self.GAME_WIDTH, 'height': self.GAME_HEIGHT } self.window_name = window_name self.Exit_key = exit_key def grab_screen_mss(self, monitor): # cap.grab intercepts the picture, np.array converts the picture into an array, cvtColor converts BRGA into BRG, and removes the transparent channel return cv2.cvtColor(np.array(self.cap.grab(monitor)), cv2.COLOR_BGRA2BGR) def run(self): SetForegroundWindow_f = 0 # Determine whether a top window is required while True: # Determine whether to press ctrl+U The window is always on top if keyboard.is_pressed('ctrl+U'): while keyboard.is_pressed('ctrl+U'): continue if SetForegroundWindow_f == 0: SetForegroundWindow_f = 1 time.sleep(1) continue else: SetForegroundWindow_f = 0 img = self.grab_screen_mss(self.mointor) cv2.namedWindow(self.window_name, cv2.WINDOW_NORMAL) # cv2.WINDOW_NORMAL Set the image size according to the window size cv2.resizeWindow(self.window_name, self.RESZIE_WIN_WIDTH, self.RESIZE_WIN_HEIGHT) cv2.imshow(self.window_name, img) if SetForegroundWindow_f == 1: shell = win32com.client.Dispatch("WScript.Shell") shell.SendKeys('%') win32gui.SetForegroundWindow(win32gui.FindWindow(None, self.window_name)) win32gui.ShowWindow(win32gui.FindWindow(None, self.window_name), win32con.SW_SHOW) if cv2.waitKey(1) & 0XFF == self.Exit_key: # Default: ESC cv2.destroyAllWindows() exit("Finish")
code explanation
The function realization idea is mainly to use the mss library for screenshots, and use the opencv library for image display and processing.
First, use the argparse library to parse the incoming parameters and set the size of the detection range.
Then, instantiate a screenshot object cap using the mss library.
Next, set the width and height of the screen, and calculate the coordinates of the center point of the screen.
After that, calculate the width, height and origin coordinates of the screenshot area in the game according to the parameters passed in, and save them in the variable mointor.
Define a function grab_screen_mss, use cap.grab to capture the picture, and use np.array to convert the picture into an array, then use cvtColor to convert BRGA to BRG, and remove the transparent channel.
A run function is defined, which loops continuously to determine whether to press ctrl+U. If pressed, the window will always be on top.
Then call the grab_screen_mss function to get a screenshot, use the cv2 library for image display, and set the size of the display window.
If the window needs to be on top, use the win32com library and win32gui library to keep the window on top.
Finally, use the waitKey function of the cv2 library to wait for user operations, and press the ESC key to exit the program.
call example
sc = ScreenCapture() sc.run()
Parameter explanation:
screen_frame : Tuple[int, int]
Screen width and height, respectively x,y region : Tuple[float, float] The actual screenshot range is x,y,(1.0, 1.0)Indicates full-screen detection, the lower the detection range, the smaller the detection range(Always keep the center of the screen as the center) window_name : str show window name exit_key : int The exit key value of the end window, which corresponds to each key on the keyboard ASCII Code value, the default is ESC key
other
The ASCII code value corresponding to each key on the keyboard (0x refers to hexadecimal, the ascii code value of the delete key is 0x2e, which is 46 in decimal)
0x1 left mouse button
0x2 right mouse button
0x3 CANCEL key
0x4 middle mouse button
0x8 BACKSPACE key
0x9 TAB key
0xC CLEAR key
0xD ENTER key
0x10 SHIFT key
0x11 CTRL key
0x12 MENU key
0x13 PAUSE key
0x14 CAPS LOCK key
0x1B ESC key
0x20 SPACEBAR key
0x21 PAGE UP key
0x22 PAGE DOWN key
0x23 END key
0x24 HOME key
0x25 LEFT ARROW key
0x26 UP ARROW key
0x27 RIGHT ARROW key
0x28 DOWN ARROW key
0x29 SELECT key
0x2A PRINT SCREEN key
0x2B EXECUTE key
0x2C SNAPSHOT key
0x2D INSERT key
0x2E DELETE key
0x2F HELP key
0x90 NUM LOCK key
The A to Z keys have the same ASCII codes as the A to Z letters:
value description
65 A key
66 B key
67 C key
68 D key
69 E key
70 F key
71 G key
72 H key
73 I key
74 J key
75 K keys
76 L key
77 M key
78 N key
79 O key
80 P key
81 Q key
82 R key
83 S key
84 T key
85 U key
86 V keys
87 W key
88 X keys
89 Y key
90 Z key
The 0 to 9 keys have the same ASCII codes as the numbers 0 to 9:
value description
48 0 keys
49 1 key
50 2 keys
51 3 keys
52 4 keys
53 5 keys
54 6 keys
55 7 keys
56 8 keys
57 9 keys
The following constants represent keys on the numeric keypad:
value description
0x60 0 key
0x61 1 key
0x62 2 keys
0x63 3 keys
0x64 4 keys
0x65 5 keys
0x66 6 keys
0x67 7 keys
0x68 8 keys
0x69 9 keys
0x6A MULTIPLICATION SIGN (*) key
0x6B PLUS SIGN (+) key
0x6C ENTER key
0x6D MINUS SIGN (–) key
0x6E DECIMAL POINT (.) key
0x6F DIVISION SIGN (/) key
The following constants represent function keys:
value description
0x70 F1 key
0x71 F2 key
0x72 F3 key
0x73 F4 key
0x74 F5 key
0x75 F6 key
0x76 F7 key
0x77 F8 key
0x78 F9 key
0x79 F10 key
0x7A F11 key
0x7B F12 key
0x7C F13 key
0x7D F14 key
0x7E F15 key
0x7F F16 key