Summary of selenium usage


from selenium import webdriver: import library

1, Declare browser objects

  • Browser = webdriver Chrome (): Google
  • Browser = webdriver Firefox (): Firefox
  • browser = webdriver.Edge() : Edge
  • Browser = webdriver safari(): Apple safari

2, Basic method

  1. Browser Maximize_ Window(): maximize the window
  2. Browser Get (url): browser request url
  3. Browser Close(): close the window
  4. Browser Quit(): close the browser
  5. Browser Page_ Source: get the HTML code of the request url
  6. Browser Page_ Source Find ('keyword'): find keyword in HTML code
  7. Browser Current_ url: get the current url

3, Find nodes

(1) Single node

  1. Browser Find_ Element_ By_ Tag_ Name: find by tag name
  2. Browser Find_ Element_ By_ id (): find by id
  3. Browser Find_ Element_ By_ Class_ Name(): find by attribute
  4. Browser Find_ Element_ By_ Name(): find by name
  5. Browser Find_ Element_ By_ Link_ Text (): search through the text in the a link
  6. Browser Find_ Element_ By_ css_ Selector (): select \id through css Classname tag lookup
  7. Browser Find_ Element_ By_ xpath (): find through xpath
  8.  #General method
     from selenium.webdirver.common.by import By
     browser.find_element(By.ID, 'q')
    

    Basically, 6 or 7 can be used to replace the first 5

(2) Multiple nodes (return list)

  1. browser.find_elements_by_tag_name

  2. browser.find_elements_by_id()

  3. browser.find_elements_by_class_name()

  4. browser.find_elements_by_name()

  5. browser.find_elements_by_link_text()

  6. browser.find_elements_by_css_selector()

  7. browser.find_elements_by_xpath()

    The same as a single node, only add s after element

4, Node interaction

(1) input node

  1. Input_ Node Send_ Keys ("keyword"): fill in the input box with text
  2. Input_ Node Clear(): clear the input box

(2) button node

  • Button_ Node Click (): click the button button

5, Node information

  1. Node Get_ Attribute ('href/id/name'): obtain node attributes
  2. Node Text: node internal text information
  3. Node id: node id
  4. Node Location: the relative position of the node in the page
  5. Node Tag_ Name: label name
  6. Node Size: node size (width, height)

6, Switch Frame

  1. Browser Switch_ To_ Frame ('id name'): switch to sub page
  2. Browser Switch_ To_ Parent_ Frame (): switch back to the parent page

7, Action chain

from selenium import webdriver
from selenium.webdriver import ActionChains

browser = webdriver.Chrome()
browser.get('http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable')
browser.switch_to.frame('iframeResult')
source = browser.find_element_by_id('draggable')
target = browser.find_element_by_id('droppable')
actions = ActionChains(browser)
actions.drag_and_drop(source, target)
actions.perform()
  1. From selenium Webdriver import actionchains: import classes
  2. actions = ActionChains(browser): pass in the browser object and create the actions object
  3. Actions Drag_ And_ Drop (source, target): drag and drop to transfer in the source / target node
  4. Actions Perform(): execute action

8, Execute JS

  • Browser Execute_ Script ('window.scrollto (0, document.body.scrollheight) '): execute js code and pull down the progress bar to the bottom

9, Cookies

  1. Browser Get_ Cookies (): get cookies of the current web page
  2. Browser Delete_ All_ Cookies (): delete cookies
  3. Browser Add_ Cookie (): add a cookie

10, Delay waiting

In the process of accessing web pages using Selenium, sometimes you need to wait for the web pages to load, so sometimes you need to wait a certain time to ensure that all the contents required in the web pages are loaded

(1) Implicit waiting

When an implicit wait is used to execute a test, if Selenium does not find a node in the DOM, it will continue to wait. After the set time is exceeded, it will throw an exception that the node cannot be found.

  • Browser Implicitly_ Wait (10): wait for 10s

(2) Explicit wait

Specify the node to find and a maximum wait time. If the node is loaded within the specified time, the searched node is returned; If the node is not loaded within the specified time, a timeout exception is thrown.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(browser, 10)
input = wait.until(EC.presence_of_element_located(By.ID, 'q')))

11, Abnormal

  1. NoSuchElementException: no node exception
  2. TimeoutException: timeout exception

12, Prevent Selenium from being shielded

from selenium import webdriver
from selenium.webdriver import ChromeOptions

option = ChromeOptions()
option.add_experimental_option('excludeSwitches', ['enable-automation'])
browser = webdriver.Chrome(options=option)
# Turn off automation extension information
option.add_experimental_option('useAutomationExtension',False)

browser.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument',{'source':'Object.defineProperty(navigator,"webdriver",{get:()=>undefined})'})
browser.get('https://antispider1.scrape.cuiqingcai.com/')

13, Do not pop up browser window

from selenium import webdriver
from selenium.webdriver import ChromeOptions

option = ChromeOptions()
option.add_argument('--headless')
browser = webdriver.Chrome(options=option)

Posted by wyred on Mon, 30 May 2022 18:12:30 +0530